At Peloton my work has gradually and then all at once become about AI. I’ve spent the last two years building systems at the boundary of platform engineering and applied AI productionising hybrid vector and keyword search, designing MCP servers, shipping one of the first apps on ChatGPT’s Apps SDK, and building agent platforms that operate across some genuinely unwieldy codebases.
What I’ve found is that most of the hard problems in AI aren’t about the model. They’re the same problems software engineering has always had: latency, reliability, cost, observability, and what happens when your system meets real users. I’m interested in applying rigorous engineering fundamentals caching strategies, inference optimization, CI/CD, infrastructure design to the layer that sits between a capable model and something that actually works in production.
On the side, I’m building my own agentic harness, writing about inference throughput and what faster models mean for how we architect loops, and benchmarking how much of what we call “AI problems” are really just distributed systems problems in disguise.
Looking to go deeper on AI deployment and infrastructure the part of the stack where the interesting engineering is still being figured out.AtAtPelotonPelotonmymyworkworkhashasgraduallygraduallyandandthenthenallallatatonceoncebecomebecomeaboutaboutAI.AI.I’veI’vespentspentthethelastlasttwotwoyearsyearsbuildingbuildingsystemssystemsatatthetheboundaryboundaryofofplatformplatformengineeringengineeringandandappliedappliedAIAIproductionisingproductionisinghybridhybridvectorvectorandandkeywordkeywordsearch,search,designingdesigningMCPMCPservers,servers,shippingshippingoneoneofofthethefirstfirstappsappsononChatGPT’sChatGPT’sAppsAppsSDK,SDK,andandbuildingbuildingagentagentplatformsplatformsthatthatoperateoperateacrossacrosssomesomegenuinelygenuinelyunwieldyunwieldycodebases.codebases.
WhatWhatI’veI’vefoundfoundisisthatthatmostmostofofthethehardhardproblemsproblemsininAIAIaren’taren’taboutaboutthethemodel.model.They’reThey’rethethesamesameproblemsproblemssoftwaresoftwareengineeringengineeringhashasalwaysalwayshad:had:latency,latency,reliability,reliability,cost,cost,observability,observability,andandwhatwhathappenshappenswhenwhenyouryoursystemsystemmeetsmeetsrealrealusers.users.I’mI’minterestedinterestedininapplyingapplyingrigorousrigorousengineeringengineeringfundamentalsfundamentalscachingcachingstrategies,strategies,inferenceinferenceoptimization,optimization,CI/CD,CI/CD,infrastructureinfrastructuredesigndesigntotothethelayerlayerthatthatsitssitsbetweenbetweenaacapablecapablemodelmodelandandsomethingsomethingthatthatactuallyactuallyworksworksininproduction.production.
OnOnthetheside,side,I’mI’mbuildingbuildingmymyownownagenticagenticharness,harness,writingwritingaboutaboutinferenceinferencethroughputthroughputandandwhatwhatfasterfastermodelsmodelsmeanmeanforforhowhowwewearchitectarchitectloops,loops,andandbenchmarkingbenchmarkinghowhowmuchmuchofofwhatwhatwewecallcall“AI“AIproblems”problems”arearereallyreallyjustjustdistributeddistributedsystemssystemsproblemsproblemsinindisguise.disguise.
LookingLookingtotogogodeeperdeeperononAIAIdeploymentdeploymentandandinfrastructureinfrastructurethethepartpartofofthethestackstackwherewherethetheinterestinginterestingengineeringengineeringisisstillstillbeingbeingfiguredfiguredout.out.