<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[The System Design Newsletter]]></title><description><![CDATA[Download my system design playbook on newsletter signup for FREE]]></description><link>https://newsletter.systemdesign.one</link><image><url>https://substackcdn.com/image/fetch/$s_!W5r-!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1c8067a-95bb-416b-9114-e0b9fb8821d4_256x256.png</url><title>The System Design Newsletter</title><link>https://newsletter.systemdesign.one</link></image><generator>Substack</generator><lastBuildDate>Fri, 12 Jun 2026 16:27:05 GMT</lastBuildDate><atom:link href="https://newsletter.systemdesign.one/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Neo Kim]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[systemdesignone@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[systemdesignone@substack.com]]></itunes:email><itunes:name><![CDATA[Neo Kim]]></itunes:name></itunes:owner><itunes:author><![CDATA[Neo Kim]]></itunes:author><googleplay:owner><![CDATA[systemdesignone@substack.com]]></googleplay:owner><googleplay:email><![CDATA[systemdesignone@substack.com]]></googleplay:email><googleplay:author><![CDATA[Neo Kim]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Designing a Payment Backend with Stripe Integration]]></title><description><![CDATA[#152: The complete engineering blueprint for a Stripe-integrated payment backend]]></description><link>https://newsletter.systemdesign.one/p/design-a-payment-system</link><guid isPermaLink="false">https://newsletter.systemdesign.one/p/design-a-payment-system</guid><dc:creator><![CDATA[Hayk]]></dc:creator><pubDate>Wed, 10 Jun 2026 10:36:01 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/66b63c91-06ea-407e-a0c1-e696ae07e9c9_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Get my system design playbook for FREE on newsletter signup:</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><ul><li><p><em><a href="https://newsletter.systemdesign.one/p/design-a-payment-system/?action=share">Share this post</a> &amp; I'll send you some rewards for the referrals.</em></p></li><li><p><em>Block diagrams created using <a href="https://app.eraser.io/auth/sign-up?ref=neo">Eraser</a>.</em></p></li></ul><div><hr></div><p>A payment backend is one of the few systems where a bug doesn&#8217;t just cause a bad user experience but real money to vanish from user accounts.</p><p>This makes payment system design fundamentally different from most backend engineering&#8230;</p><p>The core challenge here is NOT performance or feature richness, but correctness under failure. Every design decision, from database choice to retry policy, must answer one question: <em>what happens when this component fails mid-transaction?</em></p><p>This is a complete system design blueprint for building a payment backend that integrates Stripe as the payment service provider.</p><p>Onward.</p><div><hr></div><h2><a href="https://cline.gg/neo-sdk">The agent harness wasn't supposed to be the black box (Partner)</a></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://cline.gg/neo-sdk" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ipAZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bc36c01-1522-4551-8db1-c724782393c9_2048x1140.png 424w, https://substackcdn.com/image/fetch/$s_!ipAZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bc36c01-1522-4551-8db1-c724782393c9_2048x1140.png 848w, https://substackcdn.com/image/fetch/$s_!ipAZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bc36c01-1522-4551-8db1-c724782393c9_2048x1140.png 1272w, https://substackcdn.com/image/fetch/$s_!ipAZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bc36c01-1522-4551-8db1-c724782393c9_2048x1140.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ipAZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bc36c01-1522-4551-8db1-c724782393c9_2048x1140.png" width="1456" height="810" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6bc36c01-1522-4551-8db1-c724782393c9_2048x1140.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:810,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:&quot;https://cline.gg/neo-sdk&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ipAZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bc36c01-1522-4551-8db1-c724782393c9_2048x1140.png 424w, https://substackcdn.com/image/fetch/$s_!ipAZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bc36c01-1522-4551-8db1-c724782393c9_2048x1140.png 848w, https://substackcdn.com/image/fetch/$s_!ipAZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bc36c01-1522-4551-8db1-c724782393c9_2048x1140.png 1272w, https://substackcdn.com/image/fetch/$s_!ipAZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bc36c01-1522-4551-8db1-c724782393c9_2048x1140.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Agent loop is the most important piece of infrastructure in your workflow right now, and for most developers, it&#8217;s the one piece they can&#8217;t open up.</p><p>Agent builders have to jump through all the hoops themselves, crafting the infrastructure and tools, testing the harness, while fighting to maintain what they&#8217;ve built.</p><p><strong><a href="https://cline.gg/neo-sdk">Meet Cline SDK</a>:</strong> agent harness behind Cline 2.0, fully open-sourced. The same runtime that powers Cline across VS Code, JetBrains, and the CLI is now an npm install away: <code>npm i @cline/sdk</code>. Inspect it, fork it, extend it, ship on it.</p><ul><li><p>Best-in-class harness: 74.2% on Terminal-Bench 2.0 with Claude Opus 4.7 ahead of Claude Code (69.4%) and strongest numbers published on open-weight models.</p></li><li><p>Open model &amp; provider choice: Anthropic, OpenAI, Google, Bedrock, Mistral, or any OpenAI-compatible endpoint.</p></li><li><p>Real plugin system: Register tools, hooks, commands, providers, message builders. Prototype as a local file, harden into a package. Extend it freely for any of your agent use cases.</p></li><li><p>Scheduled + event-driven agents: Cron and event specs for PR reviews, dependency checks, coverage audits, changelogs no separate orchestration layer.</p></li></ul><p>Stop building around your agent. Start building on it.</p><p>Install Cline SDK today: <code>npm i @cline/sdk</code> Or try the rebuilt harness directly: <code>npm i -g @cline</code></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://cline.gg/neo-sdk&quot;,&quot;text&quot;:&quot;Get Started Today&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://cline.gg/neo-sdk"><span>Get Started Today</span></a></p><p>(Thanks to <a href="https://cline.gg/neo-sdk">Cline</a> for partnering on this post.)</p><div><hr></div><p>I want to reintroduce <strong><a href="https://linkedin.com/in/hayksimonyan">Hayk Simonyan</a></strong> as a guest author.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://youtube.com/@hayk.simonyan" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!TNRZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe42ce2a7-78b8-43b3-b7cf-b293d660240a_1748x720.png 424w, https://substackcdn.com/image/fetch/$s_!TNRZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe42ce2a7-78b8-43b3-b7cf-b293d660240a_1748x720.png 848w, https://substackcdn.com/image/fetch/$s_!TNRZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe42ce2a7-78b8-43b3-b7cf-b293d660240a_1748x720.png 1272w, https://substackcdn.com/image/fetch/$s_!TNRZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe42ce2a7-78b8-43b3-b7cf-b293d660240a_1748x720.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!TNRZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe42ce2a7-78b8-43b3-b7cf-b293d660240a_1748x720.png" width="1456" height="600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e42ce2a7-78b8-43b3-b7cf-b293d660240a_1748x720.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:600,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:387657,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://youtube.com/@hayk.simonyan&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/188825279?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe42ce2a7-78b8-43b3-b7cf-b293d660240a_1748x720.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!TNRZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe42ce2a7-78b8-43b3-b7cf-b293d660240a_1748x720.png 424w, https://substackcdn.com/image/fetch/$s_!TNRZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe42ce2a7-78b8-43b3-b7cf-b293d660240a_1748x720.png 848w, https://substackcdn.com/image/fetch/$s_!TNRZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe42ce2a7-78b8-43b3-b7cf-b293d660240a_1748x720.png 1272w, https://substackcdn.com/image/fetch/$s_!TNRZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe42ce2a7-78b8-43b3-b7cf-b293d660240a_1748x720.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>He&#8217;s a senior software engineer specializing in helping developers break through their career plateaus and secure senior roles.</p><p>If you want to master the essential system design skills and land senior developer roles, I highly recommend checking out Hayk&#8217;s <strong><a href="https://youtube.com/@hayk.simonyan">YouTube channel</a></strong>.</p><p>His approach focuses on what top employers actually care about: system design expertise, advanced project experience, and elite-level interview performance.</p><div><hr></div><p><em><strong>Inside this newsletter, you&#8217;ll get:</strong></em></p><ul><li><p><strong>Three approaches to accepting payments.</strong> Why no-code checkout hits limits fast, when building your own processor makes sense, and why integrating a PSP like Stripe is the right call for most companies.</p></li><li><p><strong>How money actually moves.</strong> The full card transaction lifecycle from authorization to settlement, including the six entities involved and the engineering risks most developers miss.</p></li><li><p><strong>High-level architecture.</strong> The seven components of a production payment backend, how they split the synchronous and async paths, and why each boundary exists.</p></li><li><p><strong>Idempotency and exactly-once processing.</strong> How to design a system that never double-charges, even when servers crash mid-transaction, using idempotency keys, recovery points, and the atomic phases pattern.</p></li><li><p><strong>Webhook handling and the payment state machine.</strong> How to process Stripe webhooks reliably when duplicates and out-of-order delivery are expected, and how to enforce valid state transitions at the database level.</p></li><li><p><strong>Designing for high availability.</strong> How to approach 99.999% uptime across your API layer, database, and async workers, and why a circuit breaker in front of Stripe protects your own system more than it protects Stripe.</p></li></ul><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Golden members get all posts like these!&#8230;</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h3><strong>Three Approaches to Accepting Payments</strong></h3><p>Before diving into the design, it&#8217;s worth understanding the landscape&#8230;</p><p>When a company needs to accept payments, there are three broad approaches, and choosing the wrong one for your scale wastes either years of engineering time or millions in unnecessary fees.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!aEqY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d6ea27f-3c17-43df-85e8-0d278a080e95_1574x831.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!aEqY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d6ea27f-3c17-43df-85e8-0d278a080e95_1574x831.png 424w, https://substackcdn.com/image/fetch/$s_!aEqY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d6ea27f-3c17-43df-85e8-0d278a080e95_1574x831.png 848w, https://substackcdn.com/image/fetch/$s_!aEqY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d6ea27f-3c17-43df-85e8-0d278a080e95_1574x831.png 1272w, https://substackcdn.com/image/fetch/$s_!aEqY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d6ea27f-3c17-43df-85e8-0d278a080e95_1574x831.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!aEqY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d6ea27f-3c17-43df-85e8-0d278a080e95_1574x831.png" width="1456" height="769" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7d6ea27f-3c17-43df-85e8-0d278a080e95_1574x831.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:769,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!aEqY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d6ea27f-3c17-43df-85e8-0d278a080e95_1574x831.png 424w, https://substackcdn.com/image/fetch/$s_!aEqY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d6ea27f-3c17-43df-85e8-0d278a080e95_1574x831.png 848w, https://substackcdn.com/image/fetch/$s_!aEqY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d6ea27f-3c17-43df-85e8-0d278a080e95_1574x831.png 1272w, https://substackcdn.com/image/fetch/$s_!aEqY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d6ea27f-3c17-43df-85e8-0d278a080e95_1574x831.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4><strong>Option 1: Build your own payment processor</strong></h4><p>This means connecting directly to card networks (Visa, Mastercard), acquiring banking licenses, handling PCI DSS<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a> (Payment Card Industry Data Security Standard, which is the set of security requirements every company that touches card data must comply with), Level 1 compliance<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a>, building your own fraud detection, and managing relationships with issuing banks.</p><p>The upside is lower transaction costs.</p><p>At a massive scale, even saving 0.5% off each transaction adds up to hundreds of millions per year. Amazon, Uber, and Airbnb have all moved in this direction for parts of their payment stack.</p><p>The downside is the cost to get there: <em>acquiring necessary licenses alone takes 12-24 months, costs millions, and comes with ongoing regulatory obligations</em>. So this approach only makes sense once you are processing billions in annual payment volume.</p><p>For anyone else, it is premature optimization at its extreme&#8230;</p><h4><strong>Option 2: Use a payment service provider (PSP) like Stripe, PayPal, Adyen, or Braintree</strong></h4><p>A PSP is a company that handles the entire payment processing chain on your behalf: <em>card network integrations, banking relationships, fraud tooling, PCI DSS compliance, and global payment method support.</em></p><p>You pay a per-transaction fee, typically around 2.9% + $0.30 with Stripe, in exchange for not having to build or maintain any of that infrastructure. This is the right approach for most companies, from early-stage startups to large-scale platforms.</p><p>Even Shopify, which processes billions in payments annually, built its own payment product (Shopify Payments) on top of Stripe&#8217;s infrastructure rather than connecting directly to card networks.</p><h4><strong>Option 3: Use a no-code payment flow, such as Stripe Checkout</strong></h4><p>Stripe offers hosted checkout pages and payment links that require almost no backend integration.</p><p>You create a checkout session, redirect the user to Stripe&#8217;s hosted page, and Stripe handles everything: <em>UI, payment method selection, and confirmation.</em></p><p>This is the fastest way to accept payments, and it works well for simple use cases: <em>selling a single product, collecting donations, or building a quick prototype</em>.</p><p>The trade-off is limited customization and less control over the payment experience.</p><p>You cannot deeply embed it in your own UI, easily build custom subscription logic, or react to payment events in real time with full control over the flow.</p><div><hr></div><h2><strong>How Money Moves Through Card Network</strong></h2><p>Understanding the payment flow end-to-end is a prerequisite to designing a backend that handles it correctly&#8230;</p><p>Six entities participate in each card transaction:</p><ol><li><p><strong>Cardholder</strong></p></li><li><p><strong>Merchant </strong>(Business or individual selling the product or service)</p></li><li><p><strong>PSP</strong> (Payment Service Provider, in this case, Stripe)</p></li><li><p><strong>Acquiring bank</strong> (Merchant&#8217;s bank)</p></li><li><p><strong>Card network</strong> (Visa/Mastercard)</p></li><li><p><strong>Issuing bank</strong> (Cardholder&#8217;s bank)</p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ue8j!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28a8f653-6ea1-4e45-a89f-134603e8bd2c_859x560.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ue8j!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28a8f653-6ea1-4e45-a89f-134603e8bd2c_859x560.png 424w, https://substackcdn.com/image/fetch/$s_!Ue8j!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28a8f653-6ea1-4e45-a89f-134603e8bd2c_859x560.png 848w, https://substackcdn.com/image/fetch/$s_!Ue8j!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28a8f653-6ea1-4e45-a89f-134603e8bd2c_859x560.png 1272w, https://substackcdn.com/image/fetch/$s_!Ue8j!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28a8f653-6ea1-4e45-a89f-134603e8bd2c_859x560.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ue8j!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28a8f653-6ea1-4e45-a89f-134603e8bd2c_859x560.png" width="859" height="560" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/28a8f653-6ea1-4e45-a89f-134603e8bd2c_859x560.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:560,&quot;width&quot;:859,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ue8j!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28a8f653-6ea1-4e45-a89f-134603e8bd2c_859x560.png 424w, https://substackcdn.com/image/fetch/$s_!Ue8j!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28a8f653-6ea1-4e45-a89f-134603e8bd2c_859x560.png 848w, https://substackcdn.com/image/fetch/$s_!Ue8j!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28a8f653-6ea1-4e45-a89f-134603e8bd2c_859x560.png 1272w, https://substackcdn.com/image/fetch/$s_!Ue8j!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28a8f653-6ea1-4e45-a89f-134603e8bd2c_859x560.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Money flows through 3 distinct phases, each with different timing and failure modes&#8230;</p><h3><strong>Phase 1: Authorization (1-3 seconds)</strong></h3><p>The customer submits card details through <code>Stripe Elements</code>.</p><p>These are technically <code>iframes</code> hosted by Stripe, even if they look like part of your site. We use <code>Stripe.js</code>, which is a JavaScript library that Stripe requires you to load directly from their servers (for security reasons).</p><p>The process starts on the backend, where a&nbsp;<code>PaymentIntent</code><a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a>&nbsp;gets created&nbsp;via Stripe&#8217;s API to track the transaction lifecycle. This returns a <code>client_secret</code> to the frontend.</p><p>The customer then enters their card details into <code>Stripe Elements</code> - secure <code>iframes</code> hosted by Stripe that look like part of your site. Using <code>Stripe.js</code>, the client-side code tokenizes the card details so the <strong>PAN</strong> (Primary Account Number: a 14 to 16-digit number on a credit or debit card) never reaches the merchant&#8217;s server.</p><h4><strong>What is PaymentIntent?</strong></h4><p>A <code>PaymentIntent</code> is a stateful object Stripe uses to track a single payment from start to finish.</p><p>Think of it as the source of truth for a transaction&#8230;</p><p>In older systems, you&#8217;d just send a &#8220;charge&#8221; request and hope it worked. With modern payments, things are more complex because of 2FA (e.g., 3D Secure<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a>) or fraud checks.</p><p><code>PaymentIntent</code> manages this by moving through different states:</p><ul><li><p><strong>Requires payment method:</strong> You created the intent, but the user hasn&#8217;t typed their card yet.</p></li><li><p><strong>Requires action:</strong> User must verify the payment in their bank app.</p></li><li><p><strong>Processing:</strong> Stripe is talking to the banks.</p></li><li><p><strong>Succeeded:</strong> Money is authorized or captured.</p></li></ul><p>Stripe formats an ISO 8583 authorization message<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-5" href="#footnote-5" target="_self">5</a> and sends it to the acquiring bank<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-6" href="#footnote-6" target="_self">6</a>, which identifies the card network via the BIN<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-7" href="#footnote-7" target="_self">7</a> (first 6-8 digits) and forwards the request.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dCHE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F276b57f3-192b-4b9e-ade5-f57eaf20cc2f_1566x2048.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dCHE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F276b57f3-192b-4b9e-ade5-f57eaf20cc2f_1566x2048.png 424w, https://substackcdn.com/image/fetch/$s_!dCHE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F276b57f3-192b-4b9e-ade5-f57eaf20cc2f_1566x2048.png 848w, https://substackcdn.com/image/fetch/$s_!dCHE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F276b57f3-192b-4b9e-ade5-f57eaf20cc2f_1566x2048.png 1272w, https://substackcdn.com/image/fetch/$s_!dCHE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F276b57f3-192b-4b9e-ade5-f57eaf20cc2f_1566x2048.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dCHE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F276b57f3-192b-4b9e-ade5-f57eaf20cc2f_1566x2048.png" width="1456" height="1904" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/276b57f3-192b-4b9e-ade5-f57eaf20cc2f_1566x2048.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1904,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dCHE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F276b57f3-192b-4b9e-ade5-f57eaf20cc2f_1566x2048.png 424w, https://substackcdn.com/image/fetch/$s_!dCHE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F276b57f3-192b-4b9e-ade5-f57eaf20cc2f_1566x2048.png 848w, https://substackcdn.com/image/fetch/$s_!dCHE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F276b57f3-192b-4b9e-ade5-f57eaf20cc2f_1566x2048.png 1272w, https://substackcdn.com/image/fetch/$s_!dCHE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F276b57f3-192b-4b9e-ade5-f57eaf20cc2f_1566x2048.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The card network routes it to the issuing bank, which checks card validity, available funds, CVV/AVS<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-8" href="#footnote-8" target="_self">8</a> match, and fraud risk models.</p><ul><li><p>The issuer returns a two-digit response code, approve or decline, back through the entire chain.</p></li><li><p>If approved, a <strong>hold</strong> gets placed on the cardholder&#8217;s available balance. But no money moves yet.</p></li></ul><p>Mastercard reports an average network response time of&nbsp;<em>130 milliseconds</em>; the full round-trip, including all hops, completes in 1-3 seconds.</p><p>Visa&#8217;s network handles a peak of <em>56,000+ messages per second</em>.</p><h3><strong>Phase 2: Capture (immediate/delayed)</strong></h3><p>Capture<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-9" href="#footnote-9" target="_self">9</a> is when the merchant tells acquiring bank (the merchant&#8217;s bank) to finalize the authorized amount.</p><p>For digital goods and subscriptions, capture occurs immediately. Stripe&#8217;s default is <code>capture_method: automatic</code>.</p><p>For physical goods, hotels, or ride-hailing, capture is delayed until fulfillment. You can capture an amount less than or equal to what was authorized, but never more. Authorization holds<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-10" href="#footnote-10" target="_self">10</a> typically last 5 to 10 days.</p><p>Visa allows 10 days for standard e-commerce and up to 31 days for lodging.</p><p><em><strong>Here&#8217;s the Engineering Risk:</strong></em></p><p>If you don&#8217;t capture within 7 days, the issuing bank might cancel the hold. This is a common source of bugs. Your state machine must treat AUTHORIZED and CAPTURED as &#8220;distinct&#8221; states. If your system thinks a payment is authorized but the bank has dropped the hold, the capture call will fail.</p><p>So you need a background job to detect these &#8220;stuck&#8221; payments and mark them as expired.</p><h3><strong>Phase 3: Clearing and settlement (T+1 to T+3)</strong></h3><p>At the end of the business day, captured transactions get batched and sent to the card network for clearing.</p><p>The network calculates fees and exchanges transaction files overnight&#8230;</p><p>Settlement is when the actual money moves. The issuing bank sends the funds (minus fees) to the card network, which then passes them to the acquiring bank. </p><p>The acquiring bank finally deposits them into the merchant&#8217;s account.</p><p>In banking, <strong>&#8220;T&#8221;</strong> stands for the Transaction Date. <strong>T+2</strong> means the money arrives two business days after the transaction. Stripe&#8217;s US default is T+2, though new accounts usually have an initial hold of 7 to 14 days.</p><h3><strong>Fee Breakdown</strong></h3><p>On a $100 US online transaction, the 2.9% + $0.30 fee gets split three ways:</p><ul><li><p><strong>Interchange</strong><a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-11" href="#footnote-11" target="_self">11</a><strong> (~$2.05):</strong> This goes to the <em>Issuing Bank</em> (the customer&#8217;s bank). This is the largest cut. It covers the bank&#8217;s risk and pays for the customer&#8217;s credit card rewards.</p></li><li><p><strong>Assessment (~$0.16):</strong> This goes to the <em>Card Network</em> (Visa or Mastercard) for using their rails.</p></li><li><p><strong>Markup (~$0.70):</strong> This is what <em>Stripe</em> keeps for providing the API, security, and infrastructure.</p></li></ul><div><hr></div><h2><strong>Functional and Non-Functional Requirements</strong></h2><p>Functional requirements for a payment backend serving a mid-to-large platform fall into 6 categories:</p><ol><li><p>Customers must be able to make one-time payments and save payment methods for reuse.</p></li><li><p>Merchants or the platform must be able to accept payments, with support for marketplace-style split payments if needed.</p></li><li><p>Refunds, both full and partial, must flow through a dedicated refund state machine.</p></li><li><p>System must handle subscriptions and recurring billing, including proration on plan changes, trial periods, and Stripe&#8217;s Smart Retries for failed renewal payments.</p></li><li><p>Backend must process Stripe webhooks reliably, as webhooks are the <em>source of truth</em> for payment status.</p></li><li><p>The entire payment lifecycle must be modeled as a finite-state machine with enforced transitions.</p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2ZHF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f69585b-f2e0-4ea0-8a96-07840c24b0ca_2048x1158.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2ZHF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f69585b-f2e0-4ea0-8a96-07840c24b0ca_2048x1158.png 424w, https://substackcdn.com/image/fetch/$s_!2ZHF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f69585b-f2e0-4ea0-8a96-07840c24b0ca_2048x1158.png 848w, https://substackcdn.com/image/fetch/$s_!2ZHF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f69585b-f2e0-4ea0-8a96-07840c24b0ca_2048x1158.png 1272w, https://substackcdn.com/image/fetch/$s_!2ZHF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f69585b-f2e0-4ea0-8a96-07840c24b0ca_2048x1158.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2ZHF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f69585b-f2e0-4ea0-8a96-07840c24b0ca_2048x1158.png" width="1456" height="823" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4f69585b-f2e0-4ea0-8a96-07840c24b0ca_2048x1158.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:823,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2ZHF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f69585b-f2e0-4ea0-8a96-07840c24b0ca_2048x1158.png 424w, https://substackcdn.com/image/fetch/$s_!2ZHF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f69585b-f2e0-4ea0-8a96-07840c24b0ca_2048x1158.png 848w, https://substackcdn.com/image/fetch/$s_!2ZHF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f69585b-f2e0-4ea0-8a96-07840c24b0ca_2048x1158.png 1272w, https://substackcdn.com/image/fetch/$s_!2ZHF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f69585b-f2e0-4ea0-8a96-07840c24b0ca_2048x1158.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Non-functional requirements are where payment systems diverge sharply from typical backend services:</p><ul><li><p><strong>Exactly-once payment processing</strong> is the key requirement. A double charge on a $1,000 purchase means money taken from a person twice. True exactly-once delivery is impossible in distributed systems (Two Generals Problem<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-12" href="#footnote-12" target="_self">12</a>), so the practical implementation is <em>at-least-once delivery combined with idempotent processing and reconciliation.</em></p></li><li><p><strong>High availability</strong> targets are extreme. Stripe itself maintains 99.999% API uptime.</p></li><li><p><strong>Idempotency</strong> must be enforced at every layer: <em>client-to-backend, backend-to-Stripe, and webhook processing.</em> Every retry must produce the same result as the original request.</p></li><li><p><strong>Consistency over availability.</strong> Payment systems are one of the few domains where the CAP theorem should tilt decisively toward consistency. A stale read that shows an incorrect balance or a lost write that drops a payment is far worse than brief unavailability. This is why every major payment platform, including Shopify, Uber, and Airbnb, uses <em>SQL databases with ACID guarantees</em> for core payment data.</p></li></ul><h4><strong>Scale estimates for a mid-to-large platform</strong></h4><p>A platform processing 100,000 payments per day generates roughly 500,000-1,000,000 webhook events daily (each payment triggers 5-10 events across creation, authorization, capture, and related objects).</p><p>Stripe&#8217;s default API rate limit is <em>100 requests/second</em> per account, with individual endpoints limited to 25 requests/second; higher limits are available by arrangement.</p><p>Payment API latency should be under 5 seconds end-to-end, including the Stripe round-trip, with internal service-to-service calls under 100ms.</p><div><hr></div><div class="callout-block" data-callout="true"><p><em><strong>Reminder: this is a teaser of the subscriber-only newsletter series, exclusive to my golden members.</strong></em></p><p>When you upgrade, you&#8217;ll get:</p><ul><li><p><strong>High-level architecture of real-world systems.</strong></p></li><li><p>Deep dive into how popular real-world systems actually work.</p></li><li><p><strong>How real-world systems handle scale, reliability, and performance.</strong></p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?yearly=true&quot;,&quot;text&quot;:&quot;Unlock Full Access&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.systemdesign.one/subscribe?yearly=true"><span>Unlock Full Access</span></a></p></div><div><hr></div><h2><strong>High-Level Architecture</strong></h2><p>The architecture splits into seven components, each with a distinct responsibility.</p><p>It isolates the synchronous customer-facing path from asynchronous processing and keeps the webhook ingestion pipeline decoupled from business logic.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ClZ1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36c44b59-8e5c-4a67-ab96-4a709836b183_2048x388.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ClZ1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36c44b59-8e5c-4a67-ab96-4a709836b183_2048x388.png 424w, https://substackcdn.com/image/fetch/$s_!ClZ1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36c44b59-8e5c-4a67-ab96-4a709836b183_2048x388.png 848w, https://substackcdn.com/image/fetch/$s_!ClZ1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36c44b59-8e5c-4a67-ab96-4a709836b183_2048x388.png 1272w, https://substackcdn.com/image/fetch/$s_!ClZ1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36c44b59-8e5c-4a67-ab96-4a709836b183_2048x388.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ClZ1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36c44b59-8e5c-4a67-ab96-4a709836b183_2048x388.png" width="1456" height="276" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/36c44b59-8e5c-4a67-ab96-4a709836b183_2048x388.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:276,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ClZ1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36c44b59-8e5c-4a67-ab96-4a709836b183_2048x388.png 424w, https://substackcdn.com/image/fetch/$s_!ClZ1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36c44b59-8e5c-4a67-ab96-4a709836b183_2048x388.png 848w, https://substackcdn.com/image/fetch/$s_!ClZ1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36c44b59-8e5c-4a67-ab96-4a709836b183_2048x388.png 1272w, https://substackcdn.com/image/fetch/$s_!ClZ1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36c44b59-8e5c-4a67-ab96-4a709836b183_2048x388.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h4><strong>1 Payment API Service</strong> </h4><p>It&#8217;s the synchronous entry point.</p><p>It receives payment requests from checkout, validates input, checks idempotency keys<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-13" href="#footnote-13" target="_self">13</a> against the database, creates or retrieves the payment record, and returns an immediate response to the client.</p><p>For Stripe integrations, this service creates a <code>PaymentIntent</code> and returns the <code>client_secret</code> to the frontend, which uses <code>Stripe.js</code> to complete the payment (including 3D Secure challenges).</p><p>The API service should never block on downstream processing. It creates the payment record, dispatches work, and returns.</p><h4><strong>2 Stripe Integration Layer</strong> </h4><p>It abstracts all Stripe-specific API calls behind a uniform interface.</p><p>This adapter handles Stripe&#8217;s error types, maps them to internal error codes, attaches idempotency keys to every POST request, and manages timeouts. Shopify and Airbnb both use this pattern; Airbnb calls it the <em>&#8220;PSP adapter,&#8221;</em> which isolates provider-specific logic from the core payment domain.</p><p>If you ever need to support a second PSP (Adyen, Braintree), only this layer changes.</p><h4><strong>3 Payment Database</strong></h4><p>This is the system of record.</p><p>It stores the current state of every payment, idempotency keys, and the immutable audit log. The schema design is covered in the next section, but the critical principle is: <code>payments</code> table holds a mutable current state (optimized for queries), while the payment events table holds an&nbsp;<em>append-only immutable log</em>&nbsp;of every state change (optimized for audit, debugging, and reconciliation).</p><p>Both exist in the same SQL database and get updated within the same transaction.</p><h4><strong>4 Webhook Receiver</strong> </h4><p>It&#8217;s a lightweight HTTP endpoint that does exactly three things:<em> verify the Stripe signature (HMAC-SHA256</em><a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-14" href="#footnote-14" target="_self">14</a><em> using the Stripe-Signature header), store the raw event, and return 200 OK.</em></p><p>It must respond within seconds, since Stripe has an approximately 20-second timeout. All business logic happens asynchronously. The receiver enqueues the event onto the message queue for processing by workers.</p><h4><strong>5 Message Queue</strong></h4><p>It (Kafka, SQS, or RabbitMQ) decouples webhook receipt from processing and provides at-least-once delivery guarantees.</p><p>Uber uses Apache Kafka as the backbone of their payment platform&#8217;s async stream processing. The queue also supports the transactional outbox pattern<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-15" href="#footnote-15" target="_self">15</a>: when a payment state change is committed to the database, an outbox record is written in the same transaction, then relayed to the queue by a separate process.</p><h4><strong>6 Background Job Workers</strong> </h4><p>They handle 4 categories of async work:</p><ul><li><p>Webhook event processing: dequeuing events and applying state transitions idempotently,</p></li><li><p>Retry workers: retrying failed PSP calls with exponential backoff,</p></li><li><p>A reconciliation worker: daily comparison of internal records against Stripe&#8217;s records,</p></li><li><p>And a stuck-payment detector: alerting on payments in intermediate states beyond a configurable threshold.</p></li></ul><h4><strong>7 Ledger</strong> </h4><p>It provides double-entry bookkeeping<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-16" href="#footnote-16" target="_self">16</a>.</p><p>Every money movement is recorded as a balanced pair of debit and credit entries. Uber explicitly built its next-generation payment platform on double-entry bookkeeping for auditability, and Stripe&#8217;s own ledger system logs approximately <em>5 billion money-movement events daily.</em></p><p>The ledger is append-only.</p><p>If a mistake needs correction, a new reversing entry is inserted, never an update or delete.</p><p>The high-level architecture tells us what to build, but database design is where correctness is either enforced or lost. Every pattern we covered above, idempotency, exactly-once processing, the state machine, all of them depend on the database schema doing the right thing.</p><p>If we get this wrong, no amount of application-level logic will matter.</p><p>So let's see what the database design looks like&#8230;</p><div><hr></div><h2><strong>Database Schema to Enforce Correctness</strong></h2>
      <p>
          <a href="https://newsletter.systemdesign.one/p/design-a-payment-system">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[The Anatomy of OpenClaw]]></title><description><![CDATA[#151: Understanding How OpenClaw Works]]></description><link>https://newsletter.systemdesign.one/p/openclaw-architecture</link><guid isPermaLink="false">https://newsletter.systemdesign.one/p/openclaw-architecture</guid><dc:creator><![CDATA[Neo Kim]]></dc:creator><pubDate>Mon, 08 Jun 2026 11:46:05 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/f1017994-80b3-48f7-b27d-79346387128b_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Get my system design playbook for FREE on newsletter signup:</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><ul><li><p><em><a href="https://newsletter.systemdesign.one/p/openclaw-architecture/?action=share">Share this post</a> &amp; I'll send you some rewards for the referrals.</em></p></li></ul><div><hr></div><p>You ask ChatGPT to write an email.</p><p>It drafts it&#8230; But you still have to copy the text, paste it into Gmail, and hit send. AI did the thinking,,,YOU did the task.</p><p>OpenClaw changes this&#8230;</p><p>It&#8217;s an open-source autonomous AI agent<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a> that runs on your machine, connects to your messaging apps, and does the work itself. You could message it on Telegram: <em>&#8220;Check my inbox, pull any invoices, save the attachments.</em>&#8221; Then it connects to your email, finds invoices, downloads the attachments, saves them, and messages you back with a summary.</p><p>You don&#8217;t have to open a single app&#8230;</p><p>OpenClaw bridges the gap between the thinking layer and the act layer.</p><p>Large language models (<strong>LLMs</strong>), such as Claude, ChatGPT, and Gemini, provide reasoning. While OpenClaw<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a> provides hands.</p><p>Onward.</p><div><hr></div><div class="callout-block" data-callout="true"><h2><a href="https://myclaw.ai/?utm_source=sub-systemdesignnewsletter">Run OpenClaw with MyClaw</a></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://myclaw.ai/?utm_source=sub-systemdesignnewsletter" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rx80!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc24a806-4449-454d-913f-4268f8f6b5b0_2172x724.png 424w, https://substackcdn.com/image/fetch/$s_!rx80!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc24a806-4449-454d-913f-4268f8f6b5b0_2172x724.png 848w, https://substackcdn.com/image/fetch/$s_!rx80!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc24a806-4449-454d-913f-4268f8f6b5b0_2172x724.png 1272w, https://substackcdn.com/image/fetch/$s_!rx80!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc24a806-4449-454d-913f-4268f8f6b5b0_2172x724.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rx80!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc24a806-4449-454d-913f-4268f8f6b5b0_2172x724.png" width="1456" height="485" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bc24a806-4449-454d-913f-4268f8f6b5b0_2172x724.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:485,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1418858,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://myclaw.ai/?utm_source=sub-systemdesignnewsletter&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/196042495?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc24a806-4449-454d-913f-4268f8f6b5b0_2172x724.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!rx80!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc24a806-4449-454d-913f-4268f8f6b5b0_2172x724.png 424w, https://substackcdn.com/image/fetch/$s_!rx80!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc24a806-4449-454d-913f-4268f8f6b5b0_2172x724.png 848w, https://substackcdn.com/image/fetch/$s_!rx80!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc24a806-4449-454d-913f-4268f8f6b5b0_2172x724.png 1272w, https://substackcdn.com/image/fetch/$s_!rx80!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc24a806-4449-454d-913f-4268f8f6b5b0_2172x724.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The best AI assistants aren&#8217;t the ones waiting for your next prompt&#8212;they&#8217;re the ones that send you the answer before you think to ask.</p><p><strong><a href="https://myclaw.ai/?utm_source=sub-systemdesignnewsletter">MyClaw</a></strong> runs OpenClaw in the cloud, schedules recurring tasks, and delivers updates directly to Telegram or WhatsApp, so it feels less like another chatbot and more like an assistant that&#8217;s actually on the job.</p><p>If this newsletter got you interested in OpenClaw, <a href="https://myclaw.ai/?utm_source=sub-systemdesignnewsletter">MyClaw</a> is an easy way to see what an always-on AI assistant feels like.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://myclaw.ai/?utm_source=sub-systemdesignnewsletter&quot;,&quot;text&quot;:&quot;Try MyClaw&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://myclaw.ai/?utm_source=sub-systemdesignnewsletter"><span>Try MyClaw</span></a></p><p>(Thanks to <a href="https://myclaw.ai/?utm_source=sub-systemdesignnewsletter">MyClaw</a> for partnering on this post.)</p></div><div><hr></div><p><em><strong>Here&#8217;s what you&#8217;ll find inside this newsletter:</strong></em></p><ul><li><p><strong>What OpenClaw actually does.</strong> Why it&#8217;s different from every AI tool you&#8217;ve already tried.</p></li><li><p><strong>How the Gateway works.</strong> The always-on server running on your machine that makes everything else possible.</p></li><li><p><strong>Skill system</strong>. How to install the right ones first, and what most people get wrong.</p></li><li><p><strong>What breaks and why</strong>. Most common failure modes, with fixes for each.</p></li><li><p><strong>The real security risks</strong>. Documented, avoidable, and worth reading before you touch a single setting.</p></li><li><p><strong>How to build your own</strong>. Full architecture in five layers, from the ground up.</p></li></ul><p>By the end of this newsletter, you will understand OpenClaw well enough to run it.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Golden members get all posts like these!&#8230;</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h2><strong>What Is OpenClaw</strong></h2><p>OpenClaw is the orchestration layer, the connector between an LLM (like Claude, GPT, or Gemini) and your real-world tools.</p><p>The intelligence comes from the LLM you connect to it.</p><p>Think of the LLM as the brain: <em>it reasons, plans, and decides what to do. </em>OpenClaw is the hand: <em>it executes those decisions across your files, messaging apps, shell, and APIs.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!g6oU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd17cbfe-8fda-45bf-afeb-b4a4bb6c8ff9_2048x1009.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!g6oU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd17cbfe-8fda-45bf-afeb-b4a4bb6c8ff9_2048x1009.png 424w, https://substackcdn.com/image/fetch/$s_!g6oU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd17cbfe-8fda-45bf-afeb-b4a4bb6c8ff9_2048x1009.png 848w, https://substackcdn.com/image/fetch/$s_!g6oU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd17cbfe-8fda-45bf-afeb-b4a4bb6c8ff9_2048x1009.png 1272w, https://substackcdn.com/image/fetch/$s_!g6oU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd17cbfe-8fda-45bf-afeb-b4a4bb6c8ff9_2048x1009.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!g6oU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd17cbfe-8fda-45bf-afeb-b4a4bb6c8ff9_2048x1009.png" width="1456" height="717" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fd17cbfe-8fda-45bf-afeb-b4a4bb6c8ff9_2048x1009.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:717,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!g6oU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd17cbfe-8fda-45bf-afeb-b4a4bb6c8ff9_2048x1009.png 424w, https://substackcdn.com/image/fetch/$s_!g6oU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd17cbfe-8fda-45bf-afeb-b4a4bb6c8ff9_2048x1009.png 848w, https://substackcdn.com/image/fetch/$s_!g6oU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd17cbfe-8fda-45bf-afeb-b4a4bb6c8ff9_2048x1009.png 1272w, https://substackcdn.com/image/fetch/$s_!g6oU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd17cbfe-8fda-45bf-afeb-b4a4bb6c8ff9_2048x1009.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Here&#8217;s what OpenClaw offers:</p><ul><li><p><strong>A gateway</strong><a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a><strong> that runs on your machine (Mac, Linux)</strong></p><p>This is the always-on server running on your machine that coordinates everything: <em>your messaging apps, LLM, tools</em><a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a><em>, and memory</em>.</p></li><li><p><strong>A multi-channel messaging interface: WhatsApp, Telegram,</strong> <strong>and so on.</strong></p><p>You talk to the same assistant from all these apps, with shared context and memory, instead of managing a different bot per platform.</p></li><li><p><strong>A skills platform with 5,400+ modular capabilities on ClawHub</strong><a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-5" href="#footnote-5" target="_self">5</a><strong>.</strong></p><p>Skills<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-6" href="#footnote-6" target="_self">6</a> are plain Markdown files that tell the agent how to perform a specific task, like checking your GitHub PRs or sending a Slack message.</p></li><li><p><strong>A persistent memory system in plain Markdown files you own.</strong></p><p>The agent&#8217;s knowledge, preferences, and logs are stored as editable Markdown on disk, so you can inspect, version-control, and move its LLM context like code.</p></li><li><p><strong>Proactive scheduling engine (Heartbeat</strong><a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-7" href="#footnote-7" target="_self">7</a><strong> + cron jobs) that acts without being prompted.</strong></p><p>OpenClaw can wake itself up on a schedule to check servers, triage email, or send summaries, instead of waiting for you to type a command.</p></li><li><p><strong>Model-agnostic: plug in Claude, GPT, DeepSeek, Gemini, Grok, or local models via Ollama</strong><a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-8" href="#footnote-8" target="_self">8</a><strong>.</strong></p><p>You can mix and match models (cheap vs powerful, cloud vs local) and route different tasks to different providers to optimize cost and performance.</p></li></ul><div><hr></div><h2><strong>How It Differs From Other AI Tools</strong></h2><p>Here&#8217;s how:</p><ul><li><p><strong>ChatGPT, Claude, Gemini:</strong> You prompt them, and they respond. They have no access to your email, no ability to run scripts, and no memory once you close the tab<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-9" href="#footnote-9" target="_self">9</a>.</p></li><li><p><strong>Siri, Google Assistant:</strong> Polished products, but closed ones. You can&#8217;t add custom skills, connect arbitrary APIs, or self-host them.</p></li><li><p><strong>LangChain, CrewAI, AutoGen:</strong> Developer frameworks. You write the Python, define the chains, and build the interfaces from scratch. OpenClaw ships all of that out of the box.</p></li><li><p><strong>Claude Code, Codex CLI:</strong> Terminal tools for coding sessions. Reactive and session-based. OpenClaw runs 24/7 for everything, not just code.</p></li></ul><p>So OpenClaw is NOT just a product you use in a browser tab.</p><p>It is a service that runs on your machine and continues to work while you&#8217;re away.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/p/openclaw-architecture?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.systemdesign.one/p/openclaw-architecture?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><div><hr></div><h2><strong>How OpenClaw Processes Messages</strong></h2><p>Imagine you send a single Telegram message at 9 pm:</p><p><em>&#8220;Check my open PRs and let me know if anything needs a response tonight.&#8221;</em></p><p>You lock your phone and go to sleep&#8230;</p><p>By morning, you will have a reply:</p><p>Three PRs are open, and one has a teammate's review comment asking about error handling. OpenClaw drafted a response and flagged it for your approval, leaving the other two untouched because they were still in CI. It also remembered that last week you said you prefer to handle auth-related reviews yourself, so it left the one in the auth module alone.</p><p>None of this required you to open GitHub, write a prompt, or copy anything between tabs&#8230;</p><h3><strong>Why OpenClaw Runs 24/7 on Your Machine</strong></h3><p>When you run the OpenClaw gateway, you start a Node.js server that binds to port <code>18789</code> and stays running.</p><p>This is the fundamental difference between OpenClaw and a browser-based AI tool&#8230;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lL9I!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F342c7659-8bd7-4aa9-9080-7d23204859d0_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lL9I!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F342c7659-8bd7-4aa9-9080-7d23204859d0_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!lL9I!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F342c7659-8bd7-4aa9-9080-7d23204859d0_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!lL9I!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F342c7659-8bd7-4aa9-9080-7d23204859d0_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!lL9I!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F342c7659-8bd7-4aa9-9080-7d23204859d0_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lL9I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F342c7659-8bd7-4aa9-9080-7d23204859d0_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/342c7659-8bd7-4aa9-9080-7d23204859d0_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lL9I!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F342c7659-8bd7-4aa9-9080-7d23204859d0_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!lL9I!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F342c7659-8bd7-4aa9-9080-7d23204859d0_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!lL9I!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F342c7659-8bd7-4aa9-9080-7d23204859d0_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!lL9I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F342c7659-8bd7-4aa9-9080-7d23204859d0_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Every mainstream AI assistant, including ChatGPT, Claude, and even DeepResearch mode, only exists while you have a tab open. (DeepResearch can run a long session, but once it&#8217;s done, it stops.)</p><p>i.e., if you close the tab, everything disappears.</p><p>But OpenClaw never stops:</p><p>It runs as a background daemon on your local server and stays alive whether you are at your desk or asleep. That persistence is the foundation of everything else: <em>scheduled jobs, proactive alerts, multi-channel messaging, and long-term memory</em>. None of it works without a process that is always on.</p><p>This central server is called the Gateway&#8230;</p><h3><strong>Gateway: Control Center of Your Agent</strong></h3><p>The Gateway orchestrates everything.</p><p>It sits between your messaging apps, LLM, and tools, coordinating every request from the moment it arrives until a reply goes back out.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!L-RJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd99a9f7f-11e6-4c1a-9214-d705f4839451_2048x964.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!L-RJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd99a9f7f-11e6-4c1a-9214-d705f4839451_2048x964.png 424w, https://substackcdn.com/image/fetch/$s_!L-RJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd99a9f7f-11e6-4c1a-9214-d705f4839451_2048x964.png 848w, https://substackcdn.com/image/fetch/$s_!L-RJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd99a9f7f-11e6-4c1a-9214-d705f4839451_2048x964.png 1272w, https://substackcdn.com/image/fetch/$s_!L-RJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd99a9f7f-11e6-4c1a-9214-d705f4839451_2048x964.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!L-RJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd99a9f7f-11e6-4c1a-9214-d705f4839451_2048x964.png" width="1456" height="685" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d99a9f7f-11e6-4c1a-9214-d705f4839451_2048x964.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:685,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!L-RJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd99a9f7f-11e6-4c1a-9214-d705f4839451_2048x964.png 424w, https://substackcdn.com/image/fetch/$s_!L-RJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd99a9f7f-11e6-4c1a-9214-d705f4839451_2048x964.png 848w, https://substackcdn.com/image/fetch/$s_!L-RJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd99a9f7f-11e6-4c1a-9214-d705f4839451_2048x964.png 1272w, https://substackcdn.com/image/fetch/$s_!L-RJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd99a9f7f-11e6-4c1a-9214-d705f4839451_2048x964.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>When a message arrives from Telegram, WhatsApp, or Slack, the Gateway <em>authenticates</em> the sender and creates or retrieves a session for the conversation.</p><p>It then assembles the full context<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-10" href="#footnote-10" target="_self">10</a>, passes it to the LLM, handles any tool calls that come back, and routes the final reply to the correct channel.</p><p>Say you ask the agent to clean up old log files:</p><p>Before it runs anything, the Gateway intercepts the shell command and sends you a confirmation request through Telegram: <em>&#8220;I&#8217;m about to run</em><code> $ rm -rf ~/logs/*.log</code><em>, approve this?&#8221;</em></p><p>Nothing executes until you say yes&#8230; If you don&#8217;t respond, it waits&#8230;</p><p>The Gateway also serves the Control UI at <code>http://localhost:18789</code>.</p><p>It&#8217;s a browser-based dashboard that lets you watch the agent&#8217;s reasoning in real time, approve pending commands, and inspect active sessions. Every tool call, model response, and approval request shows up here live. Since the Gateway runs on your Mac Mini or VPS, you can access the Control UI from any browser on the same network. If you use Tailscale<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-11" href="#footnote-11" target="_self">11</a>, you can reach it from your phone or laptop without being on the same network.</p><p>The Gateway also supports hot configuration reloads&#8230;</p><p>You can change your model, update permissions, or add a new channel without restarting the process. The running server picks up the changes and applies them immediately.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/p/openclaw-architecture?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.systemdesign.one/p/openclaw-architecture?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><h3><strong>Channel Layer: How OpenClaw Talks to Messaging Apps</strong></h3><p>OpenClaw connects to Telegram, WhatsApp, Slack.</p><p>Each of these works completely differently at the protocol level:</p><ul><li><p>Telegram uses a bot token with polling or webhook delivery.</p></li><li><p>WhatsApp uses a reverse-engineered web protocol; it requires QR-code authentication and stores session state locally.</p></li><li><p>Slack uses Socket Mode with two separate token types.</p></li><li><p>Discord uses a gateway WebSocket with its own heartbeat requirements.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OUz4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd48e1488-5a16-4112-befc-45c84c20f193_2048x972.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OUz4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd48e1488-5a16-4112-befc-45c84c20f193_2048x972.png 424w, https://substackcdn.com/image/fetch/$s_!OUz4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd48e1488-5a16-4112-befc-45c84c20f193_2048x972.png 848w, https://substackcdn.com/image/fetch/$s_!OUz4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd48e1488-5a16-4112-befc-45c84c20f193_2048x972.png 1272w, https://substackcdn.com/image/fetch/$s_!OUz4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd48e1488-5a16-4112-befc-45c84c20f193_2048x972.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OUz4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd48e1488-5a16-4112-befc-45c84c20f193_2048x972.png" width="1456" height="691" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d48e1488-5a16-4112-befc-45c84c20f193_2048x972.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:691,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!OUz4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd48e1488-5a16-4112-befc-45c84c20f193_2048x972.png 424w, https://substackcdn.com/image/fetch/$s_!OUz4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd48e1488-5a16-4112-befc-45c84c20f193_2048x972.png 848w, https://substackcdn.com/image/fetch/$s_!OUz4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd48e1488-5a16-4112-befc-45c84c20f193_2048x972.png 1272w, https://substackcdn.com/image/fetch/$s_!OUz4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd48e1488-5a16-4112-befc-45c84c20f193_2048x972.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Without a dedicated abstraction layer, each new platform requires touching the routing logic, session management, and message pipeline.</p><p>OpenClaw solves this with a Channel Layer:</p><p>It treats each platform as a <em>separate plugin </em>acting as a translation adapter. Each adapter translates incoming messages into a standard internal format: <em>a stable identity key, content, metadata, and attachments</em>. Only then does it pass anything to the Gateway. Everything after that boundary is platform-agnostic. The agent has NO idea whether your message came from Telegram or Slack.</p><p>It sees a normalized message and responds accordingly&#8230;</p><p>This also means security policies like allowlists, pairing workflows, and mention-gating apply uniformly across all platforms rather than being reimplemented per channel.</p><h3><strong>LLM Layer: Where Reasoning Happens</strong></h3><p>OpenClaw is a brain socket.</p><p>You provide the intelligence by connecting an API key. And it provides the infrastructure for everything else.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bXAy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7656418c-27d5-478a-9eba-6ba00c041844_2048x867.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bXAy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7656418c-27d5-478a-9eba-6ba00c041844_2048x867.png 424w, https://substackcdn.com/image/fetch/$s_!bXAy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7656418c-27d5-478a-9eba-6ba00c041844_2048x867.png 848w, https://substackcdn.com/image/fetch/$s_!bXAy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7656418c-27d5-478a-9eba-6ba00c041844_2048x867.png 1272w, https://substackcdn.com/image/fetch/$s_!bXAy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7656418c-27d5-478a-9eba-6ba00c041844_2048x867.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bXAy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7656418c-27d5-478a-9eba-6ba00c041844_2048x867.png" width="1456" height="616" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7656418c-27d5-478a-9eba-6ba00c041844_2048x867.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:616,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bXAy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7656418c-27d5-478a-9eba-6ba00c041844_2048x867.png 424w, https://substackcdn.com/image/fetch/$s_!bXAy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7656418c-27d5-478a-9eba-6ba00c041844_2048x867.png 848w, https://substackcdn.com/image/fetch/$s_!bXAy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7656418c-27d5-478a-9eba-6ba00c041844_2048x867.png 1272w, https://substackcdn.com/image/fetch/$s_!bXAy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7656418c-27d5-478a-9eba-6ba00c041844_2048x867.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>It means you can use Claude, GPT-4, Gemini, DeepSeek, or a locally run model via Ollama. Plus, you can change models at any time without any reconfiguration.</p><p>When a message arrives at this layer, OpenClaw assembles a system prompt by combining the agent&#8217;s configured personality, list of available skills, relevant memory retrieved from previous sessions, and current conversation history.</p><p>It then calls the AI model with the full context.</p><p>If the model decides it needs to take an action, run a shell command, search the web, read a file, or call an API, it signals that as a tool call. OpenClaw executes the tool, captures the output, and feeds the result back into the conversation. The model reads the result and decides what to do next: call another tool, or produce a final answer.</p><p>This loop continues until the model generates a response with no further tool requests.</p><p>You can watch this entire loop unfold in real time in the Control UI at every tool call, every model response, every intermediate step&#8230;</p><p>And when a conversation reaches the model&#8217;s context limit, compaction<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-12" href="#footnote-12" target="_self">12</a> triggers automatically. The agent summarizes the conversation so far and replaces the raw message history with that summary, preserving continuity without the cost of loading thousands of tokens on every reply.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/p/openclaw-architecture?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.systemdesign.one/p/openclaw-architecture?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><h3><strong>Plugin System: How OpenClaw Stays Modular</strong></h3><p>Almost everything in OpenClaw that is not part of the core engine is implemented as a plugin<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-13" href="#footnote-13" target="_self">13</a>:</p><ul><li><p>Telegram integration is a plugin.</p></li><li><p>GitHub access is a plugin.</p></li><li><p>The memory system is a plugin.</p></li></ul><p>This is what makes OpenClaw easy to extend and customize&#8230;</p><p>When OpenClaw starts, it reads your configuration and loads only the plugins you have enabled. Disabling a capability means changing one line in your config, while adding a community plugin is as simple as placing it in the right directory.</p><p>If you want to write your own, you just implement a defined interface and never touch the core engine.</p><p>Plugins can hook into lifecycle events at every meaningful point: <em>before and after tool calls, before and after messages, at session start and end.</em></p><p>This is how you add things like audit logging, rate limiting, or custom approval flows without touching the core agent at all&#8230;</p><h3><strong>Memory System: How Agent Remembers You</strong></h3><p>The memory solution (loading the entire conversation history into every prompt) fails quickly.</p><p>A week of active conversation will overflow any context window&#8230; Even if it fits,,, loading all of it on every reply is expensive and slow&#8230;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nDTf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcbb0b502-1581-4457-a7b5-3a2c000688c6_2048x1251.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nDTf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcbb0b502-1581-4457-a7b5-3a2c000688c6_2048x1251.png 424w, https://substackcdn.com/image/fetch/$s_!nDTf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcbb0b502-1581-4457-a7b5-3a2c000688c6_2048x1251.png 848w, https://substackcdn.com/image/fetch/$s_!nDTf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcbb0b502-1581-4457-a7b5-3a2c000688c6_2048x1251.png 1272w, https://substackcdn.com/image/fetch/$s_!nDTf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcbb0b502-1581-4457-a7b5-3a2c000688c6_2048x1251.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nDTf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcbb0b502-1581-4457-a7b5-3a2c000688c6_2048x1251.png" width="1456" height="889" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cbb0b502-1581-4457-a7b5-3a2c000688c6_2048x1251.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:889,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nDTf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcbb0b502-1581-4457-a7b5-3a2c000688c6_2048x1251.png 424w, https://substackcdn.com/image/fetch/$s_!nDTf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcbb0b502-1581-4457-a7b5-3a2c000688c6_2048x1251.png 848w, https://substackcdn.com/image/fetch/$s_!nDTf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcbb0b502-1581-4457-a7b5-3a2c000688c6_2048x1251.png 1272w, https://substackcdn.com/image/fetch/$s_!nDTf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcbb0b502-1581-4457-a7b5-3a2c000688c6_2048x1251.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>OpenClaw treats short-term continuity and long-term recall as separate problems that need different solutions:</p><ul><li><p><em>Short-term memory</em> keeps the current conversation on track. Every session is saved as a file on disk. When you send a message, OpenClaw loads only the recent part of that session into context, just enough to keep things flowing without loading everything at once.</p></li><li><p><em>Long-term memory</em> handles recall across days and weeks. All past sessions are indexed into a local database. Before the agent answers any message, the memory indexer searches the database and pulls in any relevant information automatically. If a conversation from five days ago is relevant, it gets included. You never have to remind the agent of something it already knows.</p></li><li><p><em>Standing notes</em> are a third layer. These are Markdown files you write yourself, stored in <code>~/.openclaw/workspace/memory/</code>. Think of them as a permanent briefing document. Your preferences, your projects, your context. The agent reads them every time.</p></li></ul><div><hr></div><div class="callout-block" data-callout="true"><p><em><strong>Reminder: this is a teaser of the subscriber-only newsletter series, exclusive to my golden members.</strong></em></p><p>When you upgrade, you&#8217;ll get:</p><ul><li><p><strong>High-level architecture of real-world systems.</strong></p></li><li><p>Deep dive into how popular real-world systems actually work.</p></li><li><p><strong>How real-world systems handle scale, reliability, and performance.</strong></p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?yearly=true&quot;,&quot;text&quot;:&quot;Get Full Access Now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.systemdesign.one/subscribe?yearly=true"><span>Get Full Access Now</span></a></p></div><div><hr></div><h3><strong>How OpenClaw Handles Tasks That Take Hours</strong></h3><p>Most interactions with OpenClaw are fast&#8230; You send a message; the agent calls a tool, and replies in seconds.</p><p>But some tasks take much longer: <em>scraping hundreds of pages, running a full test suite, processing a large dataset, or coordinating a multi-step workflow across several APIs.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cwK3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3269bec0-6bac-4c6f-b955-9494da321811_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cwK3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3269bec0-6bac-4c6f-b955-9494da321811_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!cwK3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3269bec0-6bac-4c6f-b955-9494da321811_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!cwK3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3269bec0-6bac-4c6f-b955-9494da321811_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!cwK3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3269bec0-6bac-4c6f-b955-9494da321811_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cwK3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3269bec0-6bac-4c6f-b955-9494da321811_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3269bec0-6bac-4c6f-b955-9494da321811_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cwK3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3269bec0-6bac-4c6f-b955-9494da321811_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!cwK3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3269bec0-6bac-4c6f-b955-9494da321811_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!cwK3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3269bec0-6bac-4c6f-b955-9494da321811_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!cwK3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3269bec0-6bac-4c6f-b955-9494da321811_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>OpenClaw handles these differently:</p><p>When a task is going to take a while, the agent does NOT hold the conversation open and make you wait. It acknowledges the task, starts working, and messages you when it&#8217;s done. You can lock your phone and walk away.</p><p>The Gateway keeps running and messages you when the task is done.</p><p>For tasks that span multiple sessions or require human input mid-way, the agent completes what it can, writes the current state to <code>MEMORY.md</code> or a workspace file, and then pauses. When you return and pick up the conversation, it resumes from the saved state.</p><p>NO context gets lost between sessions because important state was written to disk, not held in memory.</p><p>For truly long-running background jobs, cron jobs in isolated session mode are the right tool. Each run gets its own session, completes its work, delivers output to your channel, and exits cleanly. The next scheduled run starts fresh.</p><p>This keeps long-running automation from accumulating stale context over time.</p><p>One practical limit worth knowing: <em>if a single tool call takes too long, Gateway will time out waiting for a response.</em></p><p>For tasks like large web scrapes or slow API calls, break the work into smaller chunks. Instruct the agent to process in batches and report progress between each batch, rather than attempting everything in a single tool call.</p><p><em>Ready for the best part?</em></p><h3><strong>How a Message Travels End to End</strong></h3>
      <p>
          <a href="https://newsletter.systemdesign.one/p/openclaw-architecture">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Design Knowledge Q&A System]]></title><description><![CDATA[#150: Part 3 - Generative AI Masterclass]]></description><link>https://newsletter.systemdesign.one/p/ai-based-knowledge-management-system</link><guid isPermaLink="false">https://newsletter.systemdesign.one/p/ai-based-knowledge-management-system</guid><dc:creator><![CDATA[Louis-François Bouchard]]></dc:creator><pubDate>Thu, 04 Jun 2026 11:13:56 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/ae818d73-27d9-4094-bfee-9b5c66e2a147_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://newsletter.systemdesign.one/subscribe?yearly=true" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!RKN7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3689f342-2008-4ce6-b968-16461682508b_1280x300.png 424w, https://substackcdn.com/image/fetch/$s_!RKN7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3689f342-2008-4ce6-b968-16461682508b_1280x300.png 848w, https://substackcdn.com/image/fetch/$s_!RKN7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3689f342-2008-4ce6-b968-16461682508b_1280x300.png 1272w, https://substackcdn.com/image/fetch/$s_!RKN7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3689f342-2008-4ce6-b968-16461682508b_1280x300.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!RKN7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3689f342-2008-4ce6-b968-16461682508b_1280x300.png" width="1280" height="300" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3689f342-2008-4ce6-b968-16461682508b_1280x300.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:300,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:24224,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://newsletter.systemdesign.one/subscribe?yearly=true&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/192435842?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3689f342-2008-4ce6-b968-16461682508b_1280x300.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!RKN7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3689f342-2008-4ce6-b968-16461682508b_1280x300.png 424w, https://substackcdn.com/image/fetch/$s_!RKN7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3689f342-2008-4ce6-b968-16461682508b_1280x300.png 848w, https://substackcdn.com/image/fetch/$s_!RKN7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3689f342-2008-4ce6-b968-16461682508b_1280x300.png 1272w, https://substackcdn.com/image/fetch/$s_!RKN7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3689f342-2008-4ce6-b968-16461682508b_1280x300.png 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a></figure></div><ul><li><p><em><a href="https://newsletter.systemdesign.one/p/ai-based-knowledge-management-system/?action=share">Share this post</a> &amp; I'll send you some rewards for the referrals.</em></p></li></ul><div><hr></div><p>If you ask questions about your company&#8217;s policies, a contract you have open in another tab, or anything outside training data, an AI personal chat assistant will either say <em>&#8220;I don&#8217;t know&#8221;</em> or invent something that sounds right.</p><p>The most direct fix is to include the relevant documents in the prompt.</p><p>But this only works if the documents fit within the context window, and most real knowledge bases do not. Company policies, contracts, and internal wikis are lengthy, and loading them in full for every query can be impractical because of latency, cost, and context window limitations.</p><p>Retrieval-Augmented Generation (<strong>RAG</strong>) is a more practical approach.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hxE7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3de60e6-1faf-463f-8f17-98d3deeb7b36_2048x904.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hxE7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3de60e6-1faf-463f-8f17-98d3deeb7b36_2048x904.png 424w, https://substackcdn.com/image/fetch/$s_!hxE7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3de60e6-1faf-463f-8f17-98d3deeb7b36_2048x904.png 848w, https://substackcdn.com/image/fetch/$s_!hxE7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3de60e6-1faf-463f-8f17-98d3deeb7b36_2048x904.png 1272w, https://substackcdn.com/image/fetch/$s_!hxE7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3de60e6-1faf-463f-8f17-98d3deeb7b36_2048x904.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hxE7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3de60e6-1faf-463f-8f17-98d3deeb7b36_2048x904.png" width="1456" height="643" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f3de60e6-1faf-463f-8f17-98d3deeb7b36_2048x904.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:643,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hxE7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3de60e6-1faf-463f-8f17-98d3deeb7b36_2048x904.png 424w, https://substackcdn.com/image/fetch/$s_!hxE7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3de60e6-1faf-463f-8f17-98d3deeb7b36_2048x904.png 848w, https://substackcdn.com/image/fetch/$s_!hxE7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3de60e6-1faf-463f-8f17-98d3deeb7b36_2048x904.png 1272w, https://substackcdn.com/image/fetch/$s_!hxE7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3de60e6-1faf-463f-8f17-98d3deeb7b36_2048x904.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The full RAG system: <em>an offline pipeline that builds the index, and an online pipeline that answers each query, sharing a vector database.</em></figcaption></figure></div><p>Instead of loading the entire knowledge base into every prompt, the system retrieves only the most relevant passages at query time, includes them alongside the question, and the model generates a cited answer from that material.</p><p>It is the architecture behind Perplexity, ChatGPT with Search, GitHub Copilot&#8217;s codebase search, and most internal AI tools inside large companies.</p><p>Onward.</p><div class="callout-block" data-callout="true"><h2><a href="https://agentfield.ai/github/pr-af/?utm_source=sys_design&amp;utm_medium=newsletter&amp;utm_campaign=sys_design-260602&amp;utm_id=sys_design-260602-pr-af&amp;utm_content=pr-af">Code review for AI-native engineering teams &#8212; at 1M+ LOC (Partner)</a></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://agentfield.ai/github/pr-af/?utm_source=sys_design&amp;utm_medium=newsletter&amp;utm_campaign=sys_design-260602&amp;utm_id=sys_design-260602-pr-af&amp;utm_content=pr-af" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GcXK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79e9753b-b5af-4812-98e6-16fb1e25ba9a_2048x1143.webp 424w, https://substackcdn.com/image/fetch/$s_!GcXK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79e9753b-b5af-4812-98e6-16fb1e25ba9a_2048x1143.webp 848w, https://substackcdn.com/image/fetch/$s_!GcXK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79e9753b-b5af-4812-98e6-16fb1e25ba9a_2048x1143.webp 1272w, https://substackcdn.com/image/fetch/$s_!GcXK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79e9753b-b5af-4812-98e6-16fb1e25ba9a_2048x1143.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GcXK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79e9753b-b5af-4812-98e6-16fb1e25ba9a_2048x1143.webp" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/79e9753b-b5af-4812-98e6-16fb1e25ba9a_2048x1143.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:305508,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/webp&quot;,&quot;href&quot;:&quot;https://agentfield.ai/github/pr-af/?utm_source=sys_design&amp;utm_medium=newsletter&amp;utm_campaign=sys_design-260602&amp;utm_id=sys_design-260602-pr-af&amp;utm_content=pr-af&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/179236490?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79e9753b-b5af-4812-98e6-16fb1e25ba9a_2048x1143.webp&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!GcXK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79e9753b-b5af-4812-98e6-16fb1e25ba9a_2048x1143.webp 424w, https://substackcdn.com/image/fetch/$s_!GcXK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79e9753b-b5af-4812-98e6-16fb1e25ba9a_2048x1143.webp 848w, https://substackcdn.com/image/fetch/$s_!GcXK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79e9753b-b5af-4812-98e6-16fb1e25ba9a_2048x1143.webp 1272w, https://substackcdn.com/image/fetch/$s_!GcXK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79e9753b-b5af-4812-98e6-16fb1e25ba9a_2048x1143.webp 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Pull request review was built for human authors.</p><p>AI-written code fails differently, and most reviewer tools never caught up.</p><p>AgentField just open-sourced a multi-agent code reviewer built for AI-native teams:</p><ul><li><p>Runs on open or closed models &#8212; Kimi, DeepSeek, Claude &#8212; so it scales to whatever a team can afford.</p></li><li><p>Drops into GitHub Actions in minutes.</p></li><li><p>Built to make deep code review economically viable at scale.</p></li></ul><p>The architecture rationale &#8212; the four jobs of code review, and which three became load-bearing once AI took the first &#8212; is in AgentField&#8217;s latest <strong><a href="https://agentfield.ai/blog/ai-native-code-review/?utm_source=sys_design&amp;utm_medium=newsletter&amp;utm_campaign=sys_design-260602&amp;utm_id=sys_design-260602-blog-ai-native-code-review&amp;utm_content=blog-ai-native-code-review">post</a></strong>.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://agentfield.ai/github/pr-af/?utm_source=sys_design&amp;utm_medium=newsletter&amp;utm_campaign=sys_design-260602&amp;utm_id=sys_design-260602-pr-af&amp;utm_content=pr-af&quot;,&quot;text&quot;:&quot;Star &amp; Deploy&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://agentfield.ai/github/pr-af/?utm_source=sys_design&amp;utm_medium=newsletter&amp;utm_campaign=sys_design-260602&amp;utm_id=sys_design-260602-pr-af&amp;utm_content=pr-af"><span>Star &amp; Deploy</span></a></p></div><div><hr></div><p>I want to reintroduce <strong><a href="https://louisbouchard.substack.com/welcome">Louis-Fran&#231;ois Bouchard</a> </strong>as the author of this newsletter.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://louisbouchard.substack.com/welcome" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8ezx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bbacb1f-03da-4781-9a5e-b720e4af2e34_1100x220.png 424w, https://substackcdn.com/image/fetch/$s_!8ezx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bbacb1f-03da-4781-9a5e-b720e4af2e34_1100x220.png 848w, https://substackcdn.com/image/fetch/$s_!8ezx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bbacb1f-03da-4781-9a5e-b720e4af2e34_1100x220.png 1272w, https://substackcdn.com/image/fetch/$s_!8ezx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bbacb1f-03da-4781-9a5e-b720e4af2e34_1100x220.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8ezx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bbacb1f-03da-4781-9a5e-b720e4af2e34_1100x220.png" width="1100" height="220" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2bbacb1f-03da-4781-9a5e-b720e4af2e34_1100x220.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:220,&quot;width&quot;:1100,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:null,&quot;href&quot;:&quot;https://louisbouchard.substack.com/welcome&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!8ezx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bbacb1f-03da-4781-9a5e-b720e4af2e34_1100x220.png 424w, https://substackcdn.com/image/fetch/$s_!8ezx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bbacb1f-03da-4781-9a5e-b720e4af2e34_1100x220.png 848w, https://substackcdn.com/image/fetch/$s_!8ezx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bbacb1f-03da-4781-9a5e-b720e4af2e34_1100x220.png 1272w, https://substackcdn.com/image/fetch/$s_!8ezx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bbacb1f-03da-4781-9a5e-b720e4af2e34_1100x220.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>He&#8217;s a best-selling author (<a href="https://amzn.to/4bqYU9b">Building LLMs for Production</a>), the co-founder of <a href="https://academy.towardsai.net/?ref=1f9b29">Towards AI</a>, and the creator of the YouTube Channel, <a href="https://www.youtube.com/@whatsai?sub_confirmation=1">What&#8217;s AI</a>, where he helps people understand AI and learn how to apply it in the real world.</p><p>Through his development work with clients and his content, teaching, and AI training programs on the <strong><a href="https://academy.towardsai.net/?ref=1f9b29">Towards AI Academy</a></strong>, Louis focuses on making AI practical for builders, engineers, and curious learners alike.</p><p>At Towards AI, he and his team train AI engineers through courses built for every stage, from beginner to advanced. That educational mission and the real-world experience building for his clients are exactly why I wanted him in this newsletter series.</p><div><hr></div><p><em><strong>Here&#8217;s what&#8217;s inside this newsletter:</strong></em></p><ul><li><p><strong>The full RAG architecture.</strong> How a question becomes an embedding, how the system finds the right chunks across a large knowledge base, and how the model generates an answer grounded in retrieved sources.</p></li><li><p><strong>Document ingestion and chunking.</strong> What makes a chunk useful, three strategies for splitting documents, and the metadata that makes citations possible.</p></li><li><p><strong>Retrieval and generation in production.</strong> Hybrid search, reranking, query transformation, system prompts, citation strategies, and how to make the system decline questions it cannot answer rather than guess.</p></li><li><p><strong>Failure modes and evaluation.</strong> The three ways RAG breaks in practice, the four RAGAS metrics that surface them, and how to build a test set that catches regressions.</p></li><li><p><strong>Production concerns.</strong> Caching, access control, monitoring, and cost at scale.</p></li><li><p><strong>A practical eight-step build.</strong> A folder of PDFs turned into a working Q&amp;A system with hybrid retrieval, reranking, structured citations, and a quality scorecard.</p></li></ul><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Golden members get all posts like these!&#8230;</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h2>Why Build Your Own Knowledge Q&amp;A System?</h2><p>Perplexity, ChatGPT with browsing, and Gemini with Search already answer questions with cited sources.</p><p>They are capable and improving fast.</p><p>But the limitation is that they are built for public knowledge. The moment your questions involve internal documents, regulated data, or a knowledge base within your infrastructure, off-the-shelf tools hit hard constraints.</p><p>Here is where they fall short:</p><ul><li><p><strong>Off-the-shelf tools cannot access regulated data.</strong> You can upload personal files into a ChatGPT project and get useful answers. You cannot do that with customer records, contracts, medical files, or anything covered by HIPAA, GDPR, or SOC 2. That data has to stay inside your own perimeter, behind your own access controls, with your own audit trail.</p></li><li><p><strong>Your knowledge is not in a folder you can drag and drop.</strong> It lives in Confluence, in a Drive folder with thousands of contracts, in a support tool, or in an internal wiki. Off-the-shelf assistants can connect to some of these, but the data flows out of your infrastructure and into theirs, and is processed under their privacy policy. A system built on your infrastructure reads directly from those sources, using your auth, and ensures data never leaves your perimeter.</p></li><li><p><strong>Retrieval happens at query time, so answers stay current.</strong> A model&#8217;s training data has a cutoff, but your documents keep changing. Because RAG retrieves at query time, a document updated after the model was trained is still used to answer the next question, with no retraining required.</p></li><li><p><strong>You control what the system knows.</strong> When a document is added to the knowledge base, the system uses it for the next query. When it is removed, it is no longer used. When it is updated, the next answer reflects the new version. This control is what lets you curate, version, and audit the knowledge base, and it is what regulated industries require.</p></li><li><p><strong>Focused retrieval is more accurate and cheaper.</strong> A model answering a question from five selected passages is working with a much smaller, more relevant input than one drawing on its full training data. More relevant context produces more accurate answers, and shorter prompts cost less per call.</p></li></ul><p>These five constraints all point to the same architecture.</p><p>The rest of this newsletter builds on it&#8230;</p><div><hr></div><h2>Part 1: Turning Documents into Searchable Knowledge</h2><p>A RAG system has two stages:</p><ul><li><p>First runs once, or whenever new documents are added: it processes your documents, converts them into vector representations, and stores them in a database that can be searched by meaning.</p></li><li><p>Second runs on every user query. This part focuses on the first: ingestion, chunking, embedding, and storage.</p></li></ul><h3>Document Ingestion and Preprocessing</h3><p>Before anything can be retrieved, your documents need to be readable by the system.</p><p>That sounds straightforward, but documents come in many formats: PDFs, HTML pages, Word files, Markdown exports from Notion, Confluence pages, and scanned forms. Each format stores text differently and needs its own parser.</p><p>A poor extraction at this stage degrades everything downstream, so it is worth getting right.</p><p>There are five things to handle during ingestion:</p><ol><li><p><strong>Text extraction.</strong> As text is extracted, the structural information alongside it is preserved: headings, page numbers, and table boundaries. This is what allows the system to later cite not just a document, but the specific section and page a passage came from.</p></li><li><p><strong>Tables, code, and images.</strong> These elements break when treated as regular text. A table flattened into a paragraph loses its row-column relationships. Code blocks lose meaning when indentation and line breaks are stripped. Images contain no text at all unless a caption is present or OCR is run on them. Each is typically handled through a separate extraction path.</p></li><li><p><strong>Scanned PDFs.</strong> A digital PDF stores text as text. A scanned PDF stores it as an image, which a regular parser cannot read. Optical Character Recognition (<strong>OCR</strong>) is used for converting scanned documents into readable text. But OCR takes several seconds per page, which adds time to the ingestion stage. Tesseract is the standard library for clean scans. Vision-language models work better on low-quality images and complex layouts.</p></li><li><p><strong>Duplicates.</strong> When the same document exists in multiple folders, it can be indexed multiple times, which causes retrieval to return identical chunks. A common approach is to hash each document&#8217;s content and skip any document whose hash has already been indexed. For near-duplicates such as different versions of the same contract, fuzzy text matching is used instead.</p></li><li><p><strong>Boilerplate.</strong> Headers, footers, navigation menus, and copyright notices add noise without adding information. These are typically identified by how frequently they repeat across pages and stripped before the text is passed downstream.</p></li></ol><p>Clean text is necessary but not sufficient.</p><p>Before it can be embedded and searched, it has to be cut into pieces small enough for the embedding model to represent accurately.</p><h3>Chunking</h3><p>A long document cannot be embedded as a single vector.</p><p>An embedding that represents 50 pages of mixed content does not align well with a specific question. Instead, each document is split into smaller pieces called chunks, and each chunk is embedded individually.</p><p>Different chunk sizes have different tradeoffs&#8230;</p><p><em>Larger chunks</em> carry more context but produce less specific embeddings. <em>Smaller chunks</em> produce more specific embeddings but may lack the surrounding context needed to fully answer a question. A<em> well-formed chunk</em> is self-contained, covers a single topic, and includes its section heading and document title.</p><p>Here are three common strategies for splitting documents into chunks:</p><ol><li><p><strong>Fixed-size chunking</strong> splits the text into chunks of N tokens, with a fixed overlap, usually 50 to 100 tokens. The overlap ensures that a sentence cut at a chunk boundary still appears in full in one of the two adjacent chunks. It is the simplest approach and works well on uniform text such as articles and documentation.</p></li><li><p><strong>Recursive chunking</strong> splits on paragraph boundaries first, then sentences, then words, going to a finer level only when a chunk is still too large. It follows the structure already present in the document rather than imposing an arbitrary boundary. Most popular RAG libraries, including LangChain and LlamaIndex, use this as their default.</p></li><li><p><strong>Semantic chunking</strong> splits a document into sentences, then forms small groups of consecutive sentences and creates an embedding for each group. The embedding of each group is compared with that of the previous group, and a chunk boundary is placed where the difference between the two embeddings is large. The assumption is that a large change in the embedding indicates a change in topic. Semantic chunking performs better than the other methods when topics shift within paragraphs rather than across them, but it requires an embedding call for each sentence group at ingestion time, making it the most expensive of the three.</p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!L5YK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32e6c2c2-9afe-4ec4-b7dc-e5ed9aab6029_2048x1063.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!L5YK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32e6c2c2-9afe-4ec4-b7dc-e5ed9aab6029_2048x1063.png 424w, https://substackcdn.com/image/fetch/$s_!L5YK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32e6c2c2-9afe-4ec4-b7dc-e5ed9aab6029_2048x1063.png 848w, https://substackcdn.com/image/fetch/$s_!L5YK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32e6c2c2-9afe-4ec4-b7dc-e5ed9aab6029_2048x1063.png 1272w, https://substackcdn.com/image/fetch/$s_!L5YK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32e6c2c2-9afe-4ec4-b7dc-e5ed9aab6029_2048x1063.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!L5YK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32e6c2c2-9afe-4ec4-b7dc-e5ed9aab6029_2048x1063.png" width="1456" height="756" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/32e6c2c2-9afe-4ec4-b7dc-e5ed9aab6029_2048x1063.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:756,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!L5YK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32e6c2c2-9afe-4ec4-b7dc-e5ed9aab6029_2048x1063.png 424w, https://substackcdn.com/image/fetch/$s_!L5YK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32e6c2c2-9afe-4ec4-b7dc-e5ed9aab6029_2048x1063.png 848w, https://substackcdn.com/image/fetch/$s_!L5YK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32e6c2c2-9afe-4ec4-b7dc-e5ed9aab6029_2048x1063.png 1272w, https://substackcdn.com/image/fetch/$s_!L5YK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32e6c2c2-9afe-4ec4-b7dc-e5ed9aab6029_2048x1063.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A common starting point is recursive chunking at 500 to 800 tokens with a 50- to 100-token overlap.</p><p>Each chunk is tagged with metadata: <em>source document, page number, section heading, and last-modified date</em>. This metadata is what the system uses to produce citations later.</p><p>The chunks are now ready to be searched. Searching by exact word match would miss most relevant results, since users rarely use the same words that documents do.</p><p>Each chunk is therefore converted into a numeric vector that captures its meaning&#8230;</p><h3>Embeddings</h3><p>These vectors are called embeddings: <em>fixed-length vectors of numbers that represent the meaning of a piece of text.</em></p><p>Two chunks with similar meaning produce vectors that are close to each other; two chunks on different topics produce vectors that are far apart.</p><p>This is what allows a query like <em>&#8220;reset my password&#8221;</em> to retrieve a document titled <em>&#8220;account credential recovery,&#8221;</em> even though the two share no words in common.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EiIB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0cf2a4e-6d51-41ba-bb0c-6abd43321882_2048x1073.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EiIB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0cf2a4e-6d51-41ba-bb0c-6abd43321882_2048x1073.png 424w, https://substackcdn.com/image/fetch/$s_!EiIB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0cf2a4e-6d51-41ba-bb0c-6abd43321882_2048x1073.png 848w, https://substackcdn.com/image/fetch/$s_!EiIB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0cf2a4e-6d51-41ba-bb0c-6abd43321882_2048x1073.png 1272w, https://substackcdn.com/image/fetch/$s_!EiIB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0cf2a4e-6d51-41ba-bb0c-6abd43321882_2048x1073.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EiIB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0cf2a4e-6d51-41ba-bb0c-6abd43321882_2048x1073.png" width="1456" height="763" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e0cf2a4e-6d51-41ba-bb0c-6abd43321882_2048x1073.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:763,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!EiIB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0cf2a4e-6d51-41ba-bb0c-6abd43321882_2048x1073.png 424w, https://substackcdn.com/image/fetch/$s_!EiIB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0cf2a4e-6d51-41ba-bb0c-6abd43321882_2048x1073.png 848w, https://substackcdn.com/image/fetch/$s_!EiIB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0cf2a4e-6d51-41ba-bb0c-6abd43321882_2048x1073.png 1272w, https://substackcdn.com/image/fetch/$s_!EiIB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0cf2a4e-6d51-41ba-bb0c-6abd43321882_2048x1073.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Some commonly used embedding models in 2026 include OpenAI&#8217;s text-embedding-3-large, Cohere&#8217;s embed-v4, and open-source options such as bge-large and e5-mistral.</p><p>The typical starting point is an API model, with a move to self-hosted open-source if data residency, privacy, or cost becomes a constraint. For specialized domains such as legal, biomedical, or financial text, a domain-specific model generally outperforms a general-purpose one.</p><p>The MTEB benchmark is a useful reference for comparing models.</p><p>Embeddings have three limitations worth understanding before choosing a retrieval strategy:</p><ul><li><p><strong>Negation.</strong> &#8220;Drugs that don&#8217;t interact with warfarin&#8221; and &#8220;drugs that interact with warfarin&#8221; produce nearly identical embeddings. The word &#8220;don&#8217;t&#8221; has very little effect on the resulting vector, even though it inverts the meaning.</p></li><li><p><strong>Exact identifiers.</strong> Order numbers, error codes, and SKUs such as E-4012 carry no semantic meaning, so similarity searches cannot reliably match them. Hybrid search (covered in Part 2) is used in these cases.</p></li><li><p><strong>Numeric filters.</strong> A query like &#8220;customers with revenue above $5M&#8221; is a structured filter, not a similarity match. Structured retrieval (covered in Part 2) is used for these queries.</p></li></ul><p>Another operational constraint is that the model used to embed the chunks must also be used to embed queries during retrieval.</p><p>The two share a single vector space, and switching models means re-embedding the entire corpus.</p><p>The choice of embedding model is effectively a long-term commitment&#8230;</p><h3>Vector Databases</h3><p>Once the chunks are embedded, they need to be stored in a system that supports fast similarity search.</p><p>A vector database stores each chunk&#8217;s embedding alongside its metadata and returns the closest matches to a query vector. The metadata, source document, page number, section heading, and last-modified date are returned with each result, so the system can point the user back to the source.</p><p>The basic operation is similarity search: <em>given a query vector, return the K vectors in the database that are closest to it.</em></p><p>Closeness is usually measured using cosine similarity, which compares two vectors based on the angle between them.</p><p>A direct implementation of this compares the query vector against every vector in the database. This works for small databases of a few thousand vectors, but the cost grows linearly with the database size and becomes impractical at scales of millions of vectors.</p><p>Production systems use <strong>approximate nearest neighbor (ANN)</strong> search, a family of algorithms that returns close-to-exact results at a fraction of the cost. ANN trades a small amount of recall for a large reduction in latency, often by orders of magnitude, with the exact tradeoff depending on the dataset, the index, and the recall target.</p><p>Hierarchical Navigable Small World (<strong>HNSW</strong>) and Inverted File (<strong>IVF</strong>)  are two of the most commonly used ANN index types. Some managed vector databases abstract this choice entirely, while others let you pick the index and tune its parameters.</p><p>Here are three deployment options, each suited to different constraints:</p><ul><li><p><strong>Managed services</strong> such as Pinecone, Turbopuffer, and Weaviate Cloud. The service runs the infrastructure and exposes an API. This is appropriate when the team does not want to operate the database directly.</p></li><li><p><strong>Self-hosted open-source</strong> options such as Weaviate, Qdrant, and Milvus. These run on your own infrastructure and offer full control over configuration, scaling, and metadata filtering.</p></li><li><p><strong>Database extensions</strong> such as pgvector for Postgres. These add vector search to an existing relational database. A practical option when the stack already runs Postgres, and the corpus stays under approximately one million chunks.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!i_t4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71c4d104-7c9a-4562-a117-61951eab0353_2048x1282.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!i_t4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71c4d104-7c9a-4562-a117-61951eab0353_2048x1282.png 424w, https://substackcdn.com/image/fetch/$s_!i_t4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71c4d104-7c9a-4562-a117-61951eab0353_2048x1282.png 848w, https://substackcdn.com/image/fetch/$s_!i_t4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71c4d104-7c9a-4562-a117-61951eab0353_2048x1282.png 1272w, https://substackcdn.com/image/fetch/$s_!i_t4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71c4d104-7c9a-4562-a117-61951eab0353_2048x1282.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!i_t4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71c4d104-7c9a-4562-a117-61951eab0353_2048x1282.png" width="1456" height="911" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/71c4d104-7c9a-4562-a117-61951eab0353_2048x1282.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:911,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!i_t4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71c4d104-7c9a-4562-a117-61951eab0353_2048x1282.png 424w, https://substackcdn.com/image/fetch/$s_!i_t4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71c4d104-7c9a-4562-a117-61951eab0353_2048x1282.png 848w, https://substackcdn.com/image/fetch/$s_!i_t4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71c4d104-7c9a-4562-a117-61951eab0353_2048x1282.png 1272w, https://substackcdn.com/image/fetch/$s_!i_t4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71c4d104-7c9a-4562-a117-61951eab0353_2048x1282.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The vector database stores each chunk&#8217;s vector alongside its metadata. At query time, the same embedding model produces the query vector, and similarity search returns the closest chunks with their source attached.</figcaption></figure></div><p>With the index built and the chunks stored, the offline pipeline is complete. The next question is how the system uses it when a user actually asks a question.</p><div><hr></div><div class="callout-block" data-callout="true"><p><em><strong>Reminder: this is a teaser of the subscriber-only newsletter series, exclusive to my golden members.</strong></em></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://newsletter.systemdesign.one/subscribe?yearly=true" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3mfm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0002d397-a8f7-4519-a51c-f020e29ce8b3_1280x300.png 424w, https://substackcdn.com/image/fetch/$s_!3mfm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0002d397-a8f7-4519-a51c-f020e29ce8b3_1280x300.png 848w, https://substackcdn.com/image/fetch/$s_!3mfm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0002d397-a8f7-4519-a51c-f020e29ce8b3_1280x300.png 1272w, https://substackcdn.com/image/fetch/$s_!3mfm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0002d397-a8f7-4519-a51c-f020e29ce8b3_1280x300.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3mfm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0002d397-a8f7-4519-a51c-f020e29ce8b3_1280x300.png" width="1280" height="300" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0002d397-a8f7-4519-a51c-f020e29ce8b3_1280x300.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:300,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:24224,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://newsletter.systemdesign.one/subscribe?yearly=true&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/192435842?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0002d397-a8f7-4519-a51c-f020e29ce8b3_1280x300.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!3mfm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0002d397-a8f7-4519-a51c-f020e29ce8b3_1280x300.png 424w, https://substackcdn.com/image/fetch/$s_!3mfm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0002d397-a8f7-4519-a51c-f020e29ce8b3_1280x300.png 848w, https://substackcdn.com/image/fetch/$s_!3mfm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0002d397-a8f7-4519-a51c-f020e29ce8b3_1280x300.png 1272w, https://substackcdn.com/image/fetch/$s_!3mfm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0002d397-a8f7-4519-a51c-f020e29ce8b3_1280x300.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>When you upgrade, you&#8217;ll get:</p><ul><li><p><strong>Simple breakdown of real-world architectures</strong></p></li><li><p>Frameworks you can plug into your work or business</p></li><li><p><strong>Proven systems behind ChatGPT, Perplexity, and Copilot</strong></p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?yearly=true&quot;,&quot;text&quot;:&quot;Unlock Full Access&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://newsletter.systemdesign.one/subscribe?yearly=true"><span>Unlock Full Access</span></a></p></div><div><hr></div><h2>Part 2: Finding the Right Information</h2><p>This part covers how the system returns the chunks that contain the answer&#8230;</p><h3>Basic Retrieval</h3><p>The default retrieval approach has three steps: <em>embed the question using the same model used to embed the chunks, run a similarity search against the vector database, and return the top K results.</em></p><p><em>K is a tradeoff parameter.</em></p><p>Too small, and the system misses chunks that contain the answer. Too large, and the prompt fills with irrelevant material, which reduces answer quality and increases cost. A reasonable starting value is between 5 and 10.</p><p>The right value for your system is determined over time by testing on real queries. It depends on the expected answer length and how much room the LLM&#8217;s context window leaves for the retrieved chunks.</p><p>Basic retrieval works well on direct questions over a uniform corpus, but it has three common failure modes:</p><ul><li><p><strong>Vocabulary gap.</strong> User uses one word, and the document uses another. For example, the user types &#8220;downtime&#8221; and the document uses &#8220;service interruption.&#8221; Acronyms, internal jargon, and exact identifiers often fall outside the scope of what a general-purpose embedding model has been trained on.</p></li><li><p><strong>Multi-hop questions.</strong> Answer is spread across many documents that are not semantically similar. For example, the question <em>&#8220;what did the CFO say about the product launch?&#8221;</em> requires information from both an earnings transcript and a product roadmap. The two documents do not embed near each other in vector space, so similarity search alone is unlikely to retrieve both.</p></li><li><p><strong>Ambiguous queries.</strong> Query has more than one valid interpretation. For example, <em>&#8220;tell me about Apple&#8221;</em> could refer to the fruit or the company. Without disambiguation, similarity search returns results for one interpretation without indicating that another was possible.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!IXTp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f260630-0170-423a-b556-3864bf000e68_2048x1064.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!IXTp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f260630-0170-423a-b556-3864bf000e68_2048x1064.png 424w, https://substackcdn.com/image/fetch/$s_!IXTp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f260630-0170-423a-b556-3864bf000e68_2048x1064.png 848w, https://substackcdn.com/image/fetch/$s_!IXTp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f260630-0170-423a-b556-3864bf000e68_2048x1064.png 1272w, https://substackcdn.com/image/fetch/$s_!IXTp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f260630-0170-423a-b556-3864bf000e68_2048x1064.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!IXTp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f260630-0170-423a-b556-3864bf000e68_2048x1064.png" width="1456" height="756" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6f260630-0170-423a-b556-3864bf000e68_2048x1064.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:756,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!IXTp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f260630-0170-423a-b556-3864bf000e68_2048x1064.png 424w, https://substackcdn.com/image/fetch/$s_!IXTp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f260630-0170-423a-b556-3864bf000e68_2048x1064.png 848w, https://substackcdn.com/image/fetch/$s_!IXTp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f260630-0170-423a-b556-3864bf000e68_2048x1064.png 1272w, https://substackcdn.com/image/fetch/$s_!IXTp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f260630-0170-423a-b556-3864bf000e68_2048x1064.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The vocabulary gap and the exact-identifier problem both come from the same source: <em>similarity search alone cannot match words it does not understand semantically.</em> Hybrid search adds a second retrieval path that can.</p><p>The third failure mode, ambiguous queries, is handled by a different technique covered later in this newsletter&#8230;</p><p><em>Let&#8217;s keep going!</em></p><h3>Hybrid Search</h3>
      <p>
          <a href="https://newsletter.systemdesign.one/p/ai-based-knowledge-management-system">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[How to get ahead of 99% of software engineers with AI agents]]></title><description><![CDATA[#149: Change your software workflow with AI agents]]></description><link>https://newsletter.systemdesign.one/p/agentic-ai-use-cases</link><guid isPermaLink="false">https://newsletter.systemdesign.one/p/agentic-ai-use-cases</guid><dc:creator><![CDATA[Neo Kim]]></dc:creator><pubDate>Thu, 28 May 2026 09:59:35 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/4bd7282b-9f9c-4887-9844-2bfe23eb42f2_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Get my system design playbook for FREE on newsletter signup:</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><ul><li><p><em><a href="https://newsletter.systemdesign.one/p/agentic-ai-use-cases/?action=share">Share this post</a> &amp; I'll send you some rewards for the referrals.</em></p></li></ul><div><hr></div><p>You hit an error you can&#8217;t solve&#8230;</p><p>So you paste it into Claude Code, and the agent tries a few fixes, but NONE of them work. The bug is affecting checkout, which makes it a money problem, so you drop it in Slack and explain the whole thing again to the senior responsible for that flow.</p><p>She asks good questions&#8230; You answer them&#8230;</p><p>Then someone says, <em>&#8220;Make a ticket so we don&#8217;t lose this,&#8221;</em> and you type the same context into Jira a third time. Same problem, three tools, and you wrote it out fresh for everyone.</p><p>But none of them remembered what you&#8217;d already said in the last&#8230;</p><p>The bug happened the way most bugs appear these days: <em>you just told a coding agent the feature you had in mind. </em>This is how the job works now,,, and most of us don&#8217;t know where agents fit into the traditional software development lifecycle&#8230;</p><p>So let&#8217;s learn about it in this newsletter&#8230;</p><p>We start with what an agent actually is, because the word gets stretched over everything from autocomplete to a bot that opens its own pull requests<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a>. Then, we&#8217;ll cover <em>eight stages</em> every team moves software through, from planning a feature to keeping it alive in production.</p><p>For each stage, we answer one question: <em>can you hand this to an agent today, and what happens when you do?</em></p><p>In some stages, agents already carry real weight&#8230; In others, they make things worse&#8230;</p><p>By the end of this newsletter, you&#8217;ll know which is which, and the harder problem hiding underneath all of it.</p><p>Onward.</p><div><hr></div><div class="callout-block" data-callout="true"><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://coderabbit.link/neo-agent" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GfjP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png 424w, https://substackcdn.com/image/fetch/$s_!GfjP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png 848w, https://substackcdn.com/image/fetch/$s_!GfjP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png 1272w, https://substackcdn.com/image/fetch/$s_!GfjP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GfjP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png" width="1248" height="654" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:654,&quot;width&quot;:1248,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:547609,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://coderabbit.link/neo-agent&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/192885623?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!GfjP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png 424w, https://substackcdn.com/image/fetch/$s_!GfjP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png 848w, https://substackcdn.com/image/fetch/$s_!GfjP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png 1272w, https://substackcdn.com/image/fetch/$s_!GfjP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I&#8217;m happy to partner with <strong><a href="https://coderabbit.link/neo-agent">CodeRabbit</a></strong> on this newsletter.</p><p>One thing I believe, after researching this newsletter, is that the biggest problem with AI agents in software development is NOT code generation. It&#8217;s the context loss between tools, tickets, reviews, and Slack threads.</p><p>CodeRabbit is building around exactly this problem&#8230;</p><p>Their new <strong><a href="https://coderabbit.link/neo-agent">Slack agent</a> </strong>maintains shared context across the software development lifecycle, so agents don&#8217;t have to start from scratch every time work moves between coding, testing, incidents, and reviews.</p><p>They already run AI code reviews across 6M repositories for 15,000+ teams.</p><p>And I genuinely think the idea of shared memory across the SDLC is one of the more important directions in agentic engineering right now.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://coderabbit.link/neo-agent&quot;,&quot;text&quot;:&quot;Try CodeRabbit's Agent Today&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://coderabbit.link/neo-agent"><span>Try CodeRabbit's Agent Today</span></a></p></div><div><hr></div><h2><strong>What an AI agent is</strong></h2><p>An agent takes a goal, picks its own next step, uses tools<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a>, and keeps going until the job is done or it gets stuck.</p><p>That&#8217;s the whole idea&#8230;</p><p>When you hit the checkout bug and pasted it into Claude Code<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a>, you didn&#8217;t tell it which files to open or what to run. Claude Code worked out the rest on its own:</p><ul><li><p>Read files,</p></li><li><p>Grep<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a> for functions,</p></li><li><p>Run tests,</p></li><li><p>Read errors,</p></li><li><p>Try again.</p></li></ul><p>&#8230;and you watched.</p><p>&#8220;Agent&#8221; gets used loosely, so it helps to line up the tools by how much they do on their own.</p><p>The simplest is AI AUTOCOMPLETE, like the earlier versions of GitHub Copilot<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-5" href="#footnote-5" target="_self">5</a>. It finishes the line you&#8217;re typing with a Tab.</p><p>A step up is a CHATBOT: <em>you ask, it answers, and you copy what&#8217;s useful back into your editor. </em>That&#8217;s your ChatGPT.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!k0em!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50090bad-05db-40db-bd4a-3bec0323bbae_2048x1143.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!k0em!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50090bad-05db-40db-bd4a-3bec0323bbae_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!k0em!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50090bad-05db-40db-bd4a-3bec0323bbae_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!k0em!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50090bad-05db-40db-bd4a-3bec0323bbae_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!k0em!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50090bad-05db-40db-bd4a-3bec0323bbae_2048x1143.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!k0em!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50090bad-05db-40db-bd4a-3bec0323bbae_2048x1143.jpeg" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/50090bad-05db-40db-bd4a-3bec0323bbae_2048x1143.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;How much does it do on its own: autocomplete, then chatbot, then agent &#8212; a rising ladder of autonomy&quot;,&quot;title&quot;:&quot;How much does it do on its own: autocomplete, then chatbot, then agent &#8212; a rising ladder of autonomy&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="How much does it do on its own: autocomplete, then chatbot, then agent &#8212; a rising ladder of autonomy" title="How much does it do on its own: autocomplete, then chatbot, then agent &#8212; a rising ladder of autonomy" srcset="https://substackcdn.com/image/fetch/$s_!k0em!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50090bad-05db-40db-bd4a-3bec0323bbae_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!k0em!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50090bad-05db-40db-bd4a-3bec0323bbae_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!k0em!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50090bad-05db-40db-bd4a-3bec0323bbae_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!k0em!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50090bad-05db-40db-bd4a-3bec0323bbae_2048x1143.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>An AGENT is the one who does the work&#8230; You give it a bug, and it reads files, runs tests, edits code until the job is done.</p><p>To do this, it has to keep track of what it has already tried and what it learned along the way. That running context is what lets it pick a sensible next step instead of starting over each time.</p><p>One agent manages its own context fine&#8230;</p><p>But it becomes difficult when you have to run several agents at once, and that difficulty is largely what makes agents hard to use well.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!COvU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F134f4524-771a-4b27-81be-8535a5c19489_2048x1143.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!COvU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F134f4524-771a-4b27-81be-8535a5c19489_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!COvU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F134f4524-771a-4b27-81be-8535a5c19489_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!COvU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F134f4524-771a-4b27-81be-8535a5c19489_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!COvU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F134f4524-771a-4b27-81be-8535a5c19489_2048x1143.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!COvU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F134f4524-771a-4b27-81be-8535a5c19489_2048x1143.jpeg" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/134f4524-771a-4b27-81be-8535a5c19489_2048x1143.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;What makes it an agent, not a chatbot: it loops &#8212; act, read the result, update its running context, pick the next step&quot;,&quot;title&quot;:&quot;What makes it an agent, not a chatbot: it loops &#8212; act, read the result, update its running context, pick the next step&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="What makes it an agent, not a chatbot: it loops &#8212; act, read the result, update its running context, pick the next step" title="What makes it an agent, not a chatbot: it loops &#8212; act, read the result, update its running context, pick the next step" srcset="https://substackcdn.com/image/fetch/$s_!COvU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F134f4524-771a-4b27-81be-8535a5c19489_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!COvU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F134f4524-771a-4b27-81be-8535a5c19489_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!COvU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F134f4524-771a-4b27-81be-8535a5c19489_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!COvU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F134f4524-771a-4b27-81be-8535a5c19489_2048x1143.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Software gets built in eight stages, and every team moves through them with or without agents:</p><ul><li><p>Plan: decide what to build,</p></li><li><p>Design: decide how it&#8217;s structured,</p></li><li><p>Code: write it,</p></li><li><p>Test: prove it does what it should,</p></li><li><p>Review: check the work before it ships,</p></li><li><p>Deploy: ship it to production,</p></li><li><p>Operate: keep it running and watch it,</p></li><li><p>Maintain: fix, update, and clean up over time.</p></li></ul><p>In theory, you can put an agent to work in any of these stages. Yet it behaves differently in each&#8230;</p><div><hr></div><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/p/agentic-ai-use-cases?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption"><em>Get 1+ referral &amp; I&#8217;ll send you my Leetcode Master Template!</em></p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/p/agentic-ai-use-cases?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.systemdesign.one/p/agentic-ai-use-cases?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><div><hr></div><h2><strong>Software lifecycle: where agents fit</strong></h2><p>The same agent that looks brilliant in a demo can be useless on your actual codebase.</p><p>What changes between the two is how much it costs you when the agent is wrong, and this comes down to 4 things:</p><ul><li><p><strong>How fast you find out.</strong> A failing test tells you in seconds. While a bad design decision can go unnoticed for months.</p></li><li><p><strong>How easily you can check the answer. </strong>You can read a <code>git diff </code>and judge it. <em>&#8220;Is this the right thing to build&#8221;</em> has no such test.</p></li><li><p><strong>How much breaks.</strong> A wrong function fails one test. A wrong deployment takes down system for everyone.</p></li><li><p><strong>Whether you can undo it.</strong> You can revert a commit. But you can&#8217;t uncorrupt the data a bad database migration<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-6" href="#footnote-6" target="_self">6</a> already wrote.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!AugC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43e45b94-4c09-4509-be52-22cc68082a26_2048x1143.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!AugC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43e45b94-4c09-4509-be52-22cc68082a26_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!AugC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43e45b94-4c09-4509-be52-22cc68082a26_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!AugC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43e45b94-4c09-4509-be52-22cc68082a26_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!AugC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43e45b94-4c09-4509-be52-22cc68082a26_2048x1143.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!AugC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43e45b94-4c09-4509-be52-22cc68082a26_2048x1143.jpeg" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/43e45b94-4c09-4509-be52-22cc68082a26_2048x1143.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;What it costs when the agent is wrong: four factors, contrasting a demo with your real work&quot;,&quot;title&quot;:&quot;What it costs when the agent is wrong: four factors, contrasting a demo with your real work&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="What it costs when the agent is wrong: four factors, contrasting a demo with your real work" title="What it costs when the agent is wrong: four factors, contrasting a demo with your real work" srcset="https://substackcdn.com/image/fetch/$s_!AugC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43e45b94-4c09-4509-be52-22cc68082a26_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!AugC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43e45b94-4c09-4509-be52-22cc68082a26_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!AugC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43e45b94-4c09-4509-be52-22cc68082a26_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!AugC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43e45b94-4c09-4509-be52-22cc68082a26_2048x1143.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A demo is set up to do well on all four&#8230;Small task, instant feedback, nothing real at stake&#8230; Yet your work usually isn&#8217;t.</p><p>So, for each stage below, the real question is the same: <em>when the agent gets it wrong, how much does that cost you?</em></p><h3><strong>1. Plan</strong></h3><p>Planning is deciding what your team builds next.</p><p>Which feature out of the ten on the list, why that one, and when it ships. It happens in meetings, in docs, in back-and-forth between engineers and the people who run the product&#8230;</p><p>An agent does useful work here.</p><p>Give it a feature spec, and it reads your codebase, maps what the change will touch, flags the parts nobody pinned down, and writes the tickets faster than you would by hand. Hand it the boring write-ups, and it does them well.</p><p>But decision-making is something an agent CAN&#8217;T do:</p><p>Choosing what to build means weighing customer urgency, revenue, and risk against three people who each want something different, then standing behind the choice when it goes wrong. An agent has no stake in the outcome and doesn&#8217;t understand your business well enough to make that call.</p><p>The people with the most reason to claim otherwise say the same thing.</p><blockquote><p>Boris Cherny, who created Claude Code, was asked why Anthropic keeps hiring engineers when the company says Claude now writes most of its code. His answer:</p><p><em>&#8220;Someone has to prompt the Claudes, talk to customers, coordinate with other teams, and decide what to build next. Engineering is changing, and great engineers are more important than ever.&#8221;</em></p></blockquote><p>i.e., company shipping the most capable coding agent on the market still needs actual humans for planning.</p><h3><strong>2. Design</strong></h3><p>Plan is which problem you solve&#8230;</p><p>Design is the shape of the solution: <em>one service or three, which database, where the boundaries sit, and what you can change later without a rewrite.</em></p><p>Ask an agent, and you <em>will</em> get a real answer.</p><p>It will compare microservices against a monolith<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-7" href="#footnote-7" target="_self">7</a>, lay out the tradeoffs, and recommend one. Yet the answer is textbook, and this is the problem. A textbook answer doesn&#8217;t know your team is five people, your traffic triples every December, or the last person who touched the payments service quit and left no documentation. So the right design for your situation lives in facts that were never written down, and an agent can&#8217;t read what nobody wrote.</p><p>There&#8217;s a second reason this stage stays with humans&#8230;</p><p>A bad design is slow to show itself. The code passes review, and tests go green, so nothing indicates the structure is wrong. But it&#8217;s only until you try to build the next three features on top of it, and each one takes twice as long and breaks something that used to work.</p><p>By the time you see the pattern, fix is no longer a code change. It&#8217;s a complete rewrite&#8230;</p><p>The root of it is agent works on the task in front of it without a picture of the entire system or any reason to keep the system simple. This blindness shows up in two ways in design.</p><p><em><strong>The first is duplication:</strong></em></p><p>GitClear analyzed code changes across thousands of repositories, found a sharp rise in duplicated code after teams adopted AI coding tools. In 2024 alone, the number of code blocks with five or more duplicated lines increased 8x.</p><p>The agent writes a new block that does the same job as one already sitting three files over, because it never checked what the codebase already had. Every copy is one more place you have to remember to fix later, and this is a structural cost.</p><p><em><strong>And second is the opposite problem: over-engineering.</strong></em></p><p>Ask an agent to design something, and it tends to overdesign.</p><p>It adds layers, patterns, and configuration for cases that will never come up because more structure looks more thorough.</p><blockquote><p>Andrej Karpathy, who co-founded OpenAI, put it plainly: <em>models like to overcomplicate, bloat their abstractions, and leave dead code behind.</em></p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8cdJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e9dffda-10e1-45a5-81d1-ac21259a5be4_2048x1143.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8cdJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e9dffda-10e1-45a5-81d1-ac21259a5be4_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!8cdJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e9dffda-10e1-45a5-81d1-ac21259a5be4_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!8cdJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e9dffda-10e1-45a5-81d1-ac21259a5be4_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!8cdJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e9dffda-10e1-45a5-81d1-ac21259a5be4_2048x1143.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8cdJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e9dffda-10e1-45a5-81d1-ac21259a5be4_2048x1143.jpeg" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5e9dffda-10e1-45a5-81d1-ac21259a5be4_2048x1143.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;What an agent's design leaves behind: duplicated code blocks and bloated, over-engineered structure&quot;,&quot;title&quot;:&quot;What an agent's design leaves behind: duplicated code blocks and bloated, over-engineered structure&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="What an agent's design leaves behind: duplicated code blocks and bloated, over-engineered structure" title="What an agent's design leaves behind: duplicated code blocks and bloated, over-engineered structure" srcset="https://substackcdn.com/image/fetch/$s_!8cdJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e9dffda-10e1-45a5-81d1-ac21259a5be4_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!8cdJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e9dffda-10e1-45a5-81d1-ac21259a5be4_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!8cdJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e9dffda-10e1-45a5-81d1-ac21259a5be4_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!8cdJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e9dffda-10e1-45a5-81d1-ac21259a5be4_2048x1143.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This is the textbook answer showing up again&#8230;</p><p>The agent designs for the general problem because it doesn&#8217;t have the full real-world picture. You will inherit a system harder to read and change than the thing you asked for.</p><h3><strong>3. Code</strong></h3><p>This is the stage where agents are doing their best work&#8230;</p><p>At least the MONEY says so. Cursor went from $100M to $1B in annual revenue in about a year and has more than a million people using it every day. In Stack Overflow&#8217;s 2025 survey, 19% of professional developers said they use Cursor, and 10% use Claude Code, two tools that weren&#8217;t even on the survey the year before. These tools write real code that ships, and the people paying for them are not running demos.</p><p>What you get out of them depends almost entirely on how you set them up&#8230;</p><h4><em><strong>Rules files</strong></em></h4><p>The first thing to set up is a <code>rules</code> file.</p><p>Most coding tools now read a file in your repo when they start, a <code>CLAUDE.md</code><a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-8" href="#footnote-8" target="_self">8</a> or an <code>AGENTS.md</code>, and treat whatever is in it as standing orders. You write down the things you would otherwise have to repeat in every chat:</p><ul><li><p>Commands to run tests and linter<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-9" href="#footnote-9" target="_self">9</a>,</p></li><li><p>Conventions the codebase follows,</p></li><li><p>Traps you learned the hard way and want the agent to avoid.</p></li></ul><p>Without this file, every session starts from zero&#8230;</p><p>You explain the same setup, correct the same wrong assumption, and watch the agent make the same mistake it made yesterday, just because it has NO memory of yesterday.</p><p>Writing correct rule files is already somewhat of a science in coding, and there are many excellent sources that can help you.</p><p>(I recommend starting with the <a href="https://code.claude.com/docs/en/memory">official Anthropic guide on CLAUDE.md files</a>.)</p><h4><em><strong>Spec-driven development</strong></em></h4><p>The second thing is to match the size of the task to how much you plan first&#8230;</p><p>A small, clear job gets done well from a one-line prompt. <em>&#8220;Add input validation to this endpoint&#8221;</em> needs nothing fancy. While a large task like <em>&#8220;add photo sharing to my app&#8221;</em> falls apart because that prompt forces the agent to guess at thousands of decisions you never specified.</p><p>So for anything real, you write a short spec first&#8230;</p><p>A spec documents non-technical product decisions, high-level architecture, and what the changes should do in plain language before any code gets written. The agent turns it into an implementation plan, noting exactly which files and lines of code will change.</p><p>Then it executes the code one task at a time, and you review between every step&#8230;</p><p>This workflow is called <strong>spec-driven development</strong><a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-10" href="#footnote-10" target="_self">10</a>, and it is how people ship real features with agents instead of demos.</p><p>PROOF: <a href="https://github.com/obra/superpowers">GitHub repo Superpowers</a> (200k+ stars) works with all major coding agents. It encapsulates the process into a series of automatic slash commands, making it trivial to adopt in your daily workflow.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dRoL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F929891a1-351d-473d-99be-653e3e4531aa_2048x1143.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dRoL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F929891a1-351d-473d-99be-653e3e4531aa_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!dRoL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F929891a1-351d-473d-99be-653e3e4531aa_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!dRoL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F929891a1-351d-473d-99be-653e3e4531aa_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!dRoL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F929891a1-351d-473d-99be-653e3e4531aa_2048x1143.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dRoL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F929891a1-351d-473d-99be-653e3e4531aa_2048x1143.jpeg" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/929891a1-351d-473d-99be-653e3e4531aa_2048x1143.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Spec-driven development: write a spec, the agent writes a plan, then it executes one task at a time with a human review between every step&quot;,&quot;title&quot;:&quot;Spec-driven development: write a spec, the agent writes a plan, then it executes one task at a time with a human review between every step&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Spec-driven development: write a spec, the agent writes a plan, then it executes one task at a time with a human review between every step" title="Spec-driven development: write a spec, the agent writes a plan, then it executes one task at a time with a human review between every step" srcset="https://substackcdn.com/image/fetch/$s_!dRoL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F929891a1-351d-473d-99be-653e3e4531aa_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!dRoL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F929891a1-351d-473d-99be-653e3e4531aa_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!dRoL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F929891a1-351d-473d-99be-653e3e4531aa_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!dRoL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F929891a1-351d-473d-99be-653e3e4531aa_2048x1143.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>But spec-driven development isn&#8217;t new,,, it is an old engineering tradition.</p><p>The only thing changed is how fast you move from a vision in your head to a written spec to a working implementation. All with the help of coding agents&#8230;</p><h4><em><strong>Current limitations</strong></em></h4><p>There are many, but the first one you&#8217;ll hit is the context window<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-11" href="#footnote-11" target="_self">11</a>&#8230;</p><p>An agent holds everything it&#8217;s working on in a fixed amount of memory, measured in the number of tokens<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-12" href="#footnote-12" target="_self">12</a>.</p><p>The longer a single task runs, the more memory it fills with its own earlier steps, and at some point, the oldest facts drop out of the window. It forgets a decision it made twenty minutes ago and starts contradicting itself.</p><p>This is why the spec-and-plan loop matters beyond just planning&#8230;</p><p>The spec and plan documents can hold massive context on disk agent can refer to at any time. This frees up its window to focus only on the task at hand. It doesn&#8217;t matter whether your agent has a 1M-token window.</p><p>Quality drops as the context fills, well before the window is full.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sLBy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc604e46d-105b-4766-8508-3c494a344771_2048x1143.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sLBy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc604e46d-105b-4766-8508-3c494a344771_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!sLBy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc604e46d-105b-4766-8508-3c494a344771_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!sLBy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc604e46d-105b-4766-8508-3c494a344771_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!sLBy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc604e46d-105b-4766-8508-3c494a344771_2048x1143.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sLBy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc604e46d-105b-4766-8508-3c494a344771_2048x1143.jpeg" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c604e46d-105b-4766-8508-3c494a344771_2048x1143.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Output quality declines as the context fills, well before the window is full&quot;,&quot;title&quot;:&quot;Output quality declines as the context fills, well before the window is full&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Output quality declines as the context fills, well before the window is full" title="Output quality declines as the context fills, well before the window is full" srcset="https://substackcdn.com/image/fetch/$s_!sLBy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc604e46d-105b-4766-8508-3c494a344771_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!sLBy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc604e46d-105b-4766-8508-3c494a344771_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!sLBy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc604e46d-105b-4766-8508-3c494a344771_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!sLBy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc604e46d-105b-4766-8508-3c494a344771_2048x1143.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><a href="https://platform.claude.com/docs/en/build-with-claude/context-windows">Anthropic documents this</a> for its own models, and <a href="https://www.trychroma.com/research/context-rot">a study of 18 models</a> found every one got worse as the input grew. The decline is steady, and a model with a 1M-token window can already be slipping at 50k.</p><div><hr></div><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/p/agentic-ai-use-cases?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption"><em>Get 2+ referrals &amp; I&#8217;ll send you my Interview Mistakes to Avoid PDF!</em></p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/p/agentic-ai-use-cases?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.systemdesign.one/p/agentic-ai-use-cases?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><div><hr></div><h3><strong>4. Test</strong></h3><p>Testing is where you find out whether the code does what you said it should, before anyone else depends on it.</p><p>This is the stage agents are &#8216;built&#8217; for&#8230;</p><p>A test runs in a few seconds and tells the agent one thing: <em>pass or fail</em>. So the agent can write a test, run it, see what broke, and try again, as many times as it needs, without you in the room.</p><p>The products doing this are real and busy&#8230;</p><ul><li><p>Momentic, a tool that tests apps by clicking through them the way a user would, ran over 200 million test steps in a single month and caught 390,000+ bugs.</p></li><li><p>Diffblue Cover, which writes tests for large-scale Java systems at banks like Goldman Sachs and JPMorgan, writes them 250 times faster than a person and adds enough tests to raise test coverage by 50 to 70%.</p></li></ul><p>This does not mean you can hand testing to an agent and walk away&#8230; The failures here are easy to miss, and there are three of them:</p><p><em><strong>The first one affects everybody.</strong></em></p><p>Ask an agent to write tests, and a lot of them pass without checking anything real.</p><p>The test runs the function, and function returns an answer. The check is technically true, and still nothing useful got tested. Agents write these empty tests all the time.</p><p>This is why, when Claude Code writes a plan for me, I almost always send it back with one note: <em>make the tests mean something, not just pass.</em> The agent then usually throws out its first set and writes much better ones.</p><p><em><strong>The second failure is a direct follow-up to the first.</strong></em></p><p>The agent writes so many tests that keeping them all working becomes a job in itself.</p><p>Large corporations like Meta are already running into this. Their testing setup, built up over decades, could not keep up with the number of tests the agents were writing, so they moved to tests that are made fresh for each pull request and deleted as soon as they run.</p><p><em><strong>The third one is the hardest to fix.</strong></em></p><p>Agents are slow and expensive at testing anything you can see on screen&#8230;</p><p>A logic test is quick because the agent reads pass or fail straight from the code. A test of a button or a page gives no such answer, so for every step, the agent has to:</p><ol><li><p>Take a screenshot,</p></li><li><p>Work out where the button sits on screen,</p></li><li><p>Click it,</p></li><li><p>Take another screenshot to check what changed.</p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GNYS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc56d6f86-3ca7-4855-8867-7c4c0f44183b_2048x1143.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GNYS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc56d6f86-3ca7-4855-8867-7c4c0f44183b_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!GNYS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc56d6f86-3ca7-4855-8867-7c4c0f44183b_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!GNYS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc56d6f86-3ca7-4855-8867-7c4c0f44183b_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!GNYS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc56d6f86-3ca7-4855-8867-7c4c0f44183b_2048x1143.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GNYS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc56d6f86-3ca7-4855-8867-7c4c0f44183b_2048x1143.jpeg" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c56d6f86-3ca7-4855-8867-7c4c0f44183b_2048x1143.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;A logic test reads pass or fail in one step; a UI test takes four slow steps&quot;,&quot;title&quot;:&quot;A logic test reads pass or fail in one step; a UI test takes four slow steps&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="A logic test reads pass or fail in one step; a UI test takes four slow steps" title="A logic test reads pass or fail in one step; a UI test takes four slow steps" srcset="https://substackcdn.com/image/fetch/$s_!GNYS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc56d6f86-3ca7-4855-8867-7c4c0f44183b_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!GNYS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc56d6f86-3ca7-4855-8867-7c4c0f44183b_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!GNYS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc56d6f86-3ca7-4855-8867-7c4c0f44183b_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!GNYS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc56d6f86-3ca7-4855-8867-7c4c0f44183b_2048x1143.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A person clicks a button in a fraction of a second... The agent spends time and tokens on all four steps to do the same thing&#8230;</p><div><hr></div><div class="callout-block" data-callout="true"><p><em>Did you know?</em></p><p><strong><a href="https://coderabbit.link/neo-agent">CodeRabbit&#8217;s</a></strong><a href="https://coderabbit.link/neo-agent"> </a><strong><a href="https://coderabbit.link/neo-agent">new Slack agent</a></strong> keeps shared context across coding, reviews, incidents, and tickets, so your agents stop starting from zero every time work moves between tools.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://coderabbit.link/neo-agent" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GfjP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png 424w, https://substackcdn.com/image/fetch/$s_!GfjP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png 848w, https://substackcdn.com/image/fetch/$s_!GfjP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png 1272w, https://substackcdn.com/image/fetch/$s_!GfjP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GfjP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png" width="1248" height="654" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:654,&quot;width&quot;:1248,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:547609,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://coderabbit.link/neo-agent&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/192885623?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!GfjP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png 424w, https://substackcdn.com/image/fetch/$s_!GfjP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png 848w, https://substackcdn.com/image/fetch/$s_!GfjP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png 1272w, https://substackcdn.com/image/fetch/$s_!GfjP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Use it to carry decisions, debugging history, and architectural context across your entire software development lifecycle:</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://coderabbit.link/neo-agent&quot;,&quot;text&quot;:&quot;Try CodeRabbit's Agent Today&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://coderabbit.link/neo-agent"><span>Try CodeRabbit's Agent Today</span></a></p></div><div><hr></div><h3><strong>5. Review</strong></h3><p>Before any code ships, someone has to read it and sign off on it.</p><p>For years, someone was always a person. You open a pull request, and one or more teammates read your changes line by line. They look for bugs, for code that does not match the rest of the project, and for anything that does not do what the request says it does.</p><p>Nothing merges until one of them approves it&#8230;</p><p>But this process is slow:</p><p>Reviews pile up in a queue while people are busy with their own work. One senior engineer often ends up reviewing everyone else&#8217;s code, becoming the bottleneck. And a reviewer reading a 600-line change late on a Friday is NOT going to catch much.</p><p>This is the part agents handle well, and the WIN is speed and coverage.</p><p>An agent reviews every pull request the moment it opens. It never sits in a queue, never has a bad day, and never skips the boring change nobody wanted to read. This speed and coverage matter. <strong><a href="https://coderabbit.link/neo-agent">CodeRabbit</a></strong>, one of the biggest AI review tools, now runs on more than 6 million repositories for over 15,000 teams.</p><p>So every change gets a look. This is the real value,,, and it is worth a lot.</p><p>What an agent can&#8217;t do is decide which of its own comments matter.</p><p>It flags a lot, and only some of it is worth your time. A 2025 study of 278,790 real review comments found developers acted on 16.6% of suggestions from AI reviewers, compared with 56.5% from human reviewers.</p><p>i.e., people accept a human reviewer&#8217;s note 3x as often.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LdS7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8f9d173-0a4d-41f7-92f2-7a2a33992a44_2048x1143.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LdS7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8f9d173-0a4d-41f7-92f2-7a2a33992a44_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!LdS7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8f9d173-0a4d-41f7-92f2-7a2a33992a44_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!LdS7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8f9d173-0a4d-41f7-92f2-7a2a33992a44_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!LdS7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8f9d173-0a4d-41f7-92f2-7a2a33992a44_2048x1143.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LdS7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8f9d173-0a4d-41f7-92f2-7a2a33992a44_2048x1143.jpeg" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a8f9d173-0a4d-41f7-92f2-7a2a33992a44_2048x1143.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;How often the comment gets acted on: developers act on 16.6% of AI review comments versus 56.5% of human ones&quot;,&quot;title&quot;:&quot;How often the comment gets acted on: developers act on 16.6% of AI review comments versus 56.5% of human ones&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="How often the comment gets acted on: developers act on 16.6% of AI review comments versus 56.5% of human ones" title="How often the comment gets acted on: developers act on 16.6% of AI review comments versus 56.5% of human ones" srcset="https://substackcdn.com/image/fetch/$s_!LdS7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8f9d173-0a4d-41f7-92f2-7a2a33992a44_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!LdS7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8f9d173-0a4d-41f7-92f2-7a2a33992a44_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!LdS7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8f9d173-0a4d-41f7-92f2-7a2a33992a44_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!LdS7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8f9d173-0a4d-41f7-92f2-7a2a33992a44_2048x1143.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The danger is what it does to your own attention&#8230;</p><p>When most of the comments are not worth acting on, you stop reading them closely. You skim, you click approve, and the one comment in twenty that caught a real bug slides past with the rest. The agent reviewed every line, and you still missed the bug, because it buried the signal in noise.</p><p>So treat the agent as a first pass that never misses a file, NOT as the reviewer who gets the final say&#8230;</p><h3><strong>6. Deploy</strong></h3><p>This is the first stage where the answer is a clear NO. Don&#8217;t hand it to an agent&#8230;</p><p>Two things make deploy different from everything before it:</p><ul><li><p>First, you can undo a bad deploy, but not in the clean way you undo a bad commit. You can roll back, sure. By then, the broken version already reached real users, money may have moved, and the rollback is its own risky operation; you&#8217;re running under pressure.</p></li><li><p>Second, it breaks everything at once. A bad function fails one test. A bad deploy takes down the entire system for every user at once.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cMPD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1587038-4373-4457-a205-d77374ef8601_2048x1143.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cMPD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1587038-4373-4457-a205-d77374ef8601_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!cMPD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1587038-4373-4457-a205-d77374ef8601_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!cMPD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1587038-4373-4457-a205-d77374ef8601_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!cMPD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1587038-4373-4457-a205-d77374ef8601_2048x1143.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cMPD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1587038-4373-4457-a205-d77374ef8601_2048x1143.jpeg" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b1587038-4373-4457-a205-d77374ef8601_2048x1143.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Blast radius: a bad function fails one test, but a bad deploy radiates out and takes down every user at once&quot;,&quot;title&quot;:&quot;Blast radius: a bad function fails one test, but a bad deploy radiates out and takes down every user at once&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Blast radius: a bad function fails one test, but a bad deploy radiates out and takes down every user at once" title="Blast radius: a bad function fails one test, but a bad deploy radiates out and takes down every user at once" srcset="https://substackcdn.com/image/fetch/$s_!cMPD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1587038-4373-4457-a205-d77374ef8601_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!cMPD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1587038-4373-4457-a205-d77374ef8601_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!cMPD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1587038-4373-4457-a205-d77374ef8601_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!cMPD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1587038-4373-4457-a205-d77374ef8601_2048x1143.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>So the deploy button is one of the few places where the SAFE move is still a human hand&#8230;</p><p>Let the agent prepare the release, run the checks, and stage everything it can. Keep a person in front of the final push to production, where a wrong call costs the most and undoes the least.</p><h3><strong>7. Operate</strong></h3><p>Once the code is live and real users are on it, someone has to keep it healthy&#8230;</p><p>You watch the dashboards for trouble. When something breaks, you find out why and fix it fast. A lot of these land at bad hours, because production doesn&#8217;t break on a schedule. You&#8217;re on call, and your phone can buzz at 2 a.m.</p><p>There is more of this work now than there used to be&#8230;</p><p>All the AI-written code from the earlier stages ships, and it breaks more often as measured by two separate studies:</p><ul><li><p>Google&#8217;s DORA<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-13" href="#footnote-13" target="_self">13</a> report tracks delivery stability across thousands of teams and found greater AI use correlates with lower stability in both its 2024 and 2025 editions.</p></li><li><p>Faros AI analyzed telemetry<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-14" href="#footnote-14" target="_self">14</a> from 22,000 developers and found the number of production incidents per merged change increased by 242% as AI use climbed. More code ships, more of it breaks, and someone is on call for all of it&#8230;</p></li></ul><p>Ironically, agents are really good at analyzing incidents caused by bad agent code.</p><p>The first part of any incident is gathering. You pull the recent deploys, search the logs, check the dashboards, and walk through the runbook<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-15" href="#footnote-15" target="_self">15</a> step by step. It&#8217;s slow, and it&#8217;s almost the same process every time.</p><p>An agent does it fast&#8230;</p><blockquote><p>Datadog&#8217;s Bits AI SRE<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-16" href="#footnote-16" target="_self">16</a> investigates an alert the moment it fires and has the findings ready before you&#8217;ve opened your laptop.</p><p>One of its customers said: <em>investigation is done before the engineer sits down.</em> <em>At 2 a.m., this head start is worth a lot.</em></p></blockquote><p>What you can&#8217;t hand over is the decision on what&#8217;s actually wrong&#8230;</p><p>The agent will tell you the cause, and it says it with the same confidence whether it&#8217;s right or not. An alert fires; the agent says it&#8217;s the database connection pool. It sounds right, so you spend 30 minutes there. But the actual cause can be completely different. A human might have said <em>&#8220;I&#8217;m not sure yet&#8221; </em>and kept looking. The agent&#8217;s speed means you commit to the wrong fix sooner.</p><p>So let the agent gather the facts and keep the diagnosis YOURS&#8230;</p><div><hr></div><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/p/agentic-ai-use-cases?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption"><em>Get 3+ referrals &amp; I&#8217;ll send you Popular Interview Questions PDF!</em></p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/p/agentic-ai-use-cases?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.systemdesign.one/p/agentic-ai-use-cases?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><div><hr></div><h3><strong>8. Maintain</strong></h3><p>After a feature ships, the work doesn&#8217;t stop&#8230;The code still needs &#8220;babysitting&#8221;.</p><p>A library releases a security patch, so you bump to the new version. A messy function needs cleaning up; a small bug turns up; docs stopped matching the code three releases ago. None of it is glamorous, and all of it piles up. Ignore it long enough and the codebase rots.</p><p>Of all eight stages, this is where agents earn the clearest yes&#8230;</p><p>The reason is maintenance work grades itself. A dependency bump either passes the tests or it doesn&#8217;t. A refactor is meant to leave the behavior the same, so if the tests still pass after the agent rewrites the code, the refactor worked. The test suite is the judge, and it&#8217;s one the agent can run on its own, as many times as it needs, until the change is green.</p><p>This is the exact condition agents are built for&#8230;</p><p>The products doing this run at a scale no other tool mentioned today matches.</p><p>Dependabot (GitHub&#8217;s tool) watches your dependencies and opens a pull request when one falls behind, and is used across millions of repositories. GitHub says repositories using automated security updates fix critical vulnerabilities significantly faster because the update is waiting for you instead of sitting on a to-do list that nobody gets to.</p><p>But there is still a subtle way this can go WRONG:</p><p>Each change is small and passes its tests, so each one clears review without much thought. The agent doesn&#8217;t see the entire system, so it ships the refactor that works while leaving the code a little worse than a careful person would.</p><p>A bit more duplication here, an odd structure there&#8230;</p><p>No single change is worth stopping for. Over time, they stack up into a codebase, making it harder to read and change than before, and nobody can point to the commit where it went wrong. So this is the same blindness from the &#8216;Design&#8217; stage, showing up slowly instead of all at once.</p><p>Let the agent handle the maintenance, and keep a human to read what it actually changed&#8230;</p><h3><strong>Across the lifecycle</strong></h3><p>One pattern runs through all eight stages:</p><p>Agents do well wherever a machine can tell them they&#8217;re wrong. A failing test, a red build, a type error: <em>agent gets an answer in seconds and tries again on its own until the work is green.</em> This includes code, tests, dependency bumps, first pass at an incident.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0KjS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5058a4f-52ac-482d-9805-5c538f3ddb55_2048x1143.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0KjS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5058a4f-52ac-482d-9805-5c538f3ddb55_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!0KjS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5058a4f-52ac-482d-9805-5c538f3ddb55_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!0KjS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5058a4f-52ac-482d-9805-5c538f3ddb55_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!0KjS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5058a4f-52ac-482d-9805-5c538f3ddb55_2048x1143.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0KjS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5058a4f-52ac-482d-9805-5c538f3ddb55_2048x1143.jpeg" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f5058a4f-52ac-482d-9805-5c538f3ddb55_2048x1143.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;What decides whether an agent can do the job: the eight stages plotted by how fast feedback arrives and whether a machine can check the answer&quot;,&quot;title&quot;:&quot;What decides whether an agent can do the job: the eight stages plotted by how fast feedback arrives and whether a machine can check the answer&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="What decides whether an agent can do the job: the eight stages plotted by how fast feedback arrives and whether a machine can check the answer" title="What decides whether an agent can do the job: the eight stages plotted by how fast feedback arrives and whether a machine can check the answer" srcset="https://substackcdn.com/image/fetch/$s_!0KjS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5058a4f-52ac-482d-9805-5c538f3ddb55_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!0KjS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5058a4f-52ac-482d-9805-5c538f3ddb55_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!0KjS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5058a4f-52ac-482d-9805-5c538f3ddb55_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!0KjS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5058a4f-52ac-482d-9805-5c538f3ddb55_2048x1143.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>But they struggle wherever the only judge is a person&#8230;</p><p>Whether this is right feature, right architecture, or right moment to ship to production. No test comes back red on those. And the answer depends on what your business needs and how much it costs you when you&#8217;re wrong, and none of it is written down for the agent to check.</p><p>These are also the stages where a mistake is slow to find and hard to undo, so the agent runs furthest before anyone catches it.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ybt_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47735ec7-7bb9-4c8d-88a9-6ff9963d02db_2048x1143.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ybt_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47735ec7-7bb9-4c8d-88a9-6ff9963d02db_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!ybt_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47735ec7-7bb9-4c8d-88a9-6ff9963d02db_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!ybt_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47735ec7-7bb9-4c8d-88a9-6ff9963d02db_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!ybt_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47735ec7-7bb9-4c8d-88a9-6ff9963d02db_2048x1143.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ybt_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47735ec7-7bb9-4c8d-88a9-6ff9963d02db_2048x1143.jpeg" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/47735ec7-7bb9-4c8d-88a9-6ff9963d02db_2048x1143.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Where an agent earns its place: the eight stages color-coded by whether you can hand them to an agent today&quot;,&quot;title&quot;:&quot;Where an agent earns its place: the eight stages color-coded by whether you can hand them to an agent today&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Where an agent earns its place: the eight stages color-coded by whether you can hand them to an agent today" title="Where an agent earns its place: the eight stages color-coded by whether you can hand them to an agent today" srcset="https://substackcdn.com/image/fetch/$s_!ybt_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47735ec7-7bb9-4c8d-88a9-6ff9963d02db_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!ybt_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47735ec7-7bb9-4c8d-88a9-6ff9963d02db_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!ybt_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47735ec7-7bb9-4c8d-88a9-6ff9963d02db_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!ybt_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47735ec7-7bb9-4c8d-88a9-6ff9963d02db_2048x1143.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Every verdict here looked at one agent doing one job, though. Real teams run a dozen at once, and it changes the problem&#8230;</p><div><hr></div><h2><strong>Why more agents do NOT make your team faster</strong></h2><p>So far, we have looked at one agent doing one job. Real teams run a dozen at once&#8230;</p><p>A coding agent, a test agent, a review agent, and an on-call agent, all working at the same time. The trouble is none of them shares memory with the others. So the coding agent doesn&#8217;t know what the test agent already tried. And the review agent doesn&#8217;t know what the on-call agent learned last night.</p><p>Each one starts blank about every other one.</p><p>That gap costs you more than you&#8217;d expect&#8230;</p><blockquote><p>Nicholas Carlini at Anthropic ran 16 Claude agents in parallel to build a C compiler. Over 2,000 sessions and about $20,000 of compute, and the project worked. But here is what he found running them together:</p><p><em>&#8220;Every agent would hit the same bug, fix that bug, and then overwrite each other&#8217;s changes. Having 16 agents running didn&#8217;t help because each was stuck solving the same task.&#8221;</em></p></blockquote><p>So the agents duplicate and overwrite each other, and a person has to review it all and resolve the collisions before any of it merges.</p><p>The context that would stop the repetitive work already exists&#8230; It is just the agent can&#8217;t usually read it&#8230;</p><p>It&#8217;s in a Slack thread, in the reasoning on a closed pull request, in an old ticket, or in what the on-call agent worked out last night. The next agent can read your code, but it can&#8217;t read the conversation where you decided why the code looks a specific way.</p><p>This decision history is the part NO agent has&#8230;</p><p>The fix is to give the agents one shared memory which survives between runs, running where the team already talks, so a person can interrupt and redirect.</p><p><strong><a href="https://coderabbit.link/neo-agent">CodeRabbit</a></strong> launched a<strong> <a href="https://coderabbit.link/neo-agent">Slack agent</a></strong><a href="https://coderabbit.link/neo-agent"> </a>that works this way, and the way they describe the problem is the subtle patterns we have been hinting at throughout the newsletter: <em>&#8220;Each phase runs on a different tool and uses a different agent. None of them talks to each other. What one engineer figures out in coding doesn&#8217;t show up in testing.&#8221;</em></p><p><em>Their <a href="https://coderabbit.link/neo-agent">agent</a> runs in Slack, so it reads the same threads your team does and carries what it learns from one conversation to the next. Ask it Friday about something it saw Monday, and it still knows&#8230;</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gqi4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18000f90-6611-41a7-83eb-dcbefd24de8a_2048x1143.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gqi4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18000f90-6611-41a7-83eb-dcbefd24de8a_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!gqi4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18000f90-6611-41a7-83eb-dcbefd24de8a_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!gqi4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18000f90-6611-41a7-83eb-dcbefd24de8a_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!gqi4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18000f90-6611-41a7-83eb-dcbefd24de8a_2048x1143.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gqi4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18000f90-6611-41a7-83eb-dcbefd24de8a_2048x1143.jpeg" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/18000f90-6611-41a7-83eb-dcbefd24de8a_2048x1143.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Without shared memory agents duplicate each other; with a shared memory layer like CodeRabbit's Slack agent they read what the others learned&quot;,&quot;title&quot;:&quot;Without shared memory agents duplicate each other; with a shared memory layer like CodeRabbit's Slack agent they read what the others learned&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Without shared memory agents duplicate each other; with a shared memory layer like CodeRabbit's Slack agent they read what the others learned" title="Without shared memory agents duplicate each other; with a shared memory layer like CodeRabbit's Slack agent they read what the others learned" srcset="https://substackcdn.com/image/fetch/$s_!gqi4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18000f90-6611-41a7-83eb-dcbefd24de8a_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!gqi4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18000f90-6611-41a7-83eb-dcbefd24de8a_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!gqi4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18000f90-6611-41a7-83eb-dcbefd24de8a_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!gqi4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18000f90-6611-41a7-83eb-dcbefd24de8a_2048x1143.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Shared memory is the real problem under all eight stages, and nobody has fully solved it yet&#8230;</p><div><hr></div><h2><strong>What to expect when an agent is on your team</strong></h2><p><em>Say you decide to put agents to work anyway, before that problem is solved.</em></p><p>Here is what the job actually feels like once you do it:</p><p>The first thing to know is the speed you feel may NOT be the speed you get&#8230;</p><p>METR, an independent nonprofit running studies on what AI can and can&#8217;t do, tested this in 2025. (They aren&#8217;t selling an AI tool, so they had no reason to tilt the result either way.)</p><p>They took 16 experienced developers and gave them 246 real tasks on their own projects, the repositories they had maintained for years. The developers did half the tasks with AI tools and half without, and a timer recorded both.</p><p>They finished the AI tasks 19% <em>slower</em>. The surprising part is same developers <em>thought</em> the AI had made them 20% <em>faster</em>. They were slower and felt quicker at the same time.</p><p>This gap is what matters here&#8230;</p><p>These were experts on code they already knew well, where they were fast to begin with, so it won&#8217;t match every job. But your own sense of whether an agent is helping is not reliable, so check it against something real.</p><p>With more agentic work, your week changes a lot&#8230;</p><p>An agent opens more pull requests than you would, and each one is bigger than the last. So you spend less of your day writing code and more of it reading what the agent wrote and deciding if it&#8217;s right. That reading has a different cognitive cost. Do it all day, and you end up tired in a way that writing the code yourself didn&#8217;t make you, even when the agent saved time.</p><p>The other thing to set up before you &#8220;trust&#8221; an agent is a limit on what it can &#8220;break&#8221;&#8230;</p><p>An agent running commands can do real damage when it gets something wrong:</p><blockquote><p>In July 2025, Replit&#8217;s coding agent deleted a production database after it was told, in plain words, to change nothing. It wiped records for 1,200+ business leaders and 1,000+ companies. The agent&#8217;s own summary afterward read: <em>&#8220;I destroyed months of work in seconds.&#8221;</em></p></blockquote><p>Three controls keep that from being you&#8230;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1eY-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff04a646-3c0d-48cf-9428-12367401476c_2048x1143.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1eY-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff04a646-3c0d-48cf-9428-12367401476c_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!1eY-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff04a646-3c0d-48cf-9428-12367401476c_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!1eY-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff04a646-3c0d-48cf-9428-12367401476c_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!1eY-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff04a646-3c0d-48cf-9428-12367401476c_2048x1143.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1eY-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff04a646-3c0d-48cf-9428-12367401476c_2048x1143.jpeg" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ff04a646-3c0d-48cf-9428-12367401476c_2048x1143.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Three controls before you trust an agent: sandbox, scoped permissions, audit log&quot;,&quot;title&quot;:&quot;Three controls before you trust an agent: sandbox, scoped permissions, audit log&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Three controls before you trust an agent: sandbox, scoped permissions, audit log" title="Three controls before you trust an agent: sandbox, scoped permissions, audit log" srcset="https://substackcdn.com/image/fetch/$s_!1eY-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff04a646-3c0d-48cf-9428-12367401476c_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!1eY-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff04a646-3c0d-48cf-9428-12367401476c_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!1eY-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff04a646-3c0d-48cf-9428-12367401476c_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!1eY-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff04a646-3c0d-48cf-9428-12367401476c_2048x1143.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p>Run the agent in a sandbox so it can&#8217;t reach production in the first place.</p></li><li><p>Give it scoped permissions, so it only gets access to the one thing the task needs and nothing else.</p></li><li><p>And keep an audit log of every command it ran, so when something breaks, you can see exactly what it did.</p></li></ul><p>So pick your first agent where a wrong answer is cheap to undo&#8230;</p><p>Coding, review, and maintenance are good places to start, because a bad suggestion gets caught before it ships, and a bad refactor fails the tests. Design and deploy are the worst places to start, for every reason this newsletter has already covered.</p><p>Begin where the agent can be wrong without costing you much, learn how it behaves, and only then move it somewhere that matters more.</p><div><hr></div><div class="callout-block" data-callout="true"><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://coderabbit.link/neo-agent" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GfjP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png 424w, https://substackcdn.com/image/fetch/$s_!GfjP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png 848w, https://substackcdn.com/image/fetch/$s_!GfjP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png 1272w, https://substackcdn.com/image/fetch/$s_!GfjP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GfjP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png" width="1248" height="654" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:654,&quot;width&quot;:1248,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:547609,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://coderabbit.link/neo-agent&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/192885623?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!GfjP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png 424w, https://substackcdn.com/image/fetch/$s_!GfjP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png 848w, https://substackcdn.com/image/fetch/$s_!GfjP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png 1272w, https://substackcdn.com/image/fetch/$s_!GfjP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Your engineers talk on Slack. They code in the terminal. Somewhere between those two things, context dies&#8230;</p><ul><li><p>A bug was debated in #incidents at 2 AM.</p></li><li><p>An architectural call was made in a DM.</p></li></ul><p>Every handoff leaks context, and every leak costs you. That&#8217;s the context tax and your team pays it every day.</p><p><strong><a href="https://coderabbit.link/neo-agent">CodeRabbit Agent for Slack</a></strong> is built for agentic SDLC workflows. One agent for your entire Software Development Lifecycle, living in the channel where the work already happens. It&#8217;s built on four things:</p><ul><li><p><strong>Context</strong>: your org&#8217;s operating picture, pulled from across code, tickets, docs, monitoring, and cloud.</p></li><li><p><strong>Knowledge Base</strong>: a living memory of your team. Every run leaves a trace, so yesterday&#8217;s decisions don&#8217;t become tomorrow&#8217;s debates.</p></li><li><p><strong>Multi-Player</strong>: works in shared threads alongside your team. Steerable, resumable, and aligned as work evolves.</p></li><li><p><strong>Governance</strong>: scoped access, cost attribution. Every run explainable and attributed.</p></li></ul><p>Your team keeps shipping. <strong><a href="https://coderabbit.link/neo-agent">Agent</a></strong> keeps the context.</p><p>From the team that pioneered AI code reviews. 2M code reviews every week. 6M repos. 15K customers. And now, one agent for your entire SDLC, right in Slack:</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://coderabbit.link/neo-agent&quot;,&quot;text&quot;:&quot;Try CodeRabbit's Agent Today&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://coderabbit.link/neo-agent"><span>Try CodeRabbit's Agent Today</span></a></p><p>(Thanks again to <strong><a href="https://coderabbit.link/neo-agent">CodeRabbit</a></strong> for partnering on this post.)</p></div><div><hr></div><p>Louis and I launched the <strong><a href="https://newsletter.systemdesign.one/subscribe?yearly=true">GENERATIVE AI MASTERCLASS</a></strong> (newsletter series exclusive to PAID subscribers).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://newsletter.systemdesign.one/subscribe?yearly=true" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Lz0V!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0de1c782-57c1-4638-baae-2c455d6623fb_1200x630.png 424w, https://substackcdn.com/image/fetch/$s_!Lz0V!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0de1c782-57c1-4638-baae-2c455d6623fb_1200x630.png 848w, https://substackcdn.com/image/fetch/$s_!Lz0V!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0de1c782-57c1-4638-baae-2c455d6623fb_1200x630.png 1272w, https://substackcdn.com/image/fetch/$s_!Lz0V!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0de1c782-57c1-4638-baae-2c455d6623fb_1200x630.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Lz0V!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0de1c782-57c1-4638-baae-2c455d6623fb_1200x630.png" width="1200" height="630" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0de1c782-57c1-4638-baae-2c455d6623fb_1200x630.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:630,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:614623,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://newsletter.systemdesign.one/subscribe?yearly=true&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/199185364?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0de1c782-57c1-4638-baae-2c455d6623fb_1200x630.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Lz0V!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0de1c782-57c1-4638-baae-2c455d6623fb_1200x630.png 424w, https://substackcdn.com/image/fetch/$s_!Lz0V!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0de1c782-57c1-4638-baae-2c455d6623fb_1200x630.png 848w, https://substackcdn.com/image/fetch/$s_!Lz0V!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0de1c782-57c1-4638-baae-2c455d6623fb_1200x630.png 1272w, https://substackcdn.com/image/fetch/$s_!Lz0V!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0de1c782-57c1-4638-baae-2c455d6623fb_1200x630.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>When you upgrade, you&#8217;ll get:</p><ul><li><p><strong>Simple breakdown of real-world architectures</strong></p></li><li><p>Frameworks you can plug into your work or business</p></li><li><p><strong>Proven systems behind ChatGPT, Perplexity, and Copilot</strong></p></li></ul><p><strong>&#128073; <a href="https://newsletter.systemdesign.one/subscribe?yearly=true">CLICK HERE TO JOIN THE GENERATIVE AI MASTERCLASS</a></strong></p><p>(Golden members will get the next Generative AI newsletter in the first week of June.)</p><div><hr></div><p>If you find this newsletter valuable, share it with a friend, and subscribe if you haven&#8217;t already. There are <a href="http://newsletter.systemdesign.one/subscribe?group=true">group discounts</a>, <a href="http://newsletter.systemdesign.one/subscribe?gift=true">gift options</a>, and <a href="https://newsletter.systemdesign.one/leaderboard">referral rewards</a> available.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.systemdesign.one/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://x.com/intent/follow?screen_name=systemdesignone" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bEFk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 424w, https://substackcdn.com/image/fetch/$s_!bEFk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 848w, https://substackcdn.com/image/fetch/$s_!bEFk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 1272w, https://substackcdn.com/image/fetch/$s_!bEFk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bEFk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png" width="152" height="152" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:320,&quot;width&quot;:320,&quot;resizeWidth&quot;:152,&quot;bytes&quot;:74009,&quot;alt&quot;:&quot;Author Neo Kim; System design case studies&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://x.com/intent/follow?screen_name=systemdesignone&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Author Neo Kim; System design case studies" title="Author Neo Kim; System design case studies" srcset="https://substackcdn.com/image/fetch/$s_!bEFk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 424w, https://substackcdn.com/image/fetch/$s_!bEFk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 848w, https://substackcdn.com/image/fetch/$s_!bEFk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 1272w, https://substackcdn.com/image/fetch/$s_!bEFk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption"><strong>&#128075; Find me on <a href="https://www.linkedin.com/in/nk-systemdesign-one/">LinkedIn</a> | <a href="https://x.com/intent/follow?screen_name=systemdesignone">Twitter</a> | <a href="https://www.threads.net/@systemdesignone">Threads</a> | <a href="https://www.instagram.com/systemdesignone/">Instagram</a></strong></figcaption></figure></div><div><hr></div><p><strong>Want to reach 210K+ tech professionals at scale? </strong>&#128240;</p><p>If your company wants to reach 210K+ tech professionals, <a href="https://newsletter.systemdesign.one/p/sponsorship">advertise with me</a>.</p><div><hr></div><p>Thank you for supporting this newsletter.</p><p>You are now 210,001+ readers strong, very close to 210k. Let&#8217;s try to get 211k readers by 29 May. Consider sharing this post with your friends and get rewards.</p><p>Y&#8217;all are the best.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6oWl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6oWl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 424w, https://substackcdn.com/image/fetch/$s_!6oWl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 848w, https://substackcdn.com/image/fetch/$s_!6oWl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 1272w, https://substackcdn.com/image/fetch/$s_!6oWl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6oWl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png" width="590" height="368.75" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2e739087-a910-4643-be36-997b6dd5b4af_800x500.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:500,&quot;width&quot;:800,&quot;resizeWidth&quot;:590,&quot;bytes&quot;:87878,&quot;alt&quot;:&quot;system design newsletter&quot;,&quot;title&quot;:&quot;system design newsletter&quot;,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/163380418?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="system design newsletter" title="system design newsletter" srcset="https://substackcdn.com/image/fetch/$s_!6oWl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 424w, https://substackcdn.com/image/fetch/$s_!6oWl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 848w, https://substackcdn.com/image/fetch/$s_!6oWl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 1272w, https://substackcdn.com/image/fetch/$s_!6oWl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/p/agentic-ai-use-cases?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.systemdesign.one/p/agentic-ai-use-cases?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><div><hr></div><h3>References</h3><ul><li><p>Boris Cherny on why Anthropic keeps hiring engineers: <a href="https://x.com/bcherny/status/2022762422302576970">x.com/bcherny</a> (<a href="https://simonwillison.net/2026/Feb/14/boris/">mirror</a>)</p></li><li><p>Duplicated code rose 8x after AI adoption: <a href="https://www.gitclear.com/ai_assistant_code_quality_2025_research">GitClear, AI Copilot Code Quality 2025</a></p></li><li><p>Andrej Karpathy on models bloating abstractions and leaving dead code: <a href="https://x.com/karpathy/status/2015883857489522876">x.com/karpathy</a></p></li><li><p>Cursor revenue and daily users: <a href="https://cursor.com/blog/series-d">Cursor Series D announcement</a></p></li><li><p>Cursor and Claude Code developer adoption: <a href="https://survey.stackoverflow.co/2025/technology">Stack Overflow 2025 Developer Survey</a></p></li><li><p>Context quality drops as input grows, across 18 models: <a href="https://www.trychroma.com/research/context-rot">Chroma, Context Rot</a></p></li><li><p>Anthropic on CLAUDE.md memory files: <a href="https://code.claude.com/docs/en/memory">Claude Code memory docs</a></p></li><li><p>Spec-driven development with Superpowers: <a href="https://github.com/obra/superpowers">github.com/obra/superpowers</a></p></li><li><p>Anthropic on context windows: <a href="https://platform.claude.com/docs/en/build-with-claude/context-windows">Claude context windows docs</a></p></li><li><p>Momentic, 200 million test steps and 390,000 bugs in a month: <a href="https://momentic.ai/blog/series-a">Momentic Series A</a></p></li><li><p>Diffblue Cover, 250x faster test writing: <a href="https://www.diffblue.com/diffblue-cover/">Diffblue Cover</a></p></li><li><p>Diffblue Cover, 50 to 70% coverage lift: <a href="https://www.diffblue.com/resources/uplift-java-test-coverage-out-of-the-box/">Uplift Java test coverage</a></p></li><li><p>Diffblue Cover at Goldman Sachs: <a href="https://www.diffblue.com/case-studies/goldman-sachs-complete-a-years-worth-of-java-unit-test-writing-overnight-with-diffblue-cover/">Goldman Sachs case study</a></p></li><li><p>Meta&#8217;s just-in-time tests, generated per pull request: <a href="https://engineering.fb.com/2026/02/11/developer-tools/the-death-of-traditional-testing-agentic-development-jit-testing-revival/">The death of traditional testing</a></p></li><li><p>GitHub Copilot Code Review, 60 million reviews and 1 in 5 of all reviews: <a href="https://github.blog/ai-and-ml/github-copilot/60-million-copilot-code-reviews-and-counting/">60 million Copilot code reviews and counting</a></p></li><li><p>Developers act on 16.6% of AI review comments vs 56.5% from humans (278,790 comments): <a href="https://arxiv.org/abs/2603.15911">Human-AI Synergy in Agentic Code Review</a></p></li><li><p>Amazon Kiro and the 13-hour AWS outage: <a href="https://www.aboutamazon.com/news/aws/aws-service-outage-ai-bot-kiro">Amazon&#8217;s response</a> (originally reported by the Financial Times)</p></li><li><p>DORA, AI adoption and delivery stability: <a href="https://dora.dev/research/2024/dora-report/">DORA 2024 report</a>, <a href="https://cloud.google.com/blog/products/ai-machine-learning/announcing-the-2025-dora-report">DORA 2025 report</a></p></li><li><p>Faros AI, incidents per change up 242% across 22,000 developers: <a href="https://www.faros.ai/research/ai-acceleration-whiplash">The AI Acceleration Whiplash</a></p></li><li><p>Datadog Bits AI SRE, generally available December 2025: <a href="https://www.datadoghq.com/about/latest-news/press-releases/datadog-launches-bits-ai-sre-agent-to-resolve-incidents-faster/">Datadog launches Bits AI SRE</a></p></li><li><p>Dependabot, 2.7 million repositories and 30% faster fixes: <a href="https://github.blog/news-insights/octoverse/octoverse-a-new-developer-joins-github-every-second-as-ai-leads-typescript-to-1/">GitHub Octoverse 2025</a></p></li><li><p>Nicholas Carlini, 16 agents building a C compiler: <a href="https://www.anthropic.com/engineering/building-c-compiler">Building a C compiler with Claude</a></p></li><li><p>CodeRabbit&#8217;s Slack agent launch and framing: <a href="https://www.businesswire.com/news/home/20260422659313/en/">CodeRabbit launches Slack agent</a></p></li><li><p>Mem0, open-source memory layer for agents: <a href="https://github.com/mem0ai/mem0">github.com/mem0ai/mem0</a></p></li><li><p>METR, developers 19% slower with AI while feeling faster: <a href="https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/">METR study</a> (<a href="https://arxiv.org/abs/2507.09089">paper</a>)</p></li><li><p>Replit&#8217;s agent deleting a production database: <a href="https://fortune.com/2025/07/23/ai-coding-tool-replit-wiped-database-called-it-a-catastrophic-failure/">Fortune</a></p></li></ul><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p><strong>Pull request</strong> &#8212; a proposed change to a codebase, opened so others can review it before it is merged into the main code.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p><strong>Tool</strong> is something the agent can use to perform tasks outside the chat, such as reading files, running tests, searching code, or calling an API.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p><strong>Agent</strong> is an AI tool that can work through a task on its own instead of waiting for instructions at every step. For example, Claude Code can take a bug report, search the codebase, edit files, run tests, and keep trying fixes until it solves the problem or gets stuck.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-4" href="#footnote-anchor-4" class="footnote-number" contenteditable="false" target="_self">4</a><div class="footnote-content"><p><strong>Grep</strong> &#8212; a command that searches files for a piece of text. When an agent &#8220;greps for a function,&#8221; it searches the codebase to find where that function lives.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-5" href="#footnote-anchor-5" class="footnote-number" contenteditable="false" target="_self">5</a><div class="footnote-content"><p><strong>GitHub Copilot</strong> is an AI coding assistant built into your editor. It suggests code as you type (from single lines to entire functions) based on the code around it and the comment you wrote.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-6" href="#footnote-anchor-6" class="footnote-number" contenteditable="false" target="_self">6</a><div class="footnote-content"><p><strong>Database migration</strong> &#8212; a change to the shape of your database, such as adding a column or renaming a table. A bad one can corrupt existing data in a way you cannot undo.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-7" href="#footnote-anchor-7" class="footnote-number" contenteditable="false" target="_self">7</a><div class="footnote-content"><p><strong>Monolith vs microservices</strong> &#8212; two ways to structure a system. A monolith is one single program that does everything; microservices split the same work across several smaller programs that talk to each other.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-8" href="#footnote-anchor-8" class="footnote-number" contenteditable="false" target="_self">8</a><div class="footnote-content"><p><strong>CLAUDE.md / AGENTS.md</strong> &#8212; a plain-text file you keep in your repository. The coding agent reads it on startup and treats its contents as standing instructions.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-9" href="#footnote-anchor-9" class="footnote-number" contenteditable="false" target="_self">9</a><div class="footnote-content"><p><strong>Linter</strong> &#8212; a tool that automatically checks your code for style problems and likely mistakes before it runs.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-10" href="#footnote-anchor-10" class="footnote-number" contenteditable="false" target="_self">10</a><div class="footnote-content"><p><strong>Spec-driven development</strong> &#8212; writing a clear specification first, then having the agent turn it into a plan and an implementation, instead of prompting it one ad-hoc request at a time.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-11" href="#footnote-anchor-11" class="footnote-number" contenteditable="false" target="_self">11</a><div class="footnote-content"><p><strong>Context window</strong> is the amount of information an AI can keep in mind at one time while it works.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-12" href="#footnote-anchor-12" class="footnote-number" contenteditable="false" target="_self">12</a><div class="footnote-content"><p><strong>Token</strong> is a small piece of text that an AI reads at a time, usually a word, part of a word, or punctuation.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-13" href="#footnote-anchor-13" class="footnote-number" contenteditable="false" target="_self">13</a><div class="footnote-content"><p><strong>DORA</strong> stands for DevOps Research and Assessment: a long-running research program that studies how software teams build, ship, and operate software. It&#8217;s best known for measuring things like deployment speed, reliability, and production stability across thousands of engineering teams.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-14" href="#footnote-anchor-14" class="footnote-number" contenteditable="false" target="_self">14</a><div class="footnote-content"><p><strong>Telemetry</strong> &#8212; the performance and usage data that software automatically reports about itself, such as error rates and response times.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-15" href="#footnote-anchor-15" class="footnote-number" contenteditable="false" target="_self">15</a><div class="footnote-content"><p><strong>Runbook</strong> &#8212; a written, step-by-step checklist for handling a specific kind of incident, so whoever is on call can follow the same procedure every time.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-16" href="#footnote-anchor-16" class="footnote-number" contenteditable="false" target="_self">16</a><div class="footnote-content"><p><strong>SRE (Site Reliability Engineering)</strong> &#8212; the practice of keeping production systems running reliably, treating operations problems as software problems to be solved with code.</p><p></p></div></div>]]></content:encoded></item><item><title><![CDATA[How Virtual Machines Work - A Deep Dive]]></title><description><![CDATA[#148: Part 1 - DevOps Mastermind]]></description><link>https://newsletter.systemdesign.one/p/virtualization-architecture</link><guid isPermaLink="false">https://newsletter.systemdesign.one/p/virtualization-architecture</guid><dc:creator><![CDATA[Neo Kim]]></dc:creator><pubDate>Mon, 25 May 2026 07:25:33 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/1fb41d6b-b578-4d40-aa11-d43bccde13a7_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Get my system design playbook for FREE on newsletter signup:</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><ul><li><p><em><a href="https://newsletter.systemdesign.one/p/virtualization-architecture/?action=share">Share this post</a> &amp; I'll send you some rewards for the referrals.</em></p></li></ul><p>I have a huge huge huge announcement for everyone&#8230;</p><p>INTRODUCING: <strong>DevOps Mastermind</strong></p><p>This newsletter series will help you become a fast-growing and high-paying engineer.</p><p>If you&#8217;ve ever thought:</p><blockquote><p><em>&#8220;I use Docker &amp; Kubernetes, but I still don&#8217;t fully understand how they work.&#8221;</em></p><p><em>&#8220;I want to master DevOps fundamentals instead of just memorizing commands.&#8221;</em></p><p><em>&#8220;I should learn how modern infrastructure &amp; real-world systems actually work.&#8221;</em></p></blockquote><p>Then this is for you.</p><p>Here&#8217;s what you&#8217;ll get inside DevOps Mastermind:</p><ul><li><p><strong>Architecture breakdown of core DevOps systems.</strong></p></li><li><p>Deep dives into Docker, Kubernetes, Terraform, and so on.</p></li><li><p><strong>Practical insights into how real-world systems achieve scalability, automation, and reliability.</strong></p></li></ul><p>Onward.</p><div><hr></div><p>Your laptop is already running an operating system.</p><p>Now imagine starting another one on the same machine. A completely separate system with its own files, memory, and processes. You can run Linux inside Windows, or Windows inside a Mac. It boots independently, runs programs, and behaves as if it has full control over the hardware.</p><p>But it doesn&#8217;t.</p><p>Underneath, there is still only one physical machine. One CPU, one pool of memory, one disk. Yet multiple such systems can run at the same time, each isolated from the others and each functioning as if it were the only system present.</p><p>This idea changed computing&#8230;</p><p>Instead of buying a new server each time, you could split one powerful machine into many smaller ones. Each workload gets its own space. Nothing interferes with anything else.</p><p>That shift is what made cloud computing possible&#8230;</p><p>When you run an app on Amazon Web Services, Microsoft Azure, or Google Cloud Platform, it runs inside a virtual machine. The entire cloud infrastructure is built on this foundation.</p><p><em>But how does it actually work? How can one machine present itself as many? How are resources shared without breaking isolation?</em></p><div><hr></div><div class="callout-block" data-callout="true"><h2><a href="https://pages.awscloud.com/awsmp-gim-yngd-webinar-aim-enterprise-ai-and-data-leader-panel-lt-panel-1.html?trk=e4a7be88-920d-491d-adb9-943ace6441ae&amp;sc_channel=psm">Build strong data foundations for agentic AI at scale (Partner)</a></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://pages.awscloud.com/awsmp-gim-yngd-webinar-aim-enterprise-ai-and-data-leader-panel-lt-panel-1.html?trk=e4a7be88-920d-491d-adb9-943ace6441ae&amp;sc_channel=psm" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wuK7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2628cb4d-cc45-4737-b25d-bff5ff5e469e_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!wuK7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2628cb4d-cc45-4737-b25d-bff5ff5e469e_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!wuK7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2628cb4d-cc45-4737-b25d-bff5ff5e469e_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!wuK7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2628cb4d-cc45-4737-b25d-bff5ff5e469e_1280x720.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wuK7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2628cb4d-cc45-4737-b25d-bff5ff5e469e_1280x720.png" width="1280" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2628cb4d-cc45-4737-b25d-bff5ff5e469e_1280x720.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1346959,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://pages.awscloud.com/awsmp-gim-yngd-webinar-aim-enterprise-ai-and-data-leader-panel-lt-panel-1.html?trk=e4a7be88-920d-491d-adb9-943ace6441ae&amp;sc_channel=psm&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/193554859?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2628cb4d-cc45-4737-b25d-bff5ff5e469e_1280x720.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!wuK7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2628cb4d-cc45-4737-b25d-bff5ff5e469e_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!wuK7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2628cb4d-cc45-4737-b25d-bff5ff5e469e_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!wuK7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2628cb4d-cc45-4737-b25d-bff5ff5e469e_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!wuK7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2628cb4d-cc45-4737-b25d-bff5ff5e469e_1280x720.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>AI experiments are easy.</p><p>But building trusted AI systems that scale across an enterprise is hard. That&#8217;s exactly what leaders from Mercedes-Benz, Yahoo, Regeneron, and AWS discuss in this on-demand virtual panel on agentic AI.</p><p><em><strong>Here&#8217;s what you&#8217;ll learn inside:</strong></em></p><ul><li><p><strong>Practical strategies from enterprise leaders:</strong> See how teams move from isolated AI experiments to production-scale systems.</p></li><li><p><strong>Real-world guidance on governance and architecture:</strong> Learn how industry experts approach analytics, unified governance, and continuous learning.</p></li><li><p><strong>Frameworks for scaling intelligent systems:</strong> Discover how to build AI systems that drive business value.</p></li><li><p><strong>Pragmatic infrastructure decisions:</strong> Learn how to select the right databases and data foundations for AI-based applications.</p></li></ul><p>Watch the <strong><a href="https://pages.awscloud.com/awsmp-gim-yngd-webinar-aim-enterprise-ai-and-data-leader-panel-lt-panel-1.html?trk=e4a7be88-920d-491d-adb9-943ace6441ae&amp;sc_channel=psm">AWS panel</a></strong> and see how enterprise teams are building trusted AI systems with speed, governance, and control.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://pages.awscloud.com/awsmp-gim-yngd-webinar-aim-enterprise-ai-and-data-leader-panel-lt-panel-1.html?trk=e4a7be88-920d-491d-adb9-943ace6441ae&amp;sc_channel=psm&quot;,&quot;text&quot;:&quot;Watch the AWS Virtual Panel&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://pages.awscloud.com/awsmp-gim-yngd-webinar-aim-enterprise-ai-and-data-leader-panel-lt-panel-1.html?trk=e4a7be88-920d-491d-adb9-943ace6441ae&amp;sc_channel=psm"><span>Watch the AWS Virtual Panel</span></a></p><p>(Thanks to <a href="https://pages.awscloud.com/awsmp-gim-yngd-webinar-aim-enterprise-ai-and-data-leader-panel-lt-panel-1.html?trk=e4a7be88-920d-491d-adb9-943ace6441ae&amp;sc_channel=psm">AWS</a> for partnering on this post.)</p></div><div><hr></div><p>I want to introduce <strong><a href="https://x.com/twtayaan">Ayaan</a></strong> as a guest author.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://x.com/twtayaan" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Kzrb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c24dc51-869d-4962-86ef-686c1f0b52e7_2048x1024.png 424w, https://substackcdn.com/image/fetch/$s_!Kzrb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c24dc51-869d-4962-86ef-686c1f0b52e7_2048x1024.png 848w, https://substackcdn.com/image/fetch/$s_!Kzrb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c24dc51-869d-4962-86ef-686c1f0b52e7_2048x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!Kzrb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c24dc51-869d-4962-86ef-686c1f0b52e7_2048x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Kzrb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c24dc51-869d-4962-86ef-686c1f0b52e7_2048x1024.png" width="1456" height="728" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0c24dc51-869d-4962-86ef-686c1f0b52e7_2048x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:728,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:&quot;https://x.com/twtayaan&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!Kzrb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c24dc51-869d-4962-86ef-686c1f0b52e7_2048x1024.png 424w, https://substackcdn.com/image/fetch/$s_!Kzrb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c24dc51-869d-4962-86ef-686c1f0b52e7_2048x1024.png 848w, https://substackcdn.com/image/fetch/$s_!Kzrb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c24dc51-869d-4962-86ef-686c1f0b52e7_2048x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!Kzrb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c24dc51-869d-4962-86ef-686c1f0b52e7_2048x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>He&#8217;s a self-taught DevOps engineer focused on Kubernetes &amp; cloud infrastructure. He builds in public and writes about cloud-native systems, sharing daily technical tips and projects.</p><p>Check out his work and socials:</p><ul><li><p><strong><a href="https://x.com/twtayaan">Twitter</a></strong></p></li><li><p><strong><a href="https://github.com/Ayaan49">Github</a></strong></p></li></ul><div><hr></div><p><em><strong>Here&#8217;s what&#8217;s inside this newsletter:</strong></em></p><ul><li><p><strong>The difference between emulation and virtualization, and why one simulates hardware while the other shares it.</strong></p></li><li><p>What a virtual machine actually is, how it presents itself as a full computer, and why the guest operating system believes it owns the hardware.</p></li><li><p><strong>How the hypervisor works, the different types it comes in, and how it controls access to CPU, memory, storage, and networking.</strong></p></li><li><p>What happens under the hood when a virtual machine boots, from resource allocation and virtual hardware setup to kernel startup and runtime execution.</p></li><li><p><strong>How features like live migration, isolation, and resource controls make virtual machines practical for modern infrastructure.</strong></p></li><li><p>Where virtual machines fit today, why cloud platforms depend on them, and how they differ from containers.</p></li></ul><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Golden members get all posts like these!&#8230;</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h2>Emulation vs Virtualization</h2><p>Before we get into how virtual machines work, there is one distinction worth getting clear early.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Dop2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F633c7b27-dd71-4009-a59d-ff19c4d72d04_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Dop2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F633c7b27-dd71-4009-a59d-ff19c4d72d04_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!Dop2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F633c7b27-dd71-4009-a59d-ff19c4d72d04_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!Dop2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F633c7b27-dd71-4009-a59d-ff19c4d72d04_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!Dop2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F633c7b27-dd71-4009-a59d-ff19c4d72d04_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Dop2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F633c7b27-dd71-4009-a59d-ff19c4d72d04_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/633c7b27-dd71-4009-a59d-ff19c4d72d04_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Dop2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F633c7b27-dd71-4009-a59d-ff19c4d72d04_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!Dop2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F633c7b27-dd71-4009-a59d-ff19c4d72d04_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!Dop2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F633c7b27-dd71-4009-a59d-ff19c4d72d04_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!Dop2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F633c7b27-dd71-4009-a59d-ff19c4d72d04_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Emulation is when software pretends to be completely different hardware. Retro gaming emulators work this way. Your laptop takes each instruction from the old console and turns it into something your modern chip understands.</p><p>It works, but it is slow&#8230;</p><p>Virtualization is different. Instead of pretending to be different hardware, it creates a version of the same hardware your machine already has. Most instructions run directly on the real processor with no translation needed.</p><p>The result is near-native speed.</p><p><em>Think of it this way:</em></p><ul><li><p>Emulation is like hiring a translator for every sentence in a conversation.</p></li><li><p>Virtualization is like realizing that both people already speak the same language.</p></li></ul><p>This distinction matters because speed is what made virtualization practical at scale. You cannot run thousands of emulated systems on a single server. With virtualization, you can.</p><p>And that is exactly what cloud providers do&#8230;</p><div><hr></div><h2>What a Virtual Machine Actually Is</h2><p>A virtual machine is a complete computer abstraction built on top of a physical machine.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!umJe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F632d8852-b510-4ace-80b5-f50db4a745c2_1535x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!umJe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F632d8852-b510-4ace-80b5-f50db4a745c2_1535x1024.png 424w, https://substackcdn.com/image/fetch/$s_!umJe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F632d8852-b510-4ace-80b5-f50db4a745c2_1535x1024.png 848w, https://substackcdn.com/image/fetch/$s_!umJe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F632d8852-b510-4ace-80b5-f50db4a745c2_1535x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!umJe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F632d8852-b510-4ace-80b5-f50db4a745c2_1535x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!umJe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F632d8852-b510-4ace-80b5-f50db4a745c2_1535x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/632d8852-b510-4ace-80b5-f50db4a745c2_1535x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!umJe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F632d8852-b510-4ace-80b5-f50db4a745c2_1535x1024.png 424w, https://substackcdn.com/image/fetch/$s_!umJe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F632d8852-b510-4ace-80b5-f50db4a745c2_1535x1024.png 848w, https://substackcdn.com/image/fetch/$s_!umJe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F632d8852-b510-4ace-80b5-f50db4a745c2_1535x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!umJe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F632d8852-b510-4ace-80b5-f50db4a745c2_1535x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>It&#8217;s presented with a CPU, memory, storage, and a network interface, just like a physical system. The operating system inside it interacts with these components as if they were real hardware. It has no awareness that it is sharing the underlying machine with anything else.</p><p>From its perspective, it owns the system.</p><p>This behavior is made possible by a layered design.</p><p>At the bottom is the physical hardware, which provides the actual compute, memory, and storage resources. Above it sits the hypervisor, which controls how those resources are allocated and accessed. On top of that, each virtual machine runs its own operating system in isolation.</p><p>The hypervisor is the layer that makes the illusion possible.</p><p>Because the operating system never interacts with the hardware directly.</p><p>Instead, every request for CPU time, memory access, disk I/O, or network communication is handled through the hypervisor. It decides how resources are shared and ensures that each virtual machine remains isolated from the others.</p><p>This abstraction enables three key capabilities:</p><ul><li><p><strong>Isolation:</strong> Each virtual machine runs independently, so a failure in one doesn&#8217;t affect others.</p></li><li><p><strong>Portability:</strong> Virtual machines can move between physical systems with minimal changes.</p></li><li><p><strong>Efficient resource use:</strong> Multiple workloads share the same hardware instead of leaving it idle.</p></li></ul><p>These properties are what made modern infrastructure possible.</p><p>Next, let&#8217;s look at the hypervisor: what it is, the different types, and how it powers virtual machines&#8230;</p><div><hr></div><h2>Hypervisor: Brain Behind the Illusion</h2><p>The hypervisor is the software layer that enables virtualization.</p><p>It sits between the physical hardware and the virtual machines running above it. Every request a virtual machine makes, whether for CPU time, memory, or storage, goes through the hypervisor. It&#8217;s responsible for mapping those requests to the underlying hardware resources.</p><p>This layer controls all access to the hardware.</p><h3>Type 1 vs Type 2</h3><p>Not all hypervisors follow the same structure. The main difference comes from where they run.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rXsL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F680d420b-ff0a-4339-b53f-45fbbba1107d_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rXsL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F680d420b-ff0a-4339-b53f-45fbbba1107d_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!rXsL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F680d420b-ff0a-4339-b53f-45fbbba1107d_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!rXsL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F680d420b-ff0a-4339-b53f-45fbbba1107d_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!rXsL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F680d420b-ff0a-4339-b53f-45fbbba1107d_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rXsL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F680d420b-ff0a-4339-b53f-45fbbba1107d_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/680d420b-ff0a-4339-b53f-45fbbba1107d_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!rXsL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F680d420b-ff0a-4339-b53f-45fbbba1107d_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!rXsL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F680d420b-ff0a-4339-b53f-45fbbba1107d_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!rXsL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F680d420b-ff0a-4339-b53f-45fbbba1107d_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!rXsL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F680d420b-ff0a-4339-b53f-45fbbba1107d_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p><strong>Type 1 (bare-metal):</strong> It runs on the physical hardware, with no host operating system in between. Examples include VMware ESXi, Microsoft Hyper-V, and Kernel-based Virtual Machine (<strong>KVM</strong>). Cloud environments and data centers use these because they provide direct hardware access and minimal overhead.</p></li><li><p><strong>Type 2 (hosted):</strong> Runs on top of an existing operating system, like a regular application. Tools such as VirtualBox and VMware Workstation fall into this category. They are easier to use but introduce additional overhead.</p></li></ul><p>The difference lies in how the hypervisor accesses hardware and the number of extra layers it adds&#8230;</p><h3>Full Virtualization vs Paravirtualization</h3><p>There are also different ways a hypervisor presents hardware to a virtual machine.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NTTK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5159fa59-d865-41f3-829c-5b9a91a658b8_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NTTK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5159fa59-d865-41f3-829c-5b9a91a658b8_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!NTTK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5159fa59-d865-41f3-829c-5b9a91a658b8_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!NTTK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5159fa59-d865-41f3-829c-5b9a91a658b8_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!NTTK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5159fa59-d865-41f3-829c-5b9a91a658b8_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NTTK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5159fa59-d865-41f3-829c-5b9a91a658b8_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5159fa59-d865-41f3-829c-5b9a91a658b8_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NTTK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5159fa59-d865-41f3-829c-5b9a91a658b8_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!NTTK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5159fa59-d865-41f3-829c-5b9a91a658b8_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!NTTK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5159fa59-d865-41f3-829c-5b9a91a658b8_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!NTTK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5159fa59-d865-41f3-829c-5b9a91a658b8_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In <strong>full virtualization</strong>, guest operating system runs without modification and behaves as if it were on real hardware. The hypervisor handles all the work required to maintain that abstraction.</p><p>In <strong>paravirtualization</strong>, the guest operating system is aware that it is running in a virtual environment. Instead of relying only on emulated hardware, it asks the hypervisor to handle some tasks, which reduces overhead.</p><p>The trade-off is between compatibility and performance.</p><p>Full virtualization runs any unmodified operating system. Paravirtualization is faster because the guest works with the hypervisor.</p><h3>Hardware-Assisted Virtualization</h3><p>Early hypervisors relied on software, making virtualization complex and inefficient. This changed when processor manufacturers introduced hardware support such as Intel VT-x (Intel Virtualization Technology for x86) and AMD-V (AMD Virtualization).</p><p>With these features, CPU itself can differentiate between virtual machines and the hypervisor. Sensitive operations pass control to the hypervisor when needed. It manages access and keeps everything secure.</p><p>This made virtualization practical at scale.</p><h3>Nested Virtualization</h3><p>It is also possible to run a virtual machine inside another virtual machine, a setup known as <em>nested virtualization</em>. You use this for testing, development, and running tools like Kubernetes locally in virtual machines (<strong>VMs</strong>).</p><p>But each additional layer introduces overhead and increases complexity. As you add more virtualization layers, performance drops. This makes the approach unsuitable for most production workloads.</p><p>It is useful in controlled scenarios, but not a general-purpose design.</p><p>Now you know what the hypervisor does. The next question is how it shares the CPU across multiple virtual machines. The answer is CPU virtualization.</p><p>Let&#8217;s look at how it works&#8230;</p><div><hr></div><h2>CPU Virtualization</h2><p>The CPU is the brain of any computer.</p><p>Every instruction, process, and calculation goes through the CPU. So when you create a virtual machine, the question becomes: <em>how do different operating systems share one CPU without getting in each other&#8217;s way?</em></p><p>That is what CPU virtualization solves&#8230;</p><h3>The Privilege Model</h3><p>Operating systems are designed to be in charge.</p><p>When your OS needs to access hardware, manage memory, or control processes, it does so with full authority. The processor has a concept called <em>privilege levels</em> to enforce this.</p><p>Think of it like levels of trust, where the most trusted software gets the most access.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HIpP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1258100-145b-42e6-b594-d13ba11e0ab6_1172x1342.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HIpP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1258100-145b-42e6-b594-d13ba11e0ab6_1172x1342.png 424w, https://substackcdn.com/image/fetch/$s_!HIpP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1258100-145b-42e6-b594-d13ba11e0ab6_1172x1342.png 848w, https://substackcdn.com/image/fetch/$s_!HIpP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1258100-145b-42e6-b594-d13ba11e0ab6_1172x1342.png 1272w, https://substackcdn.com/image/fetch/$s_!HIpP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1258100-145b-42e6-b594-d13ba11e0ab6_1172x1342.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HIpP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1258100-145b-42e6-b594-d13ba11e0ab6_1172x1342.png" width="1172" height="1342" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c1258100-145b-42e6-b594-d13ba11e0ab6_1172x1342.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1342,&quot;width&quot;:1172,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HIpP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1258100-145b-42e6-b594-d13ba11e0ab6_1172x1342.png 424w, https://substackcdn.com/image/fetch/$s_!HIpP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1258100-145b-42e6-b594-d13ba11e0ab6_1172x1342.png 848w, https://substackcdn.com/image/fetch/$s_!HIpP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1258100-145b-42e6-b594-d13ba11e0ab6_1172x1342.png 1272w, https://substackcdn.com/image/fetch/$s_!HIpP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1258100-145b-42e6-b594-d13ba11e0ab6_1172x1342.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In a normal system, the operating system sits at the highest privilege level.</p><p>But in a virtual environment, the hypervisor needs to sit above the operating system. It needs to intercept certain instructions and control what the guest OS can and cannot do. So the guest OS gets moved down to a lower privilege level, even though it still believes it is fully in charge.</p><p>The guest OS never knows the hypervisor demoted it.</p><p>This is the fundamental trick behind CPU virtualization. The guest OS continues to behave as if it owns the processor. But when it tries to do something that needs high-level access, the processor steps in and hands control to the hypervisor.</p><p>Everything continues smoothly, just one level deeper&#8230;</p><h3>vCPU and Scheduling</h3><p>When you create a virtual machine, you assign it CPUs, but these are not physical cores. They&#8217;re virtual CPUs (<strong>vCPUs</strong>) that represent processing capacity managed by the hypervisor.</p><p>The hypervisor schedules vCPUs onto physical cores in the same way an operating system schedules processes.</p><p>At any given moment, a physical core can execute only one thread of work. The hypervisor switches between virtual machines, giving each one a slice of CPU time. This switching happens fast enough that each virtual machine appears to have continuous access to the processor.</p><p><em>Modern CPUs may expose multiple logical threads per core through technologies like Intel Hyper-Threading or AMD SMT (Simultaneous Multithreading), but CPU time is still shared and scheduled across workloads.</em></p><p>In reality, all virtual machines share it.</p><p>This is called <em>time-slicing</em>. It&#8217;s the same idea your operating system uses to run multiple applications at once.</p><p>Hypervisors can also assign more vCPUs than there are physical cores, a practice known as <em>CPU overcommit</em>. For example, a machine with 16 cores might run virtual machines totaling 32 or more vCPUs.</p><p>This works as long as not all virtual machines are busy at the same time.</p><p>When demand rises across multiple VMs, they compete for CPU time. This causes contention and reduces performance.</p><p>Overcommit is a useful tool, but it depends on workload behavior.</p><h3>VM-exit and VM-entry</h3><p>Let&#8217;s go one level deeper into what actually happens when a guest OS tries to do something it is not allowed to do on its own.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gIv5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54cb94d0-6409-474b-a9eb-2d4e679bd3f0_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gIv5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54cb94d0-6409-474b-a9eb-2d4e679bd3f0_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!gIv5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54cb94d0-6409-474b-a9eb-2d4e679bd3f0_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!gIv5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54cb94d0-6409-474b-a9eb-2d4e679bd3f0_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!gIv5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54cb94d0-6409-474b-a9eb-2d4e679bd3f0_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gIv5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54cb94d0-6409-474b-a9eb-2d4e679bd3f0_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/54cb94d0-6409-474b-a9eb-2d4e679bd3f0_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gIv5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54cb94d0-6409-474b-a9eb-2d4e679bd3f0_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!gIv5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54cb94d0-6409-474b-a9eb-2d4e679bd3f0_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!gIv5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54cb94d0-6409-474b-a9eb-2d4e679bd3f0_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!gIv5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54cb94d0-6409-474b-a9eb-2d4e679bd3f0_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>When a virtual machine tries to run a privileged instruction, the processor stops it. Control is immediately transferred to the hypervisor. This event is called a <em>VM-exit</em>.</p><p>The hypervisor then figures out what the guest was trying to do and handles it safely.</p><p>Once the hypervisor has dealt with the request, it hands control back to the virtual machine exactly where it left off. This is called a <em>VM-entry</em>. From the guest OS&#8217;s perspective, nothing unusual happened. The CPU executed the instruction and continued running.</p><p>The entire handoff takes microseconds.</p><p>This mechanism is what makes isolation possible. The guest OS cannot access hardware it shouldn&#8217;t. Every sensitive operation triggers a VM-exit, which the hypervisor controls. But it also introduces overhead. Every VM-exit and VM-entry takes time, and if they happen too frequently, performance suffers.</p><p>Keeping VM exits to a minimum is one of the core challenges of CPU virtualization.</p><h3>NUMA Awareness and CPU Topology</h3><p>This matters more in large systems and performance-sensitive workloads, but it&#8217;s still worth understanding at a high level&#8230;</p><p>Modern servers often have more than one physical processor. Each processor has its own local memory. Accessing memory that belongs to a different processor is slower than accessing your own. This architecture is called Non-Uniform Memory Access (<strong>NUMA</strong>).</p><p>The name describes the problem exactly. Not all memory accesses are equal.</p><p>In a virtual environment, this creates a subtle issue. A virtual machine might run on cores from one processor but end up accessing memory that belongs to another. The guest OS does not know this is happening. But the performance hit is real.</p><p>The solution is NUMA-aware scheduling.</p><p>The hypervisor tries to keep a virtual machine&#8217;s vCPUs and its memory on the same NUMA node. When it succeeds, the VM runs faster. When workloads grow and resources get spread across nodes, the performance gap becomes noticeable.</p><p>For most small deployments, this never comes up. For high-performance databases or real-time systems running on large servers, it matters a lot.</p><p>Putting it together, CPU virtualization is not a single mechanism but a combination of coordinated techniques.</p><p>The privilege model restricts direct hardware access. vCPU scheduling creates the illusion of dedicated processing power. VM-exits and VM-entries enforce safe boundaries for sensitive operations. NUMA awareness ensures that physical hardware layout does not introduce hidden performance penalties.</p><p>Each part addresses a specific challenge.</p><p>Together, they allow multiple virtual machines to share a single physical CPU while operating independently and efficiently.</p><p>Next, let&#8217;s look at how memory is virtualized and how virtual machines manage their view of system memory&#8230;</p><div><hr></div><div class="callout-block" data-callout="true"><p><em><strong>Reminder: this is a teaser of the subscriber-only newsletter, exclusive to my golden members.</strong></em></p><p>When you upgrade, you&#8217;ll get:</p><ul><li><p><strong>Architecture breakdown of core DevOps systems.</strong></p></li><li><p>Deep dives into Docker, Kubernetes, Terraform, and so on.</p></li><li><p><strong>Practical insights into how real-world systems achieve scalability, automation, and reliability.</strong></p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe&quot;,&quot;text&quot;:&quot;Unlock Full Access&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://newsletter.systemdesign.one/subscribe"><span>Unlock Full Access</span></a></p><p>Let&#8217;s keep going!</p></div><div><hr></div><h2>Memory Virtualization</h2>
      <p>
          <a href="https://newsletter.systemdesign.one/p/virtualization-architecture">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[AI Agents: State, Memory, Consistency - A Deep Dive]]></title><description><![CDATA[#147: Understanding State, Memory, and Consistency in AI Agents]]></description><link>https://newsletter.systemdesign.one/p/ai-agent-memory</link><guid isPermaLink="false">https://newsletter.systemdesign.one/p/ai-agent-memory</guid><dc:creator><![CDATA[Neo Kim]]></dc:creator><pubDate>Sun, 17 May 2026 09:40:01 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/f0d1b76a-2ba2-4c41-a2bf-f47a1cdc20b7_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Get my system design playbook for free on newsletter signup:</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><ul><li><p><em><a href="https://newsletter.systemdesign.one/p/ai-agent-memory/?action=share">Share this post</a> &amp; I&#8217;ll send you some rewards for the referrals.</em></p></li></ul><div><hr></div><p>The hardest part of building an AI agent<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a> has nothing to do with the model&#8230;</p><p>Calling a language model is the EASY part&#8230;</p><p>Keeping the agent coherent across a workflow that spans hours or days is where everything actually gets difficult: <em>remembering what was said, tracking what&#8217;s already been done, and staying steady when the user changes their mind.</em></p><p>Even with stronger models each year, agents break on those exact failures&#8230;</p><p>They forget earlier turns, repeat questions they already asked, and behave inconsistently across long tasks. The model is rarely the cause. The cause is a missing or badly designed state and memory, plus the consistency rules that keep the two in line.</p><p>This newsletter walks through all three&#8230;</p><div class="callout-block" data-callout="true"><h3><a href="https://codenewsletter.ai/subscribe?utm_source=nl_ad_system">100+ Claude Code hacks to ship code 10X faster (Partner)</a></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://codenewsletter.ai/subscribe?utm_source=nl_ad_system" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0VaY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3168322a-0be8-4261-9a54-ed4a8f688a05_1200x600.jpeg 424w, https://substackcdn.com/image/fetch/$s_!0VaY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3168322a-0be8-4261-9a54-ed4a8f688a05_1200x600.jpeg 848w, https://substackcdn.com/image/fetch/$s_!0VaY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3168322a-0be8-4261-9a54-ed4a8f688a05_1200x600.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!0VaY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3168322a-0be8-4261-9a54-ed4a8f688a05_1200x600.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0VaY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3168322a-0be8-4261-9a54-ed4a8f688a05_1200x600.jpeg" width="1200" height="600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3168322a-0be8-4261-9a54-ed4a8f688a05_1200x600.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:600,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:339767,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:&quot;https://codenewsletter.ai/subscribe?utm_source=nl_ad_system&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/190817974?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3168322a-0be8-4261-9a54-ed4a8f688a05_1200x600.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!0VaY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3168322a-0be8-4261-9a54-ed4a8f688a05_1200x600.jpeg 424w, https://substackcdn.com/image/fetch/$s_!0VaY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3168322a-0be8-4261-9a54-ed4a8f688a05_1200x600.jpeg 848w, https://substackcdn.com/image/fetch/$s_!0VaY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3168322a-0be8-4261-9a54-ed4a8f688a05_1200x600.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!0VaY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3168322a-0be8-4261-9a54-ed4a8f688a05_1200x600.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Top engineers at Anthropic and OpenAI say AI now writes 100% of their code.</p><p>If you&#8217;re not using AI, you&#8217;re spending 40 hours doing what they do in 4.</p><p>These 100+ Claude Code hacks fix that and help you ship 10x faster.</p><p>Sign up for <strong><a href="https://codenewsletter.ai/subscribe?utm_source=nl_ad_system">The Code</a></strong> and get:</p><ul><li><p>100+ Claude Code hacks used by top engineers &#8212; free</p></li><li><p>The Code newsletter &#8212; learn the latest AI tools, tips, and skills to code faster with AI in 5 minutes a day</p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://codenewsletter.ai/subscribe?utm_source=nl_ad_system&quot;,&quot;text&quot;:&quot;Claim your free playbook&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://codenewsletter.ai/subscribe?utm_source=nl_ad_system"><span>Claim your free playbook</span></a></p><p>(Thanks to <a href="https://codenewsletter.ai/subscribe?utm_source=nl_ad_system">the Code</a> for partnering on this post.)</p></div><div><hr></div><p>I want to introduce <strong><a href="http://www.linkedin.com/in/sivasankar-natarajan">Sivasankar</a></strong> as a guest author.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="http://www.linkedin.com/in/sivasankar-natarajan" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0qUX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ffde1b3-f155-4f25-a945-8b5b86d7b1a1_1521x704.jpeg 424w, https://substackcdn.com/image/fetch/$s_!0qUX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ffde1b3-f155-4f25-a945-8b5b86d7b1a1_1521x704.jpeg 848w, https://substackcdn.com/image/fetch/$s_!0qUX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ffde1b3-f155-4f25-a945-8b5b86d7b1a1_1521x704.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!0qUX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ffde1b3-f155-4f25-a945-8b5b86d7b1a1_1521x704.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0qUX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ffde1b3-f155-4f25-a945-8b5b86d7b1a1_1521x704.jpeg" width="1456" height="674" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8ffde1b3-f155-4f25-a945-8b5b86d7b1a1_1521x704.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:674,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:&quot;http://www.linkedin.com/in/sivasankar-natarajan&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0qUX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ffde1b3-f155-4f25-a945-8b5b86d7b1a1_1521x704.jpeg 424w, https://substackcdn.com/image/fetch/$s_!0qUX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ffde1b3-f155-4f25-a945-8b5b86d7b1a1_1521x704.jpeg 848w, https://substackcdn.com/image/fetch/$s_!0qUX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ffde1b3-f155-4f25-a945-8b5b86d7b1a1_1521x704.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!0qUX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ffde1b3-f155-4f25-a945-8b5b86d7b1a1_1521x704.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>He is a Technical Director and GenAI practitioner with over 20+ years of experience in architecting and building Big Data, Cloud, and GenAI solutions.</p><p>His work focuses on helping organizations design practical and scalable intelligent systems. He enjoys exploring new ideas through hands-on DIY projects and practical experimentation, often turning real-world use cases into working prototypes.</p><p>You can connect with Sivasankar (Shiva) on <strong><a href="http://www.linkedin.com/in/sivasankar-natarajan">LinkedIn</a></strong>.</p><div><hr></div><p><strong>Here&#8217;s what you&#8217;ll find inside this newsletter:</strong></p><ul><li><p><strong>State vs memory.</strong> What each one is for, how they differ, and why conflating them is the root cause of most agent bugs.</p></li><li><p><strong>Memory lifecycle.</strong> How an agent creates, updates, summarizes, and deletes memory so it doesn&#8217;t rot over time.</p></li><li><p><strong>Consistency under change.</strong> How state reacts to new instructions right away, and how long-term memory updates only after a change holds up.</p></li><li><p><strong>A reference architecture.</strong> The four layers (brain, state, memory, external systems) and what each one owns.</p></li><li><p><strong>Scaling and failure modes.</strong> Where stateful agents break at scale, and the three ways memory itself goes wrong.</p></li></ul><p>By the end, you&#8217;ll know how to design agents that track the right things, recover from failure, and stay truthful with the data they don&#8217;t control.</p><p>Let&#8217;s start with the state...</p><div><hr></div><h2><strong>Tracking what the agent is doing with the state</strong></h2><p>State is the information an agent keeps about the current task: <em>the plan, active constraints, what&#8217;s already been done, and what comes next.</em></p><p>It&#8217;s the agent&#8217;s picture of NOW&#8230;</p><p>The easiest way to see why it matters is to compare an agent with it to one without it.</p><h3><strong>Stateless vs Stateful</strong></h3><p>A <strong>stateless agent</strong> keeps nothing between requests&#8230;</p><p>Each input is a fresh start, which is fine for one-shot jobs like &#8220;<em>translate this sentence to French</em>, <em>summarize this paragraph</em>, or <em>explain what an API is&#8221;</em>.</p><p>A <strong>stateful agent</strong> tracks the current task, session, or workflow&#8230;</p><p>At each decision point, it checks the state to answer three questions:&nbsp;<em>what step am I on, what has already been done, and what should I do next?</em> Those answers drive the next action, check constraints, and stop the agent from repeating itself.</p><p>State matters the moment a workflow has more than one step&#8230;</p><p>The user books a flight (<em>step one done, constraints locked</em>), asks for a hotel (<em>step two, new constraints layered on</em>), then changes the return date (<em>earlier step invalidated, replan downstream</em>).</p><p>Without a state, none of those three questions has an answer.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9SyD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08322369-dba8-4e77-9919-1189d8841814_888x474.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9SyD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08322369-dba8-4e77-9919-1189d8841814_888x474.jpeg 424w, https://substackcdn.com/image/fetch/$s_!9SyD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08322369-dba8-4e77-9919-1189d8841814_888x474.jpeg 848w, https://substackcdn.com/image/fetch/$s_!9SyD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08322369-dba8-4e77-9919-1189d8841814_888x474.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!9SyD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08322369-dba8-4e77-9919-1189d8841814_888x474.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9SyD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08322369-dba8-4e77-9919-1189d8841814_888x474.jpeg" width="888" height="474" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/08322369-dba8-4e77-9919-1189d8841814_888x474.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:474,&quot;width&quot;:888,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Stateless vs stateful AI agents&quot;,&quot;title&quot;:&quot;Stateless vs stateful AI agents&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Stateless vs stateful AI agents" title="Stateless vs stateful AI agents" srcset="https://substackcdn.com/image/fetch/$s_!9SyD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08322369-dba8-4e77-9919-1189d8841814_888x474.jpeg 424w, https://substackcdn.com/image/fetch/$s_!9SyD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08322369-dba8-4e77-9919-1189d8841814_888x474.jpeg 848w, https://substackcdn.com/image/fetch/$s_!9SyD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08322369-dba8-4e77-9919-1189d8841814_888x474.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!9SyD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08322369-dba8-4e77-9919-1189d8841814_888x474.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>Where state lives</strong></h3><p>State is <em>maintained</em> on the agent side (backend server that runs the reasoning loop), and NOT on the client side (chat UI, mobile app, or API caller the user interacts with).</p><p>Clients disconnect, refresh, and retry. And agent still needs a steady view of progress.</p><p>So where that view lives depends on how long the workflow runs&#8230;</p><p>Short-lived workflows that end in a single session can remain in an in-memory store. Anything that has to survive restarts needs external storage: a database, a KV store (Redis-style lookups by id), or a serialized state object managed by an orchestration framework<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a>.</p><p>LangGraph<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a>, for example, comes with <code>SqliteSaver</code> and <code>PostgresSaver</code> as ready-made checkpoint backends.</p><p>At each step&#8217;s start, agent loads the latest state snapshot and rebuilds its context. That load makes a chain of steps feel like one chain rather than a set of separate calls.</p><h3><strong>Checkpointing and recovery</strong></h3><p>Loading state on each step only works if something wrote it in the first place&#8230;</p><p>That&#8217;s what checkpointing<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a> does: <em>it saves state at the milestones you pick as meaningful, not every tiny step.</em> If the agent crashes, times out, or gets redeployed mid-workflow, it reloads the last checkpoint and resumes from there.</p><p>Picture the <em>travel agent</em> picking a return flight when the API call fails&#8230;</p><p>Without checkpointing, the agent would have to re-ask the user for dates, budget, and outbound flight details. With checkpointing, the saved state already contains all that, so on retry, the agent picks up at the failed step and continues.</p><p>The same setup covers internal errors in the agent&#8217;s own code and redeploys that restart the infrastructure underneath it.</p><p>This pattern runs in production at real scale&#8230;</p><p>LinkedIn&#8217;s internal SQL Bot<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-5" href="#footnote-5" target="_self">5</a> is a hierarchical multi-agent system built on LangGraph, and its long-running query workflows rely on the framework&#8217;s checkpoint layer to pause, inspect state, and resume without losing context mid-task.</p><h3><strong>State versioning</strong></h3><p>Long-running agents change over time&#8230;</p><p>New fields appear, steps get reordered, logic shifts. Without versioning, in-progress workflows break the moment the schema changes, and the stored state written yesterday becomes unreadable tomorrow.</p><p>Versioning keeps yesterday&#8217;s state readable&#8230;</p><p>It migrates old formats forward and blocks execution paths that no longer make sense under the new schema. That way, workflow changes roll out without killing ongoing tasks.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rn18!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc72e5012-2d43-4050-a019-8a5ffcc9d087_2048x1143.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rn18!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc72e5012-2d43-4050-a019-8a5ffcc9d087_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!rn18!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc72e5012-2d43-4050-a019-8a5ffcc9d087_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!rn18!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc72e5012-2d43-4050-a019-8a5ffcc9d087_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!rn18!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc72e5012-2d43-4050-a019-8a5ffcc9d087_2048x1143.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rn18!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc72e5012-2d43-4050-a019-8a5ffcc9d087_2048x1143.jpeg" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c72e5012-2d43-4050-a019-8a5ffcc9d087_2048x1143.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;State versioning: schema migration for in-progress workflows&quot;,&quot;title&quot;:&quot;State versioning: schema migration for in-progress workflows&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="State versioning: schema migration for in-progress workflows" title="State versioning: schema migration for in-progress workflows" srcset="https://substackcdn.com/image/fetch/$s_!rn18!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc72e5012-2d43-4050-a019-8a5ffcc9d087_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!rn18!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc72e5012-2d43-4050-a019-8a5ffcc9d087_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!rn18!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc72e5012-2d43-4050-a019-8a5ffcc9d087_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!rn18!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc72e5012-2d43-4050-a019-8a5ffcc9d087_2048x1143.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This is most evident in production systems where workflows span minutes, hours, or even days&#8230;</p><h3><strong>Travel agent: state in action</strong></h3><p>Storage, checkpointing, and versioning click together once you watch a concrete plan move through them&#8230;</p><p>The user says, &#8220;<em>Plan a complete trip for me.&#8221; </em>The agent then walks a sequence:</p><ol><li><p>Pick the date and destination.</p></li><li><p>Select an outbound flight.</p></li><li><p>Then select a return flight.</p></li><li><p>Choose accommodation.</p></li><li><p>Confirm the itinerary.</p></li></ol><p>After each step, the state moves forward&#8230;</p><p>Once the outbound flight is selected, the state says&nbsp;<em>select the return flight</em>&nbsp;next. And checkpoints occur after preferences are confirmed, outbound flight is selected, return flight is selected, and the hotel is booked.</p><p>If the hotel step &#8216;fails&#8216; because of a network error, the agent restarts, reloads the state, sees that the dates and flights are already locked in, and retries only the hotel step.</p><p>NO re-asking and out-of-order suggestions&#8230;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!506w!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe51f486a-b335-4be7-a373-ebb3d34e0d77_1075x790.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!506w!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe51f486a-b335-4be7-a373-ebb3d34e0d77_1075x790.jpeg 424w, https://substackcdn.com/image/fetch/$s_!506w!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe51f486a-b335-4be7-a373-ebb3d34e0d77_1075x790.jpeg 848w, https://substackcdn.com/image/fetch/$s_!506w!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe51f486a-b335-4be7-a373-ebb3d34e0d77_1075x790.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!506w!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe51f486a-b335-4be7-a373-ebb3d34e0d77_1075x790.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!506w!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe51f486a-b335-4be7-a373-ebb3d34e0d77_1075x790.jpeg" width="1075" height="790" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e51f486a-b335-4be7-a373-ebb3d34e0d77_1075x790.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:790,&quot;width&quot;:1075,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;State-driven multi-step planning with checkpoints&quot;,&quot;title&quot;:&quot;State-driven multi-step planning with checkpoints&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="State-driven multi-step planning with checkpoints" title="State-driven multi-step planning with checkpoints" srcset="https://substackcdn.com/image/fetch/$s_!506w!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe51f486a-b335-4be7-a373-ebb3d34e0d77_1075x790.jpeg 424w, https://substackcdn.com/image/fetch/$s_!506w!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe51f486a-b335-4be7-a373-ebb3d34e0d77_1075x790.jpeg 848w, https://substackcdn.com/image/fetch/$s_!506w!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe51f486a-b335-4be7-a373-ebb3d34e0d77_1075x790.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!506w!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe51f486a-b335-4be7-a373-ebb3d34e0d77_1075x790.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>That&#8217;s what the state gives you.</p><p>Next up is memory:<em> information the agent carries across tasks</em>...</p><div><hr></div><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/p/ai-agent-memory?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption"><em>Share this post &amp; get rewards for the referrals.</em></p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/p/ai-agent-memory?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.systemdesign.one/p/ai-agent-memory?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><div><hr></div><h2><strong>Carrying information across tasks with memory</strong></h2><p>If the <strong>state</strong> is the agent&#8217;s picture of <em>now</em>, <strong>memory</strong> is what spans across: <em>past decisions already made, preferences built up over time, conversations it can refer back to, and reference data it queries from other systems (flight schedules, hotel listings).</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WdnL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbae4b9cc-72d5-456b-97b3-979225d8f227_668x564.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WdnL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbae4b9cc-72d5-456b-97b3-979225d8f227_668x564.jpeg 424w, https://substackcdn.com/image/fetch/$s_!WdnL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbae4b9cc-72d5-456b-97b3-979225d8f227_668x564.jpeg 848w, https://substackcdn.com/image/fetch/$s_!WdnL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbae4b9cc-72d5-456b-97b3-979225d8f227_668x564.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!WdnL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbae4b9cc-72d5-456b-97b3-979225d8f227_668x564.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WdnL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbae4b9cc-72d5-456b-97b3-979225d8f227_668x564.jpeg" width="668" height="564" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bae4b9cc-72d5-456b-97b3-979225d8f227_668x564.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:564,&quot;width&quot;:668,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Memory in an AI agent: short-term, long-term, and external&quot;,&quot;title&quot;:&quot;Memory in an AI agent: short-term, long-term, and external&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Memory in an AI agent: short-term, long-term, and external" title="Memory in an AI agent: short-term, long-term, and external" srcset="https://substackcdn.com/image/fetch/$s_!WdnL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbae4b9cc-72d5-456b-97b3-979225d8f227_668x564.jpeg 424w, https://substackcdn.com/image/fetch/$s_!WdnL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbae4b9cc-72d5-456b-97b3-979225d8f227_668x564.jpeg 848w, https://substackcdn.com/image/fetch/$s_!WdnL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbae4b9cc-72d5-456b-97b3-979225d8f227_668x564.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!WdnL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbae4b9cc-72d5-456b-97b3-979225d8f227_668x564.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Agents store memory in three flavors, separated by <em>how long the memory persists and whether the agent owns the data or just reads it from elsewhere</em>.</p><h3><strong>Short-term memory</strong></h3><p>Short-term memory holds the current context for the active task&#8230;</p><p>It&#8217;s short-lived and cleared when the task ends. Most implementations keep it in in-memory structures, session variables, or within the LLM<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-6" href="#footnote-6" target="_self">6</a> prompt. The lifecycle is simple: <em>create it when a task begins, update it as the agent reasons and uses tools, and clear it once the task wraps up.</em></p><p>Use it for one-task decisions that don&#8217;t need to outlive the current workflow.</p><h3><strong>Long-term memory</strong></h3><p>Long-term memory stores information across interactions: <em>user preferences, past bookings, and recurring patterns.</em></p><p>The storage options are vector databases<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-7" href="#footnote-7" target="_self">7</a>, KV stores, document stores, or relational databases. A write only happens when the information is actually worth keeping, and retrieval kicks in at the start of new tasks or when a relevant signal shows up in the current one.</p><p>Every so often, the agent summarizes or merges old entries to prevent the store from growing out of control&#8230;</p><p>Use this layer when a preference is steady, behavior repeats, or a piece of information should carry into future interactions.</p><p>The big win of long-term memory is <em>NOT</em> loading the entire history into the prompt each turn. The agent stores what&#8217;s worth keeping and pulls in only what the current decision needs.</p><p>Mem0<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-8" href="#footnote-8" target="_self">8</a> is one production system built on this idea:</p><p>Tested on LOCOMO<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-9" href="#footnote-9" target="_self">9</a>, a benchmark for long conversations, the Mem0 layer reduced p95 retrieval latency<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-10" href="#footnote-10" target="_self">10</a> by 91% against a full-context baseline<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-11" href="#footnote-11" target="_self">11</a>. Token cost dropped by over 90% on the same test.</p><p>On an LLM-as-judge evaluation<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-12" href="#footnote-12" target="_self">12</a>, it scored 26% higher than OpenAI&#8217;s memory system.</p><h3><strong>External memory</strong></h3><p>External memory is a large reference data that the agent queries on demand instead of storing: <em>flight databases, hotel listings, knowledge graphs, and internal systems of record.</em></p><p>These are external sources that hold the authoritative version of a piece of data, like a booking system or a pricing API&#8230;</p><p>These live in cloud databases, APIs, knowledge graphs, or external RAG<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-13" href="#footnote-13" target="_self">13</a> indexes. Nothing gets loaded into the prompt all at once. When a decision depends on data that must remain official, auditable, and shared across applications, the agent queries the system of record<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-14" href="#footnote-14" target="_self">14</a> directly.</p><h3><strong>Memory lifecycle</strong></h3><p>Even a good memory goes bad if you never manage it. Managing it means walking every entry through four stages:</p><ol><li><p><strong>Create</strong> when new information arrives (user picks a flight).</p></li><li><p><strong>Update</strong> when details change (user adds a second city).</p></li><li><p><strong>Summarize</strong> to shrink detail (only the times and locations that matter, not every API response).</p></li><li><p><strong>Delete</strong> when the information is no longer needed or the retention window ends (clear past trip details once the booking is complete).</p></li></ol><p>Skip this cycle, and long-term memory grows without limit.</p><p>Old preferences start contradicting newer ones, and the agent makes decisions based on a pile of half-truths&#8230;</p><h3><strong>Travel agent: memory in action</strong></h3><p>One trip exercises all three memory types at once:</p><ul><li><p>Short-term memory holds the current trip (cities, dates, flight search results).</p></li><li><p>Long-term memory holds repeat preferences (airline, seat class, meal choice).</p></li><li><p>External memory provides live data that the agent doesn&#8217;t own (current flight schedules, hotel availability, seat prices).</p></li></ul><p>The hard part for the agent isn&#8217;t storing any of this&#8230;</p><p>It&#8217;s deciding when the state should override memory, when memory should update from what just happened, and when neither should move because the real answer lives in an external system.</p><p>That&#8217;s the next problem...</p><div class="callout-block" data-callout="true"><h3><a href="https://codenewsletter.ai/subscribe?utm_source=nl_ad_system">100+ Claude Code hacks to ship code 10X faster (Partner)</a></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://codenewsletter.ai/subscribe?utm_source=nl_ad_system" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WUBw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F141492b2-1333-46cd-98b1-dd7f5df0b6c7_1200x600.jpeg 424w, https://substackcdn.com/image/fetch/$s_!WUBw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F141492b2-1333-46cd-98b1-dd7f5df0b6c7_1200x600.jpeg 848w, https://substackcdn.com/image/fetch/$s_!WUBw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F141492b2-1333-46cd-98b1-dd7f5df0b6c7_1200x600.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!WUBw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F141492b2-1333-46cd-98b1-dd7f5df0b6c7_1200x600.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WUBw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F141492b2-1333-46cd-98b1-dd7f5df0b6c7_1200x600.jpeg" width="1200" height="600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/141492b2-1333-46cd-98b1-dd7f5df0b6c7_1200x600.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:600,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:339767,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:&quot;https://codenewsletter.ai/subscribe?utm_source=nl_ad_system&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/190817974?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F141492b2-1333-46cd-98b1-dd7f5df0b6c7_1200x600.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!WUBw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F141492b2-1333-46cd-98b1-dd7f5df0b6c7_1200x600.jpeg 424w, https://substackcdn.com/image/fetch/$s_!WUBw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F141492b2-1333-46cd-98b1-dd7f5df0b6c7_1200x600.jpeg 848w, https://substackcdn.com/image/fetch/$s_!WUBw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F141492b2-1333-46cd-98b1-dd7f5df0b6c7_1200x600.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!WUBw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F141492b2-1333-46cd-98b1-dd7f5df0b6c7_1200x600.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Top engineers at Anthropic say AI now writes 100% of their code.</p><p>Are you using AI to write yours?</p><p>These 100+ Claude Code hacks show you exactly how. Sign up for The Code and get:</p><ul><li><p>100+ Claude Code hacks &#8212; free</p></li><li><p>The <a href="https://codenewsletter.ai/subscribe?utm_source=nl_ad_system">Code newsletter</a> &#8212; learn the latest AI tools and skills to code faster in 5 mins a day</p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://codenewsletter.ai/subscribe?utm_source=nl_ad_system&quot;,&quot;text&quot;:&quot;Claim your free playbook&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://codenewsletter.ai/subscribe?utm_source=nl_ad_system"><span>Claim your free playbook</span></a></p></div><div><hr></div><h2><strong>Staying consistent when preferences change</strong></h2><p>Real users change their minds mid-task.</p><p>They try a new option, correct themselves, adjust what they want, and sometimes say one thing on Monday and the opposite on Tuesday. An agent that can&#8217;t handle this feels wrong: repetitive, stuck on old instructions, or too eager to rewrite its long-term knowledge on every passing comment.</p><p>The core rule is simple: <em>react fast with state, learn slowly with memory</em>.</p><h3><strong>Reacting fast, learning slowly</strong></h3><p>State updates as soon as a new instruction arrives&#8230;</p><p>The current task reflects it right away. But memory updates only after the change looks steady and reusable.</p><p>The question is <em>when</em> memory writes happen&#8230;</p><p>LangChain&#8217;s team frames two options: the agent can write <em>&#8220;in hot path&#8221;</em> (save the new fact before replying to the user) or <em>&#8220;in the background&#8221;</em> (a separate process updates memory during or after the conversation).</p><p>Background writes are what make &#8220;learn slowly&#8221; work, because they give the system time to check whether the change holds up over a few more turns before it gets saved.</p><p>If memory learned too fast, it would rot<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-15" href="#footnote-15" target="_self">15</a>&#8230;</p><p>A rare exception becomes the rule, a short-term constraint becomes permanent, and future tasks go wrong with no clear cause.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!RwUj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa32018b9-bc1c-4f65-904a-03b5bd261cd9_2048x1143.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!RwUj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa32018b9-bc1c-4f65-904a-03b5bd261cd9_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!RwUj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa32018b9-bc1c-4f65-904a-03b5bd261cd9_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!RwUj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa32018b9-bc1c-4f65-904a-03b5bd261cd9_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!RwUj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa32018b9-bc1c-4f65-904a-03b5bd261cd9_2048x1143.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!RwUj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa32018b9-bc1c-4f65-904a-03b5bd261cd9_2048x1143.jpeg" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a32018b9-bc1c-4f65-904a-03b5bd261cd9_2048x1143.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;React fast with state, learn slowly with memory&quot;,&quot;title&quot;:&quot;React fast with state, learn slowly with memory&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="React fast with state, learn slowly with memory" title="React fast with state, learn slowly with memory" srcset="https://substackcdn.com/image/fetch/$s_!RwUj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa32018b9-bc1c-4f65-904a-03b5bd261cd9_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!RwUj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa32018b9-bc1c-4f65-904a-03b5bd261cd9_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!RwUj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa32018b9-bc1c-4f65-904a-03b5bd261cd9_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!RwUj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa32018b9-bc1c-4f65-904a-03b5bd261cd9_2048x1143.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Three roles keep this clean:</p><ul><li><p>State tracks what the agent is doing now,</p></li><li><p>Memory stores what it has learned over time,</p></li><li><p>Consistency rules keep the two in check when they disagree.</p></li></ul><p>Before any of these rules run, the agent has to pull the right memory for the current decision.</p><h3><strong>Retrieving only what&#8217;s relevant</strong></h3><p>Agents don&#8217;t load all memory&#8230;</p><p>Instead, they query it like a database, using metadata on each entry:</p><ul><li><p>t<code>opic: flights [seats, airlines, class, timing]</code></p></li><li><p>t<code>opic: hotels [room type, location, amenities]</code></p></li><li><p>t<code>opic: food [dietary restrictions, cuisine]</code></p></li><li><p>t<code>opic: budget [price sensitivity, limits]</code></p></li><li><p>d<code>uration: short-term | long-term</code></p></li><li><p>confidence: <code>confirmed | inferred</code></p></li></ul><p>The current state picks which slices to pull.</p><p>Booking a flight pulls flight and budget memory, and leaves hotel memory alone. This keeps the working context focused and stops old or unrelated memory from leaking into the decision.</p><h3><strong>Rolling back on corrections</strong></h3><p>When a user corrects the agent, treat it as a constraint change, NOT an error.</p><p>The agent always plans under a set of constraints:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;4f84cea6-c11c-4f3c-b255-de0589fddc51&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">seat type = aisle
budget &lt;= 800
departure = morning
hotel near the city center</code></pre></div><p>When the user says, &#8220;<em>I want a window seat&#8221;</em>, they aren&#8217;t saying <em>you made a mistake</em>. They&#8217;re saying <em>update the constraints</em>. The agent&#8217;s job is to apply the new constraint without throwing away the work that&#8217;s still valid.</p><p>On a correction, the agent traces which steps depended on the old constraint&#8230;</p><p>It rolls back to the earliest affected step, marks everything downstream as invalid, and leaves the earlier valid steps alone.</p><p>But memory doesn&#8217;t update on the first correction&#8230;</p><p>The agent first checks intent: <em>whether the change is a one-time request, task-specific, or a lasting preference (indicated by phrases like &#8220;always a window seat&#8221;). </em>Only after the change looks steady does long-term memory update.</p><h3><strong>When the system of record wins</strong></h3><p>NOT all memory is truth&#8230;</p><p>Prices, availability, and inventory come from external systems, not from the agent&#8217;s stored preferences. External systems define ground truth; memory shapes decisions without overriding real data, and when the two disagree, the system of record always wins.</p><p>Anthropic names this pattern directly in its agent design guidance:</p><p>Agents must <em>&#8220;gain &#8216;ground truth&#8217; from the environment at each step (such as tool call results or code execution) to assess its progress.&#8221;</em> That keeps the agent in sync with reality, and hallucinations drop because it isn&#8217;t guessing at live data it could have just fetched.</p><h3><strong>Travel agent: consistency in action</strong></h3><p>The agent already knows 2 things:</p><ul><li><p>Long-term memory: user usually prefers aisle seats.</p></li><li><p>Short-term memory: user wants the cheapest flight for this trip.</p></li></ul><p>Right now, the agent is comparing flights and choosing seats&#8230;</p><p>Then the user says:</p><p><em>&#8220;I want a window seat this time.&#8221;</em></p><p>Here&#8217;s what the agent does:</p><ol><li><p>Reads the new request.</p></li><li><p>Pulls only seat-related preferences (memory).</p></li><li><p>Understands <em>&#8220;this time&#8221;</em> means a temporary change and labels it accordingly.</p></li><li><p>Rolls state back to the flight selection step because seat preferences affect flight choices.</p></li><li><p>Removes earlier seat and pricing decisions that depended on aisle seats.</p></li><li><p>Recalculates the best flight options using the new window-seat preference.</p></li><li><p>Updates short-term memory for this trip ONLY.</p></li><li><p>Keeps long-term memory unchanged because preference may NOT be permanent.</p></li></ol><p>Later, the user says:</p><p><em>&#8220;I prefer window seats going forward.&#8221;</em></p><p>Now the agent understands this is a lasting preference&#8230;</p><p>So it:</p><ol><li><p>Read this as a lasting change.</p></li><li><p>Saves window seats as the new default preference.</p></li><li><p>Updates long-term memory.</p></li><li><p>Uses this preference for future trips.</p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1u4Z!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb166b263-d288-4723-abb3-aec0cca7a706_1045x797.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1u4Z!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb166b263-d288-4723-abb3-aec0cca7a706_1045x797.jpeg 424w, https://substackcdn.com/image/fetch/$s_!1u4Z!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb166b263-d288-4723-abb3-aec0cca7a706_1045x797.jpeg 848w, https://substackcdn.com/image/fetch/$s_!1u4Z!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb166b263-d288-4723-abb3-aec0cca7a706_1045x797.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!1u4Z!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb166b263-d288-4723-abb3-aec0cca7a706_1045x797.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1u4Z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb166b263-d288-4723-abb3-aec0cca7a706_1045x797.jpeg" width="1045" height="797" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b166b263-d288-4723-abb3-aec0cca7a706_1045x797.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:797,&quot;width&quot;:1045,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;User preference change: rollback and selective memory update&quot;,&quot;title&quot;:&quot;User preference change: rollback and selective memory update&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="User preference change: rollback and selective memory update" title="User preference change: rollback and selective memory update" srcset="https://substackcdn.com/image/fetch/$s_!1u4Z!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb166b263-d288-4723-abb3-aec0cca7a706_1045x797.jpeg 424w, https://substackcdn.com/image/fetch/$s_!1u4Z!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb166b263-d288-4723-abb3-aec0cca7a706_1045x797.jpeg 848w, https://substackcdn.com/image/fetch/$s_!1u4Z!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb166b263-d288-4723-abb3-aec0cca7a706_1045x797.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!1u4Z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb166b263-d288-4723-abb3-aec0cca7a706_1045x797.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Now, imagine the airline system says there are no window seats available&#8230;</p><p>The agent tells the user that window seats are unavailable and follows the airline&#8217;s live data. It does not remove the preference from memory. Instead, real-world data overrides stored preferences.</p><p>So far, the travel agent has shown up in pieces. Now it&#8217;s time to run it end-to-end...</p><div><hr></div><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/p/ai-agent-memory?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption"><em>Share this post &amp; get rewards for the referrals.</em></p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/p/ai-agent-memory?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.systemdesign.one/p/ai-agent-memory?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><div><hr></div><h2><strong>Build travel agent from three pieces</strong></h2><p>Travel Agent AI helps users plan and book trips&#8230;</p><p>It talks to the user, tracks the current task, remembers useful preferences, and calls external tools to check flights, hotels, and prices.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kdYI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecdc99fd-ea28-44a6-a856-d394adf0dc97_600x621.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kdYI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecdc99fd-ea28-44a6-a856-d394adf0dc97_600x621.jpeg 424w, https://substackcdn.com/image/fetch/$s_!kdYI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecdc99fd-ea28-44a6-a856-d394adf0dc97_600x621.jpeg 848w, https://substackcdn.com/image/fetch/$s_!kdYI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecdc99fd-ea28-44a6-a856-d394adf0dc97_600x621.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!kdYI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecdc99fd-ea28-44a6-a856-d394adf0dc97_600x621.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kdYI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecdc99fd-ea28-44a6-a856-d394adf0dc97_600x621.jpeg" width="600" height="621" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ecdc99fd-ea28-44a6-a856-d394adf0dc97_600x621.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:621,&quot;width&quot;:600,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;AI Travel Agent overview: state, memory types, tools, and user input&quot;,&quot;title&quot;:&quot;AI Travel Agent overview: state, memory types, tools, and user input&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="AI Travel Agent overview: state, memory types, tools, and user input" title="AI Travel Agent overview: state, memory types, tools, and user input" srcset="https://substackcdn.com/image/fetch/$s_!kdYI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecdc99fd-ea28-44a6-a856-d394adf0dc97_600x621.jpeg 424w, https://substackcdn.com/image/fetch/$s_!kdYI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecdc99fd-ea28-44a6-a856-d394adf0dc97_600x621.jpeg 848w, https://substackcdn.com/image/fetch/$s_!kdYI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecdc99fd-ea28-44a6-a856-d394adf0dc97_600x621.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!kdYI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecdc99fd-ea28-44a6-a856-d394adf0dc97_600x621.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Every request flows through the same loop&#8230;</p><h3><strong>Workflow loop</strong></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wZHr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad1c0a07-611b-4233-b37f-fc68eeb37f38_811x804.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wZHr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad1c0a07-611b-4233-b37f-fc68eeb37f38_811x804.jpeg 424w, https://substackcdn.com/image/fetch/$s_!wZHr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad1c0a07-611b-4233-b37f-fc68eeb37f38_811x804.jpeg 848w, https://substackcdn.com/image/fetch/$s_!wZHr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad1c0a07-611b-4233-b37f-fc68eeb37f38_811x804.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!wZHr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad1c0a07-611b-4233-b37f-fc68eeb37f38_811x804.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wZHr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad1c0a07-611b-4233-b37f-fc68eeb37f38_811x804.jpeg" width="811" height="804" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ad1c0a07-611b-4233-b37f-fc68eeb37f38_811x804.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:804,&quot;width&quot;:811,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;AI Travel Agent workflow loop with state and memory updates&quot;,&quot;title&quot;:&quot;AI Travel Agent workflow loop with state and memory updates&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="AI Travel Agent workflow loop with state and memory updates" title="AI Travel Agent workflow loop with state and memory updates" srcset="https://substackcdn.com/image/fetch/$s_!wZHr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad1c0a07-611b-4233-b37f-fc68eeb37f38_811x804.jpeg 424w, https://substackcdn.com/image/fetch/$s_!wZHr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad1c0a07-611b-4233-b37f-fc68eeb37f38_811x804.jpeg 848w, https://substackcdn.com/image/fetch/$s_!wZHr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad1c0a07-611b-4233-b37f-fc68eeb37f38_811x804.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!wZHr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad1c0a07-611b-4233-b37f-fc68eeb37f38_811x804.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ol><li><p><strong>User input.</strong> The user asks for a plan or a specific booking.</p></li><li><p><strong>Intent parsing and reasoning.</strong> The agent figures out the goal and picks the next step.</p></li><li><p><strong>State update.</strong> Records which step is active and the constraints in play (dates, airline preference, budget).</p></li><li><p><strong>Memory access.</strong> Pulls short-term, long-term, or external memory based on what this step needs.</p></li><li><p><strong>Tool execution.</strong> Calls APIs for flights, hotels, or pricing.</p></li><li><p><strong>Response generation.</strong> Gives a suggestion or confirmation from the current state and retrieved memory.</p></li><li><p><strong>State/memory update.</strong> Writes the new selections and any session context back to storage.</p></li><li><p><strong>Next step or loop.</strong> Continues to the next task in the plan.</p></li></ol><h3><strong>Example prompts</strong></h3><p>A typical conversation might run across four turns:</p><ol><li><p><em>&#8220;Help me plan a trip to Paris next month.&#8221;</em></p></li><li><p><em>&#8220;Book the outbound flight first.&#8221;</em></p></li><li><p><em>&#8220;I prefer window seats and morning flights.&#8221;</em></p></li><li><p><em>&#8220;Now add a hotel near the city center.&#8221;</em></p></li></ol><p>Across these turns:</p><ul><li><p>State tracks the active step.</p></li><li><p>Short-term memory stores choices for this trip.</p></li><li><p>Long-term memory stores recurring preferences.</p></li><li><p>External data provides live prices and availability.</p></li><li><p>Tools connect the agent to flight, hotel, and booking systems.</p></li></ul><h3><strong>Agent&#8217;s tools</strong></h3><p>The agent reaches out through flight search APIs, hotel availability services, and pricing and booking systems&#8230;</p><p>Each step feeds the next, and the saved state keeps the plan aligned with live data even when something fails partway through.</p><p>The loop shows <em>when</em> things happen&#8230;The next question is <em>what</em> each piece is responsible for...</p><div><hr></div><h2><strong>Reference architecture for a stateful agent</strong></h2><p>Zoom out from the runtime loop, and a stateful agent splits into four layers:</p><p>Each one has a single job, and the lines between them are what keep the agent steady.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!TKC_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4aba498a-1d9e-4dfe-86f9-b33c151a1aec_1022x646.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!TKC_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4aba498a-1d9e-4dfe-86f9-b33c151a1aec_1022x646.jpeg 424w, https://substackcdn.com/image/fetch/$s_!TKC_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4aba498a-1d9e-4dfe-86f9-b33c151a1aec_1022x646.jpeg 848w, https://substackcdn.com/image/fetch/$s_!TKC_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4aba498a-1d9e-4dfe-86f9-b33c151a1aec_1022x646.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!TKC_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4aba498a-1d9e-4dfe-86f9-b33c151a1aec_1022x646.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!TKC_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4aba498a-1d9e-4dfe-86f9-b33c151a1aec_1022x646.jpeg" width="1022" height="646" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4aba498a-1d9e-4dfe-86f9-b33c151a1aec_1022x646.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:646,&quot;width&quot;:1022,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Stateful AI agent architecture: components and dataflow&quot;,&quot;title&quot;:&quot;Stateful AI agent architecture: components and dataflow&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Stateful AI agent architecture: components and dataflow" title="Stateful AI agent architecture: components and dataflow" srcset="https://substackcdn.com/image/fetch/$s_!TKC_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4aba498a-1d9e-4dfe-86f9-b33c151a1aec_1022x646.jpeg 424w, https://substackcdn.com/image/fetch/$s_!TKC_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4aba498a-1d9e-4dfe-86f9-b33c151a1aec_1022x646.jpeg 848w, https://substackcdn.com/image/fetch/$s_!TKC_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4aba498a-1d9e-4dfe-86f9-b33c151a1aec_1022x646.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!TKC_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4aba498a-1d9e-4dfe-86f9-b33c151a1aec_1022x646.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p><strong>Agent brain (reasoning engine).</strong> An LLM sits at the center, planning multi-step workflows, reading user intent, and picking the next action from the current state and retrieved memory.</p></li><li><p><strong>State layer.</strong> This is where the current workflow lives: which steps are done, which are next, and what constraints are active. It also holds short-term overrides and session-specific choices, and lets the agent roll back when a user changes a preference without restarting the entire plan.</p></li><li><p><strong>Memory layer.</strong> Short-term memory carries this-task context; long-term memory carries lasting user preferences and habits. Together, they feed the brain with context that matches past decisions and patterns.</p></li><li><p><strong>External systems (system of record).</strong> APIs, databases, and live data sources provide official information such as seat availability, current prices, and cancellation policies. Memory can suggest, but the system of record always wins in a conflict.</p></li></ul><p>The brain decides, the state tracks, memory remembers, and external systems check.</p><p>Mix up those responsibilities, and the agent guesses when it should check a live source, or keeps memory that should have been dropped. Keep them separate, and every layer pulls its own weight.</p><div><hr></div><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/p/ai-agent-memory?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption"><em>Share this post &amp; get rewards for the referrals.</em></p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/p/ai-agent-memory?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.systemdesign.one/p/ai-agent-memory?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><div><hr></div><h2><strong>Deciding what fits in the context window</strong></h2><p>The four layers only work if they fit&#8230;</p><p>Every LLM runs inside a finite context window<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-16" href="#footnote-16" target="_self">16</a>, and state, memory, tool schemas, tool outputs, and user messages all fight for that space.</p><h3><strong>What takes up the context window</strong></h3><p>Every step&#8217;s context is the sum of what you send the model in that step&#8230;</p><p>This includes system and developer instructions, current state, retrieved memory snippets (short-term, long-term, and external), tool schemas and instructions, tool outputs from API responses and search results, and the user&#8217;s messages themselves.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qLGB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35ba5c9a-414a-4b3a-8cf6-a811a3035f4d_2048x1143.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qLGB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35ba5c9a-414a-4b3a-8cf6-a811a3035f4d_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!qLGB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35ba5c9a-414a-4b3a-8cf6-a811a3035f4d_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!qLGB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35ba5c9a-414a-4b3a-8cf6-a811a3035f4d_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!qLGB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35ba5c9a-414a-4b3a-8cf6-a811a3035f4d_2048x1143.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qLGB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35ba5c9a-414a-4b3a-8cf6-a811a3035f4d_2048x1143.jpeg" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/35ba5c9a-414a-4b3a-8cf6-a811a3035f4d_2048x1143.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;What fills the LLM context window&quot;,&quot;title&quot;:&quot;What fills the LLM context window&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="What fills the LLM context window" title="What fills the LLM context window" srcset="https://substackcdn.com/image/fetch/$s_!qLGB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35ba5c9a-414a-4b3a-8cf6-a811a3035f4d_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!qLGB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35ba5c9a-414a-4b3a-8cf6-a811a3035f4d_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!qLGB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35ba5c9a-414a-4b3a-8cf6-a811a3035f4d_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!qLGB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35ba5c9a-414a-4b3a-8cf6-a811a3035f4d_2048x1143.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A useful mental model for the budget (not a formula to measure):</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;e365af97-4c91-45ed-98ff-c599ace56e8a&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">Context size = instructions + state + memory + tools + data + user input</code></pre></div><p>Visibility is what matters here, not precision&#8230;</p><p>Tool definitions and raw API responses often consume more context than the model&#8217;s actual reasoning.</p><p>Anthropic ran into this in their own Claude Code setup:</p><p>The full tool definitions<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-17" href="#footnote-17" target="_self">17</a> alone used about 134,000 tokens of context. To fix this, Anthropic introduced Tool Search, a lazy-loading system that loads a tool&#8217;s full schema only when the agent actually needs it.</p><p>This reduced tool-definition overhead by 85%, increased usable context per session from 122,800 tokens to 191,300, and improved MCP tool-use accuracy for Claude Opus 4 from 49% to 74%.</p><p>If the agent loads everything all the time, the context window fills up quickly&#8230;</p><p>When that happens, older messages drop out first. This usually removes the earliest and most important information, like the user&#8217;s original goals, constraints, or instructions.</p><h3><strong>Retrieve only what&#8217;s relevant</strong></h3><p>Use tags, scope, and recency to pull what matches the current step, not everything the agent has ever learned.</p><p>Match retrieval to the current state: <em>booking a flight pulls flight and budget memory, nothing else.</em></p><p>Relevance is really a question of how much history you want to pay for right now&#8230;</p><h3><strong>Summarize and remove old checkpoints</strong></h3><p>In long workflows, the agent&#8217;s state keeps growing after every step&#8230;</p><p>Keeping every raw log and tool response makes the context large, expensive, and noisy. Instead, the agent can periodically create a short summary of completed work using an LLM.</p><p>The summary should keep only the important information:</p><ul><li><p>decisions already made</p></li><li><p>constraints that are now fixed</p></li><li><p>tasks that still need to be completed</p></li></ul><p>The agent should remove detailed logs that no longer affect future decisions.</p><p>And summaries should not stay forever either.</p><p>If an old summary is no longer useful for future tasks, the agent should archive it or delete it. Outdated summaries can confuse the agent just as outdated memory does.</p><h3><strong>Keep memory outside the prompt</strong></h3><p>The agent should not keep all its memory in the prompt at all times&#8230;</p><p>Instead, memory should remain in external storage and be retrieved only when needed. Adding too much memory directly into the prompt quickly fills the context window, increases cost and latency, and makes reasoning less focused.</p><p>Most stored memory is only occasionally useful and unnecessary for every step&#8230;</p><p>Before adding memory to the context, the agent should ask:</p><ul><li><p><em>Does this change what I should do next?</em></p></li><li><p><em>Is this information still valid?</em></p></li><li><p><em>Would removing it break the current plan?</em></p></li></ul><p>If the answer to all three questions is &#8220;<em>no&#8221;</em>, the memory should stay out of the context.</p><p>Good agents do NOT <em>succeed</em> by remembering everything&#8230;Instead, they retrieve only the information that matters right now.</p><h3><strong>Cost, latency, and reliability trade-offs</strong></h3><p>Every reliability improvement comes with a cost&#8230;</p><p>Retrieving more memory increases latency. Extra tool calls increase costs. Additional consistency checks slow the response down.</p><p>At scale, fast agents are usually cheaper but less reliable. More careful agents are safer but slower. You cannot optimize for speed, cost, and reliability simultaneously.</p><p>In practice, teams focus on accuracy where mistakes are expensive, such as bookings or payments. Also, they cache tool results when possible to avoid repeated API calls.</p><p>Plus, batching API requests reduces overhead. Teams also monitor context size and tune memory retrieval using real usage data instead of guesses.</p><p>Now, imagine the travel agent after handling hundreds of trips&#8230;</p><p>Its long-term memory now contains many user preferences, budgets, airlines, and booking patterns. If the agent retrieves too much memory for every request, the context becomes large and inefficient.</p><p>A scalable travel agent:</p><ul><li><p>retrieves only trip-related preferences</p></li><li><p>summarizes old planning steps</p></li><li><p>trusts live pricing APIs over stored prices</p></li><li><p>keeps prompts small and focused as history grows</p></li></ul><p>Without this discipline, the agent gradually becomes slower, more expensive, and less reliable&#8230;</p><p>Efficiently managing context is only part of scaling an agent system.</p><p>The other challenge is handling memory failures correctly&#8230;</p><div class="callout-block" data-callout="true"><h3><a href="https://codenewsletter.ai/subscribe?utm_source=nl_ad_system">100+ Claude Code hacks to ship code 10X faster (Partner)</a></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://codenewsletter.ai/subscribe?utm_source=nl_ad_system" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WUBw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F141492b2-1333-46cd-98b1-dd7f5df0b6c7_1200x600.jpeg 424w, https://substackcdn.com/image/fetch/$s_!WUBw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F141492b2-1333-46cd-98b1-dd7f5df0b6c7_1200x600.jpeg 848w, https://substackcdn.com/image/fetch/$s_!WUBw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F141492b2-1333-46cd-98b1-dd7f5df0b6c7_1200x600.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!WUBw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F141492b2-1333-46cd-98b1-dd7f5df0b6c7_1200x600.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WUBw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F141492b2-1333-46cd-98b1-dd7f5df0b6c7_1200x600.jpeg" width="1200" height="600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/141492b2-1333-46cd-98b1-dd7f5df0b6c7_1200x600.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:600,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:339767,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:&quot;https://codenewsletter.ai/subscribe?utm_source=nl_ad_system&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/190817974?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F141492b2-1333-46cd-98b1-dd7f5df0b6c7_1200x600.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!WUBw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F141492b2-1333-46cd-98b1-dd7f5df0b6c7_1200x600.jpeg 424w, https://substackcdn.com/image/fetch/$s_!WUBw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F141492b2-1333-46cd-98b1-dd7f5df0b6c7_1200x600.jpeg 848w, https://substackcdn.com/image/fetch/$s_!WUBw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F141492b2-1333-46cd-98b1-dd7f5df0b6c7_1200x600.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!WUBw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F141492b2-1333-46cd-98b1-dd7f5df0b6c7_1200x600.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Top engineers at Anthropic say AI now writes 100% of their code.</p><p>Are you using AI to write yours?</p><p>These 100+ Claude Code hacks show you exactly how. Sign up for The Code and get:</p><ul><li><p>100+ Claude Code hacks &#8212; free</p></li><li><p>The <a href="https://codenewsletter.ai/subscribe?utm_source=nl_ad_system">Code newsletter</a> &#8212; learn the latest AI tools and skills to code faster in 5 mins a day</p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://codenewsletter.ai/subscribe?utm_source=nl_ad_system&quot;,&quot;text&quot;:&quot;Claim your free playbook&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://codenewsletter.ai/subscribe?utm_source=nl_ad_system"><span>Claim your free playbook</span></a></p></div><div><hr></div><h2><strong>How memory goes wrong</strong></h2><p>Even if an agent manages context well, its memory system can still fail&#8230;</p><p>Memory makes agents more capable, but it also creates new ways for them to behave incorrectly. In real systems, three common memory failures occur repeatedly.</p><p>These failures happen at different stages of the memory pipeline:</p><ul><li><p>What gets stored</p></li><li><p>What stays over time</p></li><li><p>What gets retrieved later</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-Vul!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd91bda6d-d735-434b-be85-6372893b86ad_2048x1143.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-Vul!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd91bda6d-d735-434b-be85-6372893b86ad_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!-Vul!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd91bda6d-d735-434b-be85-6372893b86ad_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!-Vul!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd91bda6d-d735-434b-be85-6372893b86ad_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!-Vul!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd91bda6d-d735-434b-be85-6372893b86ad_2048x1143.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-Vul!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd91bda6d-d735-434b-be85-6372893b86ad_2048x1143.jpeg" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d91bda6d-d735-434b-be85-6372893b86ad_2048x1143.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;How memory goes wrong: stale memory, wrong capture, retrieval miss&quot;,&quot;title&quot;:&quot;How memory goes wrong: stale memory, wrong capture, retrieval miss&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="How memory goes wrong: stale memory, wrong capture, retrieval miss" title="How memory goes wrong: stale memory, wrong capture, retrieval miss" srcset="https://substackcdn.com/image/fetch/$s_!-Vul!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd91bda6d-d735-434b-be85-6372893b86ad_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!-Vul!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd91bda6d-d735-434b-be85-6372893b86ad_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!-Vul!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd91bda6d-d735-434b-be85-6372893b86ad_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!-Vul!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd91bda6d-d735-434b-be85-6372893b86ad_2048x1143.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>1 Stale memory</strong></h3><p>The agent relies on outdated information&#8230;</p><p><em>Example:</em> The agent remembers that the user prefers window seats, but the user recently switched to aisle seats. Without an update, the agent keeps suggesting window seats.</p><p><em>Why it matters:</em> Old preferences can override newer user intent if the system does not update or expire memory correctly.</p><h3><strong>2 Wrong information captured</strong></h3><p>The agent stores incorrect information&#8230;</p><p><em>Example:</em> During a conversation, the agent mistakenly saves the user&#8217;s phone number as their airline loyalty number.</p><p><em>Why it matters:</em> Once incorrect data enters long-term memory, it can affect many future tasks and become difficult to trace back to the original mistake.</p><h3><strong>3 Retrieval failures</strong></h3><p>The correct memory exists, but the agent fails to retrieve it when needed&#8230;</p><p><em>Example:</em> The user almost always books business class, but the agent fails to retrieve that preference and instead recommends economy.</p><p><em>Real-world example:</em> In July 2025, a Replit coding agent deleted a live production database during an active code freeze, even after the user repeatedly typed &#8220;DON&#8217;T DO IT.&#8221;</p><p>The instruction existed in the agent&#8217;s context, but the system failed to apply it correctly. The agent later generated thousands of fake user records to hide the issue. Replit&#8217;s CEO publicly apologized and later introduced an automatic separation between development and production environments.</p><p><em>Why it matters:</em> Stored memory is useless if retrieval, ranking, or relevance systems fail to surface the right information at the right moment.</p><p>Storing memory is easy. But managing it correctly over time is the hard part.</p><p>Preventing these three failures is a major part of building reliable AI agents&#8230;</p><div><hr></div><h2><strong>Remember the right things, NOT everything</strong></h2><p>Here&#8217;s a checklist for designing stateful, reliable AI agents:</p><ol><li><p><strong>Separate state from memory.</strong> State is the current workflow. Memory is short- and long-term knowledge.</p></li><li><p><strong>Checkpoint and roll back.</strong> Roll back only the steps affected by a correction. Keep the rest.</p></li><li><p><strong>Keep memory outside the prompt.</strong> Retrieve only what&#8217;s needed. Apply short-term and long-term rules differently.</p></li><li><p><strong>Budget context carefully.</strong> Treat it as a working set. Summarize old checkpoints. Drop stale summaries.</p></li><li><p><strong>Let the system of record win.</strong> External systems override stored memory when they disagree.</p></li><li><p><strong>Balance cost, latency, and reliability.</strong> Spend precision where mistakes are expensive.</p></li><li><p><strong>Monitor in production.</strong> Track context usage, retrieval hits, tool calls, and response times.</p></li><li><p><strong>Add safeguards.</strong> Freshness rules, clear approval before writing to long-term memory, scope-aware updates.</p></li></ol><p>When the three pieces work together, agents stop reacting and start adapting&#8230;</p><p>They survive failure, stay honest with real-world data, and scale without their context budget falling apart. The trick isn&#8217;t more memory or a bigger model. It&#8217;s knowing what to track, what to recall, and when to act.</p><div><hr></div><p>&#128075; I&#8217;d like to thank <strong><a href="http://www.linkedin.com/in/sivasankar-natarajan">Sivasankar</a></strong> for writing this newsletter!</p><p>He is a Technical Director and GenAI practitioner with over 20+ years of experience in architecting and building Big Data, Cloud, and GenAI solutions.</p><div><hr></div><p>Louis and I launched the <strong>GENERATIVE AI MASTERCLASS</strong> (newsletter series exclusive to PAID subscribers).</p><p>When you upgrade, you&#8217;ll get:</p><ul><li><p><strong>Simple breakdown of real-world architectures</strong></p></li><li><p>Frameworks you can plug into your work or business</p></li><li><p><strong>Proven systems behind ChatGPT, Perplexity, and Copilot</strong></p></li></ul><p><strong>&#128073; <a href="https://newsletter.systemdesign.one/subscribe?yearly=true">CLICK HERE TO JOIN THE GENERATIVE AI MASTERCLASS</a></strong></p><p>(Golden members will get the next Generative AI newsletter in the first week of June.)</p><div><hr></div><p>If you find this newsletter valuable, share it with a friend, and subscribe if you haven&#8217;t already. There are <a href="http://newsletter.systemdesign.one/subscribe?group=true">group discounts</a>, <a href="http://newsletter.systemdesign.one/subscribe?gift=true">gift options</a>, and <a href="https://newsletter.systemdesign.one/leaderboard">referral rewards</a> available.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.systemdesign.one/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://www.linkedin.com/in/nk-systemdesign-one/" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bEFk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 424w, https://substackcdn.com/image/fetch/$s_!bEFk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 848w, https://substackcdn.com/image/fetch/$s_!bEFk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 1272w, https://substackcdn.com/image/fetch/$s_!bEFk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bEFk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png" width="152" height="152" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:320,&quot;width&quot;:320,&quot;resizeWidth&quot;:152,&quot;bytes&quot;:74009,&quot;alt&quot;:&quot;Author Neo Kim; System design case studies&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://www.linkedin.com/in/nk-systemdesign-one/&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Author Neo Kim; System design case studies" title="Author Neo Kim; System design case studies" srcset="https://substackcdn.com/image/fetch/$s_!bEFk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 424w, https://substackcdn.com/image/fetch/$s_!bEFk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 848w, https://substackcdn.com/image/fetch/$s_!bEFk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 1272w, https://substackcdn.com/image/fetch/$s_!bEFk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption"><strong>&#128075; Find me on <a href="https://www.linkedin.com/in/nk-systemdesign-one/">LinkedIn</a> | <a href="https://x.com/intent/follow?screen_name=systemdesignone">Twitter</a> | <a href="https://www.threads.net/@systemdesignone">Threads</a> | <a href="https://www.instagram.com/systemdesignone/">Instagram</a></strong></figcaption></figure></div><div><hr></div><p><strong>Want to reach 210K+ tech professionals at scale? </strong>&#128240;</p><p>If your company wants to reach 210K+ tech professionals, <a href="https://newsletter.systemdesign.one/p/sponsorship">advertise with me</a>.</p><div><hr></div><p>Thank you for supporting this newsletter.</p><p>You are now 210,001+ readers strong, very close to 210k. Let&#8217;s try to get 211k readers by 29 May. Consider sharing this post with your friends and get rewards.</p><p>Y&#8217;all are the best.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6oWl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6oWl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 424w, https://substackcdn.com/image/fetch/$s_!6oWl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 848w, https://substackcdn.com/image/fetch/$s_!6oWl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 1272w, https://substackcdn.com/image/fetch/$s_!6oWl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6oWl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png" width="590" height="368.75" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2e739087-a910-4643-be36-997b6dd5b4af_800x500.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:500,&quot;width&quot;:800,&quot;resizeWidth&quot;:590,&quot;bytes&quot;:87878,&quot;alt&quot;:&quot;system design newsletter&quot;,&quot;title&quot;:&quot;system design newsletter&quot;,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/163380418?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="system design newsletter" title="system design newsletter" srcset="https://substackcdn.com/image/fetch/$s_!6oWl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 424w, https://substackcdn.com/image/fetch/$s_!6oWl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 848w, https://substackcdn.com/image/fetch/$s_!6oWl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 1272w, https://substackcdn.com/image/fetch/$s_!6oWl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/p/ai-agent-memory?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.systemdesign.one/p/ai-agent-memory?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><div><hr></div><h2><strong>References</strong></h2><ul><li><p>Anthropic Engineering. <em>Building effective agents.</em> December 2024. <a href="https://www.anthropic.com/research/building-effective-agents">https://www.anthropic.com/research/building-effective-agents</a></p></li><li><p>Anthropic Engineering. <em>Effective context engineering for AI agents.</em> <a href="https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents">https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents</a></p></li><li><p>Anthropic. <em>Tool use with Claude.</em> Claude API documentation. <a href="https://platform.claude.com/docs/en/agents-and-tools/tool-use/overview">https://platform.claude.com/docs/en/agents-and-tools/tool-use/overview</a></p></li><li><p>LangChain. <em>Memory overview.</em> LangGraph documentation. <a href="https://docs.langchain.com/oss/python/langgraph/memory">https://docs.langchain.com/oss/python/langgraph/memory</a></p></li><li><p>LangChain. <em>Persistence.</em> LangGraph documentation. <a href="https://docs.langchain.com/oss/python/langgraph/persistence">https://docs.langchain.com/oss/python/langgraph/persistence</a></p></li><li><p>Lewis, P., et al. <em>Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.</em> Facebook AI Research, 2020. <a href="https://arxiv.org/abs/2005.11401">https://arxiv.org/abs/2005.11401</a></p></li><li><p>Packer, C., et al. <em>MemGPT: Towards LLMs as Operating Systems.</em> UC Berkeley, 2023. <a href="https://arxiv.org/abs/2310.08560">https://arxiv.org/abs/2310.08560</a></p></li><li><p>Pinecone. <em>What is a Vector Database &amp; How Does it Work?</em> <a href="https://www.pinecone.io/learn/vector-database/">https://www.pinecone.io/learn/vector-database/</a></p></li><li><p>Chhikara, P., et al. <em>Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory.</em> 2025. <a href="https://arxiv.org/abs/2504.19413">https://arxiv.org/abs/2504.19413</a></p></li><li><p>Anthropic Engineering. <em>Advanced tool use.</em> <a href="https://www.anthropic.com/engineering/advanced-tool-use">https://www.anthropic.com/engineering/advanced-tool-use</a></p></li><li><p>Anthropic Engineering. <em>Building effective agents.</em> December 2024. <a href="https://www.anthropic.com/research/building-effective-agents">https://www.anthropic.com/research/building-effective-agents</a></p></li><li><p>LangChain. <em>Memory for agents.</em> LangChain Blog, 2024. <a href="https://www.langchain.com/blog/memory-for-agents">https://www.langchain.com/blog/memory-for-agents</a></p></li><li><p>LangChain. <em>Is LangGraph used in production?</em> LangChain Blog. <a href="https://www.langchain.com/blog/is-langgraph-used-in-production">https://www.langchain.com/blog/is-langgraph-used-in-production</a></p></li><li><p>AI Incident Database. <em>Incident 1152: Replit AI agent deleted production database.</em> <a href="https://incidentdatabase.ai/cite/1152/">https://incidentdatabase.ai/cite/1152/</a></p></li></ul><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p><strong>AI agent</strong> is a system that uses AI to understand goals, make decisions, and take actions to complete tasks.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p><strong>Orchestration framework.</strong> A library that manages how an agent&#8217;s steps run, how state is stored between them, and how failures are handled. LangGraph is one common example.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><ul><li><p><strong>LangGraph</strong> is a framework for building AI agents and multi-step workflows with language models. It helps manage state, memory, tool usage, and workflow execution across many steps.</p></li><li><p><strong>SqliteSaver</strong> is a built-in LangGraph component that saves agent state and checkpoints in a local SQLite database. It is useful for development or smaller applications.</p></li><li><p><strong>PostgresSaver</strong> is a built-in LangGraph component that saves agent state and checkpoints in a PostgreSQL database. It is better suited for production systems that need scalability and reliability.</p></li></ul></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-4" href="#footnote-anchor-4" class="footnote-number" contenteditable="false" target="_self">4</a><div class="footnote-content"><p><strong>Checkpointing.</strong> Saving the agent&#8217;s state at safe points so a crash, timeout, or redeploy doesn&#8217;t lose progress. On restart, the agent reloads the last checkpoint and continues.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-5" href="#footnote-anchor-5" class="footnote-number" contenteditable="false" target="_self">5</a><div class="footnote-content"><p><strong>LinkedIn&#8217;s SQL Bot</strong> is an internal AI assistant that lets employees ask questions in natural language and automatically converts them into SQL queries to retrieve data from LinkedIn&#8217;s data warehouse.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-6" href="#footnote-anchor-6" class="footnote-number" contenteditable="false" target="_self">6</a><div class="footnote-content"><p><strong>Large Language Model (LLM).</strong> A neural network trained on large amounts of text to predict the next token in a sequence. In this newsletter, the LLM is the reasoning engine at the core of the agent.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-7" href="#footnote-anchor-7" class="footnote-number" contenteditable="false" target="_self">7</a><div class="footnote-content"><p><strong>Vector database</strong> is a database designed to store and search embeddings - numerical representations of text, images, or other data created by AI models. Instead of matching exact keywords like a traditional database, a vector database finds information based on semantic similarity, meaning similarity in meaning.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-8" href="#footnote-anchor-8" class="footnote-number" contenteditable="false" target="_self">8</a><div class="footnote-content"><p><strong>Mem0</strong> is an open-source memory system for AI agents and AI assistants. It helps agents remember important information across conversations instead of sending the entire chat history to the model every time.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-9" href="#footnote-anchor-9" class="footnote-number" contenteditable="false" target="_self">9</a><div class="footnote-content"><p><strong>LOCOMO.</strong> A public benchmark of long, multi-session conversations used to test how well a system can answer questions about things said much earlier in the dialog.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-10" href="#footnote-anchor-10" class="footnote-number" contenteditable="false" target="_self">10</a><div class="footnote-content"><p><strong>p95 latency.</strong> Time it takes to complete an operation in 95 out of 100 calls. It describes the slow tail of the system, not the average.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-11" href="#footnote-anchor-11" class="footnote-number" contenteditable="false" target="_self">11</a><div class="footnote-content"><p><strong>Full-context baseline.</strong> Putting the entire conversation history into the prompt on every call, with no picking or shortening. Used as a point of comparison for memory systems that try to be smarter about what they include.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-12" href="#footnote-anchor-12" class="footnote-number" contenteditable="false" target="_self">12</a><div class="footnote-content"><p><strong>LLM-as-judge evaluation.</strong> Using another language model to score the quality of answers, instead of a human rater.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-13" href="#footnote-anchor-13" class="footnote-number" contenteditable="false" target="_self">13</a><div class="footnote-content"><p><strong>RAG</strong>: retrieval-augmented generation, where an agent fetches relevant documents at query time.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-14" href="#footnote-anchor-14" class="footnote-number" contenteditable="false" target="_self">14</a><div class="footnote-content"><p><strong>System of record.</strong> The external system that holds the authoritative version of a piece of data: a booking system, a pricing API, an inventory database. In a conflict with stored memory, the system of record always wins.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-15" href="#footnote-anchor-15" class="footnote-number" contenteditable="false" target="_self">15</a><div class="footnote-content"><p><strong>Memory rot</strong> happens when the agent saves too many weak or temporary signals as permanent knowledge.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-16" href="#footnote-anchor-16" class="footnote-number" contenteditable="false" target="_self">16</a><div class="footnote-content"><p><strong>Context window.</strong> The maximum number of tokens an LLM can process in one request. Instructions, state, memory, tool schemas, tool outputs, and user messages all fight for this space.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-17" href="#footnote-anchor-17" class="footnote-number" contenteditable="false" target="_self">17</a><div class="footnote-content"><p><strong>Tool definition</strong> is a description of a tool that tells the AI agent:</p><ul><li><p>What the tool does</p></li><li><p>What inputs it accept</p></li><li><p>What output it return</p></li><li><p>When and how to use it</p></li></ul><p>You can think of it like an API contract or a function specification for the model.</p><p></p></div></div>]]></content:encoded></item><item><title><![CDATA[How ML Systems Actually Work: From Data to Deployment]]></title><description><![CDATA[#146: From Embeddings to Feature Stores - 38 Concepts That Power Netflix, Uber, and Spotify]]></description><link>https://newsletter.systemdesign.one/p/machine-learning-system-design-interview</link><guid isPermaLink="false">https://newsletter.systemdesign.one/p/machine-learning-system-design-interview</guid><dc:creator><![CDATA[Paolo Perrone]]></dc:creator><pubDate>Tue, 12 May 2026 09:02:04 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/debd753d-7c69-4f28-a82d-c7eaedc64177_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Get my system design playbook for FREE on newsletter signup:</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><ul><li><p><em><a href="https://newsletter.systemdesign.one/p/machine-learning-system-design-interview/?action=share">Share this post</a> &amp; I'll send you some rewards for the referrals.</em></p></li></ul><div><hr></div><p>You&#8217;ve probably walked into a machine learning (ML) system design interview, been asked to <em>&#8220;design a recommendation system,&#8221;</em> and suddenly everyone&#8217;s throwing around terms like embeddings, feature stores, and model drift, and you&#8217;re nodding along while silently drowning.</p><p>Most prep resources offer algorithm deep-dives or LeetCode-style drills.</p><p>But first, what you actually need is a solid understanding of how ML systems work: <em>vocabulary, building blocks, and how they fit together.</em></p><p>This newsletter is a plain-English field guide to ML system design.</p><p>You don&#8217;t need to know TensorFlow, write Python, or understand gradient descent math. Just what these terms mean in practice, so you finally understand how these systems actually work.</p><p>Along the way, we&#8217;ll trace how real systems (like Netflix recommendations, Uber&#8217;s ETA predictions, and Spotify&#8217;s Discover Weekly) actually work under the hood. Don&#8217;t worry about memorizing every detail; this guide exists so your next ML interview feels less like an interrogation and more like a conversation.</p><p>Let&#8217;s start with the question every ML interview is really asking.</p><div><hr></div><h2><a href="https://agentfield.ai/github/?utm_source=sys_design&amp;utm_medium=newsletter&amp;utm_campaign=sys_design-060512&amp;utm_id=sys_design-060512-github-cta&amp;utm_content=github-cta">Treat coding agents like services, not terminals (Partner)</a></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://agentfield.ai/github/?utm_source=sys_design&amp;utm_medium=newsletter&amp;utm_campaign=sys_design-060512&amp;utm_id=sys_design-060512-github-cta&amp;utm_content=github-cta" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!66XB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47682dab-1a18-45f2-b475-7ccb4dd5a3ac_2048x1374.webp 424w, https://substackcdn.com/image/fetch/$s_!66XB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47682dab-1a18-45f2-b475-7ccb4dd5a3ac_2048x1374.webp 848w, https://substackcdn.com/image/fetch/$s_!66XB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47682dab-1a18-45f2-b475-7ccb4dd5a3ac_2048x1374.webp 1272w, https://substackcdn.com/image/fetch/$s_!66XB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47682dab-1a18-45f2-b475-7ccb4dd5a3ac_2048x1374.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!66XB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47682dab-1a18-45f2-b475-7ccb4dd5a3ac_2048x1374.webp" width="1456" height="977" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/47682dab-1a18-45f2-b475-7ccb4dd5a3ac_2048x1374.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:977,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:462800,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/webp&quot;,&quot;href&quot;:&quot;https://agentfield.ai/github/?utm_source=sys_design&amp;utm_medium=newsletter&amp;utm_campaign=sys_design-060512&amp;utm_id=sys_design-060512-github-cta&amp;utm_content=github-cta&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/179236490?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47682dab-1a18-45f2-b475-7ccb4dd5a3ac_2048x1374.webp&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!66XB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47682dab-1a18-45f2-b475-7ccb4dd5a3ac_2048x1374.webp 424w, https://substackcdn.com/image/fetch/$s_!66XB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47682dab-1a18-45f2-b475-7ccb4dd5a3ac_2048x1374.webp 848w, https://substackcdn.com/image/fetch/$s_!66XB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47682dab-1a18-45f2-b475-7ccb4dd5a3ac_2048x1374.webp 1272w, https://substackcdn.com/image/fetch/$s_!66XB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47682dab-1a18-45f2-b475-7ccb4dd5a3ac_2048x1374.webp 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>You use Claude Code, Codex, Cursor, or Gemini in your terminal every day. What if you could call them in code?</p><p><a href="https://agentfield.ai/github/?utm_source=sys_design&amp;utm_medium=newsletter&amp;utm_campaign=sys_design-060512&amp;utm_id=sys_design-060512-github-cta&amp;utm_content=github-cta">AgentField</a> just open-sourced <strong>harness orchestration</strong>.</p><ul><li><p>Call those same coding agents programmatically from Python, TypeScript, or Go.</p></li><li><p>Compose them into real automations: ship a PR, audit a cloud, run a code review - end-to-end pipelines, not toy demos.</p></li></ul><p>For the architecture rationale, their latest post - <em><a href="https://agentfield.ai/blog/harness-as-membrane/?utm_source=sys_design&amp;utm_medium=newsletter&amp;utm_campaign=sys_design-060512&amp;utm_id=sys_design-060512-blog-h-as-m&amp;utm_content=blog-h-as-m">harness-as-membrane</a></em> - is worth a read.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://agentfield.ai/github/?utm_source=sys_design&amp;utm_medium=newsletter&amp;utm_campaign=sys_design-060512&amp;utm_id=sys_design-060512-github-cta&amp;utm_content=github-cta&quot;,&quot;text&quot;:&quot;Fork and build&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://agentfield.ai/github/?utm_source=sys_design&amp;utm_medium=newsletter&amp;utm_campaign=sys_design-060512&amp;utm_id=sys_design-060512-github-cta&amp;utm_content=github-cta"><span>Fork and build</span></a></p><p>(Thanks to <a href="https://agentfield.ai/github/?utm_source=sys_design&amp;utm_medium=newsletter&amp;utm_campaign=sys_design-060512&amp;utm_id=sys_design-060512-github-cta&amp;utm_content=github-cta">AgentField</a> for partnering on this post.)</p><div><hr></div><p>I want to introduce <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Paolo Perrone&quot;,&quot;id&quot;:12567301,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d679f269-74f8-4ceb-971d-e06664411a38_742x465.png&quot;,&quot;uuid&quot;:&quot;abb2b322-4c05-4fa7-a521-286fa4edc445&quot;}" data-component-name="MentionToDOM"></span> as a guest author.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://theaiengineer.substack.com/welcome" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pss2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd058b6c7-0ac0-4749-8f5d-1eee2de48ede_1130x512.png 424w, https://substackcdn.com/image/fetch/$s_!pss2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd058b6c7-0ac0-4749-8f5d-1eee2de48ede_1130x512.png 848w, https://substackcdn.com/image/fetch/$s_!pss2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd058b6c7-0ac0-4749-8f5d-1eee2de48ede_1130x512.png 1272w, https://substackcdn.com/image/fetch/$s_!pss2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd058b6c7-0ac0-4749-8f5d-1eee2de48ede_1130x512.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pss2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd058b6c7-0ac0-4749-8f5d-1eee2de48ede_1130x512.png" width="1130" height="512" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d058b6c7-0ac0-4749-8f5d-1eee2de48ede_1130x512.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:512,&quot;width&quot;:1130,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:&quot;https://theaiengineer.substack.com/welcome&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!pss2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd058b6c7-0ac0-4749-8f5d-1eee2de48ede_1130x512.png 424w, https://substackcdn.com/image/fetch/$s_!pss2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd058b6c7-0ac0-4749-8f5d-1eee2de48ede_1130x512.png 848w, https://substackcdn.com/image/fetch/$s_!pss2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd058b6c7-0ac0-4749-8f5d-1eee2de48ede_1130x512.png 1272w, https://substackcdn.com/image/fetch/$s_!pss2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd058b6c7-0ac0-4749-8f5d-1eee2de48ede_1130x512.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>He&#8217;s an ML engineer with 8+ years of production AI experience, read by 1M+ AI/ML engineers on LinkedIn and Substack. He runs a tech content agency serving NVIDIA, Google, and Y Combinator, and writes <strong><a href="https://theaiengineer.substack.com/welcome">The AI Engineer</a></strong>, the newsletter that makes engineers dangerously good at AI engineering.</p><div><hr></div><h2><strong>What is an ML System?</strong></h2><p>Traditional software runs on rules you write by hand.</p><p>If the temperature is above 90&#176;F, turn on the AC. If the user clicks &#8220;buy,&#8221; charge the card. Every decision is hardcoded in code.</p><p>ML systems work differently.</p><p>Instead of writing rules, you feed the system examples (thousands or millions of them), and it figures out the rules on its own. Show it a million past rides, and it learns to predict how long your Uber will take. Show it billions of watch sessions, and it learns which movie to recommend next.</p><p>This sounds magical, but it&#8217;s just data engineering. Data flows in, patterns get extracted, predictions flow out, and the system keeps learning from its own results.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ewAN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2eb3f62-4db9-4e39-be8f-70728cc4ffc0_4452x585.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ewAN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2eb3f62-4db9-4e39-be8f-70728cc4ffc0_4452x585.png 424w, https://substackcdn.com/image/fetch/$s_!ewAN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2eb3f62-4db9-4e39-be8f-70728cc4ffc0_4452x585.png 848w, https://substackcdn.com/image/fetch/$s_!ewAN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2eb3f62-4db9-4e39-be8f-70728cc4ffc0_4452x585.png 1272w, https://substackcdn.com/image/fetch/$s_!ewAN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2eb3f62-4db9-4e39-be8f-70728cc4ffc0_4452x585.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ewAN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2eb3f62-4db9-4e39-be8f-70728cc4ffc0_4452x585.png" width="1456" height="191" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b2eb3f62-4db9-4e39-be8f-70728cc4ffc0_4452x585.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:191,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:138975,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/195463236?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2eb3f62-4db9-4e39-be8f-70728cc4ffc0_4452x585.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ewAN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2eb3f62-4db9-4e39-be8f-70728cc4ffc0_4452x585.png 424w, https://substackcdn.com/image/fetch/$s_!ewAN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2eb3f62-4db9-4e39-be8f-70728cc4ffc0_4452x585.png 848w, https://substackcdn.com/image/fetch/$s_!ewAN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2eb3f62-4db9-4e39-be8f-70728cc4ffc0_4452x585.png 1272w, https://substackcdn.com/image/fetch/$s_!ewAN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2eb3f62-4db9-4e39-be8f-70728cc4ffc0_4452x585.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Every concept in this newsletter maps to a stage in this pipeline.</p><p>Master the pipeline, master the interview&#8230;</p><div><hr></div><h2><strong>Raw Material: How Data Becomes Intelligence</strong></h2><blockquote><p><em>You can have the fanciest model in the world. If your data is garbage, your predictions will be garbage.</em></p></blockquote><h3><strong>1. Features</strong></h3><p>A <strong>feature</strong> is a single, measurable piece of information model uses to make predictions.</p><p>Think of features as the columns in a spreadsheet that describe each example.</p><p>If you&#8217;re predicting whether someone will click on an ad, your features might include the user&#8217;s age, the time of day, the device they&#8217;re using, and how many ads they&#8217;ve already seen this session. Each of these gives the model a different angle on the same question.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CBM7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F822f0b1f-d688-4fb0-8ba0-bec0204098c5_2368x2205.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CBM7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F822f0b1f-d688-4fb0-8ba0-bec0204098c5_2368x2205.png 424w, https://substackcdn.com/image/fetch/$s_!CBM7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F822f0b1f-d688-4fb0-8ba0-bec0204098c5_2368x2205.png 848w, https://substackcdn.com/image/fetch/$s_!CBM7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F822f0b1f-d688-4fb0-8ba0-bec0204098c5_2368x2205.png 1272w, https://substackcdn.com/image/fetch/$s_!CBM7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F822f0b1f-d688-4fb0-8ba0-bec0204098c5_2368x2205.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CBM7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F822f0b1f-d688-4fb0-8ba0-bec0204098c5_2368x2205.png" width="420" height="391.15384615384613" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/822f0b1f-d688-4fb0-8ba0-bec0204098c5_2368x2205.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1356,&quot;width&quot;:1456,&quot;resizeWidth&quot;:420,&quot;bytes&quot;:238196,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/195463236?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F822f0b1f-d688-4fb0-8ba0-bec0204098c5_2368x2205.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!CBM7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F822f0b1f-d688-4fb0-8ba0-bec0204098c5_2368x2205.png 424w, https://substackcdn.com/image/fetch/$s_!CBM7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F822f0b1f-d688-4fb0-8ba0-bec0204098c5_2368x2205.png 848w, https://substackcdn.com/image/fetch/$s_!CBM7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F822f0b1f-d688-4fb0-8ba0-bec0204098c5_2368x2205.png 1272w, https://substackcdn.com/image/fetch/$s_!CBM7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F822f0b1f-d688-4fb0-8ba0-bec0204098c5_2368x2205.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In Spotify&#8217;s Discover Weekly, features might include your most-played genres, the time of day you usually listen, and how often you skip songs. The model doesn&#8217;t &#8220;hear&#8221; the music. Instead, it reads the features.</p><p>The quality of your features often matters more than the sophistication of your model. Which raises the question:<em> where do good features come from?</em></p><h3><strong>2. Feature Engineering</strong></h3><p><strong>Feature engineering</strong> is the process of transforming raw data into features that help a model learn better.</p><p>Raw data is messy and unstructured. Feature engineering is about cleaning it up, reshaping it, and extracting the signal.</p><p>Think of it like preparing ingredients before cooking. You don&#8217;t throw a whole chicken into the pan. You cut it, season it, and lay it out first.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cAy4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f0855fc-6cbc-4f3f-a9e5-f7c73cf5ac43_4686x824.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cAy4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f0855fc-6cbc-4f3f-a9e5-f7c73cf5ac43_4686x824.png 424w, https://substackcdn.com/image/fetch/$s_!cAy4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f0855fc-6cbc-4f3f-a9e5-f7c73cf5ac43_4686x824.png 848w, https://substackcdn.com/image/fetch/$s_!cAy4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f0855fc-6cbc-4f3f-a9e5-f7c73cf5ac43_4686x824.png 1272w, https://substackcdn.com/image/fetch/$s_!cAy4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f0855fc-6cbc-4f3f-a9e5-f7c73cf5ac43_4686x824.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cAy4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f0855fc-6cbc-4f3f-a9e5-f7c73cf5ac43_4686x824.png" width="1456" height="256" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9f0855fc-6cbc-4f3f-a9e5-f7c73cf5ac43_4686x824.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:256,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:176350,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/195463236?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f0855fc-6cbc-4f3f-a9e5-f7c73cf5ac43_4686x824.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cAy4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f0855fc-6cbc-4f3f-a9e5-f7c73cf5ac43_4686x824.png 424w, https://substackcdn.com/image/fetch/$s_!cAy4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f0855fc-6cbc-4f3f-a9e5-f7c73cf5ac43_4686x824.png 848w, https://substackcdn.com/image/fetch/$s_!cAy4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f0855fc-6cbc-4f3f-a9e5-f7c73cf5ac43_4686x824.png 1272w, https://substackcdn.com/image/fetch/$s_!cAy4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f0855fc-6cbc-4f3f-a9e5-f7c73cf5ac43_4686x824.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>For example, Uber&#8217;s raw data might say <code>&#8220;ride requested at 2024-12-25 08:47:32 UTC.&#8221;</code></p><p>Feature engineering transforms that into: day of week (Wednesday), time of day (morning rush), holiday (yes), weather (rainy). Each transformation gives the model a more useful signal than the raw timestamp alone.</p><p>Sometimes the most powerful features aren&#8217;t in the raw data at all. A feature like <em>&#8220;average number of rides this user takes per week&#8221;</em> requires aggregation across many records. That kind of creativity is what separates good feature engineering from great ones.</p><p>But features alone aren&#8217;t enough. The model also needs to know what the &#8220;right answer&#8221; looks like.</p><h3><strong>3. Labels &amp; Ground Truth</strong></h3><p>A <strong>label</strong> is the answer you&#8217;re training the model to predict. <strong>Ground truth</strong> is what actually happened in the real world.</p><p>If you&#8217;re building a spam filter, the label for each email is &#8220;spam&#8221; or &#8220;not spam.&#8221; If you&#8217;re predicting home prices, the label is the actual sale price. The model learns by comparing its guesses against these known answers.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rz1R!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89296f27-68e3-4dce-9026-8740b013e686_5039x1530.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rz1R!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89296f27-68e3-4dce-9026-8740b013e686_5039x1530.png 424w, https://substackcdn.com/image/fetch/$s_!rz1R!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89296f27-68e3-4dce-9026-8740b013e686_5039x1530.png 848w, https://substackcdn.com/image/fetch/$s_!rz1R!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89296f27-68e3-4dce-9026-8740b013e686_5039x1530.png 1272w, https://substackcdn.com/image/fetch/$s_!rz1R!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89296f27-68e3-4dce-9026-8740b013e686_5039x1530.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rz1R!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89296f27-68e3-4dce-9026-8740b013e686_5039x1530.png" width="1456" height="442" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/89296f27-68e3-4dce-9026-8740b013e686_5039x1530.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:442,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:269939,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/195463236?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89296f27-68e3-4dce-9026-8740b013e686_5039x1530.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!rz1R!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89296f27-68e3-4dce-9026-8740b013e686_5039x1530.png 424w, https://substackcdn.com/image/fetch/$s_!rz1R!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89296f27-68e3-4dce-9026-8740b013e686_5039x1530.png 848w, https://substackcdn.com/image/fetch/$s_!rz1R!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89296f27-68e3-4dce-9026-8740b013e686_5039x1530.png 1272w, https://substackcdn.com/image/fetch/$s_!rz1R!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89296f27-68e3-4dce-9026-8740b013e686_5039x1530.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Ground truth sounds simple, but it&#8217;s often surprisingly hard to get right.</p><p>When Netflix asks <em>&#8220;did the user enjoy this movie?&#8221;,</em> the ground truth is ambiguous. Did they watch to the end? Did they rate it? Did they watch something similar next? Different definitions of ground truth lead to different model behaviors.</p><blockquote><p>&#128161; How you define your label shapes what the model actually learns. Choose carefully. Pick the wrong definition of success, and the model will optimize for the wrong thing.</p></blockquote><p>With features and labels defined, the next step is splitting the data so the model can actually learn and prove it learned the right things.</p><h3><strong>4. Training, Validation &amp; Test Sets</strong></h3><p>You don&#8217;t test a student with the same questions they practiced on.</p><p>The same principle applies to ML. The data gets split into three groups, each with a specific job.</p><ol><li><p><strong>Training set</strong> is what the model actually learns from. It sees these examples during training.</p></li><li><p><strong>Validation set</strong> is your mid-training checkpoint. It tells you which model version performs best.</p></li><li><p><strong>Test set</strong> is the final exam. If the model performs well here, it&#8217;s ready for the real world.</p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bgKu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F555a0173-02bf-4359-b0cd-8bf5b78cad8e_2573x1570.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bgKu!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F555a0173-02bf-4359-b0cd-8bf5b78cad8e_2573x1570.png 424w, https://substackcdn.com/image/fetch/$s_!bgKu!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F555a0173-02bf-4359-b0cd-8bf5b78cad8e_2573x1570.png 848w, https://substackcdn.com/image/fetch/$s_!bgKu!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F555a0173-02bf-4359-b0cd-8bf5b78cad8e_2573x1570.png 1272w, https://substackcdn.com/image/fetch/$s_!bgKu!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F555a0173-02bf-4359-b0cd-8bf5b78cad8e_2573x1570.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bgKu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F555a0173-02bf-4359-b0cd-8bf5b78cad8e_2573x1570.png" width="521" height="317.75274725274727" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/555a0173-02bf-4359-b0cd-8bf5b78cad8e_2573x1570.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:888,&quot;width&quot;:1456,&quot;resizeWidth&quot;:521,&quot;bytes&quot;:186263,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/195463236?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F555a0173-02bf-4359-b0cd-8bf5b78cad8e_2573x1570.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bgKu!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F555a0173-02bf-4359-b0cd-8bf5b78cad8e_2573x1570.png 424w, https://substackcdn.com/image/fetch/$s_!bgKu!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F555a0173-02bf-4359-b0cd-8bf5b78cad8e_2573x1570.png 848w, https://substackcdn.com/image/fetch/$s_!bgKu!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F555a0173-02bf-4359-b0cd-8bf5b78cad8e_2573x1570.png 1272w, https://substackcdn.com/image/fetch/$s_!bgKu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F555a0173-02bf-4359-b0cd-8bf5b78cad8e_2573x1570.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Most teams split their data 70/15/15 or 80/10/10.</p><p>The exact ratio depends on how much data you have, but one rule is non-negotiable: test set must never influence model development. Otherwise, you&#8217;ve let the student peek at the final exam.</p><p>At Netflix, this might mean training on viewing data from January through October, validating in November, and testing in December. This time-based split is especially important for ML systems because real-world data changes over time, a pattern we&#8217;ll revisit in Section 5.</p><p>Now that we have our data organized, there&#8217;s one more data problem that trips up many real-world systems.</p><h3><strong>5. Class Imbalance &amp; Data Leakage</strong></h3><p><strong>Class imbalance</strong> happens when one outcome is far more common than the others.</p><p>In fraud detection, maybe 1 in 10,000 transactions is fraudulent. In medical diagnosis, most patients don&#8217;t have the rare disease. The model can achieve 99.99% accuracy by simply predicting &#8220;not fraud&#8221; every time and still be completely useless.</p><p>Fixing class imbalance usually involves techniques like oversampling the rare class, undersampling the common class, or adjusting the model&#8217;s loss function to penalize mistakes on rare examples more heavily.</p><p><strong>Data leakage</strong> occurs when the answer is hidden within the training data.</p><p>You&#8217;re building a model to predict late deliveries, but your dataset includes &#8220;customer complaint filed.&#8221; That complaint only exists because the delivery was already late. Of course, the model looks accurate. Remove that feature, and it&#8217;s useless.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!s70K!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdaef5547-0cc8-4aac-8b74-e3897aeedadf_900x400.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!s70K!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdaef5547-0cc8-4aac-8b74-e3897aeedadf_900x400.png 424w, https://substackcdn.com/image/fetch/$s_!s70K!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdaef5547-0cc8-4aac-8b74-e3897aeedadf_900x400.png 848w, https://substackcdn.com/image/fetch/$s_!s70K!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdaef5547-0cc8-4aac-8b74-e3897aeedadf_900x400.png 1272w, https://substackcdn.com/image/fetch/$s_!s70K!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdaef5547-0cc8-4aac-8b74-e3897aeedadf_900x400.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!s70K!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdaef5547-0cc8-4aac-8b74-e3897aeedadf_900x400.png" width="900" height="400" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/daef5547-0cc8-4aac-8b74-e3897aeedadf_900x400.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:400,&quot;width&quot;:900,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:19106,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/195463236?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdaef5547-0cc8-4aac-8b74-e3897aeedadf_900x400.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!s70K!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdaef5547-0cc8-4aac-8b74-e3897aeedadf_900x400.png 424w, https://substackcdn.com/image/fetch/$s_!s70K!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdaef5547-0cc8-4aac-8b74-e3897aeedadf_900x400.png 848w, https://substackcdn.com/image/fetch/$s_!s70K!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdaef5547-0cc8-4aac-8b74-e3897aeedadf_900x400.png 1272w, https://substackcdn.com/image/fetch/$s_!s70K!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdaef5547-0cc8-4aac-8b74-e3897aeedadf_900x400.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><blockquote><p>&#128161; If your model scores above 95% on the first try, check for leakage. If it scores above 99%, check for class imbalance too.</p></blockquote><p>Clean data is only useful if it actually reaches your model. That&#8217;s where pipelines come in.</p><h3><strong>6. Data Pipelines &amp; ETL</strong></h3><p><strong>Data pipelines</strong> are automated workflows that move data from its raw sources (databases, event logs, third-party APIs) through a series of transformations and into a format the model can use.</p><p><strong>ETL</strong> stands for Extract, Transform, Load. Extract the raw data from its source.</p><p>Transform: clean, join, aggregate, and engineer features from it. Load the results into a feature store or a training dataset.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SV45!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4fc3e29-9d95-4c17-a252-ca24be2f50ed_4446x2080.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SV45!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4fc3e29-9d95-4c17-a252-ca24be2f50ed_4446x2080.png 424w, https://substackcdn.com/image/fetch/$s_!SV45!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4fc3e29-9d95-4c17-a252-ca24be2f50ed_4446x2080.png 848w, https://substackcdn.com/image/fetch/$s_!SV45!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4fc3e29-9d95-4c17-a252-ca24be2f50ed_4446x2080.png 1272w, https://substackcdn.com/image/fetch/$s_!SV45!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4fc3e29-9d95-4c17-a252-ca24be2f50ed_4446x2080.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SV45!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4fc3e29-9d95-4c17-a252-ca24be2f50ed_4446x2080.png" width="1456" height="681" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e4fc3e29-9d95-4c17-a252-ca24be2f50ed_4446x2080.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:681,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:341799,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/195463236?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4fc3e29-9d95-4c17-a252-ca24be2f50ed_4446x2080.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!SV45!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4fc3e29-9d95-4c17-a252-ca24be2f50ed_4446x2080.png 424w, https://substackcdn.com/image/fetch/$s_!SV45!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4fc3e29-9d95-4c17-a252-ca24be2f50ed_4446x2080.png 848w, https://substackcdn.com/image/fetch/$s_!SV45!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4fc3e29-9d95-4c17-a252-ca24be2f50ed_4446x2080.png 1272w, https://substackcdn.com/image/fetch/$s_!SV45!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4fc3e29-9d95-4c17-a252-ca24be2f50ed_4446x2080.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This is the plumbing part of ML.</p><p>And like real plumbing, nobody thinks about it until something breaks. Not the model, not the algorithm. A dropped row, a null value, a late pipeline run. No errors thrown, just silently worse predictions.</p><p>Uber runs thousands of data pipelines daily. Everyone has monitoring, alerting, and quality checks. Bring this up in an interview, and you&#8217;ll immediately stand out as the guy who actually thinks about how ML works in practice.</p><p>Pipelines produce data, but how do you keep track of which data a given model was trained on?</p><h3><strong>7. Data Versioning</strong></h3><p><strong>Data versioning</strong> means tracking exactly which version of a dataset was used to train each model.</p><p>Just like software engineers use Git to track code changes, ML teams need to track data changes.</p><p><em>Why does this matter?</em> Imagine your model&#8217;s performance suddenly drops. Without data versioning, you have no way to answer: <em>&#8220;What changed? Was it the model or the data?&#8221; </em>With versioning, you can compare the current training data with previous versions and pinpoint exactly what changed.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qFcN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6eb35d9-1632-4c90-9271-4a7dff5d0819_5032x1090.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qFcN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6eb35d9-1632-4c90-9271-4a7dff5d0819_5032x1090.png 424w, https://substackcdn.com/image/fetch/$s_!qFcN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6eb35d9-1632-4c90-9271-4a7dff5d0819_5032x1090.png 848w, https://substackcdn.com/image/fetch/$s_!qFcN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6eb35d9-1632-4c90-9271-4a7dff5d0819_5032x1090.png 1272w, https://substackcdn.com/image/fetch/$s_!qFcN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6eb35d9-1632-4c90-9271-4a7dff5d0819_5032x1090.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qFcN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6eb35d9-1632-4c90-9271-4a7dff5d0819_5032x1090.png" width="1456" height="315" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d6eb35d9-1632-4c90-9271-4a7dff5d0819_5032x1090.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:315,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:324591,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/195463236?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6eb35d9-1632-4c90-9271-4a7dff5d0819_5032x1090.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qFcN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6eb35d9-1632-4c90-9271-4a7dff5d0819_5032x1090.png 424w, https://substackcdn.com/image/fetch/$s_!qFcN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6eb35d9-1632-4c90-9271-4a7dff5d0819_5032x1090.png 848w, https://substackcdn.com/image/fetch/$s_!qFcN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6eb35d9-1632-4c90-9271-4a7dff5d0819_5032x1090.png 1272w, https://substackcdn.com/image/fetch/$s_!qFcN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6eb35d9-1632-4c90-9271-4a7dff5d0819_5032x1090.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Data versioning also enables <strong>reproducibility</strong>, a requirement in regulated industries like finance and healthcare.</p><p>If an auditor asks, &#8220;Why did the model make this decision six months ago?&#8221;, you need to be able to recreate the exact model and training data from that point in time.</p><p>Tools like <strong>DVC</strong> (Data Version Control) and MLflow help teams manage this.</p><p>In an interview, mentioning data versioning shows you understand that ML systems aren&#8217;t just about models. They&#8217;re about the entire lifecycle of data and code working together.</p><p>With clean, well-structured, and tracked data in hand, it&#8217;s time to teach the model how to find patterns in it.</p><div><hr></div><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/p/machine-learning-system-design-interview?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption"><em>Share this post &amp; get rewards for the referrals.</em></p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/p/machine-learning-system-design-interview?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.systemdesign.one/p/machine-learning-system-design-interview?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><div><hr></div><h2><strong>Pattern Machines: How Models Actually Learn</strong></h2><blockquote><p>A model doesn&#8217;t &#8220;know&#8221; anything. It finds patterns in data and uses them to make predictions.</p></blockquote><h3><strong>8. Model Training &amp; Loss Functions</strong></h3><p><strong>Model training</strong> is the process of showing the model examples and letting it adjust its internal settings until its predictions get closer to the ground truth.</p><p>Think of it as a student taking practice tests over and over, getting a little better each time.</p><p>The mechanism that drives improvement is the <strong>loss function</strong>. A loss function measures how wrong the model&#8217;s predictions are. After each batch of examples, the model calculates the loss (how far off it was) and adjusts its internal settings to reduce that number.</p><p>Over millions of examples, those small adjustments add up.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zsR_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4216f281-5cca-4a0b-a9a7-25992d4b09fc_5031x637.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zsR_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4216f281-5cca-4a0b-a9a7-25992d4b09fc_5031x637.png 424w, https://substackcdn.com/image/fetch/$s_!zsR_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4216f281-5cca-4a0b-a9a7-25992d4b09fc_5031x637.png 848w, https://substackcdn.com/image/fetch/$s_!zsR_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4216f281-5cca-4a0b-a9a7-25992d4b09fc_5031x637.png 1272w, https://substackcdn.com/image/fetch/$s_!zsR_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4216f281-5cca-4a0b-a9a7-25992d4b09fc_5031x637.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zsR_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4216f281-5cca-4a0b-a9a7-25992d4b09fc_5031x637.png" width="1456" height="184" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4216f281-5cca-4a0b-a9a7-25992d4b09fc_5031x637.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:184,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:167743,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/195463236?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4216f281-5cca-4a0b-a9a7-25992d4b09fc_5031x637.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zsR_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4216f281-5cca-4a0b-a9a7-25992d4b09fc_5031x637.png 424w, https://substackcdn.com/image/fetch/$s_!zsR_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4216f281-5cca-4a0b-a9a7-25992d4b09fc_5031x637.png 848w, https://substackcdn.com/image/fetch/$s_!zsR_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4216f281-5cca-4a0b-a9a7-25992d4b09fc_5031x637.png 1272w, https://substackcdn.com/image/fetch/$s_!zsR_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4216f281-5cca-4a0b-a9a7-25992d4b09fc_5031x637.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Different problems need different loss functions.</p><p>Predicting a number (like Uber&#8217;s ETA) might use mean squared error, penalizing big misses more than small ones. Classifying something (like spam vs. not spam) typically uses cross-entropy loss, which measures how confident the model was in the wrong answer.</p><p>The choice of loss function shapes what the model optimizes for. A model trained with mean squared error will try to be accurate on average. A model trained with mean absolute error will be more robust to outliers. Small decisions here have big downstream effects.</p><p>But what exactly is the model adjusting? That&#8217;s where parameters come in.</p><h3><strong>9. Parameters &amp; Hyperparameters</strong></h3><p><strong>Parameters</strong> are what the model actually learns.</p><p>Think of them as the model&#8217;s memory: every pattern gets encoded as a number. GPT-4 has hundreds of billions. A simple recommendation model, a few million. You don&#8217;t write them. The model discovers them on its own.</p><p><strong>Hyperparameters</strong> are the settings that control how the model trains.</p><p>You set them before training starts: how fast the model adjusts (learning rate), how many examples it processes at once (batch size), how many times it loops through the data (epochs), and the model&#8217;s architecture.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sg4t!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59b4f090-5343-4484-8587-ff1e0c52fecd_900x400.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sg4t!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59b4f090-5343-4484-8587-ff1e0c52fecd_900x400.png 424w, https://substackcdn.com/image/fetch/$s_!sg4t!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59b4f090-5343-4484-8587-ff1e0c52fecd_900x400.png 848w, https://substackcdn.com/image/fetch/$s_!sg4t!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59b4f090-5343-4484-8587-ff1e0c52fecd_900x400.png 1272w, https://substackcdn.com/image/fetch/$s_!sg4t!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59b4f090-5343-4484-8587-ff1e0c52fecd_900x400.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sg4t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59b4f090-5343-4484-8587-ff1e0c52fecd_900x400.png" width="900" height="400" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/59b4f090-5343-4484-8587-ff1e0c52fecd_900x400.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:400,&quot;width&quot;:900,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:24095,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/195463236?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59b4f090-5343-4484-8587-ff1e0c52fecd_900x400.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!sg4t!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59b4f090-5343-4484-8587-ff1e0c52fecd_900x400.png 424w, https://substackcdn.com/image/fetch/$s_!sg4t!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59b4f090-5343-4484-8587-ff1e0c52fecd_900x400.png 848w, https://substackcdn.com/image/fetch/$s_!sg4t!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59b4f090-5343-4484-8587-ff1e0c52fecd_900x400.png 1272w, https://substackcdn.com/image/fetch/$s_!sg4t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59b4f090-5343-4484-8587-ff1e0c52fecd_900x400.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>If the model is a student, parameters are the knowledge in their head, and hyperparameters are the study plan: <em>how many hours a day, which subjects first, and how many practice tests.</em></p><p>Bad hyperparameters can make a good model fail.</p><p>A learning rate that&#8217;s too high, and the model never converges. Too low, and it takes forever to train. Too few epochs and it underfits. Too many and it overfits. In practice, teams run many training experiments with different hyperparameters and compare results on the validation set.</p><p>There&#8217;s one type of parameter that powers almost every ML system you&#8217;ll encounter in interviews: embeddings.</p><h3><strong>10. Embeddings</strong></h3><p>An <strong>embedding</strong> is a way to represent something (a word, a user, a product) as a list of numbers that captures its meaning.</p><p>Instead of treating &#8220;cat&#8221; and &#8220;kitten&#8221; as completely different labels, embeddings place them close together in a vectorial space because they&#8217;re related concepts.</p><p>Think of it as a map. On a regular map, cities that are near each other geographically are plotted close together. In an embedding space, concepts that are similar in meaning are plotted close together. &#8220;Comedy&#8221; and &#8220;rom-com&#8221; would be neighbors; &#8220;horror&#8221; would be distant, but still in the same galaxy as other genres. &#8220;Accounting&#8221; would be in another universe.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CUND!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb95cf045-9819-4032-b782-816ea117eca9_700x500.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CUND!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb95cf045-9819-4032-b782-816ea117eca9_700x500.png 424w, https://substackcdn.com/image/fetch/$s_!CUND!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb95cf045-9819-4032-b782-816ea117eca9_700x500.png 848w, https://substackcdn.com/image/fetch/$s_!CUND!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb95cf045-9819-4032-b782-816ea117eca9_700x500.png 1272w, https://substackcdn.com/image/fetch/$s_!CUND!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb95cf045-9819-4032-b782-816ea117eca9_700x500.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CUND!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb95cf045-9819-4032-b782-816ea117eca9_700x500.png" width="700" height="500" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b95cf045-9819-4032-b782-816ea117eca9_700x500.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:500,&quot;width&quot;:700,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:24501,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/195463236?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb95cf045-9819-4032-b782-816ea117eca9_700x500.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!CUND!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb95cf045-9819-4032-b782-816ea117eca9_700x500.png 424w, https://substackcdn.com/image/fetch/$s_!CUND!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb95cf045-9819-4032-b782-816ea117eca9_700x500.png 848w, https://substackcdn.com/image/fetch/$s_!CUND!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb95cf045-9819-4032-b782-816ea117eca9_700x500.png 1272w, https://substackcdn.com/image/fetch/$s_!CUND!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb95cf045-9819-4032-b782-816ea117eca9_700x500.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Embeddings are everywhere in real ML systems.</p><p>Spotify uses song embeddings to find tracks that are musically similar to what you&#8217;ve been listening to. Netflix uses movie embeddings and user embeddings to match viewers with films they&#8217;re likely to enjoy.</p><p>Embeddings are how ML systems make sense of the real world. A user, a product, a sentence. Turn them into vectors, and suddenly you can calculate which ones are similar, which ones are related, and which ones to recommend.</p><p>With all this learning machinery, there&#8217;s a critical danger: the model might learn too well.</p><div class="callout-block" data-callout="true"><p><em><strong>Reminder: this is a teaser of the subscriber-only newsletter, exclusive to my golden members.</strong></em></p><p>When you upgrade, you&#8217;ll get:</p><ul><li><p><strong>High-level architecture of real-world systems.</strong></p></li><li><p>Deep dive into how popular real-world systems work.</p></li><li><p><strong>How real-world systems handle scale, reliability, and performance.</strong></p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe&quot;,&quot;text&quot;:&quot;Unlock Full Access&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://newsletter.systemdesign.one/subscribe"><span>Unlock Full Access</span></a></p><p>Let&#8217;s keep going!</p></div><h3><strong>11. Overfitting &amp; Regularization</strong></h3>
      <p>
          <a href="https://newsletter.systemdesign.one/p/machine-learning-system-design-interview">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[I struggled with mobile system design until I learned these 53 concepts]]></title><description><![CDATA[#145: Part 2 - Conflict resolution, certificate pinning, rendering performance, and 17 others.]]></description><link>https://newsletter.systemdesign.one/p/mobile-system-design-concepts</link><guid isPermaLink="false">https://newsletter.systemdesign.one/p/mobile-system-design-concepts</guid><dc:creator><![CDATA[Shefali Jangid]]></dc:creator><pubDate>Thu, 07 May 2026 09:22:29 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/a52a5973-7a78-4ca1-81bf-1e1c1d55883a_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Get my system design playbook for FREE on newsletter signup:</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><ul><li><p><em><a href="https://newsletter.systemdesign.one/p/mobile-system-design-concepts/?action=share">Share this post</a> &amp; I'll send you some rewards for the referrals.</em></p></li><li><p><em>Block diagrams created using <a href="https://app.eraser.io/auth/sign-up?ref=neo">Eraser</a>.</em></p></li></ul><div><hr></div><p>Onwards &#8216;n downwards:</p><p>Following is the second of a premium 3-part newsletter series&#8230;</p><p>If you design backend systems and want to avoid building APIs that break under flaky networks, retries, and real-world usage patterns&#8230; read this newsletter and start designing for production apps&#8230;</p><p>On with part 2 of the newsletter:</p><div><hr></div><h2><strong><a href="http://a0.to/1kh">Less prototype. More production. &#128640; (Partner)</a></strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="http://a0.to/1kh" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5MG0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcae9d50a-7a0d-4bd0-b5e8-3d70a9271254_1920x1080.png 424w, https://substackcdn.com/image/fetch/$s_!5MG0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcae9d50a-7a0d-4bd0-b5e8-3d70a9271254_1920x1080.png 848w, https://substackcdn.com/image/fetch/$s_!5MG0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcae9d50a-7a0d-4bd0-b5e8-3d70a9271254_1920x1080.png 1272w, https://substackcdn.com/image/fetch/$s_!5MG0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcae9d50a-7a0d-4bd0-b5e8-3d70a9271254_1920x1080.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5MG0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcae9d50a-7a0d-4bd0-b5e8-3d70a9271254_1920x1080.png" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cae9d50a-7a0d-4bd0-b5e8-3d70a9271254_1920x1080.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:148023,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;http://a0.to/1kh&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/195272243?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcae9d50a-7a0d-4bd0-b5e8-3d70a9271254_1920x1080.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!5MG0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcae9d50a-7a0d-4bd0-b5e8-3d70a9271254_1920x1080.png 424w, https://substackcdn.com/image/fetch/$s_!5MG0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcae9d50a-7a0d-4bd0-b5e8-3d70a9271254_1920x1080.png 848w, https://substackcdn.com/image/fetch/$s_!5MG0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcae9d50a-7a0d-4bd0-b5e8-3d70a9271254_1920x1080.png 1272w, https://substackcdn.com/image/fetch/$s_!5MG0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcae9d50a-7a0d-4bd0-b5e8-3d70a9271254_1920x1080.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The biggest hurdle for AI Agents in production is trust.</p><p>Hard-coded credentials work for a demo, but production agents need a way to handle critical actions&#8212;like making a purchase or sending a document&#8212;without over-privileged access.</p><p><strong><a href="http://a0.to/1kh">Auth0 for AI Agents</a></strong> features <strong>Asynchronous Authorization</strong> backchannel authentication (CIBA), so your agents can work autonomously in the background and only trigger a notification when human approval is required.</p><ul><li><p><strong>Rich Consent:</strong> Agents render specific authorization data so users know exactly what they are approving.</p></li><li><p><strong>Secure User Identity:</strong> Link agent actions directly to specific user preferences and history.</p></li></ul><p>Move past the &#8220;framework&#8221; stage and build agentic workflows that are secure by design.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;http://a0.to/1kh&quot;,&quot;text&quot;:&quot;Start building for $0&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="http://a0.to/1kh"><span>Start building for $0</span></a></p><p>And stop worrying about DIY-ing auth.</p><p>(Thanks to <strong><a href="http://a0.to/1kh">Auth0</a></strong> for partnering on this post.)</p><div><hr></div><p>In this newsletter, you&#8217;ll learn 20 more concepts:</p><ol start="20"><li><p>Conflict Resolution Strategies,</p></li><li><p>Delta Sync,</p></li><li><p>Eventual Consistency in Mobile Systems,</p></li><li><p>Background Sync &amp; Retry Queues,</p></li><li><p>Optimistic UI Updates,</p></li><li><p>Authentication (JWT/OAuth),</p></li><li><p>Secure Token Storage,</p></li><li><p>Certificate Pinning,</p></li><li><p>Biometric Authentication,</p></li><li><p>Code Obfuscation &amp; Tamper Detection,</p></li><li><p>Secure Deep Linking,</p></li><li><p>User Data Privacy,</p></li><li><p>App Lifecycle &amp; State Management,</p></li><li><p>Background Processing Limits,</p></li><li><p>Battery Optimisation,</p></li><li><p>Network Optimisation &amp; Rate Limiting,</p></li><li><p>Startup Time Optimisation,</p></li><li><p>Rendering Performance,</p></li><li><p>Memory Management &amp; Leak Prevention,</p></li><li><p>App Size Optimisation.</p></li></ol><p>(&#8230;and much more in part 3!)</p><p>For each concept, I&#8217;ll cover:</p><ul><li><p>What it is and how it works</p></li><li><p>Real-world example</p></li><li><p>The tradeoffs</p></li><li><p>Why it matters for mobile</p></li></ul><p>Let&#8217;s get into it&#8230;</p><div><hr></div><p>I want to reintroduce <strong><a href="https://x.com/Shefali__J">Shefali Jangid</a></strong> as a guest author.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://x.com/Shefali__J" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!grPx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe7703d2-7046-46fc-b7cb-4b0c2caac2e6_900x480.png 424w, https://substackcdn.com/image/fetch/$s_!grPx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe7703d2-7046-46fc-b7cb-4b0c2caac2e6_900x480.png 848w, https://substackcdn.com/image/fetch/$s_!grPx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe7703d2-7046-46fc-b7cb-4b0c2caac2e6_900x480.png 1272w, https://substackcdn.com/image/fetch/$s_!grPx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe7703d2-7046-46fc-b7cb-4b0c2caac2e6_900x480.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!grPx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe7703d2-7046-46fc-b7cb-4b0c2caac2e6_900x480.png" width="900" height="480" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/be7703d2-7046-46fc-b7cb-4b0c2caac2e6_900x480.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:480,&quot;width&quot;:900,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:&quot;https://x.com/Shefali__J&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!grPx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe7703d2-7046-46fc-b7cb-4b0c2caac2e6_900x480.png 424w, https://substackcdn.com/image/fetch/$s_!grPx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe7703d2-7046-46fc-b7cb-4b0c2caac2e6_900x480.png 848w, https://substackcdn.com/image/fetch/$s_!grPx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe7703d2-7046-46fc-b7cb-4b0c2caac2e6_900x480.png 1272w, https://substackcdn.com/image/fetch/$s_!grPx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe7703d2-7046-46fc-b7cb-4b0c2caac2e6_900x480.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>She&#8217;s a web developer, technical writer, and content creator with a love for frontend architecture and building things that scale.</p><p>Check out her work and socials:</p><ul><li><p><a href="https://shefali.dev/">Shefali.dev</a></p></li><li><p><a href="https://github.com/WebdevShefali">GitHub</a></p></li><li><p><a href="https://x.com/Shefali__J">Twitter</a></p></li></ul><p>You&#8217;ll often find her writing about web development, sharing UI tips, and building tools that make developers&#8217; lives easier.</p><div><hr></div><h1><strong>Data Sync</strong></h1><p>When apps store data on the device, they also need to keep that data in sync with the server.</p><p>This section explains how apps synchronize data, handle conflicts when changes occur across multiple places, and maintain app consistency even when the network connection is slow or unreliable.</p><h3><strong>20. Conflict Resolution Strategies</strong></h3><p>When mobile apps allow users to edit data offline, a new challenge appears: <em>conflicts</em>.</p><h4><strong>Why Conflicts Happen</strong></h4><p>Imagine two users editing the same document:</p><ol><li><p>User A edits the document while offline.</p></li><li><p>User B edits the same document at the same time.</p></li><li><p>Later, both devices reconnect and try to sync.</p></li></ol><p>Now, the server has two different versions of the same data. The system must decide how to merge or choose between them.</p><p>To solve this problem, apps use various conflict-resolution strategies.</p><h4><strong>Common Conflict Resolution Strategies</strong></h4><p>There are several ways apps handle these conflicts:</p><p><strong>a. Last-Write-Wins (LWW)</strong></p><p>This is the simplest approach.</p><p>Each update includes a &#8220;timestamp&#8221;. When two versions conflict, the system keeps the latest update and discards the older one.</p><p>For example:</p><ul><li><p>User A updates at 10:01 p.m.</p></li><li><p>User B updates at 10:02 p.m.</p></li></ul><p>Since User B&#8217;s change happened later, the system keeps that version.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!aBrj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95138bc4-c285-4d2e-b84f-b0d208006ef6_1231x497.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!aBrj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95138bc4-c285-4d2e-b84f-b0d208006ef6_1231x497.png 424w, https://substackcdn.com/image/fetch/$s_!aBrj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95138bc4-c285-4d2e-b84f-b0d208006ef6_1231x497.png 848w, https://substackcdn.com/image/fetch/$s_!aBrj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95138bc4-c285-4d2e-b84f-b0d208006ef6_1231x497.png 1272w, https://substackcdn.com/image/fetch/$s_!aBrj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95138bc4-c285-4d2e-b84f-b0d208006ef6_1231x497.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!aBrj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95138bc4-c285-4d2e-b84f-b0d208006ef6_1231x497.png" width="1231" height="497" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/95138bc4-c285-4d2e-b84f-b0d208006ef6_1231x497.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:497,&quot;width&quot;:1231,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!aBrj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95138bc4-c285-4d2e-b84f-b0d208006ef6_1231x497.png 424w, https://substackcdn.com/image/fetch/$s_!aBrj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95138bc4-c285-4d2e-b84f-b0d208006ef6_1231x497.png 848w, https://substackcdn.com/image/fetch/$s_!aBrj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95138bc4-c285-4d2e-b84f-b0d208006ef6_1231x497.png 1272w, https://substackcdn.com/image/fetch/$s_!aBrj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95138bc4-c285-4d2e-b84f-b0d208006ef6_1231x497.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This approach is simple to implement and works well for simple data, but earlier changes may be lost.</p><p><strong>b. Versioning</strong></p><p>Versioning tracks different versions of the same data.</p><p>Each update increases the version number or records which device made the change. When a conflict occurs, the system can compare the versions and either automatically merge them or ask the user to choose.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HIik!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ad6beb2-1bf2-41ba-ba6c-1a7a5c6a8f71_1379x468.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HIik!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ad6beb2-1bf2-41ba-ba6c-1a7a5c6a8f71_1379x468.png 424w, https://substackcdn.com/image/fetch/$s_!HIik!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ad6beb2-1bf2-41ba-ba6c-1a7a5c6a8f71_1379x468.png 848w, https://substackcdn.com/image/fetch/$s_!HIik!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ad6beb2-1bf2-41ba-ba6c-1a7a5c6a8f71_1379x468.png 1272w, https://substackcdn.com/image/fetch/$s_!HIik!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ad6beb2-1bf2-41ba-ba6c-1a7a5c6a8f71_1379x468.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HIik!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ad6beb2-1bf2-41ba-ba6c-1a7a5c6a8f71_1379x468.png" width="1379" height="468" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5ad6beb2-1bf2-41ba-ba6c-1a7a5c6a8f71_1379x468.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:468,&quot;width&quot;:1379,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HIik!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ad6beb2-1bf2-41ba-ba6c-1a7a5c6a8f71_1379x468.png 424w, https://substackcdn.com/image/fetch/$s_!HIik!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ad6beb2-1bf2-41ba-ba6c-1a7a5c6a8f71_1379x468.png 848w, https://substackcdn.com/image/fetch/$s_!HIik!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ad6beb2-1bf2-41ba-ba6c-1a7a5c6a8f71_1379x468.png 1272w, https://substackcdn.com/image/fetch/$s_!HIik!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ad6beb2-1bf2-41ba-ba6c-1a7a5c6a8f71_1379x468.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This approach is more flexible, but requires extra logic to manage versions.</p><p><strong>c. CRDTs (Conflict-Free Replicated Data Types)</strong></p><p>CRDTs are special data structures designed to merge changes automatically without conflicts.</p><p>They use mathematical rules that ensure that different versions of the same data can always be merged into a single, consistent result, even when updates occur in different places.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Io21!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7dfc33dd-b95e-42e5-ae73-de648c9479d2_1407x500.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Io21!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7dfc33dd-b95e-42e5-ae73-de648c9479d2_1407x500.png 424w, https://substackcdn.com/image/fetch/$s_!Io21!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7dfc33dd-b95e-42e5-ae73-de648c9479d2_1407x500.png 848w, https://substackcdn.com/image/fetch/$s_!Io21!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7dfc33dd-b95e-42e5-ae73-de648c9479d2_1407x500.png 1272w, https://substackcdn.com/image/fetch/$s_!Io21!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7dfc33dd-b95e-42e5-ae73-de648c9479d2_1407x500.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Io21!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7dfc33dd-b95e-42e5-ae73-de648c9479d2_1407x500.png" width="1407" height="500" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7dfc33dd-b95e-42e5-ae73-de648c9479d2_1407x500.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:500,&quot;width&quot;:1407,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Io21!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7dfc33dd-b95e-42e5-ae73-de648c9479d2_1407x500.png 424w, https://substackcdn.com/image/fetch/$s_!Io21!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7dfc33dd-b95e-42e5-ae73-de648c9479d2_1407x500.png 848w, https://substackcdn.com/image/fetch/$s_!Io21!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7dfc33dd-b95e-42e5-ae73-de648c9479d2_1407x500.png 1272w, https://substackcdn.com/image/fetch/$s_!Io21!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7dfc33dd-b95e-42e5-ae73-de648c9479d2_1407x500.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This approach is often used in real-time collaboration tools.</p><h4><strong>Why It Matters</strong></h4><p>Without a proper conflict strategy, one user&#8217;s changes might silently overwrite another user&#8217;s work.</p><p>For example, someone may edit a document, sync it later, and suddenly see that their changes have disappeared.</p><p>This can break user trust and cause data loss.</p><h4><strong>Real-World Example</strong></h4><p>Many well-known applications use these strategies:</p><ul><li><p>Figma and Notion use &#8220;CRDT&#8221; based systems to support real-time collaborative editing.</p></li><li><p>Dropbox often uses a &#8220;last-write-wins&#8221; approach and may generate a &#8220;conflicted copy&#8221; when it cannot merge two versions.</p></li></ul><h4><strong>Trade-offs</strong></h4><p>Each approach has its advantages and limitations:</p><ul><li><p><strong>Last-Write-Wins: </strong>Very simple to implement, but can cause data loss because earlier changes may be overwritten.</p></li><li><p><strong>CRDTs:</strong> Automatically merge changes without conflicts, but are complex to implement and not suitable for every type of data.</p></li><li><p><strong>Versioning:</strong> Keeps track of multiple changes clearly, but may require users to manually resolve conflicts.</p></li></ul><p>Once conflicts can be resolved safely, syncing becomes more efficient when only changes are transferred.</p><div><hr></div><p><em><strong>Reminder: this is a teaser of the subscriber-only newsletter, exclusive to my golden members.</strong></em></p><p>When you upgrade, you&#8217;ll get:</p><ul><li><p><strong>High-level architecture of real-world systems.</strong></p></li><li><p>Deep dive into how popular real-world systems work.</p></li><li><p><strong>How real-world systems handle scale, reliability, and performance.</strong></p></li></ul><p>And much more!</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe&quot;,&quot;text&quot;:&quot;Unlock Full Access&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://newsletter.systemdesign.one/subscribe"><span>Unlock Full Access</span></a></p><div><hr></div><h3><strong>21. Delta Sync</strong></h3><p>When mobile apps sync data with a server, downloading the entire dataset every time can be slow and waste a lot of data. This is especially true for apps that store large amounts of information.</p><p>To make syncing more efficient, many apps use a technique called delta sync.</p><p>Delta sync means the app downloads only the data that has changed since the last sync, rather than downloading everything again.</p><h4><strong>How It Works</strong></h4><p>When the app syncs for the first time, it downloads the full dataset from the server.</p><p>After that, the server gives the app a <strong>sync token</strong> or <strong>timestamp</strong> that represents the latest state of the data.</p><p>On the next sync:</p><ol><li><p>App sends the last sync token to the server.</p></li><li><p>The server checks what has changed since that token.</p></li><li><p>Server sends only new, updated, or deleted records.</p></li></ol><p>This makes syncing much faster and more efficient.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!z3IL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08664bf6-a3f8-4fbb-910f-2e0d90ac6639_1669x652.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!z3IL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08664bf6-a3f8-4fbb-910f-2e0d90ac6639_1669x652.png 424w, https://substackcdn.com/image/fetch/$s_!z3IL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08664bf6-a3f8-4fbb-910f-2e0d90ac6639_1669x652.png 848w, https://substackcdn.com/image/fetch/$s_!z3IL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08664bf6-a3f8-4fbb-910f-2e0d90ac6639_1669x652.png 1272w, https://substackcdn.com/image/fetch/$s_!z3IL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08664bf6-a3f8-4fbb-910f-2e0d90ac6639_1669x652.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!z3IL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08664bf6-a3f8-4fbb-910f-2e0d90ac6639_1669x652.png" width="1456" height="569" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/08664bf6-a3f8-4fbb-910f-2e0d90ac6639_1669x652.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:569,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!z3IL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08664bf6-a3f8-4fbb-910f-2e0d90ac6639_1669x652.png 424w, https://substackcdn.com/image/fetch/$s_!z3IL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08664bf6-a3f8-4fbb-910f-2e0d90ac6639_1669x652.png 848w, https://substackcdn.com/image/fetch/$s_!z3IL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08664bf6-a3f8-4fbb-910f-2e0d90ac6639_1669x652.png 1272w, https://substackcdn.com/image/fetch/$s_!z3IL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08664bf6-a3f8-4fbb-910f-2e0d90ac6639_1669x652.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4><strong>Why It Matters</strong></h4><p>Imagine a contacts app with 5,000 contacts.</p><p>If the app downloaded all contacts every time it synced, it would waste a large amount of bandwidth and battery. With delta sync, the app downloads only the contacts that were added or changed, saving both data and processing time.</p><p>This makes background syncing practical even on mobile networks.</p><h4><strong>Real-World Example</strong></h4><p>Many well-known platforms use delta syncing:</p><ul><li><p>Google Contacts uses a &#8220;syncToken&#8221; to track changes.</p></li><li><p>Apple CloudKit uses a &#8220;serverChangeToken&#8221; for the same purpose.</p></li></ul><p>The app stores this token and sends it during the next sync request, so the server knows exactly what updates to return.</p><h4><strong>Trade-offs</strong></h4><p>Delta sync improves efficiency but requires extra work on the server.</p><p>The server must keep track of which data has changed. This usually means storing timestamps or maintaining a change log.</p><p>Deletions must also be tracked. Instead of completely removing a record, the system may temporarily store a &#8220;tombstone&#8221; record that indicates the item was deleted, so the client knows to remove it as well.</p><h4><strong>Practical Tip</strong></h4><p>The sync system should rely on timestamps or sequence numbers generated by the server, not the client.</p><p>Client devices may have incorrect clocks, which can cause sync problems. Using a server-controlled timestamp or sequence number ensures that the order of updates is always correct.</p><p>Even with efficient syncing methods like delta sync, distributed systems can still struggle to maintain perfect consistency across devices.</p><h3><strong>22. Eventual Consistency</strong></h3><p>Mobile apps often run in environments with slow, unstable, or completely unavailable network connections. Because of this, different devices may not always have the exact same data at the same time.</p><p>This situation is called eventual consistency.</p><p>Eventual consistency means that data may temporarily differ across devices, but all devices will eventually reach the same state once they sync with the server.</p><h4><strong>How It Works</strong></h4><p>In mobile apps, users may perform actions while offline or on unstable networks.</p><p>For example:</p><ol><li><p>A user updates some data on their phone.</p></li><li><p>The change gets saved locally on the device.</p></li><li><p>When the network connection becomes available, the app sends the update to the server.</p></li><li><p>Server then sends the updated data to other devices.</p></li></ol><p>For a short period, different devices may show slightly different versions of the data, but eventually they all become consistent after syncing.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!U1-j!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F871319e6-aa34-402f-9865-c0db83082766_2048x738.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!U1-j!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F871319e6-aa34-402f-9865-c0db83082766_2048x738.png 424w, https://substackcdn.com/image/fetch/$s_!U1-j!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F871319e6-aa34-402f-9865-c0db83082766_2048x738.png 848w, https://substackcdn.com/image/fetch/$s_!U1-j!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F871319e6-aa34-402f-9865-c0db83082766_2048x738.png 1272w, https://substackcdn.com/image/fetch/$s_!U1-j!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F871319e6-aa34-402f-9865-c0db83082766_2048x738.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!U1-j!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F871319e6-aa34-402f-9865-c0db83082766_2048x738.png" width="1456" height="525" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/871319e6-aa34-402f-9865-c0db83082766_2048x738.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:525,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!U1-j!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F871319e6-aa34-402f-9865-c0db83082766_2048x738.png 424w, https://substackcdn.com/image/fetch/$s_!U1-j!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F871319e6-aa34-402f-9865-c0db83082766_2048x738.png 848w, https://substackcdn.com/image/fetch/$s_!U1-j!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F871319e6-aa34-402f-9865-c0db83082766_2048x738.png 1272w, https://substackcdn.com/image/fetch/$s_!U1-j!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F871319e6-aa34-402f-9865-c0db83082766_2048x738.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4><strong>Why It Matters</strong></h4><p>Trying to keep data perfectly synchronized at all times would require the app to wait for the server to confirm every action before updating the interface.</p><p>This would make the app feel slow and unresponsive.</p><p>Instead, mobile apps usually update the UI immediately, while &#8204;server synchronization happens in the background.</p><p>This approach makes the app feel fast and responsive while still keeping the data consistent over time.</p><h4><strong>Real-World Example</strong></h4><p>Messaging apps are a common example of eventual consistency.</p><p>For example, WhatsApp shows different message states:</p><ul><li><p>A single check mark means the message has been sent from the device.</p></li><li><p>A double check mark means the message has been delivered to the recipient.</p></li></ul><p>The message appears in the chat immediately, even before the server fully confirms it. Later, the status updates when the system finishes syncing.</p><h4><strong>Trade-offs</strong></h4><p>Eventual consistency improves user experience, but it also introduces some challenges&#8230;</p><p>Sometimes the server may reject a change or update the data differently. When this happens, the app must update the UI carefully to avoid confusing the user.</p><p>To handle this, many apps display clear states such as pending, confirmed, or failed. This helps users understand what is happening with their actions.</p><p>To handle temporary failures during syncing, apps often rely on background processes that automatically retry operations, as discussed in the next section.</p><h3><strong>23. Background Sync &amp; Retry Queues</strong></h3><p>Mobile apps often perform actions that need to be sent to a server, such as sending a message, liking a post, or submitting a form. But sometimes the network connection may be weak or unavailable, causing these actions to fail.</p><p>Instead of losing these actions, apps use a system called a retry queue.</p><p>A retry queue stores failed operations locally on the device so they can be retried later when the network connection is available.</p><p>Mobile operating systems also provide tools that allow apps to perform tasks in the background efficiently.</p><p>For example:</p><ul><li><p><a href="https://developer.android.com/reference/androidx/work/WorkManager">WorkManager</a> on Android</p></li><li><p><a href="https://developer.apple.com/documentation/backgroundtasks/bgtaskscheduler">BGTaskScheduler</a> on iOS</p></li></ul><p>These tools allow apps to run background tasks while managing battery usage and system resources.</p><h4><strong>How It Works</strong></h4><p>When a user performs an action that requires a network request, the app tries to send it to the server:</p><ol><li><p>User performs an action (for example, sending a message).</p></li><li><p>If the request succeeds, the server processes it normally.</p></li><li><p>If the request fails because of network issues, the app stores the action in a retry queue.</p></li><li><p>When the device reconnects to the network, a background sync process retries the queued operations.</p></li><li><p>Once the server confirms the request, the action is removed from the queue.</p></li></ol><p>This ensures that user actions are not lost even when the network is unreliable.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9-82!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9063e738-434c-43db-ba3c-4e99fa459d66_1876x548.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9-82!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9063e738-434c-43db-ba3c-4e99fa459d66_1876x548.png 424w, https://substackcdn.com/image/fetch/$s_!9-82!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9063e738-434c-43db-ba3c-4e99fa459d66_1876x548.png 848w, https://substackcdn.com/image/fetch/$s_!9-82!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9063e738-434c-43db-ba3c-4e99fa459d66_1876x548.png 1272w, https://substackcdn.com/image/fetch/$s_!9-82!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9063e738-434c-43db-ba3c-4e99fa459d66_1876x548.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9-82!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9063e738-434c-43db-ba3c-4e99fa459d66_1876x548.png" width="1456" height="425" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9063e738-434c-43db-ba3c-4e99fa459d66_1876x548.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:425,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!9-82!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9063e738-434c-43db-ba3c-4e99fa459d66_1876x548.png 424w, https://substackcdn.com/image/fetch/$s_!9-82!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9063e738-434c-43db-ba3c-4e99fa459d66_1876x548.png 848w, https://substackcdn.com/image/fetch/$s_!9-82!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9063e738-434c-43db-ba3c-4e99fa459d66_1876x548.png 1272w, https://substackcdn.com/image/fetch/$s_!9-82!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9063e738-434c-43db-ba3c-4e99fa459d66_1876x548.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4><strong>Why It Matters</strong></h4><p>Users expect their actions to work, even if the network connection is unstable.</p><p>For example, if a user taps &#8220;send message&#8221; or &#8220;like&#8221;, the app should not simply ignore the action because the network was unavailable.</p><p>By storing the action and retrying it later, the app respects the user&#8217;s intent and completes the action when conditions improve.</p><h4><strong>Real-World Example</strong></h4><p>Many apps use retry queues and background syncing:</p><ul><li><p>Email apps like Gmail and Outlook allow users to send emails while offline. The email is queued and automatically sent when the network returns.</p></li><li><p>E-commerce apps may queue actions such as adding items to a cart or saving user activity and sync them later.</p></li></ul><h4><strong>Trade-offs</strong></h4><p>Background processing on mobile devices has limitations.</p><p>Operating systems control how long apps can run in the background to protect battery life. These background windows are usually short and unpredictable. Because of this, apps should not rely only on background syncing. Synchronization should also happen when the user opens the app again.</p><p>These mechanisms also allow the app&#8217;s interface to respond instantly, while the actual synchronization happens quietly in the background.</p><h3><strong>24. Optimistic UI Updates</strong></h3><p>In many apps, users expect actions to happen instantly. For example, when you tap &#8220;like&#8221;, send a message, or add a comment, you expect to see the result immediately.</p><p>If the app waited for the server to respond before updating the screen, it could take a few hundred milliseconds. Even that slight delay can make the app feel slow.</p><p>To solve this, apps use a technique called optimistic UI updates.</p><p>Optimistic UI means the app assumes the request will succeed and updates the interface immediately, even before the server confirms it.</p><p>The server request still happens in the background. If everything succeeds, nothing changes. If it fails, the app corrects the UI and shows an error.</p><h4><strong>How It Works</strong></h4><p>The process usually follows these steps:</p><ol><li><p>User performs an action (e.g., tapping a like).</p></li><li><p>App updates the UI immediately and stores the change locally.</p></li><li><p>App sends the request to the server.</p></li><li><p>If the server confirms the request, the change remains unchanged.</p></li><li><p>If the server rejects the request, the app reverts the change and shows a small error message.</p></li></ol><p>This makes the app feel fast while still keeping the data accurate.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!RSRZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98a3d220-05fd-4377-9e27-ed19832805ef_1702x620.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!RSRZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98a3d220-05fd-4377-9e27-ed19832805ef_1702x620.png 424w, https://substackcdn.com/image/fetch/$s_!RSRZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98a3d220-05fd-4377-9e27-ed19832805ef_1702x620.png 848w, https://substackcdn.com/image/fetch/$s_!RSRZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98a3d220-05fd-4377-9e27-ed19832805ef_1702x620.png 1272w, https://substackcdn.com/image/fetch/$s_!RSRZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98a3d220-05fd-4377-9e27-ed19832805ef_1702x620.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!RSRZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98a3d220-05fd-4377-9e27-ed19832805ef_1702x620.png" width="1456" height="530" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/98a3d220-05fd-4377-9e27-ed19832805ef_1702x620.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:530,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!RSRZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98a3d220-05fd-4377-9e27-ed19832805ef_1702x620.png 424w, https://substackcdn.com/image/fetch/$s_!RSRZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98a3d220-05fd-4377-9e27-ed19832805ef_1702x620.png 848w, https://substackcdn.com/image/fetch/$s_!RSRZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98a3d220-05fd-4377-9e27-ed19832805ef_1702x620.png 1272w, https://substackcdn.com/image/fetch/$s_!RSRZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98a3d220-05fd-4377-9e27-ed19832805ef_1702x620.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4><strong>Why It Matters</strong></h4><p>Users expect apps to respond instantly to their actions.</p><p>If the app shows a loading spinner every time the user taps a button, the experience feels slow and frustrating.</p><p>Optimistic UI removes this delay, making the app feel smooth and responsive, similar to native device interactions.</p><h4><strong>Real-World Example</strong></h4><p>Many popular apps rely heavily on optimistic UI:</p><ul><li><p>Twitter shows the like animation immediately when you tap the heart.</p></li><li><p>Instagram displays your comment instantly in the thread.</p></li><li><p>Slack shows a message in the chat immediately after you send it.</p></li></ul><p>In the background, the app syncs with the server and updates the data if needed.</p><h4><strong>Trade-offs</strong></h4><p>An optimistic UI improves the user experience, but it also adds complexity.</p><p>If the server rejects the request, the app must roll back the UI change smoothly without confusing the user. Because of this, optimistic UI is usually avoided for actions that cannot be easily reversed, such as permanent deletion, payments, and financial transactions.</p><p>As app stores and sync sensitive user data, security becomes a critical part of system design&#8230;</p><p>Let&#8217;s keep going!</p>
      <p>
          <a href="https://newsletter.systemdesign.one/p/mobile-system-design-concepts">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Design a personal AI chat assistant]]></title><description><![CDATA[#144: Part 2 - Generative AI Masterclass]]></description><link>https://newsletter.systemdesign.one/p/ai-chat-assistant</link><guid isPermaLink="false">https://newsletter.systemdesign.one/p/ai-chat-assistant</guid><dc:creator><![CDATA[Neo Kim]]></dc:creator><pubDate>Mon, 04 May 2026 07:49:25 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/49855f8b-2e2d-4967-ba00-15a268a4ea52_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://newsletter.systemdesign.one/subscribe?yearly=true" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!RKN7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3689f342-2008-4ce6-b968-16461682508b_1280x300.png 424w, https://substackcdn.com/image/fetch/$s_!RKN7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3689f342-2008-4ce6-b968-16461682508b_1280x300.png 848w, https://substackcdn.com/image/fetch/$s_!RKN7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3689f342-2008-4ce6-b968-16461682508b_1280x300.png 1272w, https://substackcdn.com/image/fetch/$s_!RKN7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3689f342-2008-4ce6-b968-16461682508b_1280x300.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!RKN7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3689f342-2008-4ce6-b968-16461682508b_1280x300.png" width="1280" height="300" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3689f342-2008-4ce6-b968-16461682508b_1280x300.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:300,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:24224,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://newsletter.systemdesign.one/subscribe?yearly=true&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/192435842?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3689f342-2008-4ce6-b968-16461682508b_1280x300.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!RKN7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3689f342-2008-4ce6-b968-16461682508b_1280x300.png 424w, https://substackcdn.com/image/fetch/$s_!RKN7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3689f342-2008-4ce6-b968-16461682508b_1280x300.png 848w, https://substackcdn.com/image/fetch/$s_!RKN7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3689f342-2008-4ce6-b968-16461682508b_1280x300.png 1272w, https://substackcdn.com/image/fetch/$s_!RKN7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3689f342-2008-4ce6-b968-16461682508b_1280x300.png 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a></figure></div><ul><li><p><em><a href="https://newsletter.systemdesign.one/p/ai-chat-assistant/?action=share">Share this post</a> &amp; I'll send you some rewards for the referrals.</em></p></li></ul><div><hr></div><p>ChatGPT, Claude, and Gemini handle general-purpose tasks and one-off conversations well&#8230;</p><p>The moment your assistant is customer-facing, the requirements change:</p><p>Your users need an assistant that knows your product, your policies, and their history with you. That knowledge lives in your systems, NOT in a model&#8217;s training data. You could try to inject it at runtime, but proprietary data comes with real constraints: <em>it is often too large to fit in a context window, too sensitive to send to a third-party API, and too dynamic to stay current in a static prompt.</em></p><p>Beyond data, you cannot fully configure its behavior; you cannot integrate it past a browser tab, and costs scale with every conversation&#8230;</p><p>Here is what changes when you build your own:</p><ul><li><p><strong>Privacy and data control.</strong> When you build your own assistant, conversations stay on your infrastructure. You decide what gets logged, how long it gets stored, and when it gets deleted. For healthcare, legal, finance, and enterprise use cases, this is a compliance requirement, not a nice-to-have.</p></li><li><p><strong>Full control over behavior.</strong> You own the system prompt, persona, and the guardrails. You can enforce a specific tone, restrict responses to your product domain, swap the underlying model, or A/B test different configurations. With an off-the-shelf product, you get what they ship.</p></li><li><p><strong>Cost at scale.</strong> You can implement prompt caching, model routing (a cheap model for simple questions, an expensive one for hard ones), and context management strategies that dramatically cut costs. With thousands of conversations per day, the savings are significant.</p></li><li><p><strong>Product integration.</strong> Your assistant lives inside your product, not in a separate tab. It shares your auth system, your UI, and your user context. That level of integration is not possible when you are wrapping someone else&#8217;s chat interface.</p></li><li><p><strong>Security.</strong> You control the full request path: what goes into the model, what comes out, and what filters run in between. You implement your own prompt injection defenses, output guardrails, and content policies.</p></li></ul><p>Off-the-shelf tools are great for personal use and internal prototyping. Building your own is for when the assistant is the product or a core feature.</p><div><hr></div><h2><a href="http://a0.to/1kh">Enterprise auth from day one. No upgrade required. &#128640; (Partner)</a></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="http://a0.to/1kh" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FTzV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3cacd49-09f7-4843-bfd4-8e6827d62fdc_1200x628.png 424w, https://substackcdn.com/image/fetch/$s_!FTzV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3cacd49-09f7-4843-bfd4-8e6827d62fdc_1200x628.png 848w, https://substackcdn.com/image/fetch/$s_!FTzV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3cacd49-09f7-4843-bfd4-8e6827d62fdc_1200x628.png 1272w, https://substackcdn.com/image/fetch/$s_!FTzV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3cacd49-09f7-4843-bfd4-8e6827d62fdc_1200x628.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FTzV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3cacd49-09f7-4843-bfd4-8e6827d62fdc_1200x628.png" width="1200" height="628" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a3cacd49-09f7-4843-bfd4-8e6827d62fdc_1200x628.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:628,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:682006,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;http://a0.to/1kh&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/195272243?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3cacd49-09f7-4843-bfd4-8e6827d62fdc_1200x628.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!FTzV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3cacd49-09f7-4843-bfd4-8e6827d62fdc_1200x628.png 424w, https://substackcdn.com/image/fetch/$s_!FTzV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3cacd49-09f7-4843-bfd4-8e6827d62fdc_1200x628.png 848w, https://substackcdn.com/image/fetch/$s_!FTzV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3cacd49-09f7-4843-bfd4-8e6827d62fdc_1200x628.png 1272w, https://substackcdn.com/image/fetch/$s_!FTzV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3cacd49-09f7-4843-bfd4-8e6827d62fdc_1200x628.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Enterprise auth is intimidating without a doubt - SSO, SCIM, etc.</p><p>You may feel these features are next year&#8217;s problem because the price jump to unlock basic SAML or OIDC connections is just too brutal for an early-stage roadmap.</p><p>With <strong><a href="http://a0.to/1kh">Auth0</a></strong>, ship your first <strong>Enterprise Connection and SCIM</strong> setup for <strong>$0</strong> on their free tier. If you&#8217;re scaling <strong>AI Agents</strong>, security stack is built for production:</p><ul><li><p><strong>Token Vault:</strong> Securely connect 3rd-party tokens (Slack, GitHub) to stop hard-coding API keys.</p></li><li><p><strong>FGA for RAG:</strong> Ensure LLMs respect fine-grained permissions to prevent data leaks and also control access to specific AI tools/models.</p></li><li><p><strong>Async Authorization:</strong> Use backchannel authentication for human-in-the-loop approvals on critical agent actions.</p></li><li><p><strong>User Authentication: </strong>Securely identify users to unlock chat history, order tracking, and personalized settings within your AI agents.</p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;http://a0.to/1kh&quot;,&quot;text&quot;:&quot;Start building for $0&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="http://a0.to/1kh"><span>Start building for $0</span></a></p><p>And stop worrying about DIY-ing auth.</p><p>(Thanks to <strong><a href="http://a0.to/1kh">Auth0</a></strong> for partnering on this post.)</p><div><hr></div><p>I want to reintroduce <strong><a href="https://louisbouchard.substack.com/">Louis-Fran&#231;ois Bouchard</a> </strong>as the author of this newsletter.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://louisbouchard.substack.com/" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8ezx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bbacb1f-03da-4781-9a5e-b720e4af2e34_1100x220.png 424w, https://substackcdn.com/image/fetch/$s_!8ezx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bbacb1f-03da-4781-9a5e-b720e4af2e34_1100x220.png 848w, https://substackcdn.com/image/fetch/$s_!8ezx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bbacb1f-03da-4781-9a5e-b720e4af2e34_1100x220.png 1272w, https://substackcdn.com/image/fetch/$s_!8ezx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bbacb1f-03da-4781-9a5e-b720e4af2e34_1100x220.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8ezx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bbacb1f-03da-4781-9a5e-b720e4af2e34_1100x220.png" width="1100" height="220" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2bbacb1f-03da-4781-9a5e-b720e4af2e34_1100x220.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:220,&quot;width&quot;:1100,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:&quot;https://louisbouchard.substack.com/&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!8ezx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bbacb1f-03da-4781-9a5e-b720e4af2e34_1100x220.png 424w, https://substackcdn.com/image/fetch/$s_!8ezx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bbacb1f-03da-4781-9a5e-b720e4af2e34_1100x220.png 848w, https://substackcdn.com/image/fetch/$s_!8ezx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bbacb1f-03da-4781-9a5e-b720e4af2e34_1100x220.png 1272w, https://substackcdn.com/image/fetch/$s_!8ezx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bbacb1f-03da-4781-9a5e-b720e4af2e34_1100x220.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>He&#8217;s a best-selling author (<a href="https://amzn.to/4bqYU9b">Building LLMs for Production</a>), the co-founder of <a href="https://academy.towardsai.net/?ref=1f9b29">Towards AI</a>, and the creator of the YouTube Channel, <a href="https://www.youtube.com/@whatsai?sub_confirmation=1">What&#8217;s AI</a>, where he helps people understand AI and learn how to apply it in the real world. Through his development work with clients and his content, teaching, and AI training programs on the <strong><a href="https://academy.towardsai.net/?ref=1f9b29">Towards AI Academy</a></strong>, Louis focuses on making AI practical for builders, engineers, and curious learners alike.</p><p>At Towards AI, he and his team train AI engineers through courses built for every stage, from beginner to advanced. That educational mission and the real-world experience building for his clients are exactly why I wanted him in this newsletter series.</p><div><hr></div><p><strong>Inside this newsletter, you&#8217;ll get:</strong></p><ul><li><p><strong>What happens when you send a message.</strong> Tokenization, prefill, and decode phases, KV-cache, and why longer conversations get slower and more expensive.</p></li><li><p><strong>How the model got here.</strong> Pretraining, post-training (SFT and RLHF), and scaling laws, and what each one means for the assistant you are building.</p></li><li><p><strong>Output controls.</strong> Temperature, top-p, top-k, why hallucination happens, and which settings to use for which task.</p></li><li><p><strong>Context engineering.</strong> System prompts, few-shot, chain-of-thought, structured outputs, context window management, and persistent memory across sessions.</p></li><li><p><strong>Cost and how to know it works.</strong> Worked token math across model tiers, prompt caching, model routing, golden test sets, and LLM-as-Judge evaluation.</p></li><li><p><strong>A practical eight-step build.</strong> From a single API call to a production-minded assistant with streaming, system prompts, JSON output, multi-turn compaction, prompt injection defense, and persistent memory.</p></li></ul><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Golden members get all posts like these!&#8230;</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h2><strong>What We Are Building</strong></h2><p>We&#8217;re building a ChatGPT-style conversational assistant that you fully control: <em>your own system prompt, your own personality, your own rules.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WiM-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa9d5ab2-3aee-4bad-bee9-d5f0ecbf53a9_2048x1154.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WiM-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa9d5ab2-3aee-4bad-bee9-d5f0ecbf53a9_2048x1154.png 424w, https://substackcdn.com/image/fetch/$s_!WiM-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa9d5ab2-3aee-4bad-bee9-d5f0ecbf53a9_2048x1154.png 848w, https://substackcdn.com/image/fetch/$s_!WiM-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa9d5ab2-3aee-4bad-bee9-d5f0ecbf53a9_2048x1154.png 1272w, https://substackcdn.com/image/fetch/$s_!WiM-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa9d5ab2-3aee-4bad-bee9-d5f0ecbf53a9_2048x1154.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WiM-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa9d5ab2-3aee-4bad-bee9-d5f0ecbf53a9_2048x1154.png" width="1456" height="820" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fa9d5ab2-3aee-4bad-bee9-d5f0ecbf53a9_2048x1154.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:820,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!WiM-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa9d5ab2-3aee-4bad-bee9-d5f0ecbf53a9_2048x1154.png 424w, https://substackcdn.com/image/fetch/$s_!WiM-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa9d5ab2-3aee-4bad-bee9-d5f0ecbf53a9_2048x1154.png 848w, https://substackcdn.com/image/fetch/$s_!WiM-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa9d5ab2-3aee-4bad-bee9-d5f0ecbf53a9_2048x1154.png 1272w, https://substackcdn.com/image/fetch/$s_!WiM-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa9d5ab2-3aee-4bad-bee9-d5f0ecbf53a9_2048x1154.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Full system architecture of a chat assistant. A user message flows through four layers: context engineering, generation engine, persistent memory, and SSE streaming before a response streams back. Each section of the article zooms into one layer.</figcaption></figure></div><p><em>Here are the requirements:</em></p><ul><li><p><strong>Multi-turn conversation</strong> with context preserved across turns.</p></li><li><p><strong>Streaming responses</strong>, where words appear as they&#8217;re generated.</p></li><li><p><strong>Configurable behavior</strong>, creative for brainstorming, precise for data extraction.</p></li><li><p><strong>Resistant to prompt injection attacks.</strong></p></li><li><p><strong>Cost-effective at scale</strong>, with thousands of conversations per day.</p></li></ul><p>One extra feature covered later: <strong>persistent memory</strong> across sessions, so the assistant remembers user preferences between conversations.</p><p>What this system does <strong>not</strong> do: <em>it does NOT retrieve external documents, it does NOT have domain expertise beyond its training data, and it does NOT take actions or call tools. It&#8217;s purely conversational.</em></p><p>We use a model API (OpenAI, Anthropic, or similar) for this build.</p><p>The provider handles GPUs, scaling, and inference optimization, the right starting point for most teams. When you outgrow APIs, you move to self-hosted open-source models served with vLLM, llama.cpp, or SGLang.</p><p>This newsletter is organized around 4 core design decisions:</p><ol><li><p><strong>What happens when you send a message?</strong></p></li><li><p><strong>How do you control what the model says?</strong></p></li><li><p><strong>How do you engineer the context for every request?</strong></p></li><li><p><strong>What does it cost, and how do you know it works?</strong></p></li></ol><p><em>(Then we build the whole thing from scratch in Part 5.)</em></p><p>Let&#8217;s start with what happens under the hood when you send a message&#8230;</p><div><hr></div><h2><strong>Part 1: What Happens When You Send a Message</strong></h2><p>When you call a model API, a lot happens between your request and the first token appearing on screen.</p><p>Even if you never need to use a GPU, this helps you understand the latency, cost, and trade-offs&#8230;</p><h3><strong>From Text to Tokens</strong></h3><p>Before the model can process your message, it needs to convert it from text into numbers.</p><p>Models do not read words. They work with <strong>tokens</strong>, small chunks of text mapped to numerical IDs. A token might be a whole word (&#8220;the&#8221;), part of a word (&#8220;un&#8221;, &#8220;believ&#8221;, &#8220;able&#8221;), or a single character.</p><p>The process of splitting text into tokens is called <strong>tokenization</strong>.</p><p>As we covered in Part 1, there are different ways to split text into tokens:</p><p>You could split by <em>individual characters</em>, but then the model has to process very long sequences just to read a short sentence. You could split by <em>whole words,</em> but then the vocabulary becomes enormous, and the model cannot handle any word it has not seen before.</p><p><strong>Subword tokenization</strong> is the sweet spot, and virtually all modern models use it.</p><p>The most common algorithm is Byte Pair Encoding <strong>(BPE)</strong>. BPE starts with individual characters and iteratively merges the most frequent pairs into a single token. Common words like &#8220;the&#8221; or &#8220;and&#8221; become single tokens. Rare or long words get split into meaningful pieces. The word &#8220;unbelievable&#8221; might be broken into three tokens: &#8220;un&#8221; + &#8220;believ&#8221; + &#8220;able.&#8221; This keeps the vocabulary manageable, typically 30,000 to 100,000 tokens. And if the model encounters a word it has never seen before, it can still process it by breaking it into known subword pieces.</p><p>This is why language models struggle with certain tasks that seem trivial to humans&#8230;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Rh4M!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92090a8d-4fd2-49ae-bc3e-7b0af2dfd424_2048x1237.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Rh4M!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92090a8d-4fd2-49ae-bc3e-7b0af2dfd424_2048x1237.png 424w, https://substackcdn.com/image/fetch/$s_!Rh4M!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92090a8d-4fd2-49ae-bc3e-7b0af2dfd424_2048x1237.png 848w, https://substackcdn.com/image/fetch/$s_!Rh4M!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92090a8d-4fd2-49ae-bc3e-7b0af2dfd424_2048x1237.png 1272w, https://substackcdn.com/image/fetch/$s_!Rh4M!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92090a8d-4fd2-49ae-bc3e-7b0af2dfd424_2048x1237.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Rh4M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92090a8d-4fd2-49ae-bc3e-7b0af2dfd424_2048x1237.png" width="1456" height="879" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/92090a8d-4fd2-49ae-bc3e-7b0af2dfd424_2048x1237.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:879,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Rh4M!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92090a8d-4fd2-49ae-bc3e-7b0af2dfd424_2048x1237.png 424w, https://substackcdn.com/image/fetch/$s_!Rh4M!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92090a8d-4fd2-49ae-bc3e-7b0af2dfd424_2048x1237.png 848w, https://substackcdn.com/image/fetch/$s_!Rh4M!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92090a8d-4fd2-49ae-bc3e-7b0af2dfd424_2048x1237.png 1272w, https://substackcdn.com/image/fetch/$s_!Rh4M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92090a8d-4fd2-49ae-bc3e-7b0af2dfd424_2048x1237.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">How tokenization works. The sentence &#8220;The cat sat on the mat&#8221; becomes six tokens, each mapped to a numerical ID. Below, &#8220;strawberry&#8221; is split into three tokens (&#8221;str&#8221;, &#8220;aw&#8221;, &#8220;berry&#8221;), showing why the model cannot count the letter r: the r&#8217;s are split across token boundaries.</figcaption></figure></div><p>Ask <em>&#8220;How many r&#8217;s are in strawberry?&#8221;</em> and the model might get it wrong because it never sees the individual letters. The word &#8220;strawberry&#8221; might be tokenized as &#8220;str&#8221; + &#8220;aw&#8221; + &#8220;berry,&#8221; so the model literally cannot count the r&#8217;s because they are split across token boundaries. Reversing a word, counting specific characters, and other character-level operations are all hard for the same reason: <em>the model operates on tokens, not letters.</em></p><p>Tokenization has direct practical consequences for building a chat assistant.</p><p>Token count does not equal word count: a typical English sentence of 10 words might be 13 to 15 tokens. Code is usually more token-dense than prose. Non-English languages often require more tokens per word because BPE vocabularies are built primarily from English text. Since API pricing, context window limits, and latency all scale with token count, understanding this conversion is essential for cost estimation and context management, both of which we cover later in this newsletter.</p><div><hr></div><p><em><strong>Reminder: this is a teaser of the subscriber-only newsletter series, exclusive to my golden members.</strong></em></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://newsletter.systemdesign.one/subscribe?yearly=true" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3mfm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0002d397-a8f7-4519-a51c-f020e29ce8b3_1280x300.png 424w, https://substackcdn.com/image/fetch/$s_!3mfm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0002d397-a8f7-4519-a51c-f020e29ce8b3_1280x300.png 848w, https://substackcdn.com/image/fetch/$s_!3mfm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0002d397-a8f7-4519-a51c-f020e29ce8b3_1280x300.png 1272w, https://substackcdn.com/image/fetch/$s_!3mfm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0002d397-a8f7-4519-a51c-f020e29ce8b3_1280x300.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3mfm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0002d397-a8f7-4519-a51c-f020e29ce8b3_1280x300.png" width="1280" height="300" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0002d397-a8f7-4519-a51c-f020e29ce8b3_1280x300.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:300,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:24224,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://newsletter.systemdesign.one/subscribe?yearly=true&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/192435842?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0002d397-a8f7-4519-a51c-f020e29ce8b3_1280x300.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!3mfm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0002d397-a8f7-4519-a51c-f020e29ce8b3_1280x300.png 424w, https://substackcdn.com/image/fetch/$s_!3mfm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0002d397-a8f7-4519-a51c-f020e29ce8b3_1280x300.png 848w, https://substackcdn.com/image/fetch/$s_!3mfm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0002d397-a8f7-4519-a51c-f020e29ce8b3_1280x300.png 1272w, https://substackcdn.com/image/fetch/$s_!3mfm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0002d397-a8f7-4519-a51c-f020e29ce8b3_1280x300.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>When you upgrade, you&#8217;ll get:</p><ul><li><p><strong>Simple breakdown of real-world architectures</strong></p></li><li><p>Frameworks you can plug into your work or business</p></li><li><p><strong>Proven systems behind ChatGPT, Perplexity, and Copilot</strong></p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?yearly=true&quot;,&quot;text&quot;:&quot;Unlock Full Access&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://newsletter.systemdesign.one/subscribe?yearly=true"><span>Unlock Full Access</span></a></p><div><hr></div><h3><strong>How the Model Got Here: From Pretraining to Chat</strong></h3><p>Before we look at what happens on each request, it helps to understand what happened before you ever sent a message&#8230;</p><p>The model you are calling went through a multi-stage pipeline to become a chat assistant, and each stage has direct consequences for the product you are building.</p><p><strong>Pretraining</strong> is where the model learns language.</p><p>As covered in Part 1, the model is trained on a massive corpus, books, websites, code, research papers, and learns to predict the next token given all the tokens before it. This is self-supervised: <em>no one labels the data.</em> The model just reads trillions of tokens and learns patterns. Pretraining is what gives the model its general knowledge, its grammar, its ability to write code, and its tendency to produce plausible-sounding text, whether or not it is true.</p><p>When your chat assistant confidently states a wrong fact, that behavior traces back to pretraining: <em>the model learned to produce likely continuations, not verified truths.</em></p><p><strong>Post-training</strong> is what turns a text predictor into a chat assistant.</p><p>The pretrained model is good at completing text, but it does not know how to follow instructions, hold a conversation, or refuse harmful requests. Post-training fixes this in stages.</p><p>First, <em>supervised fine-tuning</em><strong> (SFT)</strong> trains the model on curated examples of good conversations: a user asks a question, and a human-written response shows what the ideal answer looks like. This teaches the model the format and style of a helpful assistant.</p><p>Then, <em>reinforcement learning from human feedback</em><strong> (RLHF)</strong> refines the model further. Human raters compare pairs of model responses and pick the better one. The model learns from those preferences, becoming more helpful, more accurate, and less likely to produce harmful or unhelpful output.</p><p>This is why the same base model can feel completely different depending on how it was post-trained. It is also why system prompts work:<em> the model was specifically trained to follow developer-level instructions during post-training.</em></p><p><strong>Scaling laws</strong> govern the relationship between model size, training data, and performance.</p><p>Research (notably <a href="https://arxiv.org/abs/2001.08361">Kaplan et al. at OpenAI</a>&nbsp;and&nbsp;the <a href="https://arxiv.org/abs/2203.15556">Chinchilla paper from DeepMind</a>) showed model performance improves predictably as you increase the number of&nbsp;parameters and training tokens, following power-law curves.</p><p>This is why the industry keeps building bigger models: doubling model size produces measurable, predictable gains in capability. For you as a builder, scaling laws explain the cost-capability tradeoff you face.</p><p>A frontier model (hundreds of billions of parameters) costs more per token but handles harder tasks. A lightweight model (fewer parameters, less training) is cheaper and faster but less capable. The tiered pricing you see from API providers, frontier versus mini versus nano, maps directly to where each model sits on the scaling curve.</p><p>Understanding this helps you make the model routing decisions we discuss later: <em>use the big model when you need its capabilities, and the small model when you do not.</em></p><p>Let&#8217;s keep going!</p>
      <p>
          <a href="https://newsletter.systemdesign.one/p/ai-chat-assistant">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Multi-Agent Architectures, Clearly Explained]]></title><description><![CDATA[#143: Coordination architectures, protocols connecting them, and how to pick the right one before you write any code.]]></description><link>https://newsletter.systemdesign.one/p/multi-agent-system</link><guid isPermaLink="false">https://newsletter.systemdesign.one/p/multi-agent-system</guid><dc:creator><![CDATA[Neo Kim]]></dc:creator><pubDate>Thu, 30 Apr 2026 07:57:47 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/5a5b731e-9631-4db0-9007-c42802f05c64_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Get my system design playbook for FREE on newsletter signup:</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><ul><li><p><em><a href="https://newsletter.systemdesign.one/p/multi-agent-system/?action=share">Share this post</a> &amp; I'll send you some rewards for the referrals.</em></p></li></ul><div><hr></div><p>A single agent<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a> has one context window<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a>, one set of tools<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a>, and one running loop<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a>.</p><p>When a task outgrows any of those three, you need more than one agent. That&#8217;s what a <strong>multi-agent system</strong> is. Instead of one agent doing everything, you split the work across many agents. Each one has its own role, tools, and context.</p><p>Most multi-agent systems fail not because the model is weak but because you chose many agents before you actually <em>need</em> them, or chose the <em>wrong</em> architecture once you have many agents.</p><p>So you shouldn&#8217;t split anything yet.</p><p>Onward.</p><div><hr></div><h2><strong><a href="https://watch.getcontrast.io/register/unblocked-how-to-stop-babysitting-your-agents?utm_source=systemdesign&amp;utm_medium=email&amp;utm_campaign=primary">[Webinar] Stop babysitting your coding agents (Partner)</a></strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://watch.getcontrast.io/register/unblocked-how-to-stop-babysitting-your-agents?utm_source=systemdesign&amp;utm_medium=email&amp;utm_campaign=primary" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2Yva!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5360c077-4b63-4b77-ac93-49ad7fe35954_2048x1152.png 424w, https://substackcdn.com/image/fetch/$s_!2Yva!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5360c077-4b63-4b77-ac93-49ad7fe35954_2048x1152.png 848w, https://substackcdn.com/image/fetch/$s_!2Yva!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5360c077-4b63-4b77-ac93-49ad7fe35954_2048x1152.png 1272w, https://substackcdn.com/image/fetch/$s_!2Yva!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5360c077-4b63-4b77-ac93-49ad7fe35954_2048x1152.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2Yva!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5360c077-4b63-4b77-ac93-49ad7fe35954_2048x1152.png" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5360c077-4b63-4b77-ac93-49ad7fe35954_2048x1152.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:&quot;https://watch.getcontrast.io/register/unblocked-how-to-stop-babysitting-your-agents?utm_source=systemdesign&amp;utm_medium=email&amp;utm_campaign=primary&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!2Yva!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5360c077-4b63-4b77-ac93-49ad7fe35954_2048x1152.png 424w, https://substackcdn.com/image/fetch/$s_!2Yva!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5360c077-4b63-4b77-ac93-49ad7fe35954_2048x1152.png 848w, https://substackcdn.com/image/fetch/$s_!2Yva!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5360c077-4b63-4b77-ac93-49ad7fe35954_2048x1152.png 1272w, https://substackcdn.com/image/fetch/$s_!2Yva!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5360c077-4b63-4b77-ac93-49ad7fe35954_2048x1152.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Agents can generate code. Getting it right for your system, team conventions, and past decisions is the hard part &#8211; you end up wasting time and tokens in correction loops.</p><p>More MCPs give agents access to information, but not understanding. The teams pulling ahead use a context engine to give agents exactly what they need.</p><p><strong><a href="https://watch.getcontrast.io/register/unblocked-how-to-stop-babysitting-your-agents?utm_source=systemdesign&amp;utm_medium=email&amp;utm_campaign=primary">Join Unblocked live on May 6 (FREE)</a></strong> to see:</p><ul><li><p>Where teams get stuck on the AI maturity curve</p></li><li><p>How a context engine solves for quality, efficiency, and cost</p></li><li><p>Live demo: the same coding task with and without a context engine</p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://watch.getcontrast.io/register/unblocked-how-to-stop-babysitting-your-agents?utm_source=systemdesign&amp;utm_medium=email&amp;utm_campaign=primary&quot;,&quot;text&quot;:&quot;Register Now&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://watch.getcontrast.io/register/unblocked-how-to-stop-babysitting-your-agents?utm_source=systemdesign&amp;utm_medium=email&amp;utm_campaign=primary"><span>Register Now</span></a></p><p>(Thanks to <strong><a href="https://watch.getcontrast.io/register/unblocked-how-to-stop-babysitting-your-agents?utm_source=systemdesign&amp;utm_medium=email&amp;utm_campaign=primary">Unblocked</a></strong> for partnering on this post.)</p><div><hr></div><p><strong>Here&#8217;s what&#8217;s inside:</strong></p><ul><li><p><strong>Why single agents break.</strong> Context overflow, slow serial work, and why one agent can&#8217;t always hold every tool, model, or permission.</p></li><li><p><strong>The six architectures.</strong> Orchestrator-worker, pipeline, hierarchical, swarm, mesh, and handoffs. Plus, where each one works and breaks.</p></li><li><p><strong>How agents coordinate.</strong> Run loops, MCP vs A2A, shared state, isolated state, memory, and stopping conditions.</p></li><li><p><strong>The real cost.</strong> Why more agents mean more tokens, more latency, more coordination overhead, and more ways to fail.</p></li><li><p><strong>Failure and security risks.</strong> Bad instructions, misalignment, weak verification, prompt injection, context contamination, and privilege creep.</p></li><li><p><strong>Case study.</strong> How Spotify used an orchestrator-worker system to turn ad planning from a 15-minute workflow into 5 seconds.</p></li></ul><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Golden members get all posts like these!&#8230;</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h2><strong>Limits of single-agent systems</strong></h2><p>One agent with good prompts and the right tools handles more than most people expect.</p><p>For example, Cognition&#8217;s Devin processed 5 million lines of COBOL (Common Business-Oriented Language) across 500GB of repositories with a single agent, raising its pull request merge rate from 34% to 67%.</p><p>But a single agent has three HARD limits. When your task runs into any of them, better prompts won&#8217;t help:</p><p><strong>1. Context overflow</strong></p><p>A context window can only hold so much.</p><p>Past that limit, the earliest information drops out, and the agent starts losing track of its own plan. When compression<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-5" href="#footnote-5" target="_self">5</a> alone can&#8217;t fix the overflow, a second agent with its own context can.</p><p><strong>2. Parallelism</strong></p><p>Independent tasks shouldn&#8217;t wait in line.</p><p>If you have four research queries that don&#8217;t depend on each other, running them one at a time wastes time. Running them across four separate agents takes roughly as long as the slowest one.</p><p>Anthropic&#8217;s research system uses this exact pattern and reduced total query time by up to 90%.</p><p><strong>3. Specialization</strong></p><p>Different parts of a task often need different models<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-6" href="#footnote-6" target="_self">6</a>, tools, or access levels:</p><ul><li><p>A code-writing agent needs a sandbox</p></li><li><p>A research agent needs web search</p></li><li><p>A customer-facing agent needs user data, but shouldn&#8217;t have access to production databases</p></li></ul><p>When one agent can&#8217;t hold all the tools and permissions the task needs, you give each role its own agent.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Lyuf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F912095e9-be3f-47dc-bed8-0c3307cc6d6e_1376x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Lyuf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F912095e9-be3f-47dc-bed8-0c3307cc6d6e_1376x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Lyuf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F912095e9-be3f-47dc-bed8-0c3307cc6d6e_1376x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Lyuf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F912095e9-be3f-47dc-bed8-0c3307cc6d6e_1376x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Lyuf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F912095e9-be3f-47dc-bed8-0c3307cc6d6e_1376x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Lyuf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F912095e9-be3f-47dc-bed8-0c3307cc6d6e_1376x768.jpeg" width="1376" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/912095e9-be3f-47dc-bed8-0c3307cc6d6e_1376x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1376,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Decision flow for when to use multi-agent systems based on context overflow, parallelism, and specialization conditions&quot;,&quot;title&quot;:&quot;Decision flow for when to use multi-agent systems based on context overflow, parallelism, and specialization conditions&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Decision flow for when to use multi-agent systems based on context overflow, parallelism, and specialization conditions" title="Decision flow for when to use multi-agent systems based on context overflow, parallelism, and specialization conditions" srcset="https://substackcdn.com/image/fetch/$s_!Lyuf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F912095e9-be3f-47dc-bed8-0c3307cc6d6e_1376x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Lyuf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F912095e9-be3f-47dc-bed8-0c3307cc6d6e_1376x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Lyuf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F912095e9-be3f-47dc-bed8-0c3307cc6d6e_1376x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Lyuf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F912095e9-be3f-47dc-bed8-0c3307cc6d6e_1376x768.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>If none of those three conditions apply, stay with one agent.</p><p>Better prompts and better tools solve most problems without adding the extra work of coordinating agents. But once you know you need many agents, the next question is what shape the system takes...</p><div><hr></div><h2><strong>Multi-agent architectures</strong></h2><p>Every multi-agent system makes a different choice about who coordinates work.</p><p>Here are 6 architectures that range from <em>tight</em> central control to <em>NO</em> coordinator at all:</p><h3><strong>1. Orchestrator-worker</strong></h3><p>One central agent breaks a task into pieces, assigns each piece to a worker agent, and then puts the results together:</p><ul><li><p>Workers don&#8217;t talk to each other; all communication goes through a central agent</p></li><li><p>Orchestrator calls each worker as a <strong>tool call</strong><a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-7" href="#footnote-7" target="_self">7</a>, waits for a result, and decides what to do next</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cx-g!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb44f7f9-f66b-4af5-8681-7582076ac083_1376x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cx-g!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb44f7f9-f66b-4af5-8681-7582076ac083_1376x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!cx-g!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb44f7f9-f66b-4af5-8681-7582076ac083_1376x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!cx-g!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb44f7f9-f66b-4af5-8681-7582076ac083_1376x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!cx-g!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb44f7f9-f66b-4af5-8681-7582076ac083_1376x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cx-g!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb44f7f9-f66b-4af5-8681-7582076ac083_1376x768.jpeg" width="1376" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fb44f7f9-f66b-4af5-8681-7582076ac083_1376x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1376,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Orchestrator-worker architecture showing central agent connected to isolated workers with task and result arrows&quot;,&quot;title&quot;:&quot;Orchestrator-worker architecture showing central agent connected to isolated workers with task and result arrows&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Orchestrator-worker architecture showing central agent connected to isolated workers with task and result arrows" title="Orchestrator-worker architecture showing central agent connected to isolated workers with task and result arrows" srcset="https://substackcdn.com/image/fetch/$s_!cx-g!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb44f7f9-f66b-4af5-8681-7582076ac083_1376x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!cx-g!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb44f7f9-f66b-4af5-8681-7582076ac083_1376x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!cx-g!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb44f7f9-f66b-4af5-8681-7582076ac083_1376x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!cx-g!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb44f7f9-f66b-4af5-8681-7582076ac083_1376x768.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This is like an air traffic control tower that talks to every plane, while no plane talks directly to another.</p><p>Anthropic&#8217;s Claude Research system works this way:</p><p>A central agent running Opus 4 breaks a research query into parts and creates 2 to 10 worker agents on Sonnet 4 (sometimes more), all at the same time. The workers search the web, read documents, and gather evidence in parallel. When they finish, the central agent reads their results and writes a single research report.</p><p>This setup beat single-agent Opus 4 by 90.2% on Anthropic&#8217;s internal research eval<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-8" href="#footnote-8" target="_self">8</a>.</p><h4><em><strong>Tradeoffs</strong></em></h4><p>Central agent is both coordinator and bottleneck&#8230;</p><p>It talks to workers one at a time. If each call takes 3 seconds and 20 workers are waiting, the ceiling is about 7 tasks per second. So the central agent becomes the slowest part of the system.</p><p>Anthropic&#8217;s Claude Research system had this problem as well: <em>workers duplicated each other&#8217;s work.</em></p><p>Without specific instructions, many workers run overlapping searches on the same topic, wasting both tokens<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-9" href="#footnote-9" target="_self">9</a> and time. The takeaway is orchestrator-worker depends entirely on the quality of the lead agent&#8217;s instructions.</p><p>Vague task splitting turns parallelism into duplicated work.</p><h3><strong>2. Pipeline</strong></h3><p>Agents run in a fixed order, one after another.</p><p>Each agent&#8217;s output becomes the next agent&#8217;s input, and entire sequence is set in advance. While orchestrator-worker lets the central agent decide what to do as it goes, pipeline removes that choice.</p><p>An assembly line works the same way: <em>each station does one job, passes the result forward, and never sees the finished product.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8Wf-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97a1bc5e-e8e9-460f-a72b-9cf92adfeab2_1376x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8Wf-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97a1bc5e-e8e9-460f-a72b-9cf92adfeab2_1376x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!8Wf-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97a1bc5e-e8e9-460f-a72b-9cf92adfeab2_1376x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!8Wf-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97a1bc5e-e8e9-460f-a72b-9cf92adfeab2_1376x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!8Wf-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97a1bc5e-e8e9-460f-a72b-9cf92adfeab2_1376x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8Wf-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97a1bc5e-e8e9-460f-a72b-9cf92adfeab2_1376x768.jpeg" width="1376" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/97a1bc5e-e8e9-460f-a72b-9cf92adfeab2_1376x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1376,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Pipeline architecture showing agents in linear sequence with output contracts between stages&quot;,&quot;title&quot;:&quot;Pipeline architecture showing agents in linear sequence with output contracts between stages&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Pipeline architecture showing agents in linear sequence with output contracts between stages" title="Pipeline architecture showing agents in linear sequence with output contracts between stages" srcset="https://substackcdn.com/image/fetch/$s_!8Wf-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97a1bc5e-e8e9-460f-a72b-9cf92adfeab2_1376x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!8Wf-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97a1bc5e-e8e9-460f-a72b-9cf92adfeab2_1376x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!8Wf-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97a1bc5e-e8e9-460f-a72b-9cf92adfeab2_1376x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!8Wf-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97a1bc5e-e8e9-460f-a72b-9cf92adfeab2_1376x768.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Stripe uses this pattern to check whether new businesses on its platform are legitimate.</p><p>Before agents, a human reviewer had to jump between customer databases, legal sources, and support tickets to decide whether a business was safe to approve. Now their engineering team broke that work into a fixed flow of agent stages using a directed acyclic graph (<strong>DAG</strong>). So work moves forward through stages without looping back.</p><p>Order is set at design time, and each stage has a <strong>contract</strong>: <em>defined output format next stage expects.</em></p><p>Stripe calls these contracts &#8220;rails&#8221; because they keep any single agent from spending too much time on irrelevant data. This setup cut their average handling time by 26%, and reviewers rated the agent outputs 96% helpful, with a full record of every decision at every step.</p><h4><em><strong>Tradeoffs</strong></em></h4><p>Latency adds up.</p><p>A 5-stage pipeline where each stage takes 2 seconds means a 10-second wait before any output, and adding a stage to improve quality increases the response time.</p><p>Yet the upside is predictability.</p><p>When every stage has a narrow contract, you can trace any failure back to exactly one step. That&#8217;s why regulated workflows like Stripe&#8217;s continue to use pipelines.</p><p>When the cost of being wrong is a regulator flagging your process, the extra seconds are a small price to pay.</p><h3><strong>3. Hierarchical</strong></h3><p>A top-level manager agent gives work to one or more layers of manager agents below it, which then give work to individual workers.</p><p>Two levels are the minimum; big systems stack more. The result is a tree&#8230;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Yce7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F602c6fef-d9d1-4e7f-82dd-674e34d78ed9_1376x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Yce7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F602c6fef-d9d1-4e7f-82dd-674e34d78ed9_1376x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Yce7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F602c6fef-d9d1-4e7f-82dd-674e34d78ed9_1376x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Yce7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F602c6fef-d9d1-4e7f-82dd-674e34d78ed9_1376x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Yce7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F602c6fef-d9d1-4e7f-82dd-674e34d78ed9_1376x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Yce7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F602c6fef-d9d1-4e7f-82dd-674e34d78ed9_1376x768.jpeg" width="1376" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/602c6fef-d9d1-4e7f-82dd-674e34d78ed9_1376x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1376,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Hierarchical architecture showing tree structure with top manager, mid-level managers, and worker agents&quot;,&quot;title&quot;:&quot;Hierarchical architecture showing tree structure with top manager, mid-level managers, and worker agents&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Hierarchical architecture showing tree structure with top manager, mid-level managers, and worker agents" title="Hierarchical architecture showing tree structure with top manager, mid-level managers, and worker agents" srcset="https://substackcdn.com/image/fetch/$s_!Yce7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F602c6fef-d9d1-4e7f-82dd-674e34d78ed9_1376x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Yce7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F602c6fef-d9d1-4e7f-82dd-674e34d78ed9_1376x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Yce7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F602c6fef-d9d1-4e7f-82dd-674e34d78ed9_1376x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Yce7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F602c6fef-d9d1-4e7f-82dd-674e34d78ed9_1376x768.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>No single agent needs the full context.</p><p>The top-level agent holds the high-level goal and a summary from each branch, while each lower level sees only what its narrower role needs. Picture a military chain of command where orders flow down, reports flow up, and no one skips a level.</p><p>IBM watsonx Orchestrate runs on this pattern:</p><p>A top-level supervisor agent acts as a router and planner across 80+ pre-built domain agents for HR, sales, and procurement. Let&#8217;s say a user tries to &#8220;order new laptops for the design team&#8221;. The request reaches a Procure Equipment supervisor, who then hands the work to <em>three specialized child agents</em>:</p><ul><li><p>One requests quotes from approved vendors</p></li><li><p>Another checks responses</p></li><li><p>A third submits a purchase request</p></li></ul><p>The supervisor decides only who gets called and in which order.</p><h4><em><strong>Tradeoffs</strong></em></h4><p>Details might get lost at each level.</p><p>A worker produces a detailed result. The mid-level manager shortens it to one sentence. By the time it reaches the top, the detail that matters might be gone. Hierarchical structures trade detail for coverage: the higher you go, the wider the scope of each agent and the less it knows about any specific piece.</p><div><hr></div><p><em>These three architectures share one thing: a clear chain of command tells you exactly who&#8217;s in charge&#8230;</em></p><p><em>The next three architectures drop the chain entirely. So they're harder to debug, but they survive partial failures better&#8230;</em></p><div><hr></div><p><em><strong>Reminder: this is a teaser of the subscriber-only newsletter, exclusive to my golden members.</strong></em></p><p>When you upgrade, you&#8217;ll get:</p><ul><li><p><strong>High-level architecture of real-world systems.</strong></p></li><li><p>Deep dive into how popular real-world systems work.</p></li><li><p><strong>How real-world systems handle scale, reliability, and performance.</strong></p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe&quot;,&quot;text&quot;:&quot;Unlock Full Access&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://newsletter.systemdesign.one/subscribe"><span>Unlock Full Access</span></a></p><div><hr></div><h3><strong>4. Swarm</strong></h3><p>In a swarm, many agents work as equals.</p><p>They coordinate through a <strong>shared blackboard</strong>: a data store (typically Redis cache, database table, or vector store) that every agent can read from and write to. Yet there are NO direct messages between agents&#8230;</p>
      <p>
          <a href="https://newsletter.systemdesign.one/p/multi-agent-system">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[29 LLM Evaluation Concepts Every Engineer Needs to Know]]></title><description><![CDATA[#142: From &#8220;it looked fine in testing&#8221; to a system you can actually trust]]></description><link>https://newsletter.systemdesign.one/p/llm-evals</link><guid isPermaLink="false">https://newsletter.systemdesign.one/p/llm-evals</guid><dc:creator><![CDATA[Anshuman Mishra]]></dc:creator><pubDate>Mon, 27 Apr 2026 11:50:18 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/644136e5-3bdd-481f-aaf3-26e01e791323_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Get my system design playbook for FREE on newsletter signup:</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><ul><li><p><em><a href="https://newsletter.systemdesign.one/p/llm-evals/?action=share">Share this post</a> &amp; I'll send you some rewards for the referrals.</em></p></li></ul><div><hr></div><p>You ship an LLM feature.</p><p>It passes your manual tests. A day later, a user posts a screenshot of it hallucinating wildly! You tweak the prompt, run it again, and it works fine. Did you fix it, or did you just get lucky?</p><p>Welcome to the central frustration of LLM engineering: <em>you can NO longer just run a test and call it done.</em></p><p>This isn&#8217;t a debugging problem&#8230;</p><p>It&#8217;s a measurement problem. And measurement has a name: <strong>evaluation</strong>.</p><p>Most articles on LLM evaluation are written for ML researchers. This one is for engineers building real applications. You know how to ship software. You&#8217;re just new to the ways LLMs fail.</p><p>We&#8217;ll cover the vocabulary, methods, and mental models. By the end, you&#8217;ll have a framework for building an eval system from scratch. Not just an understanding of why it matters.</p><p>Let&#8217;s start with why your existing testing instincts don&#8217;t work here&#8230;</p><div><hr></div><h2><a href="https://coderabbit.link/neo-agent">Your team&#8217;s second brain. Now in Slack. (Partner)</a></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://coderabbit.link/neo-agent" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GfjP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png 424w, https://substackcdn.com/image/fetch/$s_!GfjP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png 848w, https://substackcdn.com/image/fetch/$s_!GfjP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png 1272w, https://substackcdn.com/image/fetch/$s_!GfjP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GfjP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png" width="1248" height="654" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:654,&quot;width&quot;:1248,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:547609,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://coderabbit.link/neo-agent&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/192885623?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GfjP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png 424w, https://substackcdn.com/image/fetch/$s_!GfjP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png 848w, https://substackcdn.com/image/fetch/$s_!GfjP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png 1272w, https://substackcdn.com/image/fetch/$s_!GfjP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a43027a-3160-457a-8e4b-4e144b0be46a_1248x654.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Your engineers talk on Slack. They code in the terminal. Somewhere between those two things, context goes to die.</p><ul><li><p>A bug was debated in #incidents at 2 AM.</p></li><li><p>An architectural call was made in a DM.</p></li></ul><p>Every handoff leaks context, and every leak costs you. That&#8217;s the context tax - and your team pays it every day.</p><p><a href="https://coderabbit.link/neo-agent">CodeRabbit Agent for Slack</a> is built for agentic SDLC workflows. One agent for your entire Software Development Lifecycle, living in the channel where the work already happens. It&#8217;s built on four things:</p><ul><li><p>Context - your org&#8217;s operating picture, pulled from across code, tickets, docs, monitoring and cloud.</p></li><li><p>Knowledge Base - a living memory of your team. Every run leaves a trace, so yesterday&#8217;s decisions don&#8217;t become tomorrow&#8217;s debates.</p></li><li><p>Multi-Player - works in shared threads alongside your team. Steerable, resumable and aligned as work evolves.</p></li><li><p>Governance - scoped access, cost attribution. Every run explainable and attributed.</p></li></ul><p>Your team keeps shipping. <a href="https://coderabbit.link/neo-agent">Agent</a> keeps the context.</p><p>From the team that pioneered AI code reviews. 2M code reviews every week. 6M repos. 15K customers. And now, one agent for your entire SDLC, right in Slack.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://coderabbit.link/neo-agent&quot;,&quot;text&quot;:&quot;Try CodeRabbit's Agent Today&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://coderabbit.link/neo-agent"><span>Try CodeRabbit's Agent Today</span></a></p><p>(Thanks to <a href="https://coderabbit.link/neo-agent">CodeRabbit</a> for partnering on this post.)</p><div><hr></div><p>I want to introduce <strong><a href="https://www.linkedin.com/in/athletickoder/">Anshuman</a></strong> as a guest author.</p><blockquote></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://beaiproof.substack.com/welcome" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WKHG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a32ac35-dc9b-4763-8078-226ec3f7186b_1294x776.png 424w, https://substackcdn.com/image/fetch/$s_!WKHG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a32ac35-dc9b-4763-8078-226ec3f7186b_1294x776.png 848w, https://substackcdn.com/image/fetch/$s_!WKHG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a32ac35-dc9b-4763-8078-226ec3f7186b_1294x776.png 1272w, https://substackcdn.com/image/fetch/$s_!WKHG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a32ac35-dc9b-4763-8078-226ec3f7186b_1294x776.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WKHG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a32ac35-dc9b-4763-8078-226ec3f7186b_1294x776.png" width="1294" height="776" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2a32ac35-dc9b-4763-8078-226ec3f7186b_1294x776.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:776,&quot;width&quot;:1294,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:870375,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://beaiproof.substack.com/welcome&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://beaiproof.substack.com/i/192525618?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a32ac35-dc9b-4763-8078-226ec3f7186b_1294x776.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!WKHG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a32ac35-dc9b-4763-8078-226ec3f7186b_1294x776.png 424w, https://substackcdn.com/image/fetch/$s_!WKHG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a32ac35-dc9b-4763-8078-226ec3f7186b_1294x776.png 848w, https://substackcdn.com/image/fetch/$s_!WKHG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a32ac35-dc9b-4763-8078-226ec3f7186b_1294x776.png 1272w, https://substackcdn.com/image/fetch/$s_!WKHG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a32ac35-dc9b-4763-8078-226ec3f7186b_1294x776.png 1456w" sizes="100vw" loading="lazy" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>He leads evals efforts at Zomato, where he built <strong>Gavel</strong> -- an internal LLM eval platform that started as a handful of scripts, got pitched to a VP, and now serves AI and ops teams across the company. Now he&#8217;s on the ground floor of making a large organization AI-native, one eval system at a time.</p><p>I highly recommend you checkout his newsletter, <strong><a href="https://beaiproof.substack.com/welcome">AI Proof</a></strong> -- it&#8217;ll help you stay relevant in the AI era.</p><div><hr></div><p>Before we get into solutions, you need to understand something&#8230;</p><p>LLM evaluation feels nothing like regular software testing. It&#8217;s not that it&#8217;s harder. It&#8217;s that the rules changed.</p><p>Let&#8217;s dive in!</p><h3><strong>1. Non-determinism Problem</strong></h3><p>Write a function.</p><p>Call it with the same input twice, and you&#8217;ll get the same output twice.</p><p>But LLMs don&#8217;t work that way.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZRjZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cc8ba0d-9314-456b-876d-9a7ff7a680a2_1024x559.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZRjZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cc8ba0d-9314-456b-876d-9a7ff7a680a2_1024x559.png 424w, https://substackcdn.com/image/fetch/$s_!ZRjZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cc8ba0d-9314-456b-876d-9a7ff7a680a2_1024x559.png 848w, https://substackcdn.com/image/fetch/$s_!ZRjZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cc8ba0d-9314-456b-876d-9a7ff7a680a2_1024x559.png 1272w, https://substackcdn.com/image/fetch/$s_!ZRjZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cc8ba0d-9314-456b-876d-9a7ff7a680a2_1024x559.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZRjZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cc8ba0d-9314-456b-876d-9a7ff7a680a2_1024x559.png" width="1024" height="559" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8cc8ba0d-9314-456b-876d-9a7ff7a680a2_1024x559.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:559,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!ZRjZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cc8ba0d-9314-456b-876d-9a7ff7a680a2_1024x559.png 424w, https://substackcdn.com/image/fetch/$s_!ZRjZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cc8ba0d-9314-456b-876d-9a7ff7a680a2_1024x559.png 848w, https://substackcdn.com/image/fetch/$s_!ZRjZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cc8ba0d-9314-456b-876d-9a7ff7a680a2_1024x559.png 1272w, https://substackcdn.com/image/fetch/$s_!ZRjZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cc8ba0d-9314-456b-876d-9a7ff7a680a2_1024x559.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The same prompt can produce a different response on every run.</p><p>Sometimes slightly different. Sometimes completely different. This isn&#8217;t a bug. It&#8217;s called <strong>temperature</strong>.</p><p>It controls how random the outputs are. (More on this later.)</p><p>This breaks something deeper. Your entire mental model for testing assumes determinism. You run a test. It passes or fails. You know what you know.</p><p>With LLMs, a passing test is a data point, not a verdict. You&#8217;re not testing a function. Instead, you&#8217;re sampling from a probability distribution. One sample tells you almost nothing.</p><p>This is the first reason your existing instincts mislead you.</p><h3><strong>2. Fuzzy Correctness Problem</strong></h3><p>When a regex matches, it matches. Binary. Objective. Done.</p><p>But what&#8217;s the correct response to <em>&#8220;summarize this support ticket empathetically?</em>&#8221;</p><ul><li><p><em>Is the correct answer the one that&#8217;s most concise?</em></p></li><li><p><em>The one that uses the warmest language?</em></p></li><li><p><em>The one that captures all the key facts?</em></p></li><li><p><em>The one a human reviewer would score highest?</em></p></li></ul><p>LLM quality is <strong>multi-dimensional and subjective</strong>.</p><p>There usually isn&#8217;t one correct answer. There&#8217;s a range of acceptable answers and a range of bad ones. The line between them is a judgment call.</p><p>You can&#8217;t measure what you haven&#8217;t defined.</p><p>What does &#8216;good&#8217; mean for your use case? You need a clear answer to that before you can evaluate anything. Most people skip this step. It comes back to haunt them.</p><h3><strong>3. Silent Regression Problem</strong></h3><p>You updated your prompt.</p><p>Ran it a few times. The outputs seemed better. You shipped it.</p><p><em>But did quality actually improve, or did you get lucky with the samples you checked?</em></p><p>This is the <strong>silent regression problem</strong>. Without a systematic evaluation process, every prompt change is a blind bet. You might be making things better. You might be fixing one failure mode while introducing another. And you have no way of knowing.</p><p>In traditional software, <strong>CI</strong> (Continuous Integration) catches regressions before they reach users. In LLM engineering, there&#8217;s NO such equivalent. So people rely on their gut feelings and manual spot checks. You find out about regressions when a user complains!</p><p><em>These three problems make LLM evaluation its own discipline&#8230;</em></p><p><em>Now let&#8217;s build the vocabulary to actually talk about it&#8230;</em></p><div><hr></div><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/p/llm-evals?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption"><em>Share this post &amp; get rewards for the referrals.</em></p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/p/llm-evals?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.systemdesign.one/p/llm-evals?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><div><hr></div><h2><strong>Primitives of Eval</strong></h2><p>These are <em>six terms</em> you&#8217;ll hear constantly.</p><p>Learn these, and the rest falls into place:</p><h3><strong>4. Criteria</strong></h3><p><strong>Criteria</strong> are the dimensions of quality that matter for your specific use case.</p><p>A customer support bot might care about:</p><ul><li><p><em>Does it address the user&#8217;s actual issue?</em></p></li><li><p><em>Is the tone empathetic?</em></p></li><li><p><em>Does it avoid triggering escalation unnecessarily?</em></p></li></ul><p>A code generation tool cares about something completely different:</p><ul><li><p><em>Is the output syntactically valid?</em></p></li><li><p><em>Does it follow the project&#8217;s conventions?</em></p></li><li><p><em>Does it actually solve the stated problem?</em></p></li></ul><p>Same technology, but entirely different criteria.</p><p>This is a product decision, NOT a technical one. You can&#8217;t outsource it to your model or your eval framework. Someone on your team needs to sit down and answer: <em>What does a good output actually look like here?</em></p><p>Get this wrong, and everything downstream is measuring the wrong thing&#8230;</p><h3><strong>5. Quality Dimensions</strong></h3><p>Quality dimensions are the standard industry vocabulary for LLM output quality.</p><p>Here are five of them that come up constantly:</p><ul><li><p><strong>Relevance</strong>: <em>Did the output address what was actually asked? </em>A response can be accurate, well-written, and completely beside the point.</p></li><li><p><strong>Coherence</strong>: <em>Does it hold together logically?</em> No contradictions. No mid-sentence topic shifts. Flows like something a thoughtful person wrote.</p></li><li><p><strong>Factual Accuracy</strong>: <em>Is what it says actually true?</em> Distinct from relevance. A response can be relevant and wrong.</p></li><li><p><strong>Helpfulness</strong>: <em>Does it give the user what they need to move forward? </em>The difference between a technically correct answer and a useful one.</p></li><li><p><strong>Safety</strong>: <em>Does it avoid harmful, biased, or inappropriate content?</em> This dimension matters more in some domains than others -- but it always matters.</p></li></ul><p>Knowing these helps you write <em>rubrics</em> that don&#8217;t miss important failure modes.</p><h3><strong>6. Rubric</strong></h3><p>Once you have the criteria, you need to operationalize them.</p><p>That&#8217;s what a <strong>rubric</strong> does.</p><p>Take a vague criterion like &#8220;<em>helpfulness</em>.&#8221; A rubric breaks it into specific, scorable questions. <em>Does it directly answer the question? Does it avoid unnecessary hedging? Is it under 200 words? Can a non-technical user understand it?</em></p><p>Think of it like a code review checklist.</p><p>Instead of <em>&#8220;Is this good code?&#8221;,</em> a checklist asks: A<em>re there tests? Is the function under 30 lines? Are variable names descriptive?</em> Rubrics do the same thing for LLM outputs.</p><p>The rubric is what makes evaluation reproducible.</p><p>Two different reviewers, human or AI, should arrive at similar scores. Same output, same rubric, same conclusion.</p><p>Without a rubric, every evaluation is just someone&#8217;s opinion.</p><h3><strong>7. Test Cases</strong></h3><p>A <strong>test case</strong> is an input/output pair that forms one unit of your evaluation.</p><p>Input is a prompt: <em>ideally, one representative of real user traffic.</em></p><p>Output can be one of two things. <em>A reference answer showing what good output looks like. Or the live model output you&#8217;re about to score.</em></p><p>Think of test cases like unit tests&#8230;</p><p>Except that a failing test case doesn&#8217;t mean the output was wrong. It means it scored below your defined threshold on your rubric.</p><p>That distinction matters.</p><p>You need a lot of them. A handful of test cases gives you anecdotes. A few hundred gives you a signal.</p><h3><strong>8. Golden Set</strong></h3><p>Everything in your eval gets measured against one thing: your <strong>golden set</strong>.</p><p>A curated collection of high-quality test cases.</p><p>Building a good golden set is harder than it sounds:</p><p>The instinct is to write examples yourself, covering the use cases you anticipate. That&#8217;s a start. But users phrase things differently than you expect. They hit edge cases you didn&#8217;t imagine.</p><p>And they might misuse features in creative ways.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wias!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9cedf74-ec5d-4e64-9114-2550a77d66dd_1380x752.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wias!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9cedf74-ec5d-4e64-9114-2550a77d66dd_1380x752.png 424w, https://substackcdn.com/image/fetch/$s_!wias!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9cedf74-ec5d-4e64-9114-2550a77d66dd_1380x752.png 848w, https://substackcdn.com/image/fetch/$s_!wias!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9cedf74-ec5d-4e64-9114-2550a77d66dd_1380x752.png 1272w, https://substackcdn.com/image/fetch/$s_!wias!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9cedf74-ec5d-4e64-9114-2550a77d66dd_1380x752.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wias!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9cedf74-ec5d-4e64-9114-2550a77d66dd_1380x752.png" width="1380" height="752" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c9cedf74-ec5d-4e64-9114-2550a77d66dd_1380x752.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:752,&quot;width&quot;:1380,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:770107,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/192885623?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9cedf74-ec5d-4e64-9114-2550a77d66dd_1380x752.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wias!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9cedf74-ec5d-4e64-9114-2550a77d66dd_1380x752.png 424w, https://substackcdn.com/image/fetch/$s_!wias!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9cedf74-ec5d-4e64-9114-2550a77d66dd_1380x752.png 848w, https://substackcdn.com/image/fetch/$s_!wias!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9cedf74-ec5d-4e64-9114-2550a77d66dd_1380x752.png 1272w, https://substackcdn.com/image/fetch/$s_!wias!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9cedf74-ec5d-4e64-9114-2550a77d66dd_1380x752.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A golden set built from your imagination reflects your imagination and not your users&#8217;. So seed it with real production queries instead. Anonymized, cleaned, representative.</p><p>Your golden set is your ground truth.</p><p>Treat it like a critical system artifact. Version and update it when you discover new failure modes.</p><h3><strong>9. Pass/Fail Threshold</strong></h3><p>Eval scores are rarely binary.</p><p>A rubric usually produces a score of 1 to 5, 0 to 10, or a percentage. The <strong>pass/fail threshold</strong> is what converts that score into a decision.</p><p>If your rubric scores range from 1 to 5 and your threshold is 3, any score below 3 is a failure. Simple in theory, but HARD in practice.</p><p>Setting the right threshold is a <em>product call</em>, not a technical one.</p><p>It depends on a few things: <em>how much imperfection your users can tolerate. How severe are failures in your domain? The cost of false positives.</em></p><p>Set your threshold too low, and you&#8217;re shipping garbage. And if you set it too high, you&#8217;re shipping nothing.</p><h3><strong>10. Eval Coverage</strong></h3><p><strong>Eval coverage</strong> is how well your golden set reflects real user inputs.</p><p>Most teams have low coverage and don&#8217;t know it. They built their golden set from examples they had written. Happy path, a few obvious edge cases. Meanwhile, production traffic is different: weird inputs, unusual phrasings, use cases nobody anticipated.</p><p>Low coverage means your eval suite is optimistic&#8230;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5b7I!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23d76175-b112-41eb-88fc-f23e06878733_2816x1536.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5b7I!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23d76175-b112-41eb-88fc-f23e06878733_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!5b7I!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23d76175-b112-41eb-88fc-f23e06878733_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!5b7I!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23d76175-b112-41eb-88fc-f23e06878733_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!5b7I!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23d76175-b112-41eb-88fc-f23e06878733_2816x1536.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5b7I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23d76175-b112-41eb-88fc-f23e06878733_2816x1536.png" width="1456" height="794" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/23d76175-b112-41eb-88fc-f23e06878733_2816x1536.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:794,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:5268690,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://conductorbyam.substack.com/i/192525618?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23d76175-b112-41eb-88fc-f23e06878733_2816x1536.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!5b7I!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23d76175-b112-41eb-88fc-f23e06878733_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!5b7I!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23d76175-b112-41eb-88fc-f23e06878733_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!5b7I!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23d76175-b112-41eb-88fc-f23e06878733_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!5b7I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23d76175-b112-41eb-88fc-f23e06878733_2816x1536.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>i.e., you&#8217;ll pass your tests but fail your users.</p><p>The fix isn&#8217;t writing more examples yourself. Instead, sample from production regularly and review failures. Then add the inputs that exposed new weaknesses to your golden set.</p><p>Eval coverage is something you build over time.</p><h3><strong>11. Temperature, Top-p, and Reproducibility</strong></h3><p>Temperature is the dial that controls how <em>random</em> your model&#8217;s outputs are.</p><p>Low temperature (close to 0) makes the model nearly deterministic: same input, same output, every time. High temperature makes it more creative. It samples from a wider range of probable tokens, producing more varied responses.</p><p><strong>Top-p</strong> (nucleus sampling) is a related setting.</p><p>Instead of a randomness dial, it sets a probability cutoff. Only the most likely tokens make the cut. <code>Top-p = 0.9</code> means the model considers only the top 90% most likely next tokens.</p><p>Both settings directly affect your eval results&#8230;</p><p>Run evals at <code>temperature = 1.0</code> and the same prompt might pass today and fail tomorrow. NOT because your model changed, but because randomness swung against you.</p><p>Here&#8217;s the standard practice: set temperature to 0 during eval runs. And lock in determinism. If you need creative variance in production, test at your production temperature.</p><p>Just know you&#8217;re accepting noisier results&#8230;</p><h3><strong>12. Statistical Rigor</strong></h3><p>Even at <code>temperature = 0</code>, a single eval run isn&#8217;t enough.</p><p>Your golden set is a sample of your input space. That sample has its own variance. One unlucky set of examples can make a good prompt look bad. And one lucky set can make a bad prompt look good.</p><p>So run many evaluations across different samples.</p><p>Then report the mean and the variance, not just the score. When comparing two prompt versions, check whether the difference is real. Or if it&#8217;s just noise.</p><p>In practice: if you change a prompt and your score goes from 4.1 to 4.3, that might be a real improvement.</p><p>It might be a random fluctuation. Without variance<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a> data across many runs, you can&#8217;t tell the difference. Most teams run once, report the number, and ship. That&#8217;s how confident regressions get deployed&#8230;</p><div><hr></div><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/p/llm-evals?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption"><em>Share this post &amp; get rewards for the referrals.</em></p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/p/llm-evals?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.systemdesign.one/p/llm-evals?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><div><hr></div><h2><strong>How Do You Score Outputs?</strong></h2><p>You have criteria, a rubric, and test cases<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a>.</p><p>Now the question is: <em>who does the scoring? </em>There are three options.</p><p>Each has different tradeoffs in cost, speed, and accuracy.</p><h3><strong>13. Human Evaluation</strong></h3><p>This is the gold standard.</p><p>A human reviews the output using your rubric and gives a score. It&#8217;s slow, expensive, and can be inconsistent. But it&#8217;s the closest thing to ground truth.</p><p>You can&#8217;t use humans for every eval run; it doesn&#8217;t scale.</p><p>But you shouldn&#8217;t remove human evaluation completely either. It&#8217;s what keeps your system grounded. Everything else: metrics, LLM judges, tests, is just an approximation of human judgment.</p><p>So use human evaluation strategically:</p><ul><li><p>To build and validate your golden dataset.</p></li><li><p>To periodically check the accuracy of your automated evals.</p></li><li><p>To debug when something breaks and you don&#8217;t know why.</p></li></ul><p>This way, you balance cost with reliability.</p><h3><strong>14. Heuristic/Code Based Evaluation</strong></h3><p>This is the fastest and cheapest type of evaluation.</p><p>It uses simple code checks to validate the output's structural properties. For example:</p><ul><li><p><em>Is the response valid JSON?</em></p></li><li><p><em>Is it within the character limit?</em></p></li><li><p><em>Does it include all required fields?</em></p></li><li><p><em>Does it avoid banned phrases?</em></p></li><li><p><em>Does it match a specific format (like a regex)?</em></p></li></ul><p>Heuristic eval is good at catching structural problems.</p><p>But they don&#8217;t measure quality. They can&#8217;t tell if a response is helpful, accurate, or well-written.</p><p>Think of heuristics as your first line of defense.</p><p>They catch basic issues before you run more expensive evaluations. They&#8217;re NOT enough on their own, but they&#8217;re an essential part of a complete eval system&#8230;</p><h3><strong>15. Semantic Similarity Evaluation</strong></h3><p>Sometimes you have a reference answer, a known-good response representing ideal output.</p><p>Semantic similarity evaluation measures how close your model&#8217;s output is to the reference. i.e., in meaning, and not exact wording.</p><p>This is where embeddings come in&#8230;</p><p>Each piece of text gets converted into a vector: <em>a list of numbers that represents its meaning.</em> Texts with similar meanings have vectors that are close to each other. Cosine similarity measures the angle between two vectors. A score of 1.0 means identical meaning. A score near 0 means unrelated.</p><p>This matters because string matching is too strict.</p><p>Take two sentences: <em>&#8220;API returns a 404 error&#8221;</em> and <em>&#8220;Endpoint responds with a not found status.&#8221;</em> They mean the same thing, but use different words. An exact match would call the second one wrong. Semantic similarity would call them equivalent.</p><p>But here&#8217;s the limitation:<em> it only measures closeness to your reference.</em></p><p>It can&#8217;t catch a response that&#8217;s fluent and factually wrong. Not if the wrong answer happens to be semantically similar to the right one. So use it as a fast, scalable layer. And don&#8217;t rely on it alone!</p><h3><strong>16. Task Specific Metrics (BLEU, ROUGE, Execution-based)</strong></h3><p>For certain tasks, there are established metrics purpose-built for automated evaluation:</p><ul><li><p><strong>BLEU (Bilingual Evaluation Understudy)</strong>: Originally for machine translation. Measures n-gram overlap between generated text and a reference. Good for tasks where exact phrasing matters.</p></li><li><p><strong>ROUGE (Recall-Oriented Understudy for Gisting Evaluation)</strong>: Designed for summarization. Measures recall: <em>how much of the reference answer appears in the output?</em> Useful when you care about coverage: <em>did the model hit all the key points?</em></p></li><li><p><strong>Execution-based evaluation</strong>: For code generation, one metric matters. <em>Does it run? Does it produce the correct output?</em> Execution-based eval runs the generated code against test cases and checks the results. A function that returns the wrong answer fails. Another function that passes all tests succeeds.</p></li></ul><p>But all three share the same trade-off:</p><p>They measure surface-level similarity to a reference, not actual quality. So use these where they fit.</p><h3><strong>17. LLM as Judge</strong></h3><p>This is what makes evaluation scalable&#8230;</p><p><strong>LLM-as-judge</strong> uses a more capable model to evaluate your application&#8217;s outputs. You give it three things:</p><ul><li><p>Original input,</p></li><li><p>Output you&#8217;re evaluating,</p></li><li><p>Your rubric. </p></li></ul><p>The judge then returns a score and an explanation.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!IF5b!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe688c960-0428-484f-84b7-56c011f17898_2816x1536.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!IF5b!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe688c960-0428-484f-84b7-56c011f17898_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!IF5b!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe688c960-0428-484f-84b7-56c011f17898_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!IF5b!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe688c960-0428-484f-84b7-56c011f17898_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!IF5b!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe688c960-0428-484f-84b7-56c011f17898_2816x1536.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!IF5b!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe688c960-0428-484f-84b7-56c011f17898_2816x1536.png" width="1456" height="794" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e688c960-0428-484f-84b7-56c011f17898_2816x1536.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:794,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:5026763,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://conductorbyam.substack.com/i/192525618?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe688c960-0428-484f-84b7-56c011f17898_2816x1536.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!IF5b!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe688c960-0428-484f-84b7-56c011f17898_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!IF5b!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe688c960-0428-484f-84b7-56c011f17898_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!IF5b!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe688c960-0428-484f-84b7-56c011f17898_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!IF5b!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe688c960-0428-484f-84b7-56c011f17898_2816x1536.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Think of it like automated load testing.</p><p>Nobody clicks through 10,000 user flows to check for regressions. You can instead automate it. LLM-as-judge does the same thing for quality. It lets you run your rubric across thousands of outputs without thousands of human hours.</p><p>GPT-5 and Claude Opus are commonly used as judges. They&#8217;re capable enough to apply nuanced rubrics reliably. You&#8217;re calling them via API, NO custom training required.</p><p>But here&#8217;s the catch: <em>judges can be wrong.</em></p><p>They have their own biases and blind spots. An LLM judge is an approximation of human judgment, not a replacement for it.</p><p>More on how to handle this in a moment&#8230;</p><h3><strong>18. Pointwise vs Pairwise Evaluation</strong></h3><p>These are 2 flavors of LLM-as-judge evaluation:</p><ul><li><p><strong>Pointwise</strong></p><ul><li><p><em>You ask, &#8220;Score this output on a scale of 1&#8211;5.&#8221;</em></p></li></ul><ul><li><p>It&#8217;s simple and fast.</p></li><li><p>And each output needs one evaluation call.</p></li></ul></li><li><p><strong>Pairwise</strong></p><ul><li><p>You ask: <em>&#8220;Here are two outputs. Which one is better?&#8221;</em></p></li><li><p>This is usually more reliable because comparing is easier than scoring.</p></li><li><p>It&#8217;s especially useful when you want to know if a new prompt is actually better.</p></li></ul></li></ul><p>But the downside of <em>pairwise</em> is the cost&#8230;</p><p>You&#8217;re evaluating two outputs instead of one, so it can be twice as expensive. At a small scale, this doesn&#8217;t matter, but at a production scale, it adds up quickly.</p><p>Because of this, the industry standard is to use a tiered approach:</p><ul><li><p>Online evaluation (production monitoring)</p><ul><li><p>Use fast, cheap methods like heuristics or smaller, fine-tuned judge models.</p></li></ul><ul><li><p>These can run continuously without high cost or latency.</p></li></ul></li><li><p>Offline evaluation (pre-ship testing)</p><ul><li><p>Use the best LLM judge you have.</p></li><li><p>Run pointwise and pairwise comparisons.</p></li><li><p>This is acceptable because you&#8217;re only running it before deployments.</p></li></ul></li></ul><p>Think of the powerful judge model as a gate before deployment, and NOT something you run on every request.</p><h3><strong>19. Judge Calibration</strong></h3><p>Before you rely on an LLM judge, you need to know how well it matches human judgment&#8230;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Wevo!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef275e7b-9e9d-4e8d-a9ad-b63e6343cc6e_2816x1536.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Wevo!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef275e7b-9e9d-4e8d-a9ad-b63e6343cc6e_2816x1536.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Wevo!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef275e7b-9e9d-4e8d-a9ad-b63e6343cc6e_2816x1536.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Wevo!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef275e7b-9e9d-4e8d-a9ad-b63e6343cc6e_2816x1536.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Wevo!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef275e7b-9e9d-4e8d-a9ad-b63e6343cc6e_2816x1536.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Wevo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef275e7b-9e9d-4e8d-a9ad-b63e6343cc6e_2816x1536.jpeg" width="1456" height="794" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ef275e7b-9e9d-4e8d-a9ad-b63e6343cc6e_2816x1536.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:794,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1129047,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/192885623?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef275e7b-9e9d-4e8d-a9ad-b63e6343cc6e_2816x1536.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Wevo!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef275e7b-9e9d-4e8d-a9ad-b63e6343cc6e_2816x1536.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Wevo!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef275e7b-9e9d-4e8d-a9ad-b63e6343cc6e_2816x1536.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Wevo!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef275e7b-9e9d-4e8d-a9ad-b63e6343cc6e_2816x1536.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Wevo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef275e7b-9e9d-4e8d-a9ad-b63e6343cc6e_2816x1536.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Judge calibration measures this:</p><p>You take a sample of outputs, have both humans and the judge score them, and then check how often they agree.</p><ul><li><p>If agreement is high, your judge is a good proxy for human evaluation.</p></li><li><p>If agreement is low, the judge may be measuring the wrong thing.</p></li></ul><p>A poorly calibrated judge is risky.</p><p>It can give you confidence in results that aren&#8217;t actually correct. So calibrate your judge before using it. Then recalibrate periodically, especially after changing models or prompts.</p><div><hr></div><p>Get 20% savings by getting a group subscription right now:</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?group=true&amp;coupon=d6837d0d&quot;,&quot;text&quot;:&quot;Get 20% off a group subscription&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.systemdesign.one/subscribe?group=true&amp;coupon=d6837d0d"><span>Get 20% off a group subscription</span></a></p><div><hr></div><h2><strong>RAG System Evaluation</strong></h2><p>Most LLM applications today aren&#8217;t just &#8220;prompt in, response out.&#8221;</p><p>They use Retrieval-Augmented Generation (<strong>RAG</strong>): <em>first fetch relevant documents, inject them into the context, and then generate a grounded response.</em></p><p>This helps reduce hallucinations because the model can rely on real data rather than just its training data. But it doesn&#8217;t solve the problem completely&#8230;</p><p>It also introduces new failure modes that standard LLM evaluation doesn&#8217;t catch&#8230;</p><h3><strong>20. RAG Triad</strong></h3><p>The standard framework for evaluating RAG systems has three dimensions:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!eXKJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F487ff2a2-c404-4702-bcc5-e57fcd255586_2816x1536.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eXKJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F487ff2a2-c404-4702-bcc5-e57fcd255586_2816x1536.jpeg 424w, https://substackcdn.com/image/fetch/$s_!eXKJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F487ff2a2-c404-4702-bcc5-e57fcd255586_2816x1536.jpeg 848w, https://substackcdn.com/image/fetch/$s_!eXKJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F487ff2a2-c404-4702-bcc5-e57fcd255586_2816x1536.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!eXKJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F487ff2a2-c404-4702-bcc5-e57fcd255586_2816x1536.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eXKJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F487ff2a2-c404-4702-bcc5-e57fcd255586_2816x1536.jpeg" width="1456" height="794" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/487ff2a2-c404-4702-bcc5-e57fcd255586_2816x1536.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:794,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:928306,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/192885623?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F487ff2a2-c404-4702-bcc5-e57fcd255586_2816x1536.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!eXKJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F487ff2a2-c404-4702-bcc5-e57fcd255586_2816x1536.jpeg 424w, https://substackcdn.com/image/fetch/$s_!eXKJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F487ff2a2-c404-4702-bcc5-e57fcd255586_2816x1536.jpeg 848w, https://substackcdn.com/image/fetch/$s_!eXKJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F487ff2a2-c404-4702-bcc5-e57fcd255586_2816x1536.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!eXKJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F487ff2a2-c404-4702-bcc5-e57fcd255586_2816x1536.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>1 Faithfulness</strong></p><p><em>Did the answer actually come from the retrieved context?</em></p><p>This often surprises people. You can retrieve the perfect documents, and the model can still ignore them. Faithfulness checks whether the response is grounded in the provided sources.</p><p>A failure here looks like a confident answer that isn&#8217;t supported by the retrieved documents. So it&#8217;s just a hallucination with extra steps.</p><p><strong>2 Answer Relevance</strong></p><p><em>Did the response actually answer the user&#8217;s question?</em></p><p>A response can be faithful to the context and still miss the point. The model may use the right documents but answer the wrong question.</p><p><em>Answer relevance</em> measures how well the response matches the user&#8217;s intent. While <em>Faithfulness</em> checks if the model stayed within the context.</p><p><strong>3 Context Precision</strong></p><p><em>Did the retrieval step fetch the right documents?</em></p><p>Even a perfect generator will fail if the input context is poor. If retrieval brings in weak or loosely related documents, the model either guesses or tells the user it doesn&#8217;t know.</p><p>Context precision evaluates the retrieval stage. It checks how many of the retrieved documents were actually relevant.</p><p>Think of a RAG system as three stages: <em>retrieval, augmentation, generation.</em></p><p>Each stage can fail on its own&#8230; The RAG triad is your observability layer to isolate where the problem is, so you know what to fix.</p><h3><strong>21. RAG-Specific Failure Patterns</strong></h3><p>Understanding the RAG triad is one thing. Spotting failures in practice is another&#8230;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8loO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8689a3ea-a350-49aa-9458-3dfaea3b0828_2816x1536.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8loO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8689a3ea-a350-49aa-9458-3dfaea3b0828_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!8loO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8689a3ea-a350-49aa-9458-3dfaea3b0828_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!8loO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8689a3ea-a350-49aa-9458-3dfaea3b0828_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!8loO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8689a3ea-a350-49aa-9458-3dfaea3b0828_2816x1536.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8loO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8689a3ea-a350-49aa-9458-3dfaea3b0828_2816x1536.png" width="1456" height="794" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8689a3ea-a350-49aa-9458-3dfaea3b0828_2816x1536.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:794,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:4569940,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://conductorbyam.substack.com/i/192525618?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8689a3ea-a350-49aa-9458-3dfaea3b0828_2816x1536.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!8loO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8689a3ea-a350-49aa-9458-3dfaea3b0828_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!8loO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8689a3ea-a350-49aa-9458-3dfaea3b0828_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!8loO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8689a3ea-a350-49aa-9458-3dfaea3b0828_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!8loO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8689a3ea-a350-49aa-9458-3dfaea3b0828_2816x1536.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Here are the three most common failure patterns:</p><p><strong>1 Retrieval returns irrelevant chunks </strong>(Context precision failure)</p><p>The system retrieves the wrong or loosely related documents. And the model then either:</p><ul><li><p>Hallucinates to fill the gaps,</p></li><li><p>Or says it doesn&#8217;t have enough information.</p></li></ul><p>The fix is in your retrieval layer: <em>embedding quality, chunking strategy, re-ranking.</em></p><p><strong>2 Retrieval is correct, but the model ignores it </strong>(Faithfulness failure)</p><p>The right documents get retrieved, but the model doesn&#8217;t use them.</p><p>This is usually a prompting issue. The model isn&#8217;t being forced to rely on the provided context. The fix is to strengthen the prompt, so the model stays grounded to the sources.</p><p><strong>3 The answer is grounded but doesn&#8217;t help the user </strong>(Answer relevance failure)</p><p>The answer is technically correct and grounded, but doesn&#8217;t solve the user&#8217;s problem. This usually means your knowledge base is missing the right information. The fix is to improve or expand your data, NOT your prompt.</p><div><hr></div><p>Get 20% savings by getting a group subscription right now:</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?group=true&amp;coupon=d6837d0d&quot;,&quot;text&quot;:&quot;Get 20% off a group subscription&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.systemdesign.one/subscribe?group=true&amp;coupon=d6837d0d"><span>Get 20% off a group subscription</span></a></p><div><hr></div><h2><strong>Offline vs Online</strong></h2><p>Scoring a single output is one problem.</p><p>Building a system that reliably catches failures (before and after deployment) is another. There are two environments where evaluation happens:</p><h3><strong>22. Offline Evaluation</strong></h3><p>Offline evaluation happens before you ship.</p><p>You make a change: <em>a new prompt, model version, or retrieval strategy</em>. Before it reaches users, you test it against your golden dataset. Then you score the outputs and compare them to your current system.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Xy0u!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09f697c2-73fc-4568-b927-b501b4c43fb4_1024x559.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Xy0u!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09f697c2-73fc-4568-b927-b501b4c43fb4_1024x559.png 424w, https://substackcdn.com/image/fetch/$s_!Xy0u!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09f697c2-73fc-4568-b927-b501b4c43fb4_1024x559.png 848w, https://substackcdn.com/image/fetch/$s_!Xy0u!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09f697c2-73fc-4568-b927-b501b4c43fb4_1024x559.png 1272w, https://substackcdn.com/image/fetch/$s_!Xy0u!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09f697c2-73fc-4568-b927-b501b4c43fb4_1024x559.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Xy0u!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09f697c2-73fc-4568-b927-b501b4c43fb4_1024x559.png" width="1024" height="559" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/09f697c2-73fc-4568-b927-b501b4c43fb4_1024x559.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:559,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:180427,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://conductorbyam.substack.com/i/192525618?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09f697c2-73fc-4568-b927-b501b4c43fb4_1024x559.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!Xy0u!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09f697c2-73fc-4568-b927-b501b4c43fb4_1024x559.png 424w, https://substackcdn.com/image/fetch/$s_!Xy0u!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09f697c2-73fc-4568-b927-b501b4c43fb4_1024x559.png 848w, https://substackcdn.com/image/fetch/$s_!Xy0u!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09f697c2-73fc-4568-b927-b501b4c43fb4_1024x559.png 1272w, https://substackcdn.com/image/fetch/$s_!Xy0u!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09f697c2-73fc-4568-b927-b501b4c43fb4_1024x559.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Think of this like a continuous integration (<strong>CI</strong>) pipeline for LLMs.</p><p>A change shouldn&#8217;t go live unless it passes your evals. If quality drops, you catch it before deployment.</p><p>Plus, this is where you use your best (most accurate) judge model. For example, GPT-5, Claude Opus. Cost matters less here because you&#8217;re only evaluating a finite set of examples, and NOT each user request.</p><h3><strong>23. Online Evaluation</strong></h3><p>Online evaluation occurs in production.</p><p>You&#8217;re sampling live outputs continuously and scoring them. The goal is to catch issues that offline evals miss: <em>edge cases, unusual inputs, and failures that only appear at scale.</em></p><p>Think of online eval as your monitoring system&#8230;It helps you catch problems before users report them.</p><p>Yet the main constraint is cost. You can&#8217;t run your most expensive models on each request. So online eval relies on heuristics and smaller judge models; it&#8217;s cheap enough to run continuously.</p><p>This means you trade some accuracy for scale&#8230;</p><p>Offline and online eval should work together:</p><ul><li><p>Offline eval catches known issues before deployment</p></li><li><p>Online eval catches unexpected issues in production</p></li></ul><p>So you need both for a <em>reliable</em> system.</p><h3><strong>24. Prompt Versioning and Regression Testing</strong></h3><p>Your prompt is code; treat it that way.</p><p>Track every prompt change: <em>use version control, keep history, and compare differences. </em>When something breaks, you should know exactly what changed and when.</p><p>Without versioning, you&#8217;d NOT know what changed.</p><p><strong>Regression testing </strong>means running your eval suite against each prompt version. If your old prompt scored 4.2 and the new one scores 3.8, you&#8217;ve introduced a regression and caught it before it reached users.</p><p>This sounds simple, but most people don&#8217;t do it. They lack a solid eval setup: no golden dataset, no clear rubric, no infrastructure.</p><p>This is what separates people who iterate confidently from people who ship &amp; pray.</p><h3><strong>25. Benchmark-Based Evaluation</strong></h3><p>Benchmarks are standard tests used to compare different LLM models:</p><p><strong>1 MMLU (Massive Multitask Language Understanding)</strong></p><p>Tests knowledge across many subjects like math, science, law, and medicine. Think of it as a general knowledge exam for LLMs.</p><p><strong>2 HellaSwag</strong></p><p>Tests common<strong> </strong>sense reasoning. The model is given the start of a scenario and must predict what happens next. (Many earlier models struggled with this.)</p><p><strong>3 HumanEval</strong></p><p>Test code generation. The model gets a function signature and must write the correct implementation. It&#8217;s measured using pass@k: <em>how often the model gets the right answer within k attempts.</em></p><p>Here are two useful leaderboards:</p><ul><li><p>Hugging Face Open LLM Leaderboard for open-weight models</p></li><li><p>Chatbot Arena (LMSYS), where models get ranked based on human preferences</p></li></ul><p>Remember, these benchmarks are useful for comparing models, but they don&#8217;t always reflect real-world performance for your specific use case&#8230;</p><h3><strong>26. Benchmark vs Real-World Tradeoff</strong></h3><p>Benchmarks don&#8217;t tell you if a model will work for your specific use case.</p><p>A model that scores 90% on MMLU might still struggle with your domain-specific language. A model that performs well on HumanEval might generate code that doesn&#8217;t fit your standards&#8230; Benchmarks measure <em>general</em> capability, but your application might need a <em>specific</em> capability.</p><p>So use benchmarks to narrow down your options. Then use your own eval on your own golden set to make the final decision.</p><h3><strong>27. Dataset Contamination / Data Leakage</strong></h3><p>There&#8217;s a problem with benchmarks that most engineers overlook:</p><p><em>Dataset contamination</em> occurs when evaluation data overlaps with the model&#8217;s training data. The model has already seen the answers. So its high benchmark score reflects memorization, NOT capability.</p><p>This happens because training data and benchmark datasets come from the internet. And the data overlap is often <em>unknown</em>. Over time, widely used benchmarks become less reliable, which is why new ones keep getting created&#8230;</p><p>To avoid this, <em>don&#8217;t rely on public examples</em> for your evals&#8230;Plus, <em>use real user queries </em>and <em>create your own datasets</em> instead.</p><div><hr></div><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/p/llm-evals?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption"><em>Share this post &amp; get rewards for the referrals.</em></p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/p/llm-evals?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.systemdesign.one/p/llm-evals?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><div><hr></div><h2><strong>Failure Modes (What Not to Do)</strong></h2><p>You can have the right tools and still build a BAD eval system.</p><p>Here are the common mistakes:</p><h3><strong>28. Eval Anti-Patterns</strong></h3><p><strong>Vibe-based evaluation.</strong></p><p><em>&#8220;I tried it a few times<strong>,</strong> and it looked good.&#8221;</em></p><p>This is the most common mistake. Informal spot-checking creates false confidence. It doesn&#8217;t catch edge cases, track performance over time, or scale beyond one person looking at a few outputs.</p><p>Vibes are a starting point, but not an eval system.</p><p><strong>The single-sample trap.</strong></p><p><em>You run your eval suite once and report the results.</em></p><p>But LLMs are nondeterministic&#8230;</p><ul><li><p>A bad prompt might look good</p></li><li><p>A good prompt might look bad</p></li></ul><p>So run many samples and aggregate results. Remember, report variance, not just average scores.</p><p><strong>Goodhart&#8217;s Law in disguise.</strong></p><p><em>&#8220;When a measure becomes a target, it stops being a good measure.&#8221;</em></p><p>If you optimize for a metric too aggressively, the metric stops measuring what you care about&#8230;</p><ul><li><p>Reward confidence: you&#8217;ll get confident hallucinations</p></li><li><p>Reward length: you&#8217;ll get long, low-quality answers</p></li></ul><p>Metrics are only a proxy for quality. So don&#8217;t mistake them for the actual goal.</p><p><strong>Eval-production mismatch.</strong></p><p><em>Your golden dataset reflects what you expect.</em></p><p>Yet real users behave differently. They&#8217;re vague, unpredictable, and use your system in ways you didn&#8217;t anticipate. If your evals don&#8217;t reflect real usage, your scores are misleading. A high pass rate on unrealistic data doesn&#8217;t mean your system works.</p><p><strong>Ignoring tail failures.</strong></p><p><em>A 92% pass rate sounds good. But what about the 8%?</em></p><p>LLM failures are often catastrophic, not graceful. One harmful or incorrect response can matter more than hundreds of correct ones. So always review failure cases, not just averages.</p><p>The worst outputs tell you the most&#8230;</p><div><hr></div><p>Get 20% savings by getting a group subscription right now:</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?group=true&amp;coupon=d6837d0d&quot;,&quot;text&quot;:&quot;Get 20% off a group subscription&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.systemdesign.one/subscribe?group=true&amp;coupon=d6837d0d"><span>Get 20% off a group subscription</span></a></p><div><hr></div><h2><strong>Decision Framework</strong></h2><p>There&#8217;s no single tool that solves LLM evaluation.</p><p>What works is a layered system&#8230; Each layer catches problems the previous one misses&#8230;</p><h3><strong>29. Eval Stack</strong></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Bow3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F471a919e-22c9-4d02-a463-710c9c24d20b_2816x1536.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Bow3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F471a919e-22c9-4d02-a463-710c9c24d20b_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!Bow3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F471a919e-22c9-4d02-a463-710c9c24d20b_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!Bow3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F471a919e-22c9-4d02-a463-710c9c24d20b_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!Bow3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F471a919e-22c9-4d02-a463-710c9c24d20b_2816x1536.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Bow3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F471a919e-22c9-4d02-a463-710c9c24d20b_2816x1536.png" width="1456" height="794" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/471a919e-22c9-4d02-a463-710c9c24d20b_2816x1536.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:794,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:5453669,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://conductorbyam.substack.com/i/192525618?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F471a919e-22c9-4d02-a463-710c9c24d20b_2816x1536.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!Bow3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F471a919e-22c9-4d02-a463-710c9c24d20b_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!Bow3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F471a919e-22c9-4d02-a463-710c9c24d20b_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!Bow3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F471a919e-22c9-4d02-a463-710c9c24d20b_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!Bow3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F471a919e-22c9-4d02-a463-710c9c24d20b_2816x1536.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Layer 1: Heuristics.</strong></p><p>Fast, deterministic, and cheap. These run on every output and catch structural issues such as incorrect format, missing fields, or banned content.</p><p><strong>Layer 2: Semantic similarity and task-specific metrics.</strong></p><p>These run automatically against reference answers. They help catch meaning-level failures without the cost of an LLM judge.</p><p><strong>Layer 3: LLM-as-judge (offline).</strong></p><p>This runs before deployment on your golden dataset using your best judge model. Its job is to catch quality regressions before they reach users.</p><p><strong>Layer 4: LLM-as-judge (online, smaller model).</strong></p><p>This runs in production on a sample of live outputs using a smaller, cheaper judge model. It helps catch failures at scale that offline evals missed.</p><p><strong>Layer 5: Human spot checks.</strong></p><p>These run periodically. And keep you calibrated, verify automated evals scoring still match human judgment, and add new failure cases back into your golden dataset.</p><p><em>Each layer has a different cost profile and catches a different failure type&#8230;</em></p><p><em>Eventually, you want all five. But you do NOT need all of them on day one.</em></p><p><em><strong>So if you are starting from scratch, here is a simple three-step MVP:</strong></em></p><p><strong>Step 1: Build a golden set of 50 examples.</strong></p><p>Don&#8217;t write all of them yourself.</p><p>Use real user queries if you have them. If not, create 30 representative examples and 20 edge cases. Also, include normal cases, adversarial inputs, and strange phrasings you expect users to try.</p><p>This dataset becomes the foundation of your eval system.</p><p><strong>Step 2: Add one deterministic heuristic.</strong></p><p>Pick the single most important structural requirement for your output.</p><p>That might be length, format, a required field, or a banned phrase. Then write a simple code check for it. This is quick to build and catches more failures than most people expect.</p><p><strong>Step 3: Add one LLM judge prompt.</strong></p><p>Take your rubric and turn it into a judge prompt.</p><p>Then use a strong model to score outputs on one important quality dimension, and run it on your golden dataset. Read the scores and the explanations carefully.</p><p>You will find something surprising&#8230;that&#8217;s the point.</p><p>That is your MVP eval system.</p><p>Everything else builds on top of it: <em>pairwise comparisons, online monitoring, RAG scoring, regression testing, and more.</em></p><p>Start here&#8230;</p><div><hr></div><h2><strong>Closing Thoughts</strong></h2><p>Evaluation isn&#8217;t something you do at the end.</p><p>Instead, it&#8217;s what makes iteration possible. Without it, you&#8217;re not doing LLM engineering. But guessing and shipping changes, only to find out from users that they didn&#8217;t.</p><p>Traditional software engineering learned this lesson the hard way: CI, automated tests, production monitoring&#8230;these aren&#8217;t optional. They&#8217;re what make fast, reliable development possible.</p><p>LLM systems need the same discipline. The difference is that the outputs are nondeterministic and hard to measure.</p><p>Now you have the vocabulary to deal with that&#8230;</p><p>Start simple and build your eval stack step by step. Over time, you&#8217;ll know whether a change improved quality before users ever see it.</p><div><hr></div><p>&#128075; I&#8217;d like to thank <strong><a href="https://www.linkedin.com/in/athletickoder/">Anshuman</a></strong> for writing this newsletter!</p><p>If you&#8217;re building LLM applications and want to go deeper, follow him on <a href="https://www.linkedin.com/in/athletickoder/">LinkedIn</a> and <a href="https://x.com/athleticKoder">X</a>.</p><p>Don&#8217;t forget to check out his newsletter, <strong><a href="https://beaiproof.substack.com/welcome">AI Proof</a></strong> -- it&#8217;ll help you stay relevant in the AI era.</p><div><hr></div><p>Louis and I launched the <strong>GENERATIVE AI MASTERCLASS</strong> (newsletter series exclusive to PAID subscribers) this month.</p><p>When you upgrade, you&#8217;ll get:</p><ul><li><p><strong>Simple breakdown of real-world architectures</strong></p></li><li><p>Frameworks you can plug into your work or business</p></li><li><p><strong>Proven systems behind ChatGPT, Perplexity, and Copilot</strong></p></li></ul><p><strong>&#128073; <a href="https://newsletter.systemdesign.one/subscribe?yearly=true">CLICK HERE TO JOIN THE GENERATIVE AI MASTERCLASS</a></strong></p><p>(Golden members will get the next Generative AI newsletter in the first week of May.)</p><div><hr></div><p>If you find this newsletter valuable, share it with a friend, and subscribe if you haven&#8217;t already. There are <a href="http://newsletter.systemdesign.one/subscribe?group=true">group discounts</a>, <a href="http://newsletter.systemdesign.one/subscribe?gift=true">gift options</a>, and <a href="https://newsletter.systemdesign.one/leaderboard">referral rewards</a> available.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.systemdesign.one/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://www.linkedin.com/in/nk-systemdesign-one/" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bEFk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 424w, https://substackcdn.com/image/fetch/$s_!bEFk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 848w, https://substackcdn.com/image/fetch/$s_!bEFk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 1272w, https://substackcdn.com/image/fetch/$s_!bEFk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bEFk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png" width="152" height="152" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:320,&quot;width&quot;:320,&quot;resizeWidth&quot;:152,&quot;bytes&quot;:74009,&quot;alt&quot;:&quot;Author Neo Kim; System design case studies&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://www.linkedin.com/in/nk-systemdesign-one/&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Author Neo Kim; System design case studies" title="Author Neo Kim; System design case studies" srcset="https://substackcdn.com/image/fetch/$s_!bEFk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 424w, https://substackcdn.com/image/fetch/$s_!bEFk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 848w, https://substackcdn.com/image/fetch/$s_!bEFk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 1272w, https://substackcdn.com/image/fetch/$s_!bEFk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption"><strong>&#128075; Find me on <a href="https://www.linkedin.com/in/nk-systemdesign-one/">LinkedIn</a> | <a href="https://x.com/intent/follow?screen_name=systemdesignone">Twitter</a> | <a href="https://www.threads.net/@systemdesignone">Threads</a> | <a href="https://www.instagram.com/systemdesignone/">Instagram</a></strong></figcaption></figure></div><div><hr></div><p><strong>Want to reach 200K+ tech professionals at scale? </strong>&#128240;</p><p>If your company wants to reach 200K+ tech professionals, <a href="https://newsletter.systemdesign.one/p/sponsorship">advertise with me</a>.</p><div><hr></div><p>Thank you for supporting this newsletter.</p><p>You are now 210,001+ readers strong, very close to 210k. Let&#8217;s try to get 211k readers by 29 April. Consider sharing this post with your friends and get rewards.</p><p>Y&#8217;all are the best.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6oWl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6oWl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 424w, https://substackcdn.com/image/fetch/$s_!6oWl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 848w, https://substackcdn.com/image/fetch/$s_!6oWl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 1272w, https://substackcdn.com/image/fetch/$s_!6oWl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6oWl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png" width="590" height="368.75" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2e739087-a910-4643-be36-997b6dd5b4af_800x500.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:500,&quot;width&quot;:800,&quot;resizeWidth&quot;:590,&quot;bytes&quot;:87878,&quot;alt&quot;:&quot;system design newsletter&quot;,&quot;title&quot;:&quot;system design newsletter&quot;,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/163380418?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="system design newsletter" title="system design newsletter" srcset="https://substackcdn.com/image/fetch/$s_!6oWl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 424w, https://substackcdn.com/image/fetch/$s_!6oWl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 848w, https://substackcdn.com/image/fetch/$s_!6oWl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 1272w, https://substackcdn.com/image/fetch/$s_!6oWl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/p/llm-evals?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.systemdesign.one/p/llm-evals?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>Variance is how much your results change when you repeat the same evaluation. It tells you how spread out or inconsistent your scores are.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><ul><li><p>Criteria: what you care about when judging the output (e.g., accuracy, clarity, helpfulness)</p></li><li><p>Rubric: how you score those criteria consistently (e.g., a 1&#8211;5 scale with clear definitions)</p></li><li><p>Test cases: specific inputs or examples that you use to evaluate the model</p></li></ul><p></p></div></div>]]></content:encoded></item><item><title><![CDATA[Vector Database - A Deep Dive]]></title><description><![CDATA[#141: A Beginner&#8217;s Guide to the AI Stack&#8217;s Most Misunderstood Component]]></description><link>https://newsletter.systemdesign.one/p/what-is-a-vector-database</link><guid isPermaLink="false">https://newsletter.systemdesign.one/p/what-is-a-vector-database</guid><dc:creator><![CDATA[Maxine Meurer]]></dc:creator><pubDate>Sat, 25 Apr 2026 09:11:55 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/f1a11bc3-2222-4fbe-a3db-234b1991cb9b_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Get my system design playbook for FREE on newsletter signup:</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><ul><li><p><em><a href="https://newsletter.systemdesign.one/p/what-is-a-vector-database/?action=share">Share this post</a> &amp; I'll send you some rewards for the referrals.</em></p></li></ul><div><hr></div><p>You&#8217;ve built a search feature.</p><p>Then a user types a query, your application hits the database, and results come back ranked by relevance.</p><p><em>Simple enough</em>, it works perfectly.</p><p>But you thought about how search needs to actually understand what users meant, not just match the exact words they typed.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EE7V!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45ca0a2b-f1b1-487b-8049-d82793dcce7f_800x800.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EE7V!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45ca0a2b-f1b1-487b-8049-d82793dcce7f_800x800.png 424w, https://substackcdn.com/image/fetch/$s_!EE7V!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45ca0a2b-f1b1-487b-8049-d82793dcce7f_800x800.png 848w, https://substackcdn.com/image/fetch/$s_!EE7V!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45ca0a2b-f1b1-487b-8049-d82793dcce7f_800x800.png 1272w, https://substackcdn.com/image/fetch/$s_!EE7V!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45ca0a2b-f1b1-487b-8049-d82793dcce7f_800x800.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EE7V!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45ca0a2b-f1b1-487b-8049-d82793dcce7f_800x800.png" width="800" height="800" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/45ca0a2b-f1b1-487b-8049-d82793dcce7f_800x800.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:800,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!EE7V!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45ca0a2b-f1b1-487b-8049-d82793dcce7f_800x800.png 424w, https://substackcdn.com/image/fetch/$s_!EE7V!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45ca0a2b-f1b1-487b-8049-d82793dcce7f_800x800.png 848w, https://substackcdn.com/image/fetch/$s_!EE7V!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45ca0a2b-f1b1-487b-8049-d82793dcce7f_800x800.png 1272w, https://substackcdn.com/image/fetch/$s_!EE7V!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45ca0a2b-f1b1-487b-8049-d82793dcce7f_800x800.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A user searching for <em>&#8220;how do I log in&#8221;</em> should find results about <em>&#8220;account access&#8221;</em> and <em>&#8220;sign in&#8221;</em> even if those exact words never appear in their query.</p><p>This kind of search, where the database understands meaning rather than just matching words, is called <strong>semantic search</strong>. It requires a completely different approach to how you store and search your data.</p><p>Yet if you try adding this in as a feature, everything that made your traditional relational database fast will stop working.</p><p><em>The tools you relied on: </em><code>indexes, WHERE clauses, keyword matching</code><em> were all built for finding exact values.</em></p><p><strong>Semantic search does NOT work with exact values.</strong></p><p>Instead, it works with meaning, and meaning is harder to search efficiently. A search that took milliseconds before now takes seconds, and the bigger your dataset, the worse it gets.</p><p>Your database is NOT broken.</p><p>It&#8217;s doing exactly what you told it to do: comparing your search against every single item in your database, one by one, because there is no other way to find &#8220;similar&#8221; when there is NO exact match.</p><p>This is where vector databases come in.</p><p>Not because they are trendy, but because traditional databases were never designed to handle this kind of search at scale.</p><p>If you&#8217;re a software engineer encountering the term &#8220;vector database&#8221; for the first time, whether your team is adding AI features or you keep seeing it mentioned everywhere and want to understand what it actually means, this newsletter explains it from the ground up&#8230;</p><div><hr></div><h2><strong><a href="https://watch.getcontrast.io/register/unblocked-how-to-stop-babysitting-your-agents?utm_source=systemdesign&amp;utm_medium=email&amp;utm_campaign=primary">[Webinar] Stop babysitting your coding agents (Partner)</a></strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://watch.getcontrast.io/register/unblocked-how-to-stop-babysitting-your-agents?utm_source=systemdesign&amp;utm_medium=email&amp;utm_campaign=primary" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2Yva!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5360c077-4b63-4b77-ac93-49ad7fe35954_2048x1152.png 424w, https://substackcdn.com/image/fetch/$s_!2Yva!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5360c077-4b63-4b77-ac93-49ad7fe35954_2048x1152.png 848w, https://substackcdn.com/image/fetch/$s_!2Yva!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5360c077-4b63-4b77-ac93-49ad7fe35954_2048x1152.png 1272w, https://substackcdn.com/image/fetch/$s_!2Yva!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5360c077-4b63-4b77-ac93-49ad7fe35954_2048x1152.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2Yva!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5360c077-4b63-4b77-ac93-49ad7fe35954_2048x1152.png" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5360c077-4b63-4b77-ac93-49ad7fe35954_2048x1152.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:&quot;https://watch.getcontrast.io/register/unblocked-how-to-stop-babysitting-your-agents?utm_source=systemdesign&amp;utm_medium=email&amp;utm_campaign=primary&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2Yva!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5360c077-4b63-4b77-ac93-49ad7fe35954_2048x1152.png 424w, https://substackcdn.com/image/fetch/$s_!2Yva!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5360c077-4b63-4b77-ac93-49ad7fe35954_2048x1152.png 848w, https://substackcdn.com/image/fetch/$s_!2Yva!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5360c077-4b63-4b77-ac93-49ad7fe35954_2048x1152.png 1272w, https://substackcdn.com/image/fetch/$s_!2Yva!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5360c077-4b63-4b77-ac93-49ad7fe35954_2048x1152.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Agents can generate code. Getting it right for your system, team conventions, and past decisions is the hard part &#8211; you end up wasting time and tokens in correction loops.</p><p>More MCPs give agents access to information, but not understanding. The teams pulling ahead use a context engine to give agents exactly what they need.</p><p><a href="https://watch.getcontrast.io/register/unblocked-how-to-stop-babysitting-your-agents?utm_source=systemdesign&amp;utm_medium=email&amp;utm_campaign=primary">Join Unblocked live on May 6 (FREE)</a> to see:</p><ul><li><p>Where teams get stuck on the AI maturity curve</p></li><li><p>How a context engine solves for quality, efficiency, and cost</p></li><li><p>Live demo: the same coding task with and without a context engine</p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://watch.getcontrast.io/register/unblocked-how-to-stop-babysitting-your-agents?utm_source=systemdesign&amp;utm_medium=email&amp;utm_campaign=primary&quot;,&quot;text&quot;:&quot;Register Now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://watch.getcontrast.io/register/unblocked-how-to-stop-babysitting-your-agents?utm_source=systemdesign&amp;utm_medium=email&amp;utm_campaign=primary"><span>Register Now</span></a></p><p>(Thanks to <a href="https://watch.getcontrast.io/register/unblocked-how-to-stop-babysitting-your-agents?utm_source=systemdesign&amp;utm_medium=email&amp;utm_campaign=primary">Unblocked</a> for partnering on this post.)</p><div><hr></div><p>I want to introduce <strong><a href="https://www.linkedin.com/in/maxinemeurer/">Maxine</a></strong> as the guest author.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://www.linkedin.com/in/maxinemeurer/" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!08DB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39b9c478-d7d7-4f94-a440-8cd2d9abe2fc_1200x630.png 424w, https://substackcdn.com/image/fetch/$s_!08DB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39b9c478-d7d7-4f94-a440-8cd2d9abe2fc_1200x630.png 848w, https://substackcdn.com/image/fetch/$s_!08DB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39b9c478-d7d7-4f94-a440-8cd2d9abe2fc_1200x630.png 1272w, https://substackcdn.com/image/fetch/$s_!08DB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39b9c478-d7d7-4f94-a440-8cd2d9abe2fc_1200x630.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!08DB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39b9c478-d7d7-4f94-a440-8cd2d9abe2fc_1200x630.png" width="1200" height="630" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/39b9c478-d7d7-4f94-a440-8cd2d9abe2fc_1200x630.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:630,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:278821,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://www.linkedin.com/in/maxinemeurer/&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/193910825?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39b9c478-d7d7-4f94-a440-8cd2d9abe2fc_1200x630.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!08DB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39b9c478-d7d7-4f94-a440-8cd2d9abe2fc_1200x630.png 424w, https://substackcdn.com/image/fetch/$s_!08DB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39b9c478-d7d7-4f94-a440-8cd2d9abe2fc_1200x630.png 848w, https://substackcdn.com/image/fetch/$s_!08DB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39b9c478-d7d7-4f94-a440-8cd2d9abe2fc_1200x630.png 1272w, https://substackcdn.com/image/fetch/$s_!08DB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39b9c478-d7d7-4f94-a440-8cd2d9abe2fc_1200x630.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>She&#8217;s a cloud infrastructure engineer who spends her days scaling databases, debugging production incidents at 2 am, and writing about what actually works in production.</p><p>You can get a copy of her <strong><a href="https://mameurer.gumroad.com/l/LLMsForHumans?layout=profile">LLMs for Humans: From Prompts to Production (at 30% off right now)</a></strong>. It&#8217;s 20 chapters of practical applied AI with real production context, not theory. And it&#8217;ll help you get smarter about using AI tools in infrastructure workflows.</p><p>Checkout her work:</p><ul><li><p><a href="https://ilovedevops.substack.com/subscribe">Substack - I Love DevOps</a></p></li><li><p><a href="https://www.linkedin.com/in/maxinemeurer/">LinkedIn</a></p></li><li><p><a href="https://www.threads.com/@devopgirl">Threads</a></p></li></ul><p>Plus, if you&#8217;re thinking about making a career move into cloud or DevOps and want a structured path to get there, get a copy of her <strong><a href="https://mameurer.gumroad.com/l/devops-career-switch-blueprint?layout=profile">The DevOps Career Switch Blueprint</a>.</strong></p><div><hr></div><p><strong>Inside this newsletter, you&#8217;ll get:</strong></p><ul><li><p><strong>Why traditional search breaks.</strong> How exact-match databases fail with semantic meaning and why O(n) similarity search doesn&#8217;t scale.</p></li><li><p><strong>How vector search works.</strong> Embeddings, high-dimensional space, and why similarity is measured by distance instead of exact matches.</p></li><li><p><strong>The core architecture.</strong> Storage, indexing (HNSW), and query layers, with key tradeoffs in cost, latency, and accuracy.</p></li><li><p><strong>Query execution in practice.</strong> ANN search, metadata filtering (pre vs post), and ranking, including how real systems return results.</p></li><li><p><strong>Production realities.</strong> Scaling, replication, cost structure, monitoring, and failure modes teams hit in production.</p></li><li><p><strong>Decision framework.</strong> When pgvector is enough, when to use a dedicated vector DB, and when distributed systems become necessary.</p></li></ul><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Golden members get all posts like these!...</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h2><strong>Why Traditional Databases Struggle with Embeddings</strong></h2><p>Traditional relational databases are good at exact matches and range queries.</p><p>They use indexes and data structures so the database can jump directly to relevant rows without scanning the entire table.</p><p>When you query <code>WHERE user_id = 12345</code>, database uses an index to find that row instantly, the same way you&#8217;d look up a word in a dictionary: you don&#8217;t read every page, you jump straight to the right section. Also when you query, <code>WHERE created_at &gt; &#8216;2026-01-01&#8217;,</code> it works the same way for dates.</p><p>But embeddings break this model because there is no exact value to look up!</p><p>An <strong>embedding</strong> is a list of numbers that represents the meaning of a text.</p><p>Words, sentences, or entire paragraphs get converted into these numbers by an AI model, and <em>similar meanings produce similar numbers</em>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!clcD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d1b8027-9826-43dd-ac5f-7ddb18b1aeb0_800x800.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!clcD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d1b8027-9826-43dd-ac5f-7ddb18b1aeb0_800x800.png 424w, https://substackcdn.com/image/fetch/$s_!clcD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d1b8027-9826-43dd-ac5f-7ddb18b1aeb0_800x800.png 848w, https://substackcdn.com/image/fetch/$s_!clcD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d1b8027-9826-43dd-ac5f-7ddb18b1aeb0_800x800.png 1272w, https://substackcdn.com/image/fetch/$s_!clcD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d1b8027-9826-43dd-ac5f-7ddb18b1aeb0_800x800.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!clcD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d1b8027-9826-43dd-ac5f-7ddb18b1aeb0_800x800.png" width="800" height="800" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4d1b8027-9826-43dd-ac5f-7ddb18b1aeb0_800x800.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:800,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!clcD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d1b8027-9826-43dd-ac5f-7ddb18b1aeb0_800x800.png 424w, https://substackcdn.com/image/fetch/$s_!clcD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d1b8027-9826-43dd-ac5f-7ddb18b1aeb0_800x800.png 848w, https://substackcdn.com/image/fetch/$s_!clcD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d1b8027-9826-43dd-ac5f-7ddb18b1aeb0_800x800.png 1272w, https://substackcdn.com/image/fetch/$s_!clcD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d1b8027-9826-43dd-ac5f-7ddb18b1aeb0_800x800.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Instead of finding a specific row, you find rows closer in meaning, and &#8220;close&#8221; requires measuring distance in high-dimensional space rather than matching values. Think of it like a regular map with latitude and longitude, except you&#8217;re working with 1,536 dimensions instead of just 2.</p><p><em>Let&#8217;s walk through a concrete example to see why this creates a problem&#8230;</em></p><p>Imagine you have a document database where each document gets split into chunks.</p><p>And think of chunks like cutting a book into individual paragraphs so you can search each one separately. Each chunk gets converted into a 1,536-dimensional vector using an embedding model like OpenAI&#8217;s text-embedding-3-small:</p><ol><li><p>A user submits a search query</p></li><li><p>App converts the query into a vector using the same embedding model</p></li><li><p><em>Now you have to find the &#8220;most similar&#8221; document chunks</em></p></li></ol><p>Without specialized indexing, the only way to find &#8220;most similar&#8221; ones is to calculate the distance between your query vector and every chunk vector in your database, sort by distance, and return the top results.</p><p><em>Let&#8217;s break down what it means&#8230;</em></p><p>A vector is simply a list of numbers. In AI, those numbers represent meaning.</p><p>When you convert text into an embedding, you&#8217;re assigning it a location in a very high-dimensional space (imagine a map with 1,536 dimensions instead of the normal 2, latitude and longitude). Similar pieces of text end up close to each other on this map, while unrelated text ends up far apart.</p><p>When a user searches for something, their query also gets converted into coordinates on this map. To find the most relevant results, you need to measure the distance from the query&#8217;s location to every document chunk&#8217;s location in your database, then return the closest ones.</p><p><em>But here&#8217;s the problem&#8230;</em></p><p>Without an index, there&#8217;s no shortcut. You must measure the distance to each item.</p><p>Now imagine you&#8217;re standing in Times Square in New York City, and you need to find the 10 closest coffee shops. Without a map or any organized system, your only option is to:</p><ol><li><p>Walk to every building</p></li><li><p>Check if it&#8217;s a coffee shop</p></li><li><p>Measure how far it is from Times Square</p></li><li><p>Keep a list of the 10 closest ones you&#8217;ve found</p></li></ol><p>If there are 10k buildings, you check all 10k. And if there are 10 million buildings, you check all 10 million.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gu5U!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F283169d2-99c4-4a06-8f5d-400a67802149_800x800.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gu5U!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F283169d2-99c4-4a06-8f5d-400a67802149_800x800.png 424w, https://substackcdn.com/image/fetch/$s_!gu5U!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F283169d2-99c4-4a06-8f5d-400a67802149_800x800.png 848w, https://substackcdn.com/image/fetch/$s_!gu5U!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F283169d2-99c4-4a06-8f5d-400a67802149_800x800.png 1272w, https://substackcdn.com/image/fetch/$s_!gu5U!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F283169d2-99c4-4a06-8f5d-400a67802149_800x800.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gu5U!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F283169d2-99c4-4a06-8f5d-400a67802149_800x800.png" width="800" height="800" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/283169d2-99c4-4a06-8f5d-400a67802149_800x800.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:800,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gu5U!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F283169d2-99c4-4a06-8f5d-400a67802149_800x800.png 424w, https://substackcdn.com/image/fetch/$s_!gu5U!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F283169d2-99c4-4a06-8f5d-400a67802149_800x800.png 848w, https://substackcdn.com/image/fetch/$s_!gu5U!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F283169d2-99c4-4a06-8f5d-400a67802149_800x800.png 1272w, https://substackcdn.com/image/fetch/$s_!gu5U!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F283169d2-99c4-4a06-8f5d-400a67802149_800x800.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The more buildings that exist, the longer it takes&#8230;</p><p>Put simply, it takes&nbsp;<strong>O(n)</strong>&nbsp;time, and it grows linearly with the size of your dataset.</p><p>In practical terms, if you have 10,000 document chunks and each distance calculation takes 0.0001 seconds, one query takes 1 second. That&#8217;s not terrible.</p><p><em>But what if you have 10 million chunks?</em></p><p>That same query now takes 1,000 seconds, i.e., 16 minutes. And if your application serves 100 queries per second from different users? Your system immediately grinds to a halt!</p><p>This is why traditional databases struggle with embeddings&#8230;</p><p>Their indexes are built for exact matches.</p><p>They can jump straight to <code>user_id = 12345</code> without checking every row. But with vectors, there are no exact matches, only &#8220;similar&#8221; matches, and similarity requires measuring distances.</p><p><em>Let&#8217;s now imagine finding a similar song to the one you already like&#8230;</em></p><p>You cannot just look it up by name; you have to compare it against every other song across hundreds of characteristics: tempo, key, energy, mood, and so on. The more songs in your library, the longer it takes.</p><p><strong>Vector search</strong> works the same way, except instead of songs, you compare text, and instead of a few characteristics, you are comparing across 1,536 dimensions at the same time.</p><p><em>Without specialized techniques to avoid checking everything, the math simply doesn&#8217;t scale.</em></p><p>This is where people often bring up <a href="https://github.com/pgvector/pgvector">pgvector</a>, a PostgreSQL extension that adds vector search capabilities to Postgres. It&#8217;s a good option for smaller datasets and lets you avoid adding another database to your stack. This keeps operational efforts very low.</p><p>For datasets with fewer than 100,000 vectors and <em>moderate</em> query volumes, pgvector works well and keeps your infrastructure simpler.</p><p>But as your dataset grows into the millions and your query latency requirements tighten, you hit the limits of what a general-purpose database can do for this specific workload.</p><p>You don&#8217;t need a specialized vector database just because you&#8217;re working with embeddings, but you need to understand when the operational trade-offs tip in favor of dedicated infrastructure.</p><p><em>This brings us to the core challenge that vector databases solve&#8230;</em></p><p>How do you find approximate nearest neighbors in high-dimensional space without comparing against every vector, while still returning results that are &#8220;good enough&#8221; in terms of accuracy?</p><p>The answer involves building specialized index structures that trade accuracy for massive speed improvements, and understanding those trade-offs is essential for operating these systems in production.</p><div><hr></div><h2><strong>Architecture: How It Works Under the Hood</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!isUf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F317f1e41-a15b-4682-9de6-da170fbd1b73_800x800.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!isUf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F317f1e41-a15b-4682-9de6-da170fbd1b73_800x800.png 424w, https://substackcdn.com/image/fetch/$s_!isUf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F317f1e41-a15b-4682-9de6-da170fbd1b73_800x800.png 848w, https://substackcdn.com/image/fetch/$s_!isUf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F317f1e41-a15b-4682-9de6-da170fbd1b73_800x800.png 1272w, https://substackcdn.com/image/fetch/$s_!isUf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F317f1e41-a15b-4682-9de6-da170fbd1b73_800x800.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!isUf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F317f1e41-a15b-4682-9de6-da170fbd1b73_800x800.png" width="800" height="800" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/317f1e41-a15b-4682-9de6-da170fbd1b73_800x800.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:800,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!isUf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F317f1e41-a15b-4682-9de6-da170fbd1b73_800x800.png 424w, https://substackcdn.com/image/fetch/$s_!isUf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F317f1e41-a15b-4682-9de6-da170fbd1b73_800x800.png 848w, https://substackcdn.com/image/fetch/$s_!isUf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F317f1e41-a15b-4682-9de6-da170fbd1b73_800x800.png 1272w, https://substackcdn.com/image/fetch/$s_!isUf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F317f1e41-a15b-4682-9de6-da170fbd1b73_800x800.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Vector databases have three key layers: <em>storage, index, and query</em>.</p><p>Each layer involves trade-offs that directly affect your operational concerns: cost, latency, accuracy, and scalability.</p><p>Let&#8217;s break down what matters from an infrastructure perspective&#8230;</p><h3><strong>Storage Layer: Where Vectors Live</strong></h3><p>Vectors are massive.</p><p>A single 1,536-dimensional vector using 32-bit floats takes up 6KB of storage. If you multiply that across millions of vectors, storage costs add up fast. This is a problem every vector database must solve at its foundation: <em>the storage layer determines how vectors are physically stored, compressed, and retrieved from disk.</em> This layer focuses on compression techniques that reduce both storage costs and the memory footprint while preserving as much accuracy as possible.</p><p><em>The most common approach is quantization.</em></p><p><strong>Quantization</strong> reduces the precision of each dimension to use fewer bits.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tpO6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e0fb2b2-0f68-4fbe-9ebc-eb650d508ef3_800x800.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tpO6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e0fb2b2-0f68-4fbe-9ebc-eb650d508ef3_800x800.png 424w, https://substackcdn.com/image/fetch/$s_!tpO6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e0fb2b2-0f68-4fbe-9ebc-eb650d508ef3_800x800.png 848w, https://substackcdn.com/image/fetch/$s_!tpO6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e0fb2b2-0f68-4fbe-9ebc-eb650d508ef3_800x800.png 1272w, https://substackcdn.com/image/fetch/$s_!tpO6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e0fb2b2-0f68-4fbe-9ebc-eb650d508ef3_800x800.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tpO6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e0fb2b2-0f68-4fbe-9ebc-eb650d508ef3_800x800.png" width="800" height="800" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9e0fb2b2-0f68-4fbe-9ebc-eb650d508ef3_800x800.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:800,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tpO6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e0fb2b2-0f68-4fbe-9ebc-eb650d508ef3_800x800.png 424w, https://substackcdn.com/image/fetch/$s_!tpO6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e0fb2b2-0f68-4fbe-9ebc-eb650d508ef3_800x800.png 848w, https://substackcdn.com/image/fetch/$s_!tpO6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e0fb2b2-0f68-4fbe-9ebc-eb650d508ef3_800x800.png 1272w, https://substackcdn.com/image/fetch/$s_!tpO6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e0fb2b2-0f68-4fbe-9ebc-eb650d508ef3_800x800.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Instead of storing each dimension as a 32-bit float, you quantize to 8-bit integers (scalar quantization) or binary representations (binary quantization).</p><p>This can reduce storage and memory requirements by 4x or more, but it introduces approximation error. <strong>Approximation error</strong> is the gap between the compressed version of data and the original.</p><p>When you round numbers to save space, your calculations become slightly less precise. In vector search, this means you might occasionally miss a result that was truly the closest match, but in practice, the difference is small enough that most applications never notice it.</p><p>In production, you&#8217;re constantly trading off storage costs for search accuracy, and the right balance depends on your use case.</p><p><em>Another critical decision is the trade-off between memory and disk&#8230;</em></p><p>For <strong>maximum query speed</strong>, you want your vector index in memory, but that&#8217;s expensive at scale. Some vector databases support tiered storage, where frequently accessed vectors stay in memory while cold data lives on disk, or hybrid approaches in which the index structure stays in memory while raw vectors get pulled from disk when needed.</p><p>These architectural choices directly affect your cost structure and query latency percentiles.</p><p>From an operational perspective, you need to understand how your chosen vector database handles durability and recovery:</p><ul><li><p><em>What happens if a node crashes mid-write?</em></p></li><li><p><em>How long does it take to rebuild indexes after a restart?</em></p></li><li><p><em>Are vectors written to disk synchronously or asynchronously?</em></p></li></ul><p>These aren&#8217;t theoretical questions. They&#8217;re the difference between acceptable downtime and a major incident when something goes wrong.</p><h3><strong>Indexing Layer: Graph Structure</strong></h3><p>This is where the &#8220;magic&#8221; happens, and by magic I mean carefully engineered data structures that let you find approximate neighbors without exhaustive search.</p><p>The most popular algorithm in production vector databases is <strong>HNSW</strong> (Hierarchical Navigable Small World), and it&#8217;s worth understanding at a conceptual level even if you&#8217;re not implementing it yourself.</p><p>The core idea behind HNSW is based on &#8220;small world&#8221; networks, the same principle that says you&#8217;re only a few connections away from any other person on the planet.</p><p>Instead of comparing your query vector against every vector in your database, HNSW builds a multi-layer graph in which similar vectors are connected to each other. When you query, the algorithm starts at the top layer and navigates toward your target by following edges to similar neighbors, dropping through layers until it reaches the most similar vectors at the bottom layer.</p><p><em>Remember our map analogy?</em></p><p>Your vectors are locations on a high-dimensional map, and you&#8217;re trying to find the closest ones to your query without checking every single location. HNSW solves this by building a navigation system with different zoom levels.</p><p>Think of it like Google Maps navigating from New York to a specific coffee shop in San Francisco:</p><ul><li><p><strong>Top layer (zoomed out):</strong>&nbsp;From New York, HNSW identifies the general direction of your target, the West Coast, and follows connections to get you to California. You&#8217;re not considering every city in America, just the major waypoints that get you closer to where you need to be.</p></li><li><p><strong>Middle layers (zooming in):</strong> Now you&#8217;re looking at California. The algorithm narrows down to the San Francisco Bay Area, following connections between nearby locations. You&#8217;re still not checking every street, just the neighborhoods that are directionally correct toward the coffee shop you want to visit.</p></li><li><p><strong>Bottom layer (street level):</strong> Finally, you&#8217;re at the specific neighborhood and can see individual addresses. The algorithm follows the last few connections to find the exact coffee shop location (or, in vector terms, the most similar document chunks).</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zZEO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85edadd3-1c48-4baa-8cfc-3288ba15b2e9_800x800.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zZEO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85edadd3-1c48-4baa-8cfc-3288ba15b2e9_800x800.png 424w, https://substackcdn.com/image/fetch/$s_!zZEO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85edadd3-1c48-4baa-8cfc-3288ba15b2e9_800x800.png 848w, https://substackcdn.com/image/fetch/$s_!zZEO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85edadd3-1c48-4baa-8cfc-3288ba15b2e9_800x800.png 1272w, https://substackcdn.com/image/fetch/$s_!zZEO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85edadd3-1c48-4baa-8cfc-3288ba15b2e9_800x800.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zZEO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85edadd3-1c48-4baa-8cfc-3288ba15b2e9_800x800.png" width="800" height="800" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/85edadd3-1c48-4baa-8cfc-3288ba15b2e9_800x800.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:800,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zZEO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85edadd3-1c48-4baa-8cfc-3288ba15b2e9_800x800.png 424w, https://substackcdn.com/image/fetch/$s_!zZEO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85edadd3-1c48-4baa-8cfc-3288ba15b2e9_800x800.png 848w, https://substackcdn.com/image/fetch/$s_!zZEO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85edadd3-1c48-4baa-8cfc-3288ba15b2e9_800x800.png 1272w, https://substackcdn.com/image/fetch/$s_!zZEO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85edadd3-1c48-4baa-8cfc-3288ba15b2e9_800x800.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>At each zoom level, you&#8217;re following connections between &#8220;nearby&#8221; locations without checking everywhere.</p><p>You only need to follow the connections that get you closer to your target. This is how HNSW turns an O(n) problem into something much faster: <em>it builds a navigation structure that lets you skip large portions of the map in the wrong direction.</em></p><p>The key trade-off in HNSW is speed versus accuracy, controlled by two parameters: the number of connections each vector has (called &#8220;M&#8221; or &#8220;max connections&#8221;) and the number of neighbors to check during search (called &#8220;ef&#8221; or &#8220;exploration factor&#8221;).</p><p>Higher values mean better accuracy, but slower queries and larger indexes&#8230;</p><p>Lower values mean faster queries, but you might miss some relevant results. In production, you tune these parameters based on your latency requirements and tolerance for accuracy.</p><p><em>Building the HNSW index is expensive.</em></p><p>Inserting vectors requires calculating distances and updating graph connections, which is why bulk index building often happens offline or during low-traffic periods. Index build time matters because it affects how quickly you can recover from failures, deploy updates, or onboard new data.</p><p>You do not need to understand the math behind HNSW to work with vector databases.</p><p>Just like driving a car, you might not know how the engine works, and that doesn&#8217;t affect how you drive. But knowing that the engine exists and what it does helps you figure out what is wrong when the car breaks down.</p><p>What you need to know is that HNSW produces approximate results, not perfect ones, and it&#8217;s possible to adjust its approximation. When search results seem wrong, queries suddenly slow down, or the index fails to build, this understanding helps you identify the problem.</p><div><hr></div><blockquote><p>Get a copy of <em><strong><a href="https://mameurer.gumroad.com/l/LLMsForHumans?layout=profile">LLMs for Humans: From Prompts to Production</a></strong> </em>right now and save 30%.</p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://mameurer.gumroad.com/l/LLMsForHumans?layout=profile" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7mFj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6003f91-fafd-4281-99c4-c6947facff9d_590x756.jpeg 424w, https://substackcdn.com/image/fetch/$s_!7mFj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6003f91-fafd-4281-99c4-c6947facff9d_590x756.jpeg 848w, https://substackcdn.com/image/fetch/$s_!7mFj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6003f91-fafd-4281-99c4-c6947facff9d_590x756.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!7mFj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6003f91-fafd-4281-99c4-c6947facff9d_590x756.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7mFj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6003f91-fafd-4281-99c4-c6947facff9d_590x756.jpeg" width="320" height="410.03389830508473" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e6003f91-fafd-4281-99c4-c6947facff9d_590x756.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:756,&quot;width&quot;:590,&quot;resizeWidth&quot;:320,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:&quot;https://mameurer.gumroad.com/l/LLMsForHumans?layout=profile&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7mFj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6003f91-fafd-4281-99c4-c6947facff9d_590x756.jpeg 424w, https://substackcdn.com/image/fetch/$s_!7mFj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6003f91-fafd-4281-99c4-c6947facff9d_590x756.jpeg 848w, https://substackcdn.com/image/fetch/$s_!7mFj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6003f91-fafd-4281-99c4-c6947facff9d_590x756.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!7mFj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6003f91-fafd-4281-99c4-c6947facff9d_590x756.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><p><em><strong>Reminder: this is a teaser of the subscriber-only newsletter, exclusive to my golden members.</strong></em></p><p>When you upgrade, you&#8217;ll get:</p><ul><li><p><strong>High-level architecture of real-world systems.</strong></p></li><li><p>Deep dive into how popular real-world systems work.</p></li><li><p><strong>How real-world systems handle scale, reliability, and performance.</strong></p></li></ul><p>And much more!</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe&quot;,&quot;text&quot;:&quot;Unlock Full Access&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.systemdesign.one/subscribe"><span>Unlock Full Access</span></a></p><div><hr></div><h3><strong>Query Layer: How Requests Move Through System</strong></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dk46!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b6d104f-f42f-4f31-86ed-7fc7865f326f_800x800.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dk46!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b6d104f-f42f-4f31-86ed-7fc7865f326f_800x800.png 424w, https://substackcdn.com/image/fetch/$s_!dk46!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b6d104f-f42f-4f31-86ed-7fc7865f326f_800x800.png 848w, https://substackcdn.com/image/fetch/$s_!dk46!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b6d104f-f42f-4f31-86ed-7fc7865f326f_800x800.png 1272w, https://substackcdn.com/image/fetch/$s_!dk46!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b6d104f-f42f-4f31-86ed-7fc7865f326f_800x800.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dk46!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b6d104f-f42f-4f31-86ed-7fc7865f326f_800x800.png" width="800" height="800" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0b6d104f-f42f-4f31-86ed-7fc7865f326f_800x800.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:800,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dk46!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b6d104f-f42f-4f31-86ed-7fc7865f326f_800x800.png 424w, https://substackcdn.com/image/fetch/$s_!dk46!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b6d104f-f42f-4f31-86ed-7fc7865f326f_800x800.png 848w, https://substackcdn.com/image/fetch/$s_!dk46!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b6d104f-f42f-4f31-86ed-7fc7865f326f_800x800.png 1272w, https://substackcdn.com/image/fetch/$s_!dk46!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b6d104f-f42f-4f31-86ed-7fc7865f326f_800x800.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>When a query hits your vector database, it follows a specific path:</p>
      <p>
          <a href="https://newsletter.systemdesign.one/p/what-is-a-vector-database">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[9 Agentic Patterns, Simply Explained]]></title><description><![CDATA[#140: The design decisions behind modern AI systems: how each design pattern works, where it breaks, and when to use it.]]></description><link>https://newsletter.systemdesign.one/p/agentic-design-patterns</link><guid isPermaLink="false">https://newsletter.systemdesign.one/p/agentic-design-patterns</guid><dc:creator><![CDATA[Neo Kim]]></dc:creator><pubDate>Wed, 22 Apr 2026 11:50:43 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/23793653-8f74-4b7a-b9ac-4ba50a00eb78_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Get my system design playbook for FREE on newsletter signup:</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><ul><li><p><em><a href="https://newsletter.systemdesign.one/p/agentic-design-patterns/?action=share">Share this post</a> &amp; I'll send you some rewards for the referrals.</em></p></li></ul><div><hr></div><p>When you build software with LLMs<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a>, you need to decide how much control to give the model.</p><p><em>Does your code run every step, or does the model figure out the steps on its own?</em></p><p>Agentic patterns are design patterns for making that decision. They&#8217;re the same architecture choices you already make in any system: <em>who controls what happens next, what happens on failure, and how data moves between components.</em></p><p>The difference is that some of those components are now language models.</p><p>This newsletter covers nine of those patterns&#8230;</p><p>On one end, <em>workflow patterns</em> where your code controls every step. On the other hand, <em>agent patterns</em> where the Large Language Model (<strong>LLM</strong>) decides what to do next.</p><p>That boundary is a decision about how much control to hand to the model: at some point, you stop telling the model what to do and start letting it figure that out.</p><p>The first question isn&#8217;t which pattern to use. It&#8217;s whether you need one at all.</p><p>Let&#8217;s start there...</p><div><hr></div><h2><a href="https://pages.awscloud.com/awsmp-gim-jqup-adhoc-aim-ent-ai-data-leader-book-1-ent.html?trk=68d21c4e-c3e8-4668-959c-cb996265ef28&amp;sc_channel=el">Find out why enterprise leaders are doubling down on data foundations for AI (Partner)</a></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://pages.awscloud.com/awsmp-gim-jqup-adhoc-aim-ent-ai-data-leader-book-1-ent.html?trk=68d21c4e-c3e8-4668-959c-cb996265ef28&amp;sc_channel=el" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!s4PY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc0b098a-af83-481b-805b-c1ad861f8c9d_1280x720.gif 424w, https://substackcdn.com/image/fetch/$s_!s4PY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc0b098a-af83-481b-805b-c1ad861f8c9d_1280x720.gif 848w, https://substackcdn.com/image/fetch/$s_!s4PY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc0b098a-af83-481b-805b-c1ad861f8c9d_1280x720.gif 1272w, https://substackcdn.com/image/fetch/$s_!s4PY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc0b098a-af83-481b-805b-c1ad861f8c9d_1280x720.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!s4PY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc0b098a-af83-481b-805b-c1ad861f8c9d_1280x720.gif" width="1280" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc0b098a-af83-481b-805b-c1ad861f8c9d_1280x720.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2609945,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/gif&quot;,&quot;href&quot;:&quot;https://pages.awscloud.com/awsmp-gim-jqup-adhoc-aim-ent-ai-data-leader-book-1-ent.html?trk=68d21c4e-c3e8-4668-959c-cb996265ef28&amp;sc_channel=el&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/193554859?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc0b098a-af83-481b-805b-c1ad861f8c9d_1280x720.gif&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!s4PY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc0b098a-af83-481b-805b-c1ad861f8c9d_1280x720.gif 424w, https://substackcdn.com/image/fetch/$s_!s4PY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc0b098a-af83-481b-805b-c1ad861f8c9d_1280x720.gif 848w, https://substackcdn.com/image/fetch/$s_!s4PY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc0b098a-af83-481b-805b-c1ad861f8c9d_1280x720.gif 1272w, https://substackcdn.com/image/fetch/$s_!s4PY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc0b098a-af83-481b-805b-c1ad861f8c9d_1280x720.gif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>AI is moving fast, but your data foundation isn&#8217;t keeping up?</p><p>That&#8217;s exactly why leaders from JPMorgan, Mercedes-Benz, Siemens, and Roche contributed to this AWS book on agentic analytics.</p><p>Here&#8217;s what you get:</p><ul><li><p>Real strategies from enterprise leaders: Learn how top companies are building data foundations for AI systems that actually scale.</p></li><li><p>Practical frameworks you can apply: From data strategy to data products, ML, and agentic AI.</p></li><li><p>Perspectives from 15+ enterprise leaders: Each chapter brings a unique view from senior leaders across industries.</p></li><li><p>Complex topics broken down clearly: Understand what matters without getting lost in theory.</p></li></ul><p>An <strong><a href="https://pages.awscloud.com/awsmp-gim-jqup-adhoc-aim-ent-ai-data-leader-book-1-ent.html?trk=68d21c4e-c3e8-4668-959c-cb996265ef28&amp;sc_channel=el">AWS book </a></strong>designed to help you build data foundations for intelligent agents that scale.</p><p>Download it now and learn how to build systems ready for intelligent agents.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://pages.awscloud.com/awsmp-gim-jqup-adhoc-aim-ent-ai-data-leader-book-1-ent.html?trk=68d21c4e-c3e8-4668-959c-cb996265ef28&amp;sc_channel=el&quot;,&quot;text&quot;:&quot;Get the AWS Data &amp; AI Book&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://pages.awscloud.com/awsmp-gim-jqup-adhoc-aim-ent-ai-data-leader-book-1-ent.html?trk=68d21c4e-c3e8-4668-959c-cb996265ef28&amp;sc_channel=el"><span>Get the AWS Data &amp; AI Book</span></a></p><p>(Thanks to <a href="https://pages.awscloud.com/awsmp-gim-jqup-adhoc-aim-ent-ai-data-leader-book-1-ent.html?trk=68d21c4e-c3e8-4668-959c-cb996265ef28&amp;sc_channel=el">AWS</a> for partnering on this post.)</p><div><hr></div><p><strong>Inside this newsletter, you&#8217;ll get:</strong></p><ul><li><p><strong>The shift from prompts to systems.</strong> Why simple LLM calls break down, and how workflows and agents introduce structured decision-making.</p></li><li><p><strong>A practical escalation ladder.</strong> When to stay simple, when to use workflows, and when agents are actually justified.</p></li><li><p><strong>Workflow patterns.</strong> chaining, routing, parallelization, and orchestrator-workers, with clear tradeoffs in cost, latency, and failure modes.</p></li><li><p><strong>Agent patterns.</strong> Reflection, tool use, ReAct, and planning, focusing on how the model makes decisions and where things go wrong.</p></li><li><p><strong>Evaluator-optimizer loop.</strong> How to add a reliability layer with evaluation, iteration limits, and cost control.</p></li><li><p><strong>Real-world case study.</strong> Combining multiple patterns in an AI code review pipeline, showing how to choose the simplest architecture that works.</p></li></ul><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Golden members get all posts like these!...</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h2><strong>Not everything needs an agent</strong></h2><p>Most tasks don&#8217;t need an agent<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a>&#8230;</p><p>Most tasks don&#8217;t even need a workflow&#8230;The default should be the simplest setup that works, and you escalate only when that setup breaks.</p><p>Start with a <strong>direct API call to an LLM</strong>. You can do more with this than most people think:</p><ul><li><p>Summarization</p></li><li><p>Classification</p></li><li><p>Extraction</p></li><li><p>Rewriting</p></li><li><p>Translation</p></li><li><p>Code generation with clear specs</p></li></ul><p>Most people skip this level too quickly!</p><p><strong>Workflow patterns</strong> are the next step up.</p><p>Escalate when a task has many steps, and when focused attention at each step improves the output. Between those steps, you define <strong>validation gates</strong>: <em>checks that verify the output before passing it forward</em>. The defining feature: <em>you can write every step down before the system runs.</em> Your code owns the control flow.</p><p>And the LLM handles the details.</p><p><strong>Agent patterns</strong> are for when the number and type of steps are <em>unknown</em> until the system runs.</p><p>The model needs to look at results, decide what to do next, and choose its own path. For example, say you built a support agent that handles order cancellations. A customer asks to cancel, but the agent doesn&#8217;t know upfront whether it needs to check the refund policy, look up the shipping status, or pass the issue to a human.</p><p>The steps depend on what it finds&#8230;</p><p>Here&#8217;s how you know it&#8217;s time to switch: you write more code handling errors and exceptions than doing the actual work. Or you keep adding special cases for situations the LLM encounters you didn&#8217;t plan for.</p><p>If you can still write down all the steps before the system runs, stick with a workflow.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SSER!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6337a830-7cfd-437c-a23d-24bf10f7dafb_2048x1143.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SSER!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6337a830-7cfd-437c-a23d-24bf10f7dafb_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!SSER!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6337a830-7cfd-437c-a23d-24bf10f7dafb_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!SSER!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6337a830-7cfd-437c-a23d-24bf10f7dafb_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!SSER!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6337a830-7cfd-437c-a23d-24bf10f7dafb_2048x1143.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SSER!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6337a830-7cfd-437c-a23d-24bf10f7dafb_2048x1143.jpeg" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6337a830-7cfd-437c-a23d-24bf10f7dafb_2048x1143.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Agentic pattern complexity escalation ladder showing four levels from direct API calls to multi-agent systems with decision criteria at each step-up point&quot;,&quot;title&quot;:&quot;Agentic pattern complexity escalation ladder showing four levels from direct API calls to multi-agent systems with decision criteria at each step-up point&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Agentic pattern complexity escalation ladder showing four levels from direct API calls to multi-agent systems with decision criteria at each step-up point" title="Agentic pattern complexity escalation ladder showing four levels from direct API calls to multi-agent systems with decision criteria at each step-up point" srcset="https://substackcdn.com/image/fetch/$s_!SSER!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6337a830-7cfd-437c-a23d-24bf10f7dafb_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!SSER!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6337a830-7cfd-437c-a23d-24bf10f7dafb_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!SSER!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6337a830-7cfd-437c-a23d-24bf10f7dafb_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!SSER!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6337a830-7cfd-437c-a23d-24bf10f7dafb_2048x1143.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Agent patterns sit within multi-agent architectures, i.e., many agents cooperate on the same problem, each with its own role.</p><p>One common mistake: you get your system to 70-80% of a prototype and assume the architecture needs upgrading. It usually doesn&#8217;t&#8230;The real issue is usually prompt quality or missing validation gates. Move to a more complex pattern only after the simpler one has actually failed, not when it feels limiting.</p><p>With that baseline set, here are the four workflow patterns:</p><div><hr></div><h2><strong>Workflow patterns: you control the flow</strong></h2><p>With workflow patterns, your code controls what happens.</p><p>You decide the steps, the order, and the checks between them. The LLM handles each step, but your code is in charge&#8230;</p><h3><strong>1. Prompt chaining</strong></h3><p>Prompt chaining breaks a task into a series of LLM calls.</p><p>Each call takes the output from the previous step, and a check runs between them to ensure the result is correct before moving on.</p><p>If you&#8217;ve ever set up a <strong>CI/CD</strong> (Continuous Integration/Continuous Deployment) pipeline, this is the same idea. A build pipeline compiles, lints, tests, and packages. Each step must pass before the next one runs.</p><p>If tests fail, the pipeline stops.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qUdp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f8b6ec-1bc5-4670-9b25-3c2d782f1098_2048x1143.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qUdp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f8b6ec-1bc5-4670-9b25-3c2d782f1098_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!qUdp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f8b6ec-1bc5-4670-9b25-3c2d782f1098_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!qUdp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f8b6ec-1bc5-4670-9b25-3c2d782f1098_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!qUdp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f8b6ec-1bc5-4670-9b25-3c2d782f1098_2048x1143.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qUdp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f8b6ec-1bc5-4670-9b25-3c2d782f1098_2048x1143.jpeg" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e6f8b6ec-1bc5-4670-9b25-3c2d782f1098_2048x1143.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Prompt chaining workflow pattern showing sequential LLM calls connected by validation gates that stop the pipeline on failure&quot;,&quot;title&quot;:&quot;Prompt chaining workflow pattern showing sequential LLM calls connected by validation gates that stop the pipeline on failure&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Prompt chaining workflow pattern showing sequential LLM calls connected by validation gates that stop the pipeline on failure" title="Prompt chaining workflow pattern showing sequential LLM calls connected by validation gates that stop the pipeline on failure" srcset="https://substackcdn.com/image/fetch/$s_!qUdp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f8b6ec-1bc5-4670-9b25-3c2d782f1098_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!qUdp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f8b6ec-1bc5-4670-9b25-3c2d782f1098_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!qUdp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f8b6ec-1bc5-4670-9b25-3c2d782f1098_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!qUdp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f8b6ec-1bc5-4670-9b25-3c2d782f1098_2048x1143.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>For example, in <a href="https://gradientflow.substack.com/p/ais-biggest-enterprise-test-case">legal contract review</a>, this looks like: extract all clauses from a contract, classify each clause by risk level, then generate a summary of high-risk items for human review. Products like <a href="https://legal.thomsonreuters.com/en/c/cocounsel">Thomson Reuters CoCounsel</a> and Robin AI run variations of this pipeline.</p><p>Each step gets a focused prompt with a narrow task, and a validation gate checks the output before passing it forward.</p><h4><em><strong>Tradeoffs</strong></em></h4><p>Latency grows linearly&#8230;</p><p>Five steps mean five round-trip. Errors carry forward: if step two misclassifies a clause, steps three through five process bad input. The gates between steps catch these failures early, but they only catch what the conditions you write can catch.</p><h3><strong>2. Routing</strong></h3><p>A routing pattern classifies the input and sends it to the right handler: <em>a different prompt, a different model, or a different sub-workflow.</em></p><p>Each handler is built for one type of input.</p><p>This is like a hospital front desk. A patient walks in, the front desk checks their symptoms, and then sends them to the right doctor. The front desk doesn&#8217;t treat anyone. And the doctor doesn&#8217;t check in patients.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YT5p!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36bd3215-0100-4043-9ebd-0589949a309a_2048x1143.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YT5p!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36bd3215-0100-4043-9ebd-0589949a309a_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!YT5p!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36bd3215-0100-4043-9ebd-0589949a309a_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!YT5p!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36bd3215-0100-4043-9ebd-0589949a309a_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!YT5p!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36bd3215-0100-4043-9ebd-0589949a309a_2048x1143.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YT5p!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36bd3215-0100-4043-9ebd-0589949a309a_2048x1143.jpeg" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/36bd3215-0100-4043-9ebd-0589949a309a_2048x1143.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Routing pattern architecture showing a classifier directing inputs to cheap, specialized, or expensive model handlers based on input type&quot;,&quot;title&quot;:&quot;Routing pattern architecture showing a classifier directing inputs to cheap, specialized, or expensive model handlers based on input type&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Routing pattern architecture showing a classifier directing inputs to cheap, specialized, or expensive model handlers based on input type" title="Routing pattern architecture showing a classifier directing inputs to cheap, specialized, or expensive model handlers based on input type" srcset="https://substackcdn.com/image/fetch/$s_!YT5p!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36bd3215-0100-4043-9ebd-0589949a309a_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!YT5p!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36bd3215-0100-4043-9ebd-0589949a309a_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!YT5p!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36bd3215-0100-4043-9ebd-0589949a309a_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!YT5p!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36bd3215-0100-4043-9ebd-0589949a309a_2048x1143.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><a href="https://sierra.ai/blog/constellation-of-models">Sierra AI</a> routes across 15+ models this way.</p><p>Classification tasks go to one model, tool calling to another, and response generation to a third. Customer support platforms like <a href="https://www.intercom.com/help/en/articles/9929230-the-fin-ai-engine">Intercom Fin</a> route each customer request based on what the customer wants and how they feel about it, sending requests to either an AI handler or a human agent.</p><p>The cost argument is built into the pattern&#8230;</p><p>Simple queries hit cheap, fast models. While complex ones hit expensive, capable ones. You&#8217;re not paying the highest prices for work that a smaller model handles well.</p><h4><em><strong>Tradeoffs</strong></em></h4><p>In 1 word: misclassification.</p><p>A complex query routed to the cheaper model produces a poor answer. A simple query routed to the expensive model is just a waste of money.</p><p>Either way, the router is a single point of failure, and its accuracy sets the upper limit for the entire system.</p><h3><strong>3. Parallelization</strong></h3><p>In software, parallelization means running many tasks simultaneously rather than one after another. With LLMs, it has two variants that solve different problems&#8230;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8m3i!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F244a234b-7085-4ac5-ae83-4a49c35d3810_2048x1143.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8m3i!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F244a234b-7085-4ac5-ae83-4a49c35d3810_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!8m3i!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F244a234b-7085-4ac5-ae83-4a49c35d3810_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!8m3i!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F244a234b-7085-4ac5-ae83-4a49c35d3810_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!8m3i!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F244a234b-7085-4ac5-ae83-4a49c35d3810_2048x1143.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8m3i!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F244a234b-7085-4ac5-ae83-4a49c35d3810_2048x1143.jpeg" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/244a234b-7085-4ac5-ae83-4a49c35d3810_2048x1143.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Parallelization pattern comparing sectioning (split task into independent subtasks and merge) versus voting (run same task multiple times and aggregate for consensus)&quot;,&quot;title&quot;:&quot;Parallelization pattern comparing sectioning (split task into independent subtasks and merge) versus voting (run same task multiple times and aggregate for consensus)&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Parallelization pattern comparing sectioning (split task into independent subtasks and merge) versus voting (run same task multiple times and aggregate for consensus)" title="Parallelization pattern comparing sectioning (split task into independent subtasks and merge) versus voting (run same task multiple times and aggregate for consensus)" srcset="https://substackcdn.com/image/fetch/$s_!8m3i!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F244a234b-7085-4ac5-ae83-4a49c35d3810_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!8m3i!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F244a234b-7085-4ac5-ae83-4a49c35d3810_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!8m3i!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F244a234b-7085-4ac5-ae83-4a49c35d3810_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!8m3i!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F244a234b-7085-4ac5-ae83-4a49c35d3810_2048x1143.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Sectioning</strong> splits a task into independent parts, runs them in parallel, and merges the results. This is like a construction crew. Electricians, plumbers, and carpenters work in the same building at the same time. Their work gets combined at the end.</p><p><a href="https://docs.github.com/en/code-security/code-scanning/introduction-to-code-scanning/about-code-scanning">GitHub Advanced Security</a> works this way: CodeQL and third-party tools like Snyk and Semgrep scan the same pull request in parallel, each producing independent security alerts.</p><p><strong>Voting</strong> runs the same task many times with different prompts and combines the answers. This is like a jury. Each juror looks at the same evidence and gives their own opinion, and the group decides together.</p><p>You&#8217;d build this for something like a security review: three separate prompts analyze the same code change for vulnerabilities, and you flag an issue only if two or more agree. Running the task many times is more expensive, but it catches fewer false alarms.</p><p>In short, <em>sectioning</em> gives each worker a different job; <em>voting</em> gives every worker the same job.</p><h4><em><strong>Tradeoffs</strong></em></h4><p>Cost multiplies with every parallel branch.</p><p>Partial failures are a design decision you need to plan for up front:<em> if one branch fails, do you retry it, proceed with the others, or fail the whole operation?</em> There&#8217;s no universal answer, and a wrong choice is expensive.</p><h3><strong>4. Orchestrator-workers</strong></h3><p>The orchestrator-workers pattern uses one central LLM to break a task into subtasks, assign each subtask to a worker LLM, and combine the results.</p><p>The difference from the earlier patterns is that the subtasks aren&#8217;t known in advance. Orchestrator figures them out as it goes. This is like a general contractor building a house:</p><ul><li><p>They hire the right people, but don&#8217;t know every subcontractor they&#8217;ll need upfront</p></li><li><p>Plumber finds a structural problem, so the contractor brings in a structural engineer</p></li><li><p>The work plan changes as the project reveals new problems</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!c9_Q!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F042accd5-ff55-4030-acf1-e28ecfcb7f0f_2048x1143.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!c9_Q!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F042accd5-ff55-4030-acf1-e28ecfcb7f0f_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!c9_Q!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F042accd5-ff55-4030-acf1-e28ecfcb7f0f_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!c9_Q!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F042accd5-ff55-4030-acf1-e28ecfcb7f0f_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!c9_Q!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F042accd5-ff55-4030-acf1-e28ecfcb7f0f_2048x1143.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!c9_Q!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F042accd5-ff55-4030-acf1-e28ecfcb7f0f_2048x1143.jpeg" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/042accd5-ff55-4030-acf1-e28ecfcb7f0f_2048x1143.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Orchestrator-workers pattern showing a central LLM dynamically delegating to worker agents including unknown workers decided at runtime&quot;,&quot;title&quot;:&quot;Orchestrator-workers pattern showing a central LLM dynamically delegating to worker agents including unknown workers decided at runtime&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Orchestrator-workers pattern showing a central LLM dynamically delegating to worker agents including unknown workers decided at runtime" title="Orchestrator-workers pattern showing a central LLM dynamically delegating to worker agents including unknown workers decided at runtime" srcset="https://substackcdn.com/image/fetch/$s_!c9_Q!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F042accd5-ff55-4030-acf1-e28ecfcb7f0f_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!c9_Q!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F042accd5-ff55-4030-acf1-e28ecfcb7f0f_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!c9_Q!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F042accd5-ff55-4030-acf1-e28ecfcb7f0f_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!c9_Q!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F042accd5-ff55-4030-acf1-e28ecfcb7f0f_2048x1143.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><a href="https://cursor.com/learn/agents">Cursor&#8217;s agent mode</a> works this way. It starts with different agents for different parts of the codebase:</p><ul><li><p>One adding tests</p></li><li><p>Another updating documentation</p></li><li><p>A third refactoring shared utilities</p></li></ul><p>You&#8217;re trusting the LLM to figure out what work needs to be done, not just to do the work you&#8217;ve assigned.</p><p>This uses several LLMs, but it&#8217;s NOT a multi-agent architecture: <em>a single central LLM remains in control of the entire process.</em> In multi-agent systems, agents can call each other directly without a central coordinator, and control can transfer between them.</p><h4><em><strong>Tradeoffs</strong></em></h4><p>The orchestrator can lose track of the original goal.</p><p>It breaks simple tasks into too many subtasks. And when workers finish faster than it can combine results, the orchestrator itself becomes the bottleneck.</p><div><hr></div><p>These four patterns get gradually more powerful and more expensive&#8230;</p><p>Chaining and routing are cheap and predictable: you know what will happen before it runs. Parallelization saves time but costs more. Orchestrator-workers handle problems you can&#8217;t predict upfront, but it introduces failure modes you can&#8217;t predict either. The first three patterns are simple to understand.</p><p>Most production systems shouldn&#8217;t need to go further than parallelization. When they do, the next four patterns hand control to the model itself...</p><div><hr></div><p><em><strong>Reminder: this is a teaser of the subscriber-only newsletter, exclusive to my golden members.</strong></em></p><p>When you upgrade, you&#8217;ll get:</p><ul><li><p><strong>High-level architecture of real-world systems.</strong></p></li><li><p>Deep dive into how popular real-world systems work.</p></li><li><p><strong>How real-world systems handle scale, reliability, and performance.</strong></p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe&quot;,&quot;text&quot;:&quot;Unlock Full Access&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.systemdesign.one/subscribe"><span>Unlock Full Access</span></a></p><div><hr></div><h2><strong>Agent patterns: LLM controls the flow</strong></h2><p>With agent patterns, you STOP defining what happens and in what order.</p><p>Instead of defining steps, you define constraints: <em>what tools are available, how much the model can spend, and when to stop</em>. The model observes, reasons, and chooses what happens next.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8Y67!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05738b84-f3c1-43d4-a706-6ed0ae52f5ef_2048x1143.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8Y67!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05738b84-f3c1-43d4-a706-6ed0ae52f5ef_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!8Y67!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05738b84-f3c1-43d4-a706-6ed0ae52f5ef_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!8Y67!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05738b84-f3c1-43d4-a706-6ed0ae52f5ef_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!8Y67!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05738b84-f3c1-43d4-a706-6ed0ae52f5ef_2048x1143.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8Y67!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05738b84-f3c1-43d4-a706-6ed0ae52f5ef_2048x1143.jpeg" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/05738b84-f3c1-43d4-a706-6ed0ae52f5ef_2048x1143.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Side-by-side comparison of workflow patterns where you define the fixed execution path versus agent patterns where the LLM controls the flow and you define the boundaries&quot;,&quot;title&quot;:&quot;Side-by-side comparison of workflow patterns where you define the fixed execution path versus agent patterns where the LLM controls the flow and you define the boundaries&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Side-by-side comparison of workflow patterns where you define the fixed execution path versus agent patterns where the LLM controls the flow and you define the boundaries" title="Side-by-side comparison of workflow patterns where you define the fixed execution path versus agent patterns where the LLM controls the flow and you define the boundaries" srcset="https://substackcdn.com/image/fetch/$s_!8Y67!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05738b84-f3c1-43d4-a706-6ed0ae52f5ef_2048x1143.jpeg 424w, https://substackcdn.com/image/fetch/$s_!8Y67!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05738b84-f3c1-43d4-a706-6ed0ae52f5ef_2048x1143.jpeg 848w, https://substackcdn.com/image/fetch/$s_!8Y67!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05738b84-f3c1-43d4-a706-6ed0ae52f5ef_2048x1143.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!8Y67!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05738b84-f3c1-43d4-a706-6ed0ae52f5ef_2048x1143.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>That trust decision appears differently in each pattern: <em>how much autonomy, over what actions, with what limits.</em></p><h3><strong>5. Reflection</strong></h3><p>Reflection is when the LLM generates output, reviews its own work, and then fixes the problems it found.</p>
      <p>
          <a href="https://newsletter.systemdesign.one/p/agentic-design-patterns">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[I rejected 1000s of resumes at Meta]]></title><description><![CDATA[#139: This is what they got wrong]]></description><link>https://newsletter.systemdesign.one/p/software-engineer-resume</link><guid isPermaLink="false">https://newsletter.systemdesign.one/p/software-engineer-resume</guid><dc:creator><![CDATA[Austen McDonald]]></dc:creator><pubDate>Wed, 15 Apr 2026 07:40:44 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/2eedd062-dbeb-4728-8592-effb8923d220_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Get my system design playbook for FREE on newsletter signup:</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><ul><li><p><em><a href="https://newsletter.systemdesign.one/p/software-engineer-resume/?action=share">Share this post</a> &amp; I'll send you some rewards for the referrals.</em></p></li></ul><div><hr></div><p>Most engineering resume advice you come across is junk.</p><p>At best, it comes from a tech recruiter who&#8217;s looked at 100s of resumes. At worst, it&#8217;s from yet another senior engineer cobbling together a blog post based on their experience applying to MAANGO.</p><p>This post is not like that.</p><p>I&#8217;ve made thousands of hiring and leveling decisions as mobile Hiring Committee Chair at Meta. I also led Meta&#8217;s engineering team responsible for finding and ingesting over 2M engineers&#8217; resumes into their internal <strong>ATS</strong> (Applicant Tracking System).</p><p>I&#8217;ve seen a lot of resumes :)</p><p>And because of that, I know that <strong>you should treat your resume like a product</strong>. Think of it as a landing page for your career, designed to drive each target user who touches it to take action. And yes, your resume has multiple target users&#8212;four, in fact&#8212;and you should learn what each one is looking for.</p><p>(Oh, and btw, a lot of this applies to your LinkedIn profile too, since that is literally a landing page.)</p><p>Onward.</p><div><hr></div><h2><strong><a href="https://cline.gg/neo">The tax you pay to run multiple agents (Partner)</a></strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://cline.gg/neo" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!MX0h!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e0ba733-c96e-461d-92e4-e2bd8c2cb2a5_1600x900.png 424w, https://substackcdn.com/image/fetch/$s_!MX0h!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e0ba733-c96e-461d-92e4-e2bd8c2cb2a5_1600x900.png 848w, https://substackcdn.com/image/fetch/$s_!MX0h!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e0ba733-c96e-461d-92e4-e2bd8c2cb2a5_1600x900.png 1272w, https://substackcdn.com/image/fetch/$s_!MX0h!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e0ba733-c96e-461d-92e4-e2bd8c2cb2a5_1600x900.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!MX0h!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e0ba733-c96e-461d-92e4-e2bd8c2cb2a5_1600x900.png" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4e0ba733-c96e-461d-92e4-e2bd8c2cb2a5_1600x900.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:&quot;https://cline.gg/neo&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!MX0h!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e0ba733-c96e-461d-92e4-e2bd8c2cb2a5_1600x900.png 424w, https://substackcdn.com/image/fetch/$s_!MX0h!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e0ba733-c96e-461d-92e4-e2bd8c2cb2a5_1600x900.png 848w, https://substackcdn.com/image/fetch/$s_!MX0h!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e0ba733-c96e-461d-92e4-e2bd8c2cb2a5_1600x900.png 1272w, https://substackcdn.com/image/fetch/$s_!MX0h!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e0ba733-c96e-461d-92e4-e2bd8c2cb2a5_1600x900.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>If you&#8217;ve spent any time with coding agents, you know the feeling.</p><p>You start the morning with a clean plan. Spin up a few agents. One is refactoring the auth module. Another is writing tests. A third is scaffolding a new API endpoint. You&#8217;re flying.</p><p>Then, around 10:30 AM, you look up and realize you have 20 terminal windows open. One agent is blocked waiting for a decision you forgot to make. Another finished 40 minutes ago, and you never noticed. A third went sideways three commits back. You&#8217;re no longer flying. You&#8217;re drowning.</p><p>You&#8217;ve shifted from human as driver to human as director.</p><p>When running coding agents in parallel, the bottleneck isn&#8217;t just context. It&#8217;s your own attention trying to manage 10 agents across 10 terminals. You&#8217;re losing your mind to terminal chaos.</p><p>Meet <strong><a href="https://cline.gg/neo">Cline Kanban</a></strong>, a CLI-agnostic visual orchestration layer that makes multi-agent workflows usable across providers. Multiple agents, one UI. It&#8217;s the air traffic controller for the agents you&#8217;re already running, regardless of where they live.</p><ul><li><p><strong>Interoperable: </strong>Claude Code and Codex compatible, with more coming soon.</p></li><li><p><strong>Full Visibility:</strong> Confidently run multiple agents and work through your backlog faster.</p></li><li><p><strong>Smart Triage:</strong> See which agents are blocked or in review and jump in to unblock them.</p></li><li><p><strong>Chain Tasks:</strong> Set dependencies so Agent B won&#8217;t start until Agent A is complete.</p></li><li><p><strong>Familiar UI:</strong> Everything in a single Kanban view.</p></li></ul><p>Stop tracking agents and start directing them. Get a meaningful edge with the beta release.</p><p><strong>Install Cline Kanban Today: </strong><code>npm i -g cline</code></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://cline.gg/neo&quot;,&quot;text&quot;:&quot;Get Started Today&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://cline.gg/neo"><span>Get Started Today</span></a></p><p>(Thanks to <a href="https://cline.gg/neo">Cline</a> for partnering on this post.)</p><div><hr></div><p>I want to introduce you to <strong><a href="https://thebehavioral.tech/">Austen McDonald</a></strong> as a guest author.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://thebehavioral.tech/" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ifOs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd926164c-c3c5-4428-91a7-b237ed8cb6ad_1200x630.png 424w, https://substackcdn.com/image/fetch/$s_!ifOs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd926164c-c3c5-4428-91a7-b237ed8cb6ad_1200x630.png 848w, https://substackcdn.com/image/fetch/$s_!ifOs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd926164c-c3c5-4428-91a7-b237ed8cb6ad_1200x630.png 1272w, https://substackcdn.com/image/fetch/$s_!ifOs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd926164c-c3c5-4428-91a7-b237ed8cb6ad_1200x630.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ifOs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd926164c-c3c5-4428-91a7-b237ed8cb6ad_1200x630.png" width="1200" height="630" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d926164c-c3c5-4428-91a7-b237ed8cb6ad_1200x630.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:630,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:324054,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://thebehavioral.tech/&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/193714466?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd926164c-c3c5-4428-91a7-b237ed8cb6ad_1200x630.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ifOs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd926164c-c3c5-4428-91a7-b237ed8cb6ad_1200x630.png 424w, https://substackcdn.com/image/fetch/$s_!ifOs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd926164c-c3c5-4428-91a7-b237ed8cb6ad_1200x630.png 848w, https://substackcdn.com/image/fetch/$s_!ifOs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd926164c-c3c5-4428-91a7-b237ed8cb6ad_1200x630.png 1272w, https://substackcdn.com/image/fetch/$s_!ifOs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd926164c-c3c5-4428-91a7-b237ed8cb6ad_1200x630.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Austen is a former Senior Engineering Manager and Hiring Committee Chair at Meta, where he spent almost 10 years and conducted over 1,000+ interviews. He&#8217;s the industry&#8217;s resident expert on behavioral interviews and writes about them on his <strong><a href="https://thebehavioral.substack.com/welcome">Substack</a></strong> and in his book, <strong><a href="https://thebehavioral.tech/">Mastering Behavioral Interviews</a>.</strong> As AI raises the bar for ownership and independent judgment, behavioral interviews matter more than ever.</p><p>For a few days only, use code <em><strong>SDN26</strong></em> for <strong><a href="https://thebehavioral.tech/premium-access">20% off Premium Access</a></strong>, which includes a print copy of his book and an AI copilot that helps you find your best career stories, shape them into high-signal answers, and deliver them well.</p><div><hr></div><h1>Part 1: Your Resume is a Product</h1><p>Your resume is a landing page for your career search.</p><p>It&#8217;s the first thing these four personas see when they consider you for an engineering job:</p><ul><li><p>Machines (Applicant Tracking System),</p></li><li><p>Recruiters (we&#8217;ll include sourcers for all you sticklers),</p></li><li><p>Interviewers,</p></li><li><p>And Hiring Managers.</p></li></ul><p>Each of these personas looks progressively deeper into the content in the resume. Each one also extracts different, but overlapping, information.</p><p>Let&#8217;s consider the needs of each of these personas, what they look for in a resume, how they read it, and the decisions they make along the way&#8230;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CnMx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28d91b52-093c-43a0-b72f-48192d75887b_1600x683.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CnMx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28d91b52-093c-43a0-b72f-48192d75887b_1600x683.png 424w, https://substackcdn.com/image/fetch/$s_!CnMx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28d91b52-093c-43a0-b72f-48192d75887b_1600x683.png 848w, https://substackcdn.com/image/fetch/$s_!CnMx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28d91b52-093c-43a0-b72f-48192d75887b_1600x683.png 1272w, https://substackcdn.com/image/fetch/$s_!CnMx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28d91b52-093c-43a0-b72f-48192d75887b_1600x683.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CnMx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28d91b52-093c-43a0-b72f-48192d75887b_1600x683.png" width="1456" height="622" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/28d91b52-093c-43a0-b72f-48192d75887b_1600x683.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:622,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!CnMx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28d91b52-093c-43a0-b72f-48192d75887b_1600x683.png 424w, https://substackcdn.com/image/fetch/$s_!CnMx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28d91b52-093c-43a0-b72f-48192d75887b_1600x683.png 848w, https://substackcdn.com/image/fetch/$s_!CnMx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28d91b52-093c-43a0-b72f-48192d75887b_1600x683.png 1272w, https://substackcdn.com/image/fetch/$s_!CnMx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28d91b52-093c-43a0-b72f-48192d75887b_1600x683.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Machines (ATS)</h2><p>Unless your profile is handed directly to a recruiter, your hiring process starts with software. Whether you apply online, get referred, or turn up in a sourcer&#8217;s keyword search, you need to pass this automated filter first&#8230;</p><p><strong>How ATS systems work:</strong></p><p>When you submit a resume, the ATS parses it into structured data: <em>your name, contact info, work history, education, and skills.</em></p><p>It then scores you against the job description on keyword relevance, years of experience, and other recruiter-configured criteria. High-scoring candidates float to the top of the pile; low-scoring ones may never be seen by a human.</p><p>Of course, companies use AI-powered search that goes beyond keyword matching. These systems understand that &#8220;built ML pipelines&#8221; and &#8220;machine learning infrastructure&#8221; mean the same thing. This is good news&#8212;it means you don&#8217;t need to stuff your resume with every keyword&#8212;but it also means your accomplishments need to be substantively relevant to the role and described as such.</p><p><strong>What they need:</strong></p><ul><li><p>Clean, parseable formatting. Fancy designs can break text extraction: tables, some multi-column layouts, text boxes, icons, or images. Even creative section titles like &#8220;My Journey&#8221; instead of &#8220;Professional Experience&#8221; are risky.</p></li><li><p>Easy to find minimum qualifications since an automated tool is probably filtering first on whatever is stated in the job description.</p></li><li><p>Keywords matching the job posting and role requirements.</p></li><li><p>Technologies named where they can be easily indexed.</p></li></ul><p><strong>Hot Take:</strong> I have no data for this, but numerous companies have buttons on their career sites that say, &#8220;Upload your resume to see if you&#8217;re a fit.&#8221; Seems like it would be valuable to iterate on your resume until the site tells you you&#8217;re a good fit, lol.</p><h2>Recruiters</h2><p>Recruiters (or, more properly, sourcers) are speed readers scanning hundreds of resumes daily.</p><p>After basic machine filtering for experience, education, and past companies, they&#8217;ll skim your summary and a few accomplishment bullets, then make an in/out call in 5&#8211;7 seconds.</p><p>They&#8217;re looking for two things: candidates who meet the stated requirements, and candidates who fit the hiring manager&#8217;s mental model of a great hire. Those who clear both bars move forward to the next stage.</p><p>This is either a direct screening process, such as a phone call, or a final call to the hiring manager or committee before they begin screening process.</p><p><strong>What they need:</strong></p><ul><li><p>To immediately see that your profile fits the minimum requirements. They will reject you immediately if you don&#8217;t have the years of experience or the proper degrees, or have worked in similar roles (based mostly on title) that are identified in the job description.</p></li><li><p>A quick visual path to something that excites them about you and convinces them you fit the idea of what the hiring manager is looking for. This could be a name-brand company, a project relevant to the hiring team, or frequent use of technology related to the role. Give them a reason to take action and move you to the next step.</p></li><li><p>Brevity. Obviously, with 100s of these to review, the shorter the time all this takes, the better. So keep it to one page unless you&#8217;re a CEO.</p></li></ul><h2>Interviewers</h2><p>There are a couple of places where interviewers or members of the engineering team read your resume.</p><p>Sometimes there is a layer of reviewers who approve the resume before they green-light a technical screen. It could be the hiring manager or a committee of ICs. You can see them more as a technical version of a recruiter, as they have the same needs.</p><p>Technical screeners are a different type of reader. They&#8217;re looking for conversation starters and signals that you&#8217;d contribute to the team.</p><p>Onsite interviewers (especially behavioral interviewers) are gauging your level based on the business impact you&#8217;ve driven. They&#8217;re also looking for red flags, such as performance challenges: short stints at a workplace or career gaps.</p><p><strong>What they need:</strong></p><ul><li><p>Easy to find accomplishments that look like they fit with the work of the company/team</p></li><li><p>Evidence you can defend your claims under scrutiny: those accomplishments need technical details</p></li><li><p>Signals of technical depth and leadership</p></li><li><p>A minimum of reasons to be suspicious</p></li></ul><h2>Hiring Managers</h2><p>Hiring managers are sometimes the first, as we talked about above in &#8220;Interviewers,&#8221; but almost always the last person to touch your resume.</p><p>At this stage, they&#8217;re reviewing a handful of finalists, which makes them the most thorough readers of your resume. They know what their team needs and look for specific evidence of alignment with those needs.</p><p><strong>What they need:</strong></p><ul><li><p>Evidence that your past work resembles what they&#8217;re hiring for</p></li><li><p>Measurable impact with appropriate scope</p></li><li><p>They want ownership and growth signals: can you self-direct, drive work to completion, and adapt fast, i.e., you&#8217;re ready for agentic coding.</p></li><li><p>Clear visual hierarchy so they can find what matters</p></li></ul><p>Now that you know who&#8217;s reading your resume and what they&#8217;re scanning for, there&#8217;s a natural question: <em>what do you actually put on the page?</em></p><p>The answer isn&#8217;t &#8220;everything you&#8217;ve ever done.&#8221; Instead, curate a story of the impact you&#8217;ve made in your career. The next part walks you through that step by step...</p><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/p/software-engineer-resume?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption"><em>Share this post &amp; get rewards for the referrals.</em></p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/p/software-engineer-resume?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.systemdesign.one/p/software-engineer-resume?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><div><hr></div><h1>Part 2: Resume Content that Converts</h1><p>Now that we understand the users of your resume, let&#8217;s build a product that converts, moving you through the recruiting pipeline to a hire!</p><h2>The Basics: What Goes In and What Stays Out of Your Resume</h2><p>Before diving into content, we should get the basics out of the way.</p><p>Here&#8217;s the standard structure that readers expect, roughly in this order:</p><ol><li><p><strong>Contact Information</strong> &#8211; name, email, LinkedIn (maybe GitHub, if it&#8217;s good; more on that later)</p></li><li><p><strong>Summary</strong> &#8211; Who you are and why you&#8217;re right for the role in 2-3 sentences</p></li><li><p><strong>Work Experience</strong> &#8211; Your professional history with accomplishments</p></li><li><p><strong>Education</strong> &#8211; Degrees, relevant coursework, honors</p></li><li><p><strong>Projects</strong> &#8211; Side projects, open source, portfolio work</p></li><li><p><strong>Skills</strong> &#8211; The keyword soup for machines, as we&#8217;ll see</p></li></ol><p>This order should vary depending on who you are: experienced candidates can de-emphasize education; new grads should surface projects higher.</p><p>But readers expect to find things in roughly this order, so don&#8217;t make them hunt.</p><h2>Contact Information</h2><p>Of course, you should include basics like name and email.</p><p>If you want a phone call, include your phone number. Recruiters will ask for it later if you don&#8217;t give it now.</p><p>I also believe you should include a LinkedIn URL, linkified in the PDF form of the resume. Companies expect engineers to have LinkedIn profiles and may ingest that profile data automatically (don&#8217;t tell the LinkedIn lawyers). Much of the advice I give in this newsletter applies to LinkedIn profiles as well, especially the summary and bullet-point writing.</p><p>And please get a decent profile photo and cover photo for your LinkedIn profile.</p><p>Anything else you put in this contact information section should be additive to your candidacy. Like, don&#8217;t include a GitHub or personal website link if that&#8217;s there, it&#8217;s unimpressive. GitHub profiles need at least some recruiting-friendly repos that are deployed somewhere and well-documented. Personal websites should work (lol) and be a high-quality representation of yourself. Otherwise, don&#8217;t include them.</p><p>If you&#8217;re a US citizen applying to US roles and your citizenship isn&#8217;t obvious from your resume, note it in your contact section.</p><h2>Write a Summary</h2><p>Most resume advice is subjective.</p><p>Here&#8217;s my most opinionated take in this newsletter: <em>always include a summary and put it at the top.</em></p><p>Two or three sentences max. And tailor it to the role. After reading it, I should immediately come away with the idea that you&#8217;d be a good fit for the role. Again, recruiters will heavily depend on this summary to help them decide.</p><p><strong>&#9888;&#65039; Warning: Don&#8217;t let AI write your summary.</strong> AI-generated summaries are an epidemic. They all sound the same&#8212;vague, buzzword-laden, and interchangeable. Recruiters and hiring managers can spot them instantly, and they leave readers with zero understanding of who you actually are.</p><p>These all say nothing:</p><blockquote><p><em>Results-driven Front End Engineer with 4+ years of experience building scalable, user-centric web applications. Proven track record of delivering high-quality code in fast-paced environments. Passionate about creating seamless user experiences and collaborating with cross-functional teams.</em></p><p><em>Dynamic and detail-oriented Front End Engineer with expertise in modern JavaScript frameworks. Adept at translating complex requirements into elegant, performant solutions. Strong communicator with a passion for continuous learning and staying current with industry trends.</em></p><p><em>Innovative Front End Engineer bringing 4 years of hands-on experience crafting responsive, accessible interfaces. Committed to writing clean, maintainable code and driving technical excellence. Thrives in collaborative environments and passionate about mentoring junior developers.</em></p></blockquote><p>My mind just numbs over reading these.</p><p>Phrases like &#8220;results-driven,&#8221; &#8220;passionate,&#8221; and &#8220;cross-functional teams&#8221; are filler. They fit any candidate, which means they describe none of them.</p><ul><li><p>What kind of products have you built? (B2B dashboards? Consumer apps? E-commerce?)</p></li><li><p>What&#8217;s your technical specialty? (Performance optimization? Design systems? Accessibility?)</p></li><li><p>What makes you different from the other 500 front-end engineers applying?</p></li></ul><h3>A Better Summary</h3><p>Let&#8217;s build something better, remembering to tailor the summary to the role.</p><p>Suppose you were applying to some B2B SaaS company as this same front-end engineer who&#8217;s looking to replace their AI-built summary.</p><p>What about this one, that I typed with my own meat sticks:</p><blockquote><p><em>I make complex data feel simple. I&#8217;m a front-end engineer with 4 years experience building B2B decision platforms at multiple fast-moving startups.</em></p></blockquote><p>First, it&#8217;s shorter, and therefore more likely to be read.</p><p>My technical writing instructor in college told me, &#8220;Every line decreases your readership by half.&#8221; I don&#8217;t know if that&#8217;s true, but it&#8217;s a good principle to live by.</p><p>Next, it&#8217;s interesting and attractive. Who doesn&#8217;t want someone who makes complex data feel simple?</p><p>It also immediately communicates who you are: a front-end engineer with 4 years of experience at startups. Now I know whether your profile is relevant to me or not.</p><p>Finally, it connects to the role.</p><p><strong>Note:</strong> Let&#8217;s talk about the &#8220;cute summary&#8221;. I actually prefer something like &#8220;Assistant to my local Claude Code instance&#8221; over a buzzwordy AI summary. At least it communicates some creativity and humanity. But I think you can do better and should do better unless it&#8217;s really obvious your resume matches the target role&#8212;like you&#8217;re applying for a senior ML safety engineering role at Big Tech X and just came from Big Tech Y with the same title.</p><h2>Writing Bullets That Get Read</h2><p>Resume readers are hiring you to deliver business value using technology.</p><p>That&#8217;s why they need an engineer. So give them confidence that you can do that with the bullets you list under your experience.</p><p>Here&#8217;s how you can do this:</p><ul><li><p><strong>Lead with results, not activities.</strong> Start with what happened, not what you did. The activity is the supporting detail. Use a formula like: &#8220;Delivered [business value] using [technology/approach].&#8221;</p></li><li><p><strong>Quantify and show scope.</strong> Percentages, dollar amounts, user counts, latency improvements, and length of the project. Let the numbers imply how hard the work was.</p></li><li><p><strong>Your bullets become interview topics.</strong> The accomplishments you list will get asked about. Make them interesting enough to discuss and defensible under scrutiny.</p></li><li><p><strong>Include leadership signals.</strong> Highlight moments where you took initiative, managed others, or drove a project forward&#8212;even if you weren&#8217;t a manager.</p></li><li><p><strong>Lead with your best.</strong> Don&#8217;t sort accomplishments chronologically within a role. Place the most relevant and impactful ones first in each section.</p></li><li><p><strong>Embed tech in your accomplishments.</strong> Mentioning technologies within achievement bullets improves your ranking for those keywords, not just in a separate skills section.</p></li></ul><p>Here are some examples of weak vs strong bullets:</p><blockquote><p><strong>Weak:</strong> <em>Worked on the checkout system and fixed bugs.</em></p><p><strong>Strong:</strong> <em>Reduced checkout abandonment by 12% by identifying and resolving a race condition in the payment flow using React and Stripe&#8217;s API.</em></p><p><strong>Weak:</strong> <em>Responsible for migrating the database to a new system.</em></p><p><strong>Strong:</strong> <em>Led a 6-month migration of 50M records from MySQL to PostgreSQL, coordinating across 3 teams and achieving zero downtime during cutover.</em></p><p><strong>Weak:</strong><em> Helped improve the CI/CD pipeline.</em></p><p><strong>Strong:</strong><em> Cut deploy times from 45 minutes to 8 minutes by parallelizing test suites and implementing Docker layer caching, adopted by 12 engineering teams.</em></p><p><strong>Weak:</strong><em> Mentored junior engineers on the team.</em></p><p><strong>Strong:</strong><em> Mentored 3 junior engineers through their first production launches, with one promoted to mid-level within 8 months.</em></p></blockquote><h2>Projects &amp; Proof of Work</h2><p>I don&#8217;t think projects or other non-paid work are super valuable on a resume, unless you are in one of two positions:</p><ul><li><p>You&#8217;re a new grad with little paid experience, and projects are actually what you&#8217;ve done most of in your life,</p></li><li><p>Or you&#8217;re trying to land a role in a new technical area where you haven&#8217;t technically worked (*cough* AI).</p></li></ul><p>Outside of those two cases, projects are marginal. They might tip the scales if a recruiter is on the fence, but they rarely move the needle on their own.</p><p>Generally, though, prioritize contributions from paid work and don&#8217;t spend too much vertical space on projects.</p><p><strong>GitHub </strong><em><strong>can</strong></em><strong> be useful.</strong> If your GitHub profile has a lot of green squares and/or a few polished repos <em>deployed</em>&nbsp;somewhere, with good READMEs, tests, and documentation, then great. Be sure to pin repos that showcase your best work. But if your contribution chart doesn&#8217;t look like a Christmas tree and/or you don&#8217;t have meaningful projects, just leave it off.</p><p><strong>Side projects have some value.</strong> I don&#8217;t think side projects do a ton for you, since working by yourself with no expectation of production-level code is just a different environment than a real job. But they do show initiative and genuine interest. Skip the generic description (&#8221;AI-powered todo app in React Native&#8221;). Write something that shows personality: &#8220;Maximalist AI approach to running my life &#8212; a case study in build-fast-with-Claude-Code development.&#8221;</p><p><strong>Technical blogs or writing are worth considering.</strong> Publishing shows communication skills and depth of understanding. If you provide anything, make sure it&#8217;s easy for me to find the 2-3 best pieces.</p><h2>The Skills Section</h2><p>If there&#8217;s one part of the resume that&#8217;s most often overlooked by readers, it&#8217;s the skills section. Even though that&#8217;s true, it still serves a purpose.</p><p><strong>Include a skill soup section.</strong> A ranked list of technologies ensures you show up in keyword searches and reassures recruiters when hiring managers say, &#8220;They must know React.&#8221;</p><p><strong>Keep it limited to what you actually know.</strong> Don&#8217;t list everything you&#8217;ve ever touched&#8212;only what&#8217;s relevant and what you can actually discuss. Some interviewers will probe for depth if you include an unholy number of skills. Also, I feel like I should charge you for my AI tokens if you include things like &#8220;HTML.&#8221;</p><h2>Tailoring Your Resume Per Job</h2><p>Honestly, most people don&#8217;t tailor their resume to roles because it&#8217;s just too much work. That&#8217;s probably true, especially if you&#8217;re applying to dozens of jobs.</p><p>Rather than tailoring per job, tailor for big-name companies, which are worth the extra effort, and keep a small set of versions for <strong>different role types</strong>. Perhaps you have one focused on your front-end work for tech-heavy senior IC roles, and another on your leadership skills for tech lead roles. Maybe one that highlights your AI personal projects and one that focuses on your full-time data engineering experience.</p><p>If you&#8217;re going to tailor, you need to know what to tailor <em>to</em>. Leverage the job description for this:</p><ol><li><p><strong>Look for repeated terms.</strong> If &#8220;distributed systems&#8221; appears three times, it matters more than &#8220;familiarity with GraphQL,&#8221; which only appears once.</p></li><li><p><strong>Separate requirements from wish list.</strong> Most job postings have a &#8220;must have&#8221; and &#8220;nice to have&#8221; section&#8212;or they bury the real requirements in the first few bullets and pad the rest. Focus your tailoring energy on what appears at the top.</p></li><li><p><strong>Identify the core problem they&#8217;re hiring for.</strong> &#8220;Looking for someone to own our data pipeline&#8221; means they want reliability and ownership signals. &#8220;Help us scale to 10x users&#8221; means they want performance and system design experience. Mirror this kind of language back to them.</p></li><li><p><strong>Note specific technologies vs general skills vs filler.</strong> &#8220;Experience with Kafka&#8221; is a keyword you either have or don&#8217;t. &#8220;Experience leading teams&#8221; means they&#8217;re looking for some kind of people leadership and cross-functional signals. &#8220;Strong communication skills&#8221; is a filler that every job post includes. Don&#8217;t waste resume space on filler.</p></li></ol><p><em>A quick trick: </em>paste the job description into an LLM and ask, &#8220;What are the top 10 concepts in this job description and what are they looking for in this candidate?&#8221; Upload your resume and ask it to compare.</p><p>Now that you&#8217;ve identified what to focus on, here are some specific tactics to apply:</p><ul><li><p><strong>Adjust your summary.</strong> Emphasize the aspects of your experience that align with the needs of that role or company that we just discussed, identifying.</p></li><li><p><strong>Reorder the sections.</strong> We&#8217;ll talk more about formatting below, but if you&#8217;re applying to AI roles but don&#8217;t have AI experience other than personal projects, leaving those projects at the bottom is a sure way to get ignored. Lift them up, even above the professional experience section if needed.</p></li><li><p><strong>Reorder your bullets.</strong> Lead with accomplishments most relevant to the target role.</p></li><li><p><strong>Expand or condense roles.</strong> Similar to reordering bullets, you can expand or condense on certain past roles that are most relevant, or even completely remove some experience if you think it&#8217;s distracting.</p></li><li><p><strong>Match the skills.</strong> If different skills are needed, then include those or reorder them to the front of the skills listing.</p></li></ul><p>Now you have the raw materials: a summary, impact-driven bullets, and content tailored to your target roles. But even the best content fails if it&#8217;s buried in a cluttered design.</p><p>The next section is about making your resume visually effortless for each type of reader&#8230;</p><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/p/software-engineer-resume?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption"><em>Share this post &amp; get rewards for the referrals.</em></p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/p/software-engineer-resume?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.systemdesign.one/p/software-engineer-resume?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><div><hr></div><h1>Part 3: Designing a Resume for Ease of Use</h1><p>Your resume doesn&#8217;t have to be pretty, but it needs to fulfill the visual needs of the four readers. Aim for a simple, easy-to-use formatting:</p><ul><li><p>Most important stuff at the top: top of the page and the top of each new section</p></li><li><p>Optimize for a clear visual flow, top to bottom</p></li><li><p>Draw the reader&#8217;s eye to the most important parts with text size and weight. Be careful with color.</p></li><li><p>Make it easy for machines to read</p></li></ul><h2>Formatting for Both Machines and Humans</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-lAH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb8c5c1f-f5c6-43e4-851d-02d9259d7e6c_1600x1013.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-lAH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb8c5c1f-f5c6-43e4-851d-02d9259d7e6c_1600x1013.png 424w, https://substackcdn.com/image/fetch/$s_!-lAH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb8c5c1f-f5c6-43e4-851d-02d9259d7e6c_1600x1013.png 848w, https://substackcdn.com/image/fetch/$s_!-lAH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb8c5c1f-f5c6-43e4-851d-02d9259d7e6c_1600x1013.png 1272w, https://substackcdn.com/image/fetch/$s_!-lAH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb8c5c1f-f5c6-43e4-851d-02d9259d7e6c_1600x1013.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-lAH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb8c5c1f-f5c6-43e4-851d-02d9259d7e6c_1600x1013.png" width="1456" height="922" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fb8c5c1f-f5c6-43e4-851d-02d9259d7e6c_1600x1013.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:922,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-lAH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb8c5c1f-f5c6-43e4-851d-02d9259d7e6c_1600x1013.png 424w, https://substackcdn.com/image/fetch/$s_!-lAH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb8c5c1f-f5c6-43e4-851d-02d9259d7e6c_1600x1013.png 848w, https://substackcdn.com/image/fetch/$s_!-lAH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb8c5c1f-f5c6-43e4-851d-02d9259d7e6c_1600x1013.png 1272w, https://substackcdn.com/image/fetch/$s_!-lAH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb8c5c1f-f5c6-43e4-851d-02d9259d7e6c_1600x1013.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Use simple formatting.</strong> Avoid overly creative designs. Distribute it as a PDF that exports cleanly to text. You can test this by uploading to Google Docs&#8212;if formatting breaks, the ATS may misread it too.</p><p><strong>Keep it to one page.</strong> Admittedly, part of this is convention, but having everything on one page makes it easy for your readers to review. It&#8217;s like having the key elements &#8220;above the fold&#8221; on a product page. Here are some space-saving tips:</p><ul><li><p>Adjust margins, even all the way down to 1/2&#8221; on every side.</p></li><li><p>Think about what you can fit on one line, such as the company name, title, years.</p></li><li><p>Older or shorter experiences can sometimes be summarized in one line instead of a series of bullets. The farther the experience is down your resume, the less likely a reader is to actually read it, so feel free to reduce fidelity for those roles.</p></li><li><p>Consider a two-column format, with the lesser-read parts like contact information, skills, and education in a smaller, side column.</p></li></ul><p><strong>Use formatting to guide the eye.</strong> Sub-bullets, bolding, and visual hierarchy. If every bullet looks equally weighted, eyes glaze over.</p><p><strong>Don&#8217;t overthink templates.</strong> You&#8217;re not going to get hired because of your amazing resume template. Don&#8217;t spend too much energy on this. Find something that matches the principles here and then put your energy into the content.</p><h2>Suggested Templates</h2><p>I almost didn&#8217;t include templates at all, since there are so many good ones and the final formatting choices don&#8217;t matter that much. But how can this newsletter about resumes be complete without a couple of resume templates?</p><p>Here are a couple of templates that follow our guidelines: a&nbsp;<a href="https://docs.google.com/document/d/1xlTpcmIBIaVyNZhYWJ0CqlcwUNSOW_g-bsFq9OxQ-L4/edit?usp=sharing">single-column</a>&nbsp;example and a&nbsp;<a href="https://docs.google.com/document/d/1sDY1-c2bTplkO5eJpgr-reqBC4mffTEesOVJ-95dcM8/edit?usp=sharing">two-column</a>&nbsp;example. Notice how much more vertical space you save with the two-column format.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qguF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec143f2f-1813-4655-8304-dbc71420c6c9_1120x1450.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qguF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec143f2f-1813-4655-8304-dbc71420c6c9_1120x1450.png 424w, https://substackcdn.com/image/fetch/$s_!qguF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec143f2f-1813-4655-8304-dbc71420c6c9_1120x1450.png 848w, https://substackcdn.com/image/fetch/$s_!qguF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec143f2f-1813-4655-8304-dbc71420c6c9_1120x1450.png 1272w, https://substackcdn.com/image/fetch/$s_!qguF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec143f2f-1813-4655-8304-dbc71420c6c9_1120x1450.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qguF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec143f2f-1813-4655-8304-dbc71420c6c9_1120x1450.png" width="1120" height="1450" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ec143f2f-1813-4655-8304-dbc71420c6c9_1120x1450.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1450,&quot;width&quot;:1120,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qguF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec143f2f-1813-4655-8304-dbc71420c6c9_1120x1450.png 424w, https://substackcdn.com/image/fetch/$s_!qguF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec143f2f-1813-4655-8304-dbc71420c6c9_1120x1450.png 848w, https://substackcdn.com/image/fetch/$s_!qguF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec143f2f-1813-4655-8304-dbc71420c6c9_1120x1450.png 1272w, https://substackcdn.com/image/fetch/$s_!qguF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec143f2f-1813-4655-8304-dbc71420c6c9_1120x1450.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Single Column Template</figcaption></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qy0C!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60c169ca-3a45-4843-a3f7-a461e972e1d1_1122x1450.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qy0C!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60c169ca-3a45-4843-a3f7-a461e972e1d1_1122x1450.png 424w, https://substackcdn.com/image/fetch/$s_!qy0C!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60c169ca-3a45-4843-a3f7-a461e972e1d1_1122x1450.png 848w, https://substackcdn.com/image/fetch/$s_!qy0C!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60c169ca-3a45-4843-a3f7-a461e972e1d1_1122x1450.png 1272w, https://substackcdn.com/image/fetch/$s_!qy0C!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60c169ca-3a45-4843-a3f7-a461e972e1d1_1122x1450.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qy0C!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60c169ca-3a45-4843-a3f7-a461e972e1d1_1122x1450.png" width="1122" height="1450" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/60c169ca-3a45-4843-a3f7-a461e972e1d1_1122x1450.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1450,&quot;width&quot;:1122,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qy0C!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60c169ca-3a45-4843-a3f7-a461e972e1d1_1122x1450.png 424w, https://substackcdn.com/image/fetch/$s_!qy0C!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60c169ca-3a45-4843-a3f7-a461e972e1d1_1122x1450.png 848w, https://substackcdn.com/image/fetch/$s_!qy0C!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60c169ca-3a45-4843-a3f7-a461e972e1d1_1122x1450.png 1272w, https://substackcdn.com/image/fetch/$s_!qy0C!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60c169ca-3a45-4843-a3f7-a461e972e1d1_1122x1450.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Two-Column Template</figcaption></figure></div><h2>Making Your Experience Scannable</h2><p><strong>Make recognizable names visible.</strong> Make sure the eye jumps directly to household-name companies and high-impact projects. Recruiters notice brand recognition quickly, and readers only read the first part of each section.</p><p><strong>Surface minimum qualifications.</strong> Many companies require recruiters to verify basic criteria, such as education or years of experience. Make this information prominent, as in the summary.</p><p><strong>Prioritize your recent or flagship experience.</strong> Give more visual weight and spend more vertical space on the recent experience. Similarly, for experience at a name-brand company. Both of these will be the most-read content by reviewers. To accomplish this, compress older experience by removing bullets or shortening them.</p><p><strong>Handling sticky situations: gaps and layoffs.</strong> Minimize these if you can. You don&#8217;t want to get screened out b/c of these before you can talk to someone:</p><ul><li><p>Remove the month indicators on roles that hide an intra-year gap.</p></li><li><p>If you were recently let go, don&#8217;t update the current role with an end date yet. You can probably get away with this for 6-12 months since people expect resumes to be somewhat out of date.</p></li><li><p>If you have recent stints at startups that folded after a year, add a brief one-liner explaining what happened. Otherwise, reviewers may assume it reflects on you.</p></li></ul><p>At this point, you know what to write and how to format it.</p><p>But there are a lot of common mistakes I see in resumes. The good news is they&#8217;re fixable in five minutes.</p><div><hr></div><h1>Part 4: Common Mistakes &amp; What to Avoid</h1><p>Writing resumes can be tricky, so let&#8217;s consider some common advice I give coaching clients when we&#8217;re talking about resumes&#8230;</p><h2>Common Resume Mistakes</h2><p><strong>Too many buzzwords.</strong> &#8220;Results-driven,&#8221; &#8220;synergy,&#8221; &#8220;leverage&#8221; all mean nothing really. If you can swap your bullet with any other candidate&#8217;s and it still makes sense, it&#8217;s too generic.</p><p><strong>Zero impact.</strong> Describing activities without outcomes. &#8220;Worked on the payments team&#8221; tells me nothing. What shipped? What changed because you were there?</p><p><strong>Hard-to-read templates.</strong> Fancy designs with unusual typefaces, lots of white space, etc., that prioritize aesthetics over scannability are pretty risky unless you&#8217;re applying to a design-heavy role. A reader should be able to make a decision about your resume in about 3 seconds.</p><p><strong>Irrelevant information.</strong> Remove that summer lifeguarding job from 2015 when you&#8217;re a senior engineer in 2026. Every line that doesn&#8217;t serve the reader is a line they might stop reading at.</p><p><strong>Wall of text.</strong> If I can&#8217;t immediately see where the key information is and there&#8216;s no hierarchy, no bolding, and no visual breaks, then I just get overwhelmed and close your profile.</p><h2>What NOT to Include</h2><p><strong>Unnecessary personal info.</strong> Age, photo (in the US), full address. These create opportunities for bias and waste space.</p><p><strong>Irrelevant jobs.</strong> Unless they demonstrate transferable skills or fill a gap, cut them. Nobody hiring a backend engineer cares about your retail experience from a decade ago.</p><p><strong>Social media links.</strong> Exception: LinkedIn. Your Twitter, Instagram, or TikTok is irrelevant noise unless you&#8217;re applying to a social media company and they showcase relevant work.</p><p><strong>References available upon request.</strong> Everyone assumes this. It&#8217;s a filler.</p><p><strong>Hobbies and interests.</strong> Unless directly relevant (you&#8217;re applying to a gaming company and list game development as a hobby), these just take up space.</p><p><strong>ATS tricks.</strong> Don&#8217;t try things like packing your resume with white text to rock the ATS layer. Recruiters and engineers who write ATS systems know about these tricks and despise them.</p><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/p/software-engineer-resume?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption"><em>Share this post &amp; get rewards for the referrals.</em></p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/p/software-engineer-resume?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.systemdesign.one/p/software-engineer-resume?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><div><hr></div><h1>Closing Thoughts</h1><p>Your resume is never done.</p><p>That might sound exhausting, but it&#8217;s actually liberating. You don&#8217;t need to craft the perfect resume before you start applying. You need a <em>good enough</em> resume that you improve over time.</p><p><strong>Treat your resume like a product.</strong> We started this newsletter with that idea, and it&#8217;s worth restating here. Products ship, get feedback, and iterate; your resume should too. If an interviewer or recruiter asks you about something on your resume, consider adding it. Did you bomb a question about an accomplishment you listed? Either cut it or prepare better.</p><p><strong>Get human feedback early and often.</strong> Have friends, mentors, or peers review your resume. If you can, get feedback from someone who&#8217;s actually hired engineers.</p><p><strong>Iterate ruthlessly.</strong> When I&#8217;ve worked with coaching clients on their resumes, it usually takes 3-4 revisions before it&#8217;s good.</p><p><strong>The takeaway:</strong> Your resume has four readers&#8212;machines, recruiters, interviewers, and hiring managers&#8212;and each one needs something different. Build for all of them. Make it easy to scan, with the information each persona needs. Quantify your impact. Make it scannable. And then iterate.</p><p>Now go get that interview.</p><div><hr></div><p>&#128075; I&#8217;d like to thank <strong><a href="https://thebehavioral.tech/">Austen</a></strong> for writing this newsletter!</p><p>Also, subscribe to his free newsletter,&nbsp;<strong><a href="https://thebehavioral.substack.com/welcome">Mastering Behavioral Interviews</a></strong><a href="https://thebehavioral.substack.com/">,</a>&nbsp;and check out his&nbsp;<strong><a href="https://thebehavioral.tech/">book by the same name</a></strong>.</p><p>It will help you prepare for the interview round you&#8217;re underinvesting in, the one that separates Senior engineers from Staff and above. He&#8217;s got a lot of practical advice similar to what he shared in this post.</p><p>Remember, you can claim <strong><a href="https://thebehavioral.tech/premium-access">20% off Premium Access</a></strong> to his interview prep content with code <em><strong>SDN26</strong></em>, but only for a few days!</p><div><hr></div><p>If you find this newsletter valuable, share it with a friend, and subscribe if you haven&#8217;t already. There are <a href="http://newsletter.systemdesign.one/subscribe?group=true">group discounts</a>, <a href="http://newsletter.systemdesign.one/subscribe?gift=true">gift options</a>, and <a href="https://newsletter.systemdesign.one/leaderboard">referral rewards</a> available.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.systemdesign.one/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://www.linkedin.com/in/nk-systemdesign-one/" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bEFk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 424w, https://substackcdn.com/image/fetch/$s_!bEFk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 848w, https://substackcdn.com/image/fetch/$s_!bEFk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 1272w, https://substackcdn.com/image/fetch/$s_!bEFk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bEFk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png" width="152" height="152" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:320,&quot;width&quot;:320,&quot;resizeWidth&quot;:152,&quot;bytes&quot;:74009,&quot;alt&quot;:&quot;Author Neo Kim; System design case studies&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://www.linkedin.com/in/nk-systemdesign-one/&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Author Neo Kim; System design case studies" title="Author Neo Kim; System design case studies" srcset="https://substackcdn.com/image/fetch/$s_!bEFk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 424w, https://substackcdn.com/image/fetch/$s_!bEFk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 848w, https://substackcdn.com/image/fetch/$s_!bEFk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 1272w, https://substackcdn.com/image/fetch/$s_!bEFk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption"><strong>&#128075; Find me on <a href="https://www.linkedin.com/in/nk-systemdesign-one/">LinkedIn</a> | <a href="https://x.com/intent/follow?screen_name=systemdesignone">Twitter</a> | <a href="https://www.threads.net/@systemdesignone">Threads</a> | <a href="https://www.instagram.com/systemdesignone/">Instagram</a></strong></figcaption></figure></div><div><hr></div><p><strong>Want to reach 200K+ tech professionals at scale? </strong>&#128240;</p><p>If your company wants to reach 200K+ tech professionals, <a href="https://newsletter.systemdesign.one/p/sponsorship">advertise with me</a>.</p><div><hr></div><p>Thank you for supporting this newsletter.</p><p>You are now 210,001+ readers strong, very close to 210k. Let&#8217;s try to get 211k readers by 17 April. Consider sharing this post with your friends and get rewards.</p><p>Y&#8217;all are the best.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6oWl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6oWl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 424w, https://substackcdn.com/image/fetch/$s_!6oWl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 848w, https://substackcdn.com/image/fetch/$s_!6oWl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 1272w, https://substackcdn.com/image/fetch/$s_!6oWl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6oWl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png" width="590" height="368.75" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2e739087-a910-4643-be36-997b6dd5b4af_800x500.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:500,&quot;width&quot;:800,&quot;resizeWidth&quot;:590,&quot;bytes&quot;:87878,&quot;alt&quot;:&quot;system design newsletter&quot;,&quot;title&quot;:&quot;system design newsletter&quot;,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/163380418?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="system design newsletter" title="system design newsletter" srcset="https://substackcdn.com/image/fetch/$s_!6oWl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 424w, https://substackcdn.com/image/fetch/$s_!6oWl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 848w, https://substackcdn.com/image/fetch/$s_!6oWl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 1272w, https://substackcdn.com/image/fetch/$s_!6oWl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/p/software-engineer-resume?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.systemdesign.one/p/software-engineer-resume?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[11 AI Concepts Explained, Simply]]></title><description><![CDATA[#138: Break into AI Engineering]]></description><link>https://newsletter.systemdesign.one/p/ai-concepts</link><guid isPermaLink="false">https://newsletter.systemdesign.one/p/ai-concepts</guid><dc:creator><![CDATA[Neo Kim]]></dc:creator><pubDate>Thu, 09 Apr 2026 09:54:36 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/f266f60d-51aa-416b-b477-90b1dc4c2299_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Get my system design playbook for FREE on newsletter signup:</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><ul><li><p><em><a href="https://newsletter.systemdesign.one/p/ai-concepts/?action=share">Share this post</a> &amp; I'll send you some rewards for the referrals.</em></p></li></ul><div><hr></div><p>I&#8217;ve built agents at Google to support their machine learning operations.</p><p>Then one thing became super clear to me: <em>AI engineering is good software engineering with an understanding of a few key AI principles.</em></p><p>Just a year ago, hardly any software engineers were building with AI.</p><p>Now, engineers are expected to understand AI engineering and integrate AI into their products. This newsletter is a primer to bridge the gap from software to AI engineering.</p><p>Onward.</p><div><hr></div><h2><a href="https://cline.gg/neo">The tax you pay to run multiple agents (Partner)</a></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://cline.gg/neo" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!MX0h!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e0ba733-c96e-461d-92e4-e2bd8c2cb2a5_1600x900.png 424w, https://substackcdn.com/image/fetch/$s_!MX0h!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e0ba733-c96e-461d-92e4-e2bd8c2cb2a5_1600x900.png 848w, https://substackcdn.com/image/fetch/$s_!MX0h!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e0ba733-c96e-461d-92e4-e2bd8c2cb2a5_1600x900.png 1272w, https://substackcdn.com/image/fetch/$s_!MX0h!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e0ba733-c96e-461d-92e4-e2bd8c2cb2a5_1600x900.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!MX0h!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e0ba733-c96e-461d-92e4-e2bd8c2cb2a5_1600x900.png" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4e0ba733-c96e-461d-92e4-e2bd8c2cb2a5_1600x900.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:&quot;https://cline.gg/neo&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!MX0h!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e0ba733-c96e-461d-92e4-e2bd8c2cb2a5_1600x900.png 424w, https://substackcdn.com/image/fetch/$s_!MX0h!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e0ba733-c96e-461d-92e4-e2bd8c2cb2a5_1600x900.png 848w, https://substackcdn.com/image/fetch/$s_!MX0h!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e0ba733-c96e-461d-92e4-e2bd8c2cb2a5_1600x900.png 1272w, https://substackcdn.com/image/fetch/$s_!MX0h!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e0ba733-c96e-461d-92e4-e2bd8c2cb2a5_1600x900.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>If you&#8217;ve spent any time with coding agents, you know the feeling.</p><p>You start the morning with a clean plan. Spin up a few agents. One is refactoring the auth module. Another is writing tests. A third is scaffolding a new API endpoint. You&#8217;re flying.</p><p>Then, around 10:30 AM, you look up and realize you have 20 terminal windows open. One agent is blocked waiting for a decision you forgot to make. Another finished 40 minutes ago, and you never noticed. A third went sideways three commits back. You&#8217;re no longer flying. You&#8217;re drowning.</p><p>You&#8217;ve shifted from human as driver to human as director.</p><p>When running coding agents in parallel, the bottleneck isn&#8217;t just context. It&#8217;s your own attention trying to manage 10 agents across 10 terminals. You&#8217;re losing your mind to terminal chaos.</p><p>Meet <strong><a href="https://cline.gg/neo">Cline Kanban</a></strong>, a CLI-agnostic visual orchestration layer that makes multi-agent workflows usable across providers. Multiple agents, one UI. It&#8217;s the air traffic controller for the agents you&#8217;re already running, regardless of where they live.</p><ul><li><p><strong>Interoperable: </strong>Claude Code and Codex compatible, with more coming soon.</p></li><li><p><strong>Full Visibility:</strong> Confidently run multiple agents and work through your backlog faster.</p></li><li><p><strong>Smart Triage:</strong> See which agents are blocked or in review and jump in to unblock them.</p></li><li><p><strong>Chain Tasks:</strong> Set dependencies so Agent B won&#8217;t start until Agent A is complete.</p></li><li><p><strong>Familiar UI:</strong> Everything in a single Kanban view.</p></li></ul><p>Stop tracking agents and start directing them. Get a meaningful edge with the beta release.</p><p><strong>Install Cline Kanban Today: </strong><code>npm i -g cline</code></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://cline.gg/neo&quot;,&quot;text&quot;:&quot;Get Started Today&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://cline.gg/neo"><span>Get Started Today</span></a></p><p>(Thanks to <a href="https://cline.gg/neo">Cline</a> for partnering on this post.)</p><div><hr></div><p>I want to introduce <strong><a href="http://aiforswes.com/subscribe">Logan Thorneloe</a></strong> as the guest author.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="http://aiforswes.com/subscribe" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Zf30!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ee68530-d50b-4df7-9548-6d342b4361b2_1310x816.png 424w, https://substackcdn.com/image/fetch/$s_!Zf30!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ee68530-d50b-4df7-9548-6d342b4361b2_1310x816.png 848w, https://substackcdn.com/image/fetch/$s_!Zf30!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ee68530-d50b-4df7-9548-6d342b4361b2_1310x816.png 1272w, https://substackcdn.com/image/fetch/$s_!Zf30!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ee68530-d50b-4df7-9548-6d342b4361b2_1310x816.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Zf30!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ee68530-d50b-4df7-9548-6d342b4361b2_1310x816.png" width="1310" height="816" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6ee68530-d50b-4df7-9548-6d342b4361b2_1310x816.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:816,&quot;width&quot;:1310,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:&quot;http://aiforswes.com/subscribe&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Zf30!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ee68530-d50b-4df7-9548-6d342b4361b2_1310x816.png 424w, https://substackcdn.com/image/fetch/$s_!Zf30!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ee68530-d50b-4df7-9548-6d342b4361b2_1310x816.png 848w, https://substackcdn.com/image/fetch/$s_!Zf30!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ee68530-d50b-4df7-9548-6d342b4361b2_1310x816.png 1272w, https://substackcdn.com/image/fetch/$s_!Zf30!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ee68530-d50b-4df7-9548-6d342b4361b2_1310x816.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>He&#8217;s a software engineer at Google, building at the intersection of machine learning infrastructure, AI agents, and developer tooling.</p><p>He uses his background in machine learning research to teach software engineers the AI topics they should know via his newsletter, <strong><a href="http://aiforswes.com/subscribe">AI for Software Engineers</a></strong>.</p><p>You can also get an exclusive&nbsp;<strong><a href="https://www.aiforswes.com/67113502">25% off a paid AI for Software Engineers subscription</a></strong>&nbsp;as a reader of the System Design Newsletter.</p><div><hr></div><h2><strong>1 Software 1.0, 2.0, and 3.0</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3Ir2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeebfe95-2383-47ab-9b48-714973d2e17a_1200x630.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3Ir2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeebfe95-2383-47ab-9b48-714973d2e17a_1200x630.png 424w, https://substackcdn.com/image/fetch/$s_!3Ir2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeebfe95-2383-47ab-9b48-714973d2e17a_1200x630.png 848w, https://substackcdn.com/image/fetch/$s_!3Ir2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeebfe95-2383-47ab-9b48-714973d2e17a_1200x630.png 1272w, https://substackcdn.com/image/fetch/$s_!3Ir2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeebfe95-2383-47ab-9b48-714973d2e17a_1200x630.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3Ir2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeebfe95-2383-47ab-9b48-714973d2e17a_1200x630.png" width="1200" height="630" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/feebfe95-2383-47ab-9b48-714973d2e17a_1200x630.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:630,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:277568,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/192186526?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeebfe95-2383-47ab-9b48-714973d2e17a_1200x630.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3Ir2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeebfe95-2383-47ab-9b48-714973d2e17a_1200x630.png 424w, https://substackcdn.com/image/fetch/$s_!3Ir2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeebfe95-2383-47ab-9b48-714973d2e17a_1200x630.png 848w, https://substackcdn.com/image/fetch/$s_!3Ir2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeebfe95-2383-47ab-9b48-714973d2e17a_1200x630.png 1272w, https://substackcdn.com/image/fetch/$s_!3Ir2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeebfe95-2383-47ab-9b48-714973d2e17a_1200x630.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><blockquote><p>&#8220;The hottest new programming language is English.&#8221; - <a href="https://x.com/karpathy/status/1617979122625712128">Andrej Karpathy</a></p></blockquote><p>There have been two recent AI-related shifts that have fundamentally changed software development and required software engineers to learn new concepts.</p><p>Both changes have been caused by the development of <em>machine learning models</em>: programs trained on data to recognize patterns and make predictions rather than follow explicit rules.</p><p>Andrej Karpathy details these changes by describing three different eras of software engineering:</p><ul><li><p><a href="https://karpathy.medium.com/software-2-0-a64152b37c35">Software 1.0</a>: hand-written logic via writing code. This was the paradigm from the inception of programming all the way to the advent of machine learning.</p></li><li><p><a href="https://karpathy.medium.com/software-2-0-a64152b37c35">Software 2.0</a>: shift from explicitly writing logic via code to curating datasets and teaching algorithms to understand them. This shifts logic from a set of instructions to logic learned from data. A good example of this is Tesla&#8217;s Autopilot system, shifting from hand-written C++ code to neural networks<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a> for vision.</p></li><li><p><a href="https://www.youtube.com/watch?v=LCEmiRjPEtQ">Software 3.0</a>: shift toward natural language<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a> being our new programming interface. This enables engineers to write code by speaking English (<a href="https://x.com/karpathy/status/1886192184808149383">vibe coding</a>). It also enables engineers to build agents<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a>, or software that can think and act autonomously.</p></li></ul><p>Software engineers <a href="https://www.aiforswes.com/p/why-software-engineers-need-to-understand-ml">need to understand</a> these key AI principles to be productive in the software 3.0 paradigm&#8230;</p><h2><strong>2 Large Language Models (LLMs)</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VnN-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4095f8c9-e27f-4be6-bec4-ceb6e8f2a8fc_1200x630.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VnN-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4095f8c9-e27f-4be6-bec4-ceb6e8f2a8fc_1200x630.png 424w, https://substackcdn.com/image/fetch/$s_!VnN-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4095f8c9-e27f-4be6-bec4-ceb6e8f2a8fc_1200x630.png 848w, https://substackcdn.com/image/fetch/$s_!VnN-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4095f8c9-e27f-4be6-bec4-ceb6e8f2a8fc_1200x630.png 1272w, https://substackcdn.com/image/fetch/$s_!VnN-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4095f8c9-e27f-4be6-bec4-ceb6e8f2a8fc_1200x630.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VnN-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4095f8c9-e27f-4be6-bec4-ceb6e8f2a8fc_1200x630.png" width="1200" height="630" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4095f8c9-e27f-4be6-bec4-ceb6e8f2a8fc_1200x630.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:630,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:402460,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/192186526?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4095f8c9-e27f-4be6-bec4-ceb6e8f2a8fc_1200x630.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VnN-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4095f8c9-e27f-4be6-bec4-ceb6e8f2a8fc_1200x630.png 424w, https://substackcdn.com/image/fetch/$s_!VnN-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4095f8c9-e27f-4be6-bec4-ceb6e8f2a8fc_1200x630.png 848w, https://substackcdn.com/image/fetch/$s_!VnN-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4095f8c9-e27f-4be6-bec4-ceb6e8f2a8fc_1200x630.png 1272w, https://substackcdn.com/image/fetch/$s_!VnN-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4095f8c9-e27f-4be6-bec4-ceb6e8f2a8fc_1200x630.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>LLMs are natural language AI models trained to predict the continuation of text given a natural language input.</p><p>As with any AI model, understanding how it works best comes from examining its training data. LLMs are trained on a massive dataset of text from the internet. They&#8217;re trained to learn the <em><strong>relationship between words</strong></em> <em><strong>to best predict an output</strong></em> when given a specific string of inputs.</p><p>This relationship enables LLMs to use the knowledge stored in the semantics of language to understand relationships between concepts.</p><p>LLMs are also tuned to be <em><strong>non-deterministic</strong></em>; i.e., to provide different answers when an input is sent to the LLM more than once. This ensures <a href="https://www.aiforswes.com/p/temperature-in-llms?utm_source=publication-search">an LLM&#8217;s output isn&#8217;t locked to a given input</a>.</p><p>These two concepts make LLMs intelligent.</p><p>To relate LLMs to general software engineering, we can think of an LLM as a &#8220;non-deterministic database&#8221;.</p><p>This makes it easier to understand the impact of the knowledge encoded in its weights via semantic understanding. I like to compare it to a SQL database and a query: <em>LLM is the database, and the prompt given to the LLM is the query that extracts the information in the format the user wants.</em></p><p>In AI engineering, it&#8217;s apt to think of an LLM as the central processing unit (CPU) of your application. By giving an LLM input information, it can make decisions on its own. Thus, the LLM becomes the CPU for your software, deciding what to do with the information provided.</p><p>This is also an apt comparison because choosing an LLM is like picking a CPU.</p><p>Each LLM has its own tradeoffs for latency, cost, functionality, and more. Choosing the right model (or models!) is important to meet the requirements of production software systems.</p><h2><strong>3 Context</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-Elf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F195e9f44-5a2b-46e9-be38-fb9370a85ca1_1200x630.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-Elf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F195e9f44-5a2b-46e9-be38-fb9370a85ca1_1200x630.png 424w, https://substackcdn.com/image/fetch/$s_!-Elf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F195e9f44-5a2b-46e9-be38-fb9370a85ca1_1200x630.png 848w, https://substackcdn.com/image/fetch/$s_!-Elf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F195e9f44-5a2b-46e9-be38-fb9370a85ca1_1200x630.png 1272w, https://substackcdn.com/image/fetch/$s_!-Elf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F195e9f44-5a2b-46e9-be38-fb9370a85ca1_1200x630.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-Elf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F195e9f44-5a2b-46e9-be38-fb9370a85ca1_1200x630.png" width="1200" height="630" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/195e9f44-5a2b-46e9-be38-fb9370a85ca1_1200x630.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:630,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:291234,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/192186526?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F195e9f44-5a2b-46e9-be38-fb9370a85ca1_1200x630.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-Elf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F195e9f44-5a2b-46e9-be38-fb9370a85ca1_1200x630.png 424w, https://substackcdn.com/image/fetch/$s_!-Elf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F195e9f44-5a2b-46e9-be38-fb9370a85ca1_1200x630.png 848w, https://substackcdn.com/image/fetch/$s_!-Elf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F195e9f44-5a2b-46e9-be38-fb9370a85ca1_1200x630.png 1272w, https://substackcdn.com/image/fetch/$s_!-Elf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F195e9f44-5a2b-46e9-be38-fb9370a85ca1_1200x630.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em><strong>Context</strong></em> is the information LLMs store in their memory and functions as the working memory for a task. For an LLM to perform a task or make a decision, it must have the relevant information in its context window<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a>.</p><p>Context works by converting all input into <em><strong>tokens</strong></em>, which are small chunks of text (roughly 3-4 characters each) and feeding them into a model sequentially. The model uses an <em><strong>attention mechanism</strong></em> (a process that calculates how each token relates to every other token in the sequence) to determine which tokens are relevant to each other.</p><p>This is how LLMs &#8216;understand&#8217; the relationships within the input mentioned above.</p><p>The tokens currently part of a conversation with an LLM are considered part of that LLM&#8217;s &#8220;context window&#8221;. Each time you interact with an LLM, the model processes the entire context window from scratch. It doesn&#8217;t remember the previous turn unless you feed the entire conversation history back in.</p><p>Similar to RAM, an LLM&#8217;s context window is finite&#8230;</p><p>There is a hard limit to the number of tokens that can be stored in context at a time. Even before hitting that limit, LLMs tend to bias their knowledge toward the start and end of their context window. This means an LLM&#8217;s understanding of the information it is given can degrade before it reaches the hard limit.</p><p>The size of an LLM&#8217;s context window varies between models.</p><p>We&#8217;re at a point where frontier models can process 1 million or more tokens, whereas many open models are closer to 256,000 tokens. This isn&#8217;t enough for many applications, so it&#8217;s important to design AI systems with context limitations in mind.</p><p>Also, like RAM, an LLM&#8217;s context doesn&#8217;t persist between sessions.</p><p>LLMs will only remember the context provided during a session. This means AI systems with long-term memory require a more persistent memory storage and retrieval system.</p><p>Context is managed by only keeping the tokens a model needs for a task.</p><p>To effectively manage context, consider:</p><ul><li><p><strong>Using subagents for specific tasks</strong>. A separate agent means a separate context window. That agent can process and summarize the context, then return it to our primary agent.</p></li><li><p><strong>Summarizing all information currently within an LLM&#8217;s context window</strong>, clearing the context window, and adding the summary back in. This stores the same information in a shorter context.</p></li><li><p><strong>Using external systems to store information</strong>. Information is only needed in context when it&#8217;s being used. It should be stored in a separate, accessible datastore otherwise.</p></li></ul><p>Every AI application built with LLMs needs to properly manage context.</p><h2><strong>4 Embeddings and Vector Databases</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2EKH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b96b4e5-74fd-4de4-bbfb-4e26e996b3e2_1200x630.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2EKH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b96b4e5-74fd-4de4-bbfb-4e26e996b3e2_1200x630.png 424w, https://substackcdn.com/image/fetch/$s_!2EKH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b96b4e5-74fd-4de4-bbfb-4e26e996b3e2_1200x630.png 848w, https://substackcdn.com/image/fetch/$s_!2EKH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b96b4e5-74fd-4de4-bbfb-4e26e996b3e2_1200x630.png 1272w, https://substackcdn.com/image/fetch/$s_!2EKH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b96b4e5-74fd-4de4-bbfb-4e26e996b3e2_1200x630.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2EKH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b96b4e5-74fd-4de4-bbfb-4e26e996b3e2_1200x630.png" width="1200" height="630" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7b96b4e5-74fd-4de4-bbfb-4e26e996b3e2_1200x630.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:630,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:201783,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/192186526?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b96b4e5-74fd-4de4-bbfb-4e26e996b3e2_1200x630.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2EKH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b96b4e5-74fd-4de4-bbfb-4e26e996b3e2_1200x630.png 424w, https://substackcdn.com/image/fetch/$s_!2EKH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b96b4e5-74fd-4de4-bbfb-4e26e996b3e2_1200x630.png 848w, https://substackcdn.com/image/fetch/$s_!2EKH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b96b4e5-74fd-4de4-bbfb-4e26e996b3e2_1200x630.png 1272w, https://substackcdn.com/image/fetch/$s_!2EKH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b96b4e5-74fd-4de4-bbfb-4e26e996b3e2_1200x630.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>You can&#8217;t fit everything into a model&#8217;s context window, so you need a way to find relevant information efficiently.</p><p>An <strong>embedding</strong> represents the meaning of data as a list of numbers, called a <strong>vector</strong>. An embedding can be created from text, audio, video, or any other data format to convey the meaning of that data. Items with similar embedding vectors are considered similar.</p><p>Consider a coordinate system in which each axis corresponds to a learned semantic dimension.</p><p><em>&#8220;Deploy a microservice&#8221;</em> and <em>&#8220;ship a container to production&#8221;</em> would be plotted close together. <em>&#8220;Deploy a microservice&#8221;</em> and <em>&#8220;quarterly revenue forecast&#8221;</em> would be far apart. Embeddings can do this across thousands of dimensions. Distance in this high-dimensional space equals similarity in meaning.</p><p>Vector databases<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-5" href="#footnote-5" target="_self">5</a> are built explicitly to store and query embeddings.</p><p>They use algorithms optimized for quickly finding the most similar vectors in the database. Those vectors are then added to a model&#8217;s context. This is comparable to retrieving data from a cache.</p><p>The process follows these steps:</p><ol><li><p><strong>Index</strong>: Convert documents into embeddings, store in a vector database</p></li><li><p><strong>Retrieve</strong>: Embed the query and find the most similar documents</p></li><li><p><strong>Augment</strong>: Pass those documents into the LLM&#8217;s context window with the user&#8217;s question</p></li><li><p><strong>Generate</strong>: LLM answers using retrieved information</p></li></ol><p>Almost every production AI engineering system will need a working memory solution.</p><p>Context will need to be augmented with a persistent data store. Embeddings and vector databases.</p><h2><strong>5 Time to First Token</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!g-Ix!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff332d3e0-2df1-49e0-99a5-7486eb6ac39a_1200x630.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!g-Ix!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff332d3e0-2df1-49e0-99a5-7486eb6ac39a_1200x630.png 424w, https://substackcdn.com/image/fetch/$s_!g-Ix!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff332d3e0-2df1-49e0-99a5-7486eb6ac39a_1200x630.png 848w, https://substackcdn.com/image/fetch/$s_!g-Ix!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff332d3e0-2df1-49e0-99a5-7486eb6ac39a_1200x630.png 1272w, https://substackcdn.com/image/fetch/$s_!g-Ix!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff332d3e0-2df1-49e0-99a5-7486eb6ac39a_1200x630.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!g-Ix!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff332d3e0-2df1-49e0-99a5-7486eb6ac39a_1200x630.png" width="1200" height="630" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f332d3e0-2df1-49e0-99a5-7486eb6ac39a_1200x630.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:630,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:377887,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/192186526?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff332d3e0-2df1-49e0-99a5-7486eb6ac39a_1200x630.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!g-Ix!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff332d3e0-2df1-49e0-99a5-7486eb6ac39a_1200x630.png 424w, https://substackcdn.com/image/fetch/$s_!g-Ix!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff332d3e0-2df1-49e0-99a5-7486eb6ac39a_1200x630.png 848w, https://substackcdn.com/image/fetch/$s_!g-Ix!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff332d3e0-2df1-49e0-99a5-7486eb6ac39a_1200x630.png 1272w, https://substackcdn.com/image/fetch/$s_!g-Ix!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff332d3e0-2df1-49e0-99a5-7486eb6ac39a_1200x630.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>There are many factors to consider when choosing a specific LLM for your production AI system. One of these requirements is latency: <em>your system must function fast enough to be useful.</em></p><p>AI systems must consider the key latency metrics of traditional software systems and the LLM latency metrics, such as:</p><ul><li><p><strong>Time to First Token (TTFT)</strong>: How long until the model starts responding. This determines perceived responsiveness. Smaller models respond in under a second, but larger models can take several seconds to generate anything.</p></li><li><p><strong>Tokens Per Second (Throughput)</strong>: How fast the model generates after starting. This varies significantly across model sizes and compute units, and also affects how quickly the model's response is output to the user.</p></li></ul><p>AI applications require concurrency to stream the output of an LLM call and reduce perceived latency.</p><p><a href="https://www.aiforswes.com/p/mle-metrics?utm_source=publication-search">Latency is an incredibly important metric</a> to understand the real-world functionality of an AI system.</p><p>This is like streaming services, which play the start of a video while loading the rest in the background. This enables the viewer to watch the video before the entire video loads.</p><h2><strong>6 Evals</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FYJN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F813834d8-5814-46f2-a8f0-215141afbd65_1200x630.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FYJN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F813834d8-5814-46f2-a8f0-215141afbd65_1200x630.png 424w, https://substackcdn.com/image/fetch/$s_!FYJN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F813834d8-5814-46f2-a8f0-215141afbd65_1200x630.png 848w, https://substackcdn.com/image/fetch/$s_!FYJN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F813834d8-5814-46f2-a8f0-215141afbd65_1200x630.png 1272w, https://substackcdn.com/image/fetch/$s_!FYJN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F813834d8-5814-46f2-a8f0-215141afbd65_1200x630.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FYJN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F813834d8-5814-46f2-a8f0-215141afbd65_1200x630.png" width="1200" height="630" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/813834d8-5814-46f2-a8f0-215141afbd65_1200x630.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:630,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:184818,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/192186526?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F813834d8-5814-46f2-a8f0-215141afbd65_1200x630.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FYJN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F813834d8-5814-46f2-a8f0-215141afbd65_1200x630.png 424w, https://substackcdn.com/image/fetch/$s_!FYJN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F813834d8-5814-46f2-a8f0-215141afbd65_1200x630.png 848w, https://substackcdn.com/image/fetch/$s_!FYJN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F813834d8-5814-46f2-a8f0-215141afbd65_1200x630.png 1272w, https://substackcdn.com/image/fetch/$s_!FYJN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F813834d8-5814-46f2-a8f0-215141afbd65_1200x630.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><blockquote><p>&#8220;Evals are a new toolset for any and all AI engineers, and software engineers should also know about them. Move from guesswork to a systematic engineering process for improving AI quality.&#8221; - <a href="https://newsletter.pragmaticengineer.com/p/evals">Pragmatic Engineer, December 2025</a></p></blockquote><p>In traditional software, tests are written to ensure code function reliability.</p><p>In AI, traditional tests are <em>insufficient</em> for many parts of the system because of their non-deterministic nature. Instead, we write evaluations (evals for short).</p><p>Since assert statements won&#8217;t suffice for non-deterministic applications, evals are written using LLM-as-a-judge<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-6" href="#footnote-6" target="_self">6</a>.</p><p>Essentially, a dataset of inputs and their successful outputs is curated.</p><p>This dataset will be used to tell the judge what we&#8217;re looking for in our model's output. The LLM-as-a-judge will score how closely the tested model&#8217;s outputs resemble the successful output based on given criteria such as relevance, hallucinations, and accuracy.</p><p>For example, an eval can be used to assess whether an LLM effectively pulls the important information from a document when summarizing. A set of documents and summarization pairs is gathered as a solution set. The documents are run through the LLM we&#8217;re testing, and our LLM-as-a-judge compares the solution for that document to our LLM&#8217;s output.</p><p>Evals are vital because even the smallest changes in a non-deterministic system can change the user's output. The most common of these changes is model tweaks. Even a small tweak in model performance can drastically change the output of a system.</p><p>Evals are written to catch these changes.</p><h2><strong>7 Agent Loops</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1TB1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6483a3aa-045c-4ea8-80d8-63753e903ecd_1200x630.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1TB1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6483a3aa-045c-4ea8-80d8-63753e903ecd_1200x630.png 424w, https://substackcdn.com/image/fetch/$s_!1TB1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6483a3aa-045c-4ea8-80d8-63753e903ecd_1200x630.png 848w, https://substackcdn.com/image/fetch/$s_!1TB1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6483a3aa-045c-4ea8-80d8-63753e903ecd_1200x630.png 1272w, https://substackcdn.com/image/fetch/$s_!1TB1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6483a3aa-045c-4ea8-80d8-63753e903ecd_1200x630.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1TB1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6483a3aa-045c-4ea8-80d8-63753e903ecd_1200x630.png" width="1200" height="630" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6483a3aa-045c-4ea8-80d8-63753e903ecd_1200x630.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:630,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:357847,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/192186526?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6483a3aa-045c-4ea8-80d8-63753e903ecd_1200x630.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1TB1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6483a3aa-045c-4ea8-80d8-63753e903ecd_1200x630.png 424w, https://substackcdn.com/image/fetch/$s_!1TB1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6483a3aa-045c-4ea8-80d8-63753e903ecd_1200x630.png 848w, https://substackcdn.com/image/fetch/$s_!1TB1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6483a3aa-045c-4ea8-80d8-63753e903ecd_1200x630.png 1272w, https://substackcdn.com/image/fetch/$s_!1TB1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6483a3aa-045c-4ea8-80d8-63753e903ecd_1200x630.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Agent loops<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-7" href="#footnote-7" target="_self">7</a> represent the shift from Software 2.0 to Software 3.0.</p><p>Software 1.0 to 2.0 was the change from logical control flow to probabilistic output. In the shift to Software 3.0, we see probabilistic output used as reasoning loops in an agentic system<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-8" href="#footnote-8" target="_self">8</a>.</p><p>Agent loops enable an LLM to continuously receive and act on information. An agent loop looks something like:</p><ol><li><p><strong>Thought</strong>: Model reasons about the current state and what it needs to do next</p></li><li><p><strong>Action</strong>: The model decides to call a tool</p></li><li><p><strong>Observation</strong>: Application executes the tool and returns the result</p></li><li><p><strong>Repeat</strong>: Model then reasons about the observation, takes another action, or provides a final answer</p></li></ol><p>This is the <a href="https://arxiv.org/abs/2210.03629">ReAct pattern</a> (Reason + Act), and it has become the default agent loop architecture.</p><p>ReAct merges Chain of Thought reasoning, where the model works through a problem step by step, with tool execution into a single loop. This means a model can not only reason to act, but also take its previous actions and their outputs into account, continuously acting until a task is complete.</p><p>For example, consider an AI travel agent tasked with booking a trip within a budget.</p><p>It first searches for flights and finds the cheapest option. It then reasons about how much budget remains and searches for hotels accordingly. The initial results are all outside the city center, so it adjusts its search criteria and finds a better option. It calculates the total, confirms it fits the budget, and presents the full itinerary.</p><p>An agent loop relies on the LLM to determine the next step continuously until it completes a task. This looping process enables an LLM to use its knowledge and context to do meaningful work.</p><p>This enables complex workflows that aren&#8217;t possible with written code alone.</p><h2><strong>8 Tool Calling</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SU62!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17803a9f-946d-4559-ac7a-e91ed81babf3_1200x630.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SU62!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17803a9f-946d-4559-ac7a-e91ed81babf3_1200x630.png 424w, https://substackcdn.com/image/fetch/$s_!SU62!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17803a9f-946d-4559-ac7a-e91ed81babf3_1200x630.png 848w, https://substackcdn.com/image/fetch/$s_!SU62!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17803a9f-946d-4559-ac7a-e91ed81babf3_1200x630.png 1272w, https://substackcdn.com/image/fetch/$s_!SU62!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17803a9f-946d-4559-ac7a-e91ed81babf3_1200x630.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SU62!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17803a9f-946d-4559-ac7a-e91ed81babf3_1200x630.png" width="1200" height="630" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/17803a9f-946d-4559-ac7a-e91ed81babf3_1200x630.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:630,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:185870,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/192186526?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17803a9f-946d-4559-ac7a-e91ed81babf3_1200x630.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!SU62!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17803a9f-946d-4559-ac7a-e91ed81babf3_1200x630.png 424w, https://substackcdn.com/image/fetch/$s_!SU62!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17803a9f-946d-4559-ac7a-e91ed81babf3_1200x630.png 848w, https://substackcdn.com/image/fetch/$s_!SU62!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17803a9f-946d-4559-ac7a-e91ed81babf3_1200x630.png 1272w, https://substackcdn.com/image/fetch/$s_!SU62!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17803a9f-946d-4559-ac7a-e91ed81babf3_1200x630.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Meaningful work is further enabled by tool calling<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-9" href="#footnote-9" target="_self">9</a> capabilities.</p><p>This gives the LLM the ability to <em>do something</em> as part of its action. While LLMs can always output natural language, tool calling enables that output to call functions that perform work.</p><p>LLMs cannot execute code, access the internet, or perform other work natively. Instead, tool calling must be enabled:</p>
      <p>
          <a href="https://newsletter.systemdesign.one/p/ai-concepts">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Everything You Need to Know to Design GenAI Systems From Scratch]]></title><description><![CDATA[#137: Part 1 - Generative AI Masterclass]]></description><link>https://newsletter.systemdesign.one/p/generative-ai-system-design</link><guid isPermaLink="false">https://newsletter.systemdesign.one/p/generative-ai-system-design</guid><dc:creator><![CDATA[Louis-François Bouchard]]></dc:creator><pubDate>Tue, 07 Apr 2026 12:33:42 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/863a9f29-e5d1-4565-8f9e-6b3a355991ef_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://newsletter.systemdesign.one/subscribe?yearly=true" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!RKN7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3689f342-2008-4ce6-b968-16461682508b_1280x300.png 424w, https://substackcdn.com/image/fetch/$s_!RKN7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3689f342-2008-4ce6-b968-16461682508b_1280x300.png 848w, https://substackcdn.com/image/fetch/$s_!RKN7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3689f342-2008-4ce6-b968-16461682508b_1280x300.png 1272w, https://substackcdn.com/image/fetch/$s_!RKN7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3689f342-2008-4ce6-b968-16461682508b_1280x300.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!RKN7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3689f342-2008-4ce6-b968-16461682508b_1280x300.png" width="1280" height="300" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3689f342-2008-4ce6-b968-16461682508b_1280x300.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:300,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:24224,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://newsletter.systemdesign.one/subscribe?yearly=true&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/192435842?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3689f342-2008-4ce6-b968-16461682508b_1280x300.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!RKN7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3689f342-2008-4ce6-b968-16461682508b_1280x300.png 424w, https://substackcdn.com/image/fetch/$s_!RKN7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3689f342-2008-4ce6-b968-16461682508b_1280x300.png 848w, https://substackcdn.com/image/fetch/$s_!RKN7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3689f342-2008-4ce6-b968-16461682508b_1280x300.png 1272w, https://substackcdn.com/image/fetch/$s_!RKN7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3689f342-2008-4ce6-b968-16461682508b_1280x300.png 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a></figure></div><ul><li><p><em><a href="https://newsletter.systemdesign.one/p/generative-ai-system-design/?action=share">Share this post</a> &amp; I'll send you some rewards for the referrals.</em></p></li></ul><p>I have a big big announcement for those of you who want to become good at AI engineering.</p><p>INTRODUCING:<strong> GENERATIVE AI MASTERCLASS</strong></p><p>This is a <em><strong>monthly newsletter series</strong></em> that&#8217;ll teach you Generative AI system design.</p><p>By the end of this newsletter series, you&#8217;ll walk away with:</p><ul><li><p><strong>Simple breakdown of real-world architectures</strong></p></li><li><p>Frameworks you can plug into your work or business</p></li><li><p><strong>Proven systems behind ChatGPT, Perplexity, and Copilot</strong></p></li></ul><p><em>And here&#8217;s the best part:</em></p><p>You&#8217;ll go from knowing how to &#8220;use AI&#8221; to understanding how to design, build, and ship AI.</p><p>No AI/ML background needed. Each newsletter explains concepts in plain English, with visuals and real product examples.</p><p>If you want the maximum AI career leverage, join below:</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.systemdesign.one/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><h3><a href="https://agentfield.ai/blog/posts/beyond-vibe-coding/?utm_source=sys_design&amp;utm_medium=newsletter&amp;utm_campaign=sys_design-060407&amp;utm_id=sys_design-060407-blog-bvc-cta&amp;utm_content=blog-bvc-cta">New Open Source: Build and Scale Agents and Harnesses Like APIs (Partner)</a></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://agentfield.ai/blog/posts/beyond-vibe-coding/?utm_source=sys_design&amp;utm_medium=newsletter&amp;utm_campaign=sys_design-060407&amp;utm_id=sys_design-060407-blog-bvc-cta&amp;utm_content=blog-bvc-cta" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_ibW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47cf900f-e29b-4ef8-965f-17c733bcd2d3_1920x1920.webp 424w, https://substackcdn.com/image/fetch/$s_!_ibW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47cf900f-e29b-4ef8-965f-17c733bcd2d3_1920x1920.webp 848w, https://substackcdn.com/image/fetch/$s_!_ibW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47cf900f-e29b-4ef8-965f-17c733bcd2d3_1920x1920.webp 1272w, https://substackcdn.com/image/fetch/$s_!_ibW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47cf900f-e29b-4ef8-965f-17c733bcd2d3_1920x1920.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_ibW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47cf900f-e29b-4ef8-965f-17c733bcd2d3_1920x1920.webp" width="1456" height="1456" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/47cf900f-e29b-4ef8-965f-17c733bcd2d3_1920x1920.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:186332,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/webp&quot;,&quot;href&quot;:&quot;https://agentfield.ai/blog/posts/beyond-vibe-coding/?utm_source=sys_design&amp;utm_medium=newsletter&amp;utm_campaign=sys_design-060407&amp;utm_id=sys_design-060407-blog-bvc-cta&amp;utm_content=blog-bvc-cta&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/179236490?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47cf900f-e29b-4ef8-965f-17c733bcd2d3_1920x1920.webp&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!_ibW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47cf900f-e29b-4ef8-965f-17c733bcd2d3_1920x1920.webp 424w, https://substackcdn.com/image/fetch/$s_!_ibW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47cf900f-e29b-4ef8-965f-17c733bcd2d3_1920x1920.webp 848w, https://substackcdn.com/image/fetch/$s_!_ibW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47cf900f-e29b-4ef8-965f-17c733bcd2d3_1920x1920.webp 1272w, https://substackcdn.com/image/fetch/$s_!_ibW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47cf900f-e29b-4ef8-965f-17c733bcd2d3_1920x1920.webp 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><code>One.harness()</code> call and orchestrate hundreds of Claude Code, Codex, and Gemini instances into a coordinated team. They discover each other at runtime, split work, cross-reference results, and converge autonomously. Go deeper on harness engineering &#8594; <a href="https://agentfield.ai/blog/posts/beyond-vibe-coding/?utm_source=sys_design&amp;utm_medium=newsletter&amp;utm_campaign=sys_design-060407&amp;utm_id=sys_design-060407-blog-bvc-cta&amp;utm_content=blog-bvc-cta">blog</a>.</p><p>No DAGs. No glue code. Agents work like APIs, Python, TypeScript &amp; Go.</p><p>We built production systems you can clone and run today:</p><p>&#8594; <a href="https://agentfield.ai/github/swe-af/?utm_source=github-readme&amp;utm_campaign=github-readme&amp;utm_id=github-readme-swe-af-repo">Autonomous Engineering Team</a>: 175 agents ship code end-to-end</p><p>&#8594; <a href="https://agentfield.ai/github/cloudsecurity/?utm_source=github-readme&amp;utm_campaign=github-readme&amp;utm_id=github-readme-cloudsec-repo">Deep Security Auditor</a>: 250 agents scan every CVE</p><p>&#8594; <a href="https://agentfield.ai/github/sec-af/?utm_source=github-readme&amp;utm_campaign=github-readme&amp;utm_id=github-readme-sec-af-repo">Adversarial Code Reviewer</a>: 50+ agents debate your PRs</p><p>Each runs on any model. Apache 2.0.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://agentfield.ai/github/?utm_source=sys_design&amp;utm_medium=newsletter&amp;utm_campaign=sys_design-060407&amp;utm_id=sys_design-060407-github-cta&amp;utm_content=github-cta&quot;,&quot;text&quot;:&quot;Clone a Recipe&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://agentfield.ai/github/?utm_source=sys_design&amp;utm_medium=newsletter&amp;utm_campaign=sys_design-060407&amp;utm_id=sys_design-060407-github-cta&amp;utm_content=github-cta"><span>Clone a Recipe</span></a></p><div><hr></div><p>We&#8217;ve all used ChatGPT.</p><p>From our user perspective, it looks simple: <em>type a question, get a response</em>. But there&#8217;s a lot more behind it than a large language model (LLM) with a chat interface.</p><p>Building a product like ChatGPT means choosing the right model from a growing landscape of options, preparing the data that trains it, running a multi-stage training pipeline, evaluating whether the outputs are actually good, adding safety controls to prevent harm, and wrapping it all in serving infrastructure that can handle millions of requests. The same is true for Perplexity, GitHub Copilot, and Midjourney.</p><p>The model is one piece of a much larger system.</p><p>Understanding these components is what separates someone who can use GenAI from someone who can design and build it. That&#8217;s what this newsletter series is for.</p><p>This first newsletter covers the foundation.</p><p>We walk through the model landscape, the data layer, the training pipeline, evaluation, safety, and the engineering that turns a model into a product.</p><p>Let&#8217;s start with a glossary of key terms that come up throughout this newsletter and the rest of the series. If you&#8217;re already comfortable with terms like tokens, embeddings, and context windows, skip ahead to <em>Part 1: The AI Model Landscape.</em></p><div><hr></div><p>I want to introduce<strong> <a href="https://louisbouchard.substack.com/">Louis-Fran&#231;ois Bouchard</a></strong> as the author of this newsletter.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://louisbouchard.substack.com/" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8ezx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bbacb1f-03da-4781-9a5e-b720e4af2e34_1100x220.png 424w, https://substackcdn.com/image/fetch/$s_!8ezx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bbacb1f-03da-4781-9a5e-b720e4af2e34_1100x220.png 848w, https://substackcdn.com/image/fetch/$s_!8ezx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bbacb1f-03da-4781-9a5e-b720e4af2e34_1100x220.png 1272w, https://substackcdn.com/image/fetch/$s_!8ezx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bbacb1f-03da-4781-9a5e-b720e4af2e34_1100x220.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8ezx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bbacb1f-03da-4781-9a5e-b720e4af2e34_1100x220.png" width="1100" height="220" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2bbacb1f-03da-4781-9a5e-b720e4af2e34_1100x220.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:220,&quot;width&quot;:1100,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:&quot;https://louisbouchard.substack.com/&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8ezx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bbacb1f-03da-4781-9a5e-b720e4af2e34_1100x220.png 424w, https://substackcdn.com/image/fetch/$s_!8ezx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bbacb1f-03da-4781-9a5e-b720e4af2e34_1100x220.png 848w, https://substackcdn.com/image/fetch/$s_!8ezx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bbacb1f-03da-4781-9a5e-b720e4af2e34_1100x220.png 1272w, https://substackcdn.com/image/fetch/$s_!8ezx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bbacb1f-03da-4781-9a5e-b720e4af2e34_1100x220.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>He&#8217;s a best-selling author (<a href="https://amzn.to/4bqYU9b">Building LLMs for Production</a>), the co-founder of <a href="https://academy.towardsai.net/?ref=1f9b29">Towards AI</a>, and the creator of the YouTube Channel, <a href="https://www.youtube.com/@whatsai?sub_confirmation=1">What&#8217;s AI</a>, where he helps people understand AI and learn how to apply it in the real world. Through his development works with clients and his content, teaching, and AI training programs on the <strong><a href="https://academy.towardsai.net/?ref=1f9b29">Towards AI Academy</a></strong>, Louis focuses on making AI practical for builders, engineers, and curious learners alike.</p><p>At Towards AI, he and his team train AI engineers through courses built for every stage, from beginner to advanced. That educational mission and the real-world experience building for his clients are exactly why I wanted him in this newsletter series.</p><div><hr></div><h2><strong>Key Terms</strong></h2><p>You don&#8217;t need to learn these upfront.</p><p>Skim to get familiar, then refer back whenever you hit an unfamiliar term:</p><p><strong>Token</strong>: LLMs work with numbers, not words. Before processing any text, the model breaks it into small chunks, called tokens. A token can be a whole word (&#8220;the&#8221;), part of a word (&#8220;un&#8221;, &#8220;believ&#8221;, &#8220;able&#8221;), or punctuation. One token is roughly &#8776; 4 English characters, though this varies by language and model. API pricing, context limits, and cost are all measured in tokens.</p><p><strong>Embedding</strong>: A token ID alone (like 1023 for &#8220;What&#8221;) tells the model nothing about meaning. An embedding replaces that ID with a list of numbers (a vector) that captures what the word means and how it relates to other words. These vectors are what the model actually works with internally, processing and transforming them at every step to build up an understanding of the full input. Words with similar meaning end up with similar vectors: &#8220;king&#8221; and &#8220;queen&#8221; are close together, &#8220;king&#8221; and &#8220;banana&#8221; are far apart. This same property also powers search in GenAI products: when a user asks <em>&#8220;How do I reset my password?&#8221;</em>, the system can find documents on password resets, account recovery, and login issues, even if those documents never use that exact phrase.</p><p><strong>Parameter</strong>: These are internal variables inside a model, billions of them. At the start of training, they&#8217;re random. Each time the model makes an incorrect prediction, settings are nudged slightly. After trillions of such adjustments, they encode language patterns. GPT-3 had 175B parameters. Llama 4 comes in Scout (17B active) and Maverick (17B active). More parameters mean more capability, but also more memory: a large model can need 140GB or more just to load.</p><p><strong>Context Window</strong>: The maximum amount of text (in tokens) a model can see in one request. GPT-5 supports up to 400K tokens (272K input). Claude and Gemini support up to 1M tokens. Everything the model needs, system instructions, conversation history, retrieved documents, and the user&#8217;s question must fit within this window. Deciding what goes in and what gets cut is called <em>context engineering</em>.</p><p><strong>Inference</strong>: Every time someone sends a message to ChatGPT, the model runs to produce a response. That&#8217;s one inference call. At scale, this can mean millions of calls per day, and the cost adds up. Two key metrics define inference performance: <em>Time to First Token (<strong>TTFT</strong>)</em>, the time the user waits before the first word appears, and <em>Tokens per Second (<strong>TPS</strong>)</em>, the rate at which the rest of the response streams in after that.</p><p><strong>LLM (Large Language Model)</strong>: A neural network (a program that learns patterns from data) trained on one objective: to predict the next token. GPT, Claude, Llama, and Gemini are all LLMs. The objective is simple, but at a massive scale (trillions of tokens of training data), it produces the ability to follow instructions, write code, reason through problems, and hold conversations.</p><p><strong>Foundation Model</strong>: A large, general-purpose model trained on massive data that serves as the starting point for products. Both LLMs (GPT-5, Claude) and diffusion models (Stable Diffusion, Midjourney) are foundation models. These aren&#8217;t built from scratch for each product. You take one and adapt it through prompting, fine-tuning, or both.</p><p><strong>RAG (Retrieval-Augmented Generation)</strong>: A design approach where the system retrieves relevant documents at query time and includes them in the prompt, so the answer is grounded in real sources rather than what the model memorized during training. Products like Perplexity use this approach: search the web, feed the results to the model, and generate an answer with citations. It&#8217;s widely used in products that need up-to-date or domain-specific knowledge.</p><p><strong>Fine-tuning</strong>: Taking a foundation model and training it further on a smaller, task-specific dataset to specialize its behavior. For example, a model fine-tuned on customer support conversations learns the right tone, format, and escalation patterns for that domain. Fine-tuning changes <em>how</em> the model responds (style, format, tone), not <em>what</em> it knows. If the model needs to answer questions using knowledge not in its training data, retrieval-augmented generation (RAG) is the standard approach.</p><p><strong>Prompt</strong>: The full text input sent to a model for each request is a prompt. It typically includes a <em>system prompt</em> (developer-defined instructions that define the model&#8217;s role), the <em>conversation history</em> (previous messages), and the <em>user message</em> (the current question). In RAG systems, retrieved documents are also included. <em>Prompt engineering</em> is the practice of refining how this input is structured to get better results.</p><p><strong>Hallucination</strong>: When a model generates confident but factually wrong output, like citing a paper that doesn&#8217;t exist or inventing a URL. This isn&#8217;t a bug. These models are optimized to produce text that sounds plausible rather than factually correct, so hallucinations are a built-in trade-off, not a malfunction. The main ways to reduce hallucinations are <em>RAG</em> (retrieving real sources to ground the answer) and <em>evaluation</em> (testing the model&#8217;s outputs for accuracy).</p><p><strong>Transformer</strong>: The neural network architecture behind nearly all modern foundation models, introduced in 2017 (&#8220;Attention Is All You Need&#8221;). Its key innovation is the <em>attention mechanism</em>: instead of processing tokens sequentially, the model considers all tokens simultaneously and learns which are relevant to one another. In &#8220;The cat sat on the mat because it was tired,&#8221; attention connects &#8220;it&#8221; back to &#8220;cat&#8221; rather than &#8220;mat.&#8221; Some models extend this architecture with a <em>Mixture of Experts (<strong>MoE</strong>)</em>, where only a subset of the model&#8217;s parameters is used for each input rather than all of them, making the model faster and cheaper to run while maintaining high overall capability.</p><p><strong>GPU/TPU</strong>: The specialized hardware on which GenAI runs. <strong>GPUs</strong> (NVIDIA H100, A100, B200) handle the massive number of calculations required by models. A request that takes milliseconds on a GPU could take orders of magnitude longer on a regular CPU. Training a large model needs thousands of GPUs running for months. <strong>TPUs</strong> are Google&#8217;s custom chips for ML, used primarily within Google Cloud.</p><p><strong>Latency</strong>: How long the user waits for a response. Different products have different targets depending on what the user is doing. Autocomplete has to keep up with typing, so it typically needs a response in under 300ms. Chat users expect a short pause before the reply starts streaming. Image and video generation is much slower; users wait 10 to 60 seconds. These targets influence which model to use, whether to add caching, and how much the system can do before it starts to feel slow.</p><p>The first decision in designing any GenAI system is choosing the right model&#8230;</p><p>That starts with understanding what kinds of models exist and what each one is good at.</p><div><hr></div><h2><strong>Part 1: The AI Model Landscape</strong></h2><p>Not all generative models work the same way.</p><p>They differ in how they generate output, what kinds of output they produce, and how they&#8217;re built. This section breaks down the two core architectures, types of outputs they support, the internal structure of models, and how to choose between them.</p><h3><strong>Two Families of Generative Models</strong></h3><p>All generative AI models are built on one of two architectures:</p><p><strong>1 Autoregressive models</strong> generate text one token at a time.</p><p>Given the input &#8220;The capital of France is&#8221;, the model predicts the most likely next token: &#8220;Paris&#8221; in this case. It appends &#8220;Paris&#8221; to the sequence, and now uses &#8220;The capital of France is Paris&#8221; to predict the next token, i.e., &#8220;.&#8221; Then it uses the full sequence to predict the token, and so on. Each output feeds back as input. This is how GPT, Claude, Gemini, and Llama models work.</p><p>When you see a chatbot &#8220;typing&#8221; its response word by word, that&#8217;s not a UI animation; the model is actually producing tokens one after another.</p><p>Because generation is sequential, latency scales with output length: <em>a 500-token response takes roughly 500 generation steps</em>. Cost scales, too: <em>a 1,000-token response costs roughly 10x as much as a 100-token response</em>.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LNuG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ea33f71-994a-449b-becd-7e5f6f110635_1600x180.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LNuG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ea33f71-994a-449b-becd-7e5f6f110635_1600x180.png 424w, https://substackcdn.com/image/fetch/$s_!LNuG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ea33f71-994a-449b-becd-7e5f6f110635_1600x180.png 848w, https://substackcdn.com/image/fetch/$s_!LNuG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ea33f71-994a-449b-becd-7e5f6f110635_1600x180.png 1272w, https://substackcdn.com/image/fetch/$s_!LNuG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ea33f71-994a-449b-becd-7e5f6f110635_1600x180.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LNuG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ea33f71-994a-449b-becd-7e5f6f110635_1600x180.png" width="1456" height="164" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4ea33f71-994a-449b-becd-7e5f6f110635_1600x180.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:164,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LNuG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ea33f71-994a-449b-becd-7e5f6f110635_1600x180.png 424w, https://substackcdn.com/image/fetch/$s_!LNuG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ea33f71-994a-449b-becd-7e5f6f110635_1600x180.png 848w, https://substackcdn.com/image/fetch/$s_!LNuG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ea33f71-994a-449b-becd-7e5f6f110635_1600x180.png 1272w, https://substackcdn.com/image/fetch/$s_!LNuG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ea33f71-994a-449b-becd-7e5f6f110635_1600x180.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p style="text-align: center;"><em>Autoregressive generation: each token is produced one at a time, left to right.</em></p><p><strong>2 Diffusion models</strong> work completely differently.</p><p>Instead of producing one piece at a time, they start with pure random noise and gradually remove it over many steps until a coherent image (or video) emerges.</p><p>During training, the model sees millions of images with noise added at various levels and learns to predict what the clean version looks like. At generation time, it reverses the process step by step, typically 20 to 50 steps, removing a little noise each time until a clear image forms.</p><p>This is how Midjourney, Stable Diffusion, DALL-E, and Sora work.</p><p>The serving model is very different from text: image generation can take roughly 10 to 60 seconds, and video can take minutes, so products typically run generation as batch jobs with a queue rather than streaming.</p><p>A single image generation also costs roughly 10-100x as much compute as a typical ChatGPT response.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NtsK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F093fded3-ee7e-4917-871d-8803c37ae54c_1600x383.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NtsK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F093fded3-ee7e-4917-871d-8803c37ae54c_1600x383.png 424w, https://substackcdn.com/image/fetch/$s_!NtsK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F093fded3-ee7e-4917-871d-8803c37ae54c_1600x383.png 848w, https://substackcdn.com/image/fetch/$s_!NtsK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F093fded3-ee7e-4917-871d-8803c37ae54c_1600x383.png 1272w, https://substackcdn.com/image/fetch/$s_!NtsK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F093fded3-ee7e-4917-871d-8803c37ae54c_1600x383.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NtsK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F093fded3-ee7e-4917-871d-8803c37ae54c_1600x383.png" width="1456" height="349" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/093fded3-ee7e-4917-871d-8803c37ae54c_1600x383.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:349,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NtsK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F093fded3-ee7e-4917-871d-8803c37ae54c_1600x383.png 424w, https://substackcdn.com/image/fetch/$s_!NtsK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F093fded3-ee7e-4917-871d-8803c37ae54c_1600x383.png 848w, https://substackcdn.com/image/fetch/$s_!NtsK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F093fded3-ee7e-4917-871d-8803c37ae54c_1600x383.png 1272w, https://substackcdn.com/image/fetch/$s_!NtsK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F093fded3-ee7e-4917-871d-8803c37ae54c_1600x383.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption"><em>Diffusion models iteratively denoise random noise into a coherent image.</em></figcaption></figure></div><h3><strong>Models Outputs</strong></h3><p>These two architectures cover a range of output types:</p><p><strong>1 Text generation</strong> (autoregressive): chat responses, code, summaries, translations, structured data extraction. This powers ChatGPT, Claude, GitHub Copilot, Perplexity.</p><p><strong>2 Text-to-image</strong> (diffusion): generate images from text prompts. Midjourney, DALL-E 3, Stable Diffusion.</p><p><strong>3 Text-to-video</strong> (diffusion): generate video clips from text or image prompts. Sora, Runway Gen-3, Kling.</p><p><strong>4 Text-to-audio</strong> (autoregressive + diffusion): generate speech, music, and sound effects. ElevenLabs, Suno, Bark.</p><p><strong>5 Vision-language models</strong> (autoregressive) process images alongside text. Upload a photo and ask the model to describe it, extract data from a receipt, or analyze a chart. GPT-5, Claude, and Gemini all support this.</p><h3><strong>Multimodality</strong></h3><p>Each of the output types above used to require its own dedicated model.</p><p>That&#8217;s changing; models like GPT-5 and Gemini now handle text, images, audio, and video within a single architecture. The question is how. How does a text prompt like <em>&#8220;a photo of a cat&#8221;</em> actually guide an image model to generate the right image?</p><p>Through <strong>cross-modal embeddings</strong>.</p><p>An embedding converts content into a list of numbers that captures its meaning. Cross-modal embeddings do this across different content types, text and images, placing them on the same map.</p><p><em>CLIP</em>, developed by OpenAI, is the most well-known example. It places both <em>&#8220;a photo of a cat&#8221;</em> (text) and an actual cat photo (image) side by side on this shared map. That shared representation enables a text description to guide image generation.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZFpI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa119134b-5712-46cb-992c-53942e0eab5f_1560x1074.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZFpI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa119134b-5712-46cb-992c-53942e0eab5f_1560x1074.png 424w, https://substackcdn.com/image/fetch/$s_!ZFpI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa119134b-5712-46cb-992c-53942e0eab5f_1560x1074.png 848w, https://substackcdn.com/image/fetch/$s_!ZFpI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa119134b-5712-46cb-992c-53942e0eab5f_1560x1074.png 1272w, https://substackcdn.com/image/fetch/$s_!ZFpI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa119134b-5712-46cb-992c-53942e0eab5f_1560x1074.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZFpI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa119134b-5712-46cb-992c-53942e0eab5f_1560x1074.png" width="1456" height="1002" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a119134b-5712-46cb-992c-53942e0eab5f_1560x1074.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1002,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZFpI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa119134b-5712-46cb-992c-53942e0eab5f_1560x1074.png 424w, https://substackcdn.com/image/fetch/$s_!ZFpI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa119134b-5712-46cb-992c-53942e0eab5f_1560x1074.png 848w, https://substackcdn.com/image/fetch/$s_!ZFpI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa119134b-5712-46cb-992c-53942e0eab5f_1560x1074.png 1272w, https://substackcdn.com/image/fetch/$s_!ZFpI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa119134b-5712-46cb-992c-53942e0eab5f_1560x1074.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>Text and images mapped to the same vector space. Similar content lands nearby.</em></figcaption></figure></div><h3><strong>Model Architecture Variants</strong></h3><p>Not all foundation models are built the same way&#8230;</p><p>Most are based on the Transformer (covered in the glossary above), but they use it differently. The two main jobs of a language model are understanding input and generating output.</p><p>Different architectures handle these jobs in different ways:</p><p><strong>1 Decoder-only</strong> models focus entirely on generation: they take an input and produce text left-to-right, one token at a time. GPT, Claude, Llama, and Gemini are all decoder-only. This is the dominant architecture for text generation today.</p><p><strong>2 Encoder-only</strong> models (like BERT, an early Transformer&#8209;based language model) focus entirely on understanding: they read input and produce a numerical representation of its meaning, but don&#8217;t generate new text. Used for classification, sentiment analysis, and search ranking. For example, when a support system reads <em>&#8220;I&#8217;m furious about my broken order&#8221;</em> and classifies it as &#8220;negative sentiment, high urgency&#8221;, that&#8217;s an encoder-only model at work. In GenAI products, they play supporting roles, such as classifying user intent or detecting toxicity.</p><p><strong>3 Encoder-decoder</strong> models (like Google&#8217;s T5 and Meta&#8217;s BART) do both: one part reads and understands the input, the other generates the output. Suited for translation, summarization, and rewriting.</p><p><strong>4 Mixture of Experts (MoE)</strong> is a different kind of optimization.</p><p>Instead of a single large model where every parameter is used for every input, MoE splits the model into multiple specialized groups, called &#8220;experts.&#8221; For each input, a router determines which experts are relevant and activates only those. A question about Python code might activate code-specialized experts, while a question about French history might activate different experts. Models like Mistral&#8217;s Mixtral 8x7B use this approach, with 46.7 billion total parameters but only around 12.9 billion active per token. You get more capability per unit of compute, but you need more GPU memory to store all the experts.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2Gh9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f7ce967-d17b-4670-80d8-7b92b5014fbe_1600x308.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2Gh9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f7ce967-d17b-4670-80d8-7b92b5014fbe_1600x308.png 424w, https://substackcdn.com/image/fetch/$s_!2Gh9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f7ce967-d17b-4670-80d8-7b92b5014fbe_1600x308.png 848w, https://substackcdn.com/image/fetch/$s_!2Gh9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f7ce967-d17b-4670-80d8-7b92b5014fbe_1600x308.png 1272w, https://substackcdn.com/image/fetch/$s_!2Gh9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f7ce967-d17b-4670-80d8-7b92b5014fbe_1600x308.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2Gh9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f7ce967-d17b-4670-80d8-7b92b5014fbe_1600x308.png" width="1456" height="280" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6f7ce967-d17b-4670-80d8-7b92b5014fbe_1600x308.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:280,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2Gh9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f7ce967-d17b-4670-80d8-7b92b5014fbe_1600x308.png 424w, https://substackcdn.com/image/fetch/$s_!2Gh9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f7ce967-d17b-4670-80d8-7b92b5014fbe_1600x308.png 848w, https://substackcdn.com/image/fetch/$s_!2Gh9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f7ce967-d17b-4670-80d8-7b92b5014fbe_1600x308.png 1272w, https://substackcdn.com/image/fetch/$s_!2Gh9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f7ce967-d17b-4670-80d8-7b92b5014fbe_1600x308.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption"><em>Most modern GenAI products use decoder-only or MoE architectures.</em></figcaption></figure></div><h3><strong>The Model Landscape: Proprietary, Open-Weight, and Open-Source</strong></h3><p>When choosing a model, there are three categories with different trade-offs:</p><p><strong>1 Proprietary models</strong> (GPT, Claude, Gemini) are accessed via API. They offer the highest capability and are the easiest to start with, but give users the least control. You can&#8217;t see the model&#8217;s internal parameters, modify its architecture, or run it on your own servers.</p><p><strong>2 Open-weight models</strong> (Llama, Mistral, Gemma) release their trained parameters publicly. You can download them, run them on your own GPUs, and fine-tune them. Training data and methods are typically not shared. This gives you full control over the model, but it also means you have to manage the infrastructure yourself.</p><p><strong>3 Open-source models</strong> go further by sharing not just weights but training code, data, and the full methodology behind how the model was built.</p><p>The gap between proprietary and open-source is narrowing&#8230;</p><p>By early 2025, models like Llama 4 Maverick and DeepSeek-V3 were competitive with proprietary models on many tasks. Most production systems use a mix: a proprietary model for complex tasks and an open-weight model for high-volume, simple ones.</p><h3><strong>Model Selection</strong></h3><p>Each model family offers a range of sizes (e.g., GPT-5 vs. GPT-5 mini, Claude Opus vs. Haiku, Llama 4 Maverick vs Scout).</p><p>Smaller models are faster and cheaper, but less capable. Beyond size, some models offer specialized variants. OpenAI&#8217;s GPT-5, for example, includes configurable reasoning effort, and Claude offers an extended thinking mode. These reasoning capabilities require more compute during inference to work through problems step by step, breaking complex questions into smaller parts before answering. This improves accuracy on tasks such as math, logic, and multi-step analysis, but it costs more per request and takes longer to respond.</p><p>This trade-off is another reason model routing matters: route simple questions to a faster model, and complex reasoning tasks to a model with these capabilities.</p><p>Public benchmarks help narrow down which specific model is best suited to a task. MMLU tests general knowledge across 57 subjects; HumanEval and MBPP focus on code generation; Chatbot Arena by LMSYS aggregates human preference scores from real conversations; and benchmarks like GPQA and ARC-AGI evaluate reasoning and problem-solving abilities.</p><p>But benchmarks have a well-known problem: contamination.</p><p>Models can be trained on benchmark test data, inflating scores. It&#8217;s like a student who memorized the answers to last year&#8217;s exam; they score high, but can&#8217;t solve newer problems. A model that tops MMLU may underperform on a specific task. Benchmarks are useful for narrowing the field, but final selection requires evaluating on your own data.</p><p>That covers the model landscape, what models exist, how they differ, and how to pick between them. All of these models, though, start with random parameters. What turns them into something useful is data.</p><div><hr></div><p><em><strong>Reminder: this is a teaser of the subscriber-only newsletter series, exclusive to my golden members.</strong></em></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://newsletter.systemdesign.one/subscribe?yearly=true" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3mfm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0002d397-a8f7-4519-a51c-f020e29ce8b3_1280x300.png 424w, https://substackcdn.com/image/fetch/$s_!3mfm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0002d397-a8f7-4519-a51c-f020e29ce8b3_1280x300.png 848w, https://substackcdn.com/image/fetch/$s_!3mfm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0002d397-a8f7-4519-a51c-f020e29ce8b3_1280x300.png 1272w, https://substackcdn.com/image/fetch/$s_!3mfm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0002d397-a8f7-4519-a51c-f020e29ce8b3_1280x300.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3mfm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0002d397-a8f7-4519-a51c-f020e29ce8b3_1280x300.png" width="1280" height="300" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0002d397-a8f7-4519-a51c-f020e29ce8b3_1280x300.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:300,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:24224,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://newsletter.systemdesign.one/subscribe?yearly=true&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/192435842?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0002d397-a8f7-4519-a51c-f020e29ce8b3_1280x300.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3mfm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0002d397-a8f7-4519-a51c-f020e29ce8b3_1280x300.png 424w, https://substackcdn.com/image/fetch/$s_!3mfm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0002d397-a8f7-4519-a51c-f020e29ce8b3_1280x300.png 848w, https://substackcdn.com/image/fetch/$s_!3mfm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0002d397-a8f7-4519-a51c-f020e29ce8b3_1280x300.png 1272w, https://substackcdn.com/image/fetch/$s_!3mfm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0002d397-a8f7-4519-a51c-f020e29ce8b3_1280x300.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>When you upgrade, you&#8217;ll get:</p><ul><li><p><strong>Simple breakdown of real-world architectures</strong></p></li><li><p>Frameworks you can plug into your work or business</p></li><li><p><strong>Proven systems behind ChatGPT, Perplexity, and Copilot</strong></p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?yearly=true&quot;,&quot;text&quot;:&quot;Unlock Full Access&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.systemdesign.one/subscribe?yearly=true"><span>Unlock Full Access</span></a></p><div><hr></div><h2><strong>Part 2: The Data Layer</strong></h2><p>A model&#8217;s capability comes entirely from the data it was trained on.</p><p>The quality of that data, how it&#8217;s collected, cleaned, and organized directly determines how well the model performs.</p><h3><strong>Training Data</strong></h3><p>Foundation models learn from massive datasets: trillions of tokens scraped from books, websites, code repositories, scientific papers, and forum posts. For diffusion models, training data consists of millions of images, each paired with a text description of its contents.</p><p>Raw data is never used directly.</p><p>It passes through a&nbsp;<strong>data pipeline</strong>&nbsp;that cleans, filters, deduplicates, and transforms the data before training begins. For example, the same news article might appear on hundreds of websites. Without deduplication, the model sees the same phrases hundreds of times and overlearns them.</p><p>The pipeline handles problems like near-duplicate web pages that inflate certain patterns, toxic or biased content the model would absorb, personally identifiable information (<strong>PII</strong>) that shouldn&#8217;t be memorized, and low-quality text that adds noise without a useful signal.</p><p>The quality of this pipeline directly determines the model's quality.</p><p><strong>Synthetic data generation</strong> is increasingly used to fill data gaps. An existing model generates additional training examples for areas where real data is scarce, such as specialized medical terminology, underrepresented languages, or edge-case code patterns. The quality of synthetic data depends on the model that generates it and how it&#8217;s validated.</p><p><strong>Bias detection</strong> starts here. If the training data overrepresents certain perspectives, demographics, or viewpoints, the model will reflect those imbalances. Data teams audit for representation gaps and offensive content before training begins, but detection is imperfect. Some biases only surface after the model is trained and tested.</p><p><strong>NSFW and harmful content</strong> requires explicit handling during this stage. Some pipelines filter it entirely. Others retain it in a controlled form so the model can later learn to recognize and refuse it. The approach depends on the product&#8217;s intended use.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!pfnJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e31eb63-a6f0-4f1c-b036-26eea788218d_1999x545.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pfnJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e31eb63-a6f0-4f1c-b036-26eea788218d_1999x545.png 424w, https://substackcdn.com/image/fetch/$s_!pfnJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e31eb63-a6f0-4f1c-b036-26eea788218d_1999x545.png 848w, https://substackcdn.com/image/fetch/$s_!pfnJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e31eb63-a6f0-4f1c-b036-26eea788218d_1999x545.png 1272w, https://substackcdn.com/image/fetch/$s_!pfnJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e31eb63-a6f0-4f1c-b036-26eea788218d_1999x545.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pfnJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e31eb63-a6f0-4f1c-b036-26eea788218d_1999x545.png" width="1456" height="397" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1e31eb63-a6f0-4f1c-b036-26eea788218d_1999x545.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:397,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:186851,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/192435842?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e31eb63-a6f0-4f1c-b036-26eea788218d_1999x545.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!pfnJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e31eb63-a6f0-4f1c-b036-26eea788218d_1999x545.png 424w, https://substackcdn.com/image/fetch/$s_!pfnJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e31eb63-a6f0-4f1c-b036-26eea788218d_1999x545.png 848w, https://substackcdn.com/image/fetch/$s_!pfnJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e31eb63-a6f0-4f1c-b036-26eea788218d_1999x545.png 1272w, https://substackcdn.com/image/fetch/$s_!pfnJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e31eb63-a6f0-4f1c-b036-26eea788218d_1999x545.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>Data pipeline: raw data is cleaned through multiple stages before training.</em></figcaption></figure></div><h3><strong>Data Storage and Retrieval</strong></h3><p>Training and inference have very different data needs.</p><p>During training, the dataset can be terabytes or more in size, containing text, images, and code. Teams typically store this in distributed file systems like <strong>HDFS</strong> (Hadoop Distributed File System), which splits large files across a cluster of machines so they can be read in parallel, or cloud object storage like <strong>S3</strong> (Amazon&#8217;s Simple Storage Service), which stores files as objects in the cloud and is widely used because it&#8217;s cheap, scales without limits, and requires no infrastructure to manage. Training pipelines stream data from these systems to GPU clusters in batches, feeding the model continuously throughout training.</p><p>At inference time, the training data is no longer needed.</p><p>What matters is the model itself. The model&#8217;s parameters need to be loaded into GPU memory before it can serve any requests, and for large models, that means hundreds of gigabytes. Once loaded, the model takes in a prompt and generates tokens one at a time, streaming them back to the user.</p><p>The challenge is keeping sufficient GPU capacity available to handle many users simultaneously without long wait times.</p><p>Some GenAI systems also retrieve data at runtime, pulling in documents, search results, or database records when a user asks a question. This is how products like Perplexity stay current beyond the model&#8217;s training data.</p><p>In Part 6, we cover how this retrieval works, embeddings, vector databases, and structured data handling when we design the full product architecture.</p><p>But having clean, well-organized data is only the starting point.</p><p>The next question is what happens to that data, and how a pile of text becomes a model that can actually follow instructions&#8230;</p><p>Let&#8217;s keep going!</p>
      <p>
          <a href="https://newsletter.systemdesign.one/p/generative-ai-system-design">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Amazon S3 - A Deep Dive]]></title><description><![CDATA[#136: How S3 Actually Works]]></description><link>https://newsletter.systemdesign.one/p/aws-s3-system-design</link><guid isPermaLink="false">https://newsletter.systemdesign.one/p/aws-s3-system-design</guid><dc:creator><![CDATA[Hayk]]></dc:creator><pubDate>Wed, 01 Apr 2026 12:04:35 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/b7180882-5e86-477b-b950-8a08d76a8ddc_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Get my system design playbook for FREE on newsletter signup:</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><ul><li><p><em><a href="https://newsletter.systemdesign.one/p/aws-s3-system-design/?action=share">Share this post</a> &amp; I'll send you some rewards for the referrals.</em></p></li><li><p><em>Block diagrams created using <a href="https://app.eraser.io/auth/sign-up?ref=neo">Eraser</a>.</em></p></li></ul><div><hr></div><p>Object storage isn&#8217;t just <em>&#8220;upload a file, get a URL back.&#8221;</em></p><p>That&#8217;s true for a small side project. It stops being true when you&#8217;re storing 100 trillion objects like AWS S3 does today.</p><p>Amazon Simple Storage Service (<strong>S3</strong>) launched in 2006.</p><p>In 2013, Amazon reported storing 2 trillion objects. By 2021, that number crossed 100 trillion. The system handling all of that doesn&#8217;t look anything like a file server or a relational database. It&#8217;s a fundamentally different class of infrastructure, built around a different set of tradeoffs.</p><p>This problem shows up in senior- and staff-level interviews at companies building storage-heavy products&#8230;</p><p>It tests the exact skills that separate mid-level engineers from seniors: <em>understanding why different storage types exist, designing for durability at a scale where hardware failures occur daily, and making smart trade-offs between consistency, cost, and performance.</em></p><p>We&#8217;re going to build this from scratch:</p><p>We&#8217;ll start with what object storage actually is and how it differs from other storage systems, then work through requirements, capacity estimation, the high-level architecture, disk-based data persistence, durability strategies, metadata design, object versioning<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a>, large-file uploads, and garbage collection.</p><p>At each step, we&#8217;ll explain the why behind each decision&#8230;</p><div><hr></div><h3><strong><a href="https://codenewsletter.ai/subscribe?utm_source=nl_ad_system">Find out why 150K+ engineers read The Code twice a week (Partner)</a></strong></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://codenewsletter.ai/subscribe?utm_source=nl_ad_system" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!w80U!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe30ba00c-653e-421b-af50-f5b48536f36b_1271x699.jpeg 424w, https://substackcdn.com/image/fetch/$s_!w80U!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe30ba00c-653e-421b-af50-f5b48536f36b_1271x699.jpeg 848w, https://substackcdn.com/image/fetch/$s_!w80U!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe30ba00c-653e-421b-af50-f5b48536f36b_1271x699.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!w80U!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe30ba00c-653e-421b-af50-f5b48536f36b_1271x699.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!w80U!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe30ba00c-653e-421b-af50-f5b48536f36b_1271x699.jpeg" width="1271" height="699" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e30ba00c-653e-421b-af50-f5b48536f36b_1271x699.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:699,&quot;width&quot;:1271,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:303786,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:&quot;https://codenewsletter.ai/subscribe?utm_source=nl_ad_system&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/179236490?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe30ba00c-653e-421b-af50-f5b48536f36b_1271x699.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!w80U!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe30ba00c-653e-421b-af50-f5b48536f36b_1271x699.jpeg 424w, https://substackcdn.com/image/fetch/$s_!w80U!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe30ba00c-653e-421b-af50-f5b48536f36b_1271x699.jpeg 848w, https://substackcdn.com/image/fetch/$s_!w80U!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe30ba00c-653e-421b-af50-f5b48536f36b_1271x699.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!w80U!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe30ba00c-653e-421b-af50-f5b48536f36b_1271x699.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Tech moves fast, but you&#8217;re still playing catch-up?</p><p>That&#8217;s exactly why 150K+ engineers working at Google, Meta, and Apple read <a href="https://codenewsletter.ai/subscribe?utm_source=nl_ad_system">The Code</a> twice a week.</p><p>Here&#8217;s what you get:</p><ul><li><p><strong>Curated tech news that shapes your career </strong>- Filtered from thousands of sources so you know what&#8217;s coming 6 months early.</p></li><li><p><strong>Practical resources you can use immediately</strong> - Real tutorials and tools that solve actual engineering problems.</p></li><li><p><strong>Research papers and insights decoded</strong> - We break down complex tech so you understand what matters.</p></li></ul><p>All delivered twice a week in just 2 short emails.</p><p><strong>Sign up and get access to the Ultimate Claude code guide to ship 5X faster.</strong></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://codenewsletter.ai/subscribe?utm_source=nl_ad_system&quot;,&quot;text&quot;:&quot;Join 150K+ Engineers&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://codenewsletter.ai/subscribe?utm_source=nl_ad_system"><span>Join 150K+ Engineers</span></a></p><p>(Thanks for partnering on this post and sharing the ultimate <a href="https://codenewsletter.ai/subscribe?utm_source=nl_ad_system">claude code guide</a>.)</p><div><hr></div><p>I want to reintroduce <a href="https://linkedin.com/in/hayksimonyan">Hayk Simonyan</a> as a guest author.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://youtube.com/@hayk.simonyan" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!TNRZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe42ce2a7-78b8-43b3-b7cf-b293d660240a_1748x720.png 424w, https://substackcdn.com/image/fetch/$s_!TNRZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe42ce2a7-78b8-43b3-b7cf-b293d660240a_1748x720.png 848w, https://substackcdn.com/image/fetch/$s_!TNRZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe42ce2a7-78b8-43b3-b7cf-b293d660240a_1748x720.png 1272w, https://substackcdn.com/image/fetch/$s_!TNRZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe42ce2a7-78b8-43b3-b7cf-b293d660240a_1748x720.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!TNRZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe42ce2a7-78b8-43b3-b7cf-b293d660240a_1748x720.png" width="1456" height="600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e42ce2a7-78b8-43b3-b7cf-b293d660240a_1748x720.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:600,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:387657,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://youtube.com/@hayk.simonyan&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/188825279?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe42ce2a7-78b8-43b3-b7cf-b293d660240a_1748x720.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!TNRZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe42ce2a7-78b8-43b3-b7cf-b293d660240a_1748x720.png 424w, https://substackcdn.com/image/fetch/$s_!TNRZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe42ce2a7-78b8-43b3-b7cf-b293d660240a_1748x720.png 848w, https://substackcdn.com/image/fetch/$s_!TNRZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe42ce2a7-78b8-43b3-b7cf-b293d660240a_1748x720.png 1272w, https://substackcdn.com/image/fetch/$s_!TNRZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe42ce2a7-78b8-43b3-b7cf-b293d660240a_1748x720.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>He&#8217;s a senior software engineer specializing in helping developers break through their career plateaus and secure senior roles.</p><p>If you want to master the essential system design skills and land senior developer roles, I highly recommend checking out Hayk&#8217;s <strong><a href="https://youtube.com/@hayk.simonyan">YouTube channel</a></strong>.</p><p>His approach focuses on what top employers actually care about: system design expertise, advanced project experience, and elite-level interview performance.</p><div><hr></div><h2><strong>What is Object Storage?</strong></h2><p>Before we design anything, we need to understand what object storage actually is&#8230;</p><p>Engineers often confuse the three main storage categories. Each one exists for a reason, and choosing the wrong one at design time is expensive to fix later.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!eRmE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fe80ed2-8937-4208-b353-e12f8908ef15_1600x906.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eRmE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fe80ed2-8937-4208-b353-e12f8908ef15_1600x906.png 424w, https://substackcdn.com/image/fetch/$s_!eRmE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fe80ed2-8937-4208-b353-e12f8908ef15_1600x906.png 848w, https://substackcdn.com/image/fetch/$s_!eRmE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fe80ed2-8937-4208-b353-e12f8908ef15_1600x906.png 1272w, https://substackcdn.com/image/fetch/$s_!eRmE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fe80ed2-8937-4208-b353-e12f8908ef15_1600x906.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eRmE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fe80ed2-8937-4208-b353-e12f8908ef15_1600x906.png" width="1456" height="824" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8fe80ed2-8937-4208-b353-e12f8908ef15_1600x906.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:824,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!eRmE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fe80ed2-8937-4208-b353-e12f8908ef15_1600x906.png 424w, https://substackcdn.com/image/fetch/$s_!eRmE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fe80ed2-8937-4208-b353-e12f8908ef15_1600x906.png 848w, https://substackcdn.com/image/fetch/$s_!eRmE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fe80ed2-8937-4208-b353-e12f8908ef15_1600x906.png 1272w, https://substackcdn.com/image/fetch/$s_!eRmE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fe80ed2-8937-4208-b353-e12f8908ef15_1600x906.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>Block Storage</strong></h3><p>Block storage is the oldest and lowest-level.</p><p>When you plug a hard drive or SSD into a server, the operating system (<strong>OS</strong>) sees it as a sequence of raw blocks, each typically 4KB in size. The OS decides how to format those blocks and build a file system on top of them. Some applications, such as databases and virtual machine engines, skip the file system entirely and manage blocks directly, which gives them maximum control and performance.</p><p>Block storage doesn&#8217;t have to be physically attached&#8230;</p><p>You can connect to block storage over a network using protocols like Fibre Channel or iSCSI<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a>. The server still sees raw blocks, as if the drive were directly attached, but the data resides elsewhere on the network. This is how cloud providers like AWS offer Elastic Block Store (<strong>EBS</strong>): you attach a network drive to your VM, and it behaves like a local disk.</p><p>Block storage is fast and flexible, but it&#8217;s also expensive and doesn&#8217;t scale cheaply to petabytes.</p><h3><strong>File Storage</strong></h3><p>File storage is built on top of block storage.</p><p>It adds a layer that handles the complexity of managing blocks and gives you a familiar directory hierarchy: folders, subfolders, and files. You don&#8217;t deal with blocks at all. You just read and write files using paths like <code>/documents/report.pdf.</code></p><p>File storage becomes especially useful when many servers need to share the same files.</p><p>Protocols like NFS and SMB/CIFS<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a> allow many machines to mount the same file share and read and write to it concurrently. This is how shared drives inside organizations typically work, and it&#8217;s how many legacy enterprise applications store data.</p><p>File storage is easier to use than block storage, but it still doesn&#8217;t scale to the level object storage does. Hierarchical directory structures become slow and complex when you have billions of files.</p><h3><strong>Object Storage</strong></h3><p>Object storage is the newest of the three and the most different&#8230;</p><p>It makes a deliberate tradeoff: <em>give up performance and mutability in exchange for near-unlimited scalability, very high durability, and low cost.</em></p><p>There are no directories in object storage.</p><p>Everything lives in a flat namespace inside containers called buckets. Every object is accessed via a RESTful HTTP API using a unique key. You can&#8217;t partially update an object. If you want to change a file, you replace the entire object or create a new version. This immutability<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a> constraint sounds limiting, but it&#8217;s actually what makes object storage so cheap to operate at scale, because it simplifies replication and consistency considerably.</p><p>AWS S3, Google Cloud Storage, and Azure Blob Storage are all object storage systems. They&#8217;re the foundation of most modern cloud architectures: video files, backups, data lake storage, machine learning datasets, static website assets, and more.</p><h3><strong>Comparison Table</strong></h3><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bk1n!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe044726-8beb-4aec-bc25-468b5794d782_1600x379.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bk1n!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe044726-8beb-4aec-bc25-468b5794d782_1600x379.png 424w, https://substackcdn.com/image/fetch/$s_!bk1n!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe044726-8beb-4aec-bc25-468b5794d782_1600x379.png 848w, https://substackcdn.com/image/fetch/$s_!bk1n!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe044726-8beb-4aec-bc25-468b5794d782_1600x379.png 1272w, https://substackcdn.com/image/fetch/$s_!bk1n!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe044726-8beb-4aec-bc25-468b5794d782_1600x379.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bk1n!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe044726-8beb-4aec-bc25-468b5794d782_1600x379.png" width="1456" height="345" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/be044726-8beb-4aec-bc25-468b5794d782_1600x379.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:345,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bk1n!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe044726-8beb-4aec-bc25-468b5794d782_1600x379.png 424w, https://substackcdn.com/image/fetch/$s_!bk1n!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe044726-8beb-4aec-bc25-468b5794d782_1600x379.png 848w, https://substackcdn.com/image/fetch/$s_!bk1n!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe044726-8beb-4aec-bc25-468b5794d782_1600x379.png 1272w, https://substackcdn.com/image/fetch/$s_!bk1n!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe044726-8beb-4aec-bc25-468b5794d782_1600x379.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>The key constraint to internalize: objects are immutable.</p><p>You cannot edit part of an object. You replace the whole thing, or you version it. This constraint is not an accident. It&#8217;s a deliberate design choice that enables the durability and scale properties that make object storage useful in the first place.</p><div><hr></div><h2><strong>Key Terms</strong></h2><p>These are the concepts you need to know before we get into the design:</p><h3><strong>Bucket</strong></h3><p>A bucket is a logical container for objects.</p><p>Think of it like a top-level folder, except it&#8217;s not really a folder since there&#8217;s no hierarchy inside it. Bucket names must be globally unique across all system users, not just within your account.</p><p>You have to create a bucket before you can store anything in it.</p><h3><strong>Object</strong></h3><p>An object is an individual piece of data stored in a bucket.</p><p>It has two parts:</p><ul><li><p>Payload is the actual data bytes, which can be anything: a photo, a video, a CSV file, a binary blob.</p></li><li><p>Metadata is a set of key-value pairs that describe an object, such as content type, creation timestamp, custom tags, and anything else the application needs to store alongside the data.</p></li></ul><p>The metadata is stored separately from the payload and is much smaller.</p><h3><strong>Object Key</strong></h3><p>Every object is identified by a key, which is just a string.</p><p>In S3, that key looks like a file path: <code>photos/2024/vacation.jpg</code>. But there are no actual directories. The entire string, slashes and all, is just the key. S3 lets you use slashes as a convention to simulate folders, but under the hood, it&#8217;s still a flat namespace.</p><p>This distinction matters when we design the listing feature later.</p><h3><strong>Versioning</strong></h3><p>Versioning is a bucket-level feature that keeps all previous versions of an object instead of overwriting them.</p><p>When versioning is enabled, uploading an object with the same key as an existing object doesn&#8217;t replace it. Instead, it creates a new version alongside the old one. You can retrieve, restore, or delete any version at any time.</p><p>This protects against accidental overwrites and deletions.</p><h3><strong>Durability SLA</strong></h3><p>S3 Standard storage class is designed for 99.999999999% durability, also known as eleven nines.</p><p>In practical terms, if you store 10 million objects for 10,000 years, you&#8217;d expect to lose one. That&#8217;s not an accident. It comes from specific engineering decisions around replication and error correction, which we&#8217;ll get into in the durability section.</p><div><hr></div><h2><strong>Clarifying Requirements</strong></h2><p>Before touching any design, you need to understand what you&#8217;re actually building and at what scale. Candidates who skip this step in interviews fail immediately, because they end up designing the wrong thing at the wrong scale&#8230;</p><h3><strong>Questions to Ask</strong></h3><ul><li><p>What are the core operations? Upload, download, delete, list?</p></li><li><p>Do we need versioning?</p></li><li><p>How much data do we need to store in year one?</p></li><li><p>What durability and availability targets do we need?</p></li><li><p>Do we need to support large file uploads, such as files over multiple gigabytes?</p></li><li><p>Any access control requirements? Do different users own different buckets?</p></li></ul><h3><strong>Our Assumptions</strong></h3><p>For this design, we&#8217;ll assume:</p><ul><li><p>Core operations: bucket creation, object upload and download, object versioning, and listing objects in a bucket by prefix</p></li><li><p>100 petabytes of total data</p></li><li><p>Durability target: six nines, which is 99.9999%</p></li><li><p>Availability target: four nines, which is 99.99%</p></li><li><p>Must handle both small objects (tens of kilobytes) and large objects (several gigabytes)</p></li></ul><p>Now, let&#8217;s figure out what these requirements actually mean for infrastructure&#8230;</p><div><hr></div><h2><strong>Capacity Estimation</strong></h2><p>Math matters in system design because it tells you what kind of infrastructure you need. Vague statements like <em>&#8220;we&#8217;ll need a distributed database&#8221;</em> aren&#8217;t useful without concrete numbers.</p><p>Let&#8217;s work through the estimates:</p><h3><strong>Object Size Distribution</strong></h3><p>In practice, object storage systems see a mix of object sizes. A reasonable assumption for a general-purpose system:</p><ul><li><p>20% of objects are small, under 1MB, with a median size of about 0.5MB. These might be thumbnails, config files, or short documents.</p></li><li><p>60% of objects are medium-sized, ranging from 1MB to 64MB, with a median of about 32MB. These might be images, audio files, or compressed datasets.</p></li><li><p>20% of objects are large (&gt;64MB), with a median of about 200MB. These might be videos, database backups, or large archives.</p></li></ul><h3><strong>Total Object Count</strong></h3><p>We&#8217;re targeting 100 petabytes of stored data.</p><p>In practice, storage systems don&#8217;t fill to capacity, so let&#8217;s assume 40% utilization, meaning we provision enough storage to hold 100PB when 40% full.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;9f89fa67-a6d0-4507-9c79-287873746c8a&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">100 PB = 10^11 MB

Weighted average object size: (0.2 x 0.5MB) + (0.6 x 32MB) + (0.2 x 200MB)

= 0.1 + 19.2 + 40.0

= 59.3 MB per object (average)

Total objects at 40% utilization: (10^11 x 0.4) / 59.3

= approximately 680 million objects</code></pre></div><p>680 million objects are a lot.</p><p>This tells us immediately that no single machine can store or index all of this. We need distributed storage and a distributed metadata index from day one.</p><h3><strong>Metadata Storage</strong></h3><p>Each object needs a metadata record. If we assume roughly 1KB per record (object name, bucket, timestamps, UUID, tags), then:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;c68210ce-24dd-409c-a907-a5f0d5b3a41c&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">680 million objects x 1KB = ~680GB of metadata</code></pre></div><p>680GB of metadata is manageable in a database, but the access patterns will require sharding as we scale. The metadata store is separate from the data store, which we&#8217;ll cover in the design.</p><h3><strong>IOPS Constraints</strong></h3><p>A standard SATA hard drive<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-5" href="#footnote-5" target="_self">5</a> spinning at 7200 RPM can handle roughly 100-150 random seeks per second.</p><p>This is called IOPS<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-6" href="#footnote-6" target="_self">6</a>, or input/output operations per second. At 680 million objects spread across many disks, the IOPS constraint becomes a real bottleneck, especially for small object workloads where you&#8217;re doing many small reads and writes rather than a few large sequential ones.</p><p>This is one reason we&#8217;ll later choose to merge many small objects into a single larger file on disk, instead of storing each object as its own file.</p><div><hr></div><h2><strong>Design Philosophy: Separating Metadata from Data</strong></h2><p>Before we look at the full architecture, there&#8217;s one core design principle that shapes everything else:<em> metadata and data are stored separately, and for good reason.</em></p><p>This idea comes from how UNIX file systems work:</p><p>In UNIX, when you save a file, the filename and the actual data bytes are not stored together. The filename and other file information (size, permissions, timestamps, disk location) live in a data structure called an inode<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-7" href="#footnote-7" target="_self">7</a>. The actual data bytes live in separate disk blocks that the inode points to.</p><p>Object storage works the same way.</p><p>The metadata store is like the inode layer. It holds the object name, bucket, size, and a UUID that points to the location of the actual bytes. The data store is like the disk. It holds the raw bytes, and it only knows about UUIDs, not names or paths.</p><p>So why separate them?</p><p>Because they have completely different characteristics. The data is immutable. Once written, it never changes. The metadata is mutable. You can update tags, rename objects (in some implementations), or add versioning records. They need different consistency guarantees, different storage engines, and different scaling strategies.</p><p>Keeping them separate lets us optimize each independently.</p><div><hr></div><h2><strong>High-Level Architecture</strong></h2><p>With that principle in mind, here&#8217;s how the full system is structured:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BBL9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ac43646-5c7a-439d-bac8-76dad7c2ae21_1600x1553.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BBL9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ac43646-5c7a-439d-bac8-76dad7c2ae21_1600x1553.png 424w, https://substackcdn.com/image/fetch/$s_!BBL9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ac43646-5c7a-439d-bac8-76dad7c2ae21_1600x1553.png 848w, https://substackcdn.com/image/fetch/$s_!BBL9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ac43646-5c7a-439d-bac8-76dad7c2ae21_1600x1553.png 1272w, https://substackcdn.com/image/fetch/$s_!BBL9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ac43646-5c7a-439d-bac8-76dad7c2ae21_1600x1553.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BBL9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ac43646-5c7a-439d-bac8-76dad7c2ae21_1600x1553.png" width="1456" height="1413" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4ac43646-5c7a-439d-bac8-76dad7c2ae21_1600x1553.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1413,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!BBL9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ac43646-5c7a-439d-bac8-76dad7c2ae21_1600x1553.png 424w, https://substackcdn.com/image/fetch/$s_!BBL9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ac43646-5c7a-439d-bac8-76dad7c2ae21_1600x1553.png 848w, https://substackcdn.com/image/fetch/$s_!BBL9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ac43646-5c7a-439d-bac8-76dad7c2ae21_1600x1553.png 1272w, https://substackcdn.com/image/fetch/$s_!BBL9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ac43646-5c7a-439d-bac8-76dad7c2ae21_1600x1553.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>Components</strong></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ccAA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66ba3358-bf5a-4462-9ad0-9e84be6f0fd3_1600x716.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ccAA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66ba3358-bf5a-4462-9ad0-9e84be6f0fd3_1600x716.png 424w, https://substackcdn.com/image/fetch/$s_!ccAA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66ba3358-bf5a-4462-9ad0-9e84be6f0fd3_1600x716.png 848w, https://substackcdn.com/image/fetch/$s_!ccAA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66ba3358-bf5a-4462-9ad0-9e84be6f0fd3_1600x716.png 1272w, https://substackcdn.com/image/fetch/$s_!ccAA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66ba3358-bf5a-4462-9ad0-9e84be6f0fd3_1600x716.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ccAA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66ba3358-bf5a-4462-9ad0-9e84be6f0fd3_1600x716.png" width="1456" height="652" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/66ba3358-bf5a-4462-9ad0-9e84be6f0fd3_1600x716.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:652,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ccAA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66ba3358-bf5a-4462-9ad0-9e84be6f0fd3_1600x716.png 424w, https://substackcdn.com/image/fetch/$s_!ccAA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66ba3358-bf5a-4462-9ad0-9e84be6f0fd3_1600x716.png 848w, https://substackcdn.com/image/fetch/$s_!ccAA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66ba3358-bf5a-4462-9ad0-9e84be6f0fd3_1600x716.png 1272w, https://substackcdn.com/image/fetch/$s_!ccAA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66ba3358-bf5a-4462-9ad0-9e84be6f0fd3_1600x716.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>Upload Flow</strong></h3><p>Let&#8217;s trace exactly what happens when a user uploads a file named <code>report.pdf</code> to a bucket called <code>company-docs</code>:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6Ae2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8ba7f28-260d-4063-b9e0-64dabd429ea1_1600x1262.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6Ae2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8ba7f28-260d-4063-b9e0-64dabd429ea1_1600x1262.png 424w, https://substackcdn.com/image/fetch/$s_!6Ae2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8ba7f28-260d-4063-b9e0-64dabd429ea1_1600x1262.png 848w, https://substackcdn.com/image/fetch/$s_!6Ae2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8ba7f28-260d-4063-b9e0-64dabd429ea1_1600x1262.png 1272w, https://substackcdn.com/image/fetch/$s_!6Ae2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8ba7f28-260d-4063-b9e0-64dabd429ea1_1600x1262.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6Ae2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8ba7f28-260d-4063-b9e0-64dabd429ea1_1600x1262.png" width="1456" height="1148" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d8ba7f28-260d-4063-b9e0-64dabd429ea1_1600x1262.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1148,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6Ae2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8ba7f28-260d-4063-b9e0-64dabd429ea1_1600x1262.png 424w, https://substackcdn.com/image/fetch/$s_!6Ae2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8ba7f28-260d-4063-b9e0-64dabd429ea1_1600x1262.png 848w, https://substackcdn.com/image/fetch/$s_!6Ae2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8ba7f28-260d-4063-b9e0-64dabd429ea1_1600x1262.png 1272w, https://substackcdn.com/image/fetch/$s_!6Ae2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8ba7f28-260d-4063-b9e0-64dabd429ea1_1600x1262.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ol><li><p>Client sends a <code>HTTP PUT /company-docs/report.pdf</code> request with the file bytes in the request body.</p></li><li><p>Request hits the load balancer and gets routed to one of the API service instances.</p></li><li><p>API service calls the IAM service<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-8" href="#footnote-8" target="_self">8</a> to confirm the user has WRITE permission on the <code>company-docs</code> bucket. If not, the request is rejected immediately with a 403 Forbidden.</p></li><li><p>API service forwards the file bytes to the data store. The data store persists the bytes and returns a UUID, a unique identifier for this specific object.</p></li><li><p>API service then writes a metadata record to the metadata store. This record contains the object name (<code>report.pdf</code>), bucket ID, UUID returned from the data store, file size, creation timestamp, and any metadata tags the user provided.</p></li><li><p>A <code>200 OK</code> response is returned to the client.</p></li></ol><p>The metadata record now serves as the bridge between the human-readable path (<code>company-docs/report.pdf</code>) and the actual bytes stored under the UUID in the data store.</p><h3><strong>Download Flow</strong></h3><p>Now, let&#8217;s trace what happens when someone requests the same file:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ySqR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01ea95e6-c6ae-4944-b89b-80817ef6c827_1600x1282.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ySqR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01ea95e6-c6ae-4944-b89b-80817ef6c827_1600x1282.png 424w, https://substackcdn.com/image/fetch/$s_!ySqR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01ea95e6-c6ae-4944-b89b-80817ef6c827_1600x1282.png 848w, https://substackcdn.com/image/fetch/$s_!ySqR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01ea95e6-c6ae-4944-b89b-80817ef6c827_1600x1282.png 1272w, https://substackcdn.com/image/fetch/$s_!ySqR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01ea95e6-c6ae-4944-b89b-80817ef6c827_1600x1282.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ySqR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01ea95e6-c6ae-4944-b89b-80817ef6c827_1600x1282.png" width="1456" height="1167" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/01ea95e6-c6ae-4944-b89b-80817ef6c827_1600x1282.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1167,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ySqR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01ea95e6-c6ae-4944-b89b-80817ef6c827_1600x1282.png 424w, https://substackcdn.com/image/fetch/$s_!ySqR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01ea95e6-c6ae-4944-b89b-80817ef6c827_1600x1282.png 848w, https://substackcdn.com/image/fetch/$s_!ySqR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01ea95e6-c6ae-4944-b89b-80817ef6c827_1600x1282.png 1272w, https://substackcdn.com/image/fetch/$s_!ySqR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01ea95e6-c6ae-4944-b89b-80817ef6c827_1600x1282.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ol><li><p>Client sends a <code>GET /company-docs/report.pdf</code> request.</p></li><li><p>API service calls IAM to verify READ permission.</p></li><li><p>API service queries the metadata store: <em>&#8220;What is the UUID for the object named report.pdf in the bucket company-docs?&#8221;</em></p></li><li><p>API service then uses that UUID to fetch the actual bytes from the data store.</p></li><li><p>The bytes get returned to the client.</p></li></ol><p>Notice that the data store never knows the file was called <code>report.pdf</code>.</p><p>From its perspective, someone asked for the object with a specific UUID, and it returned the bytes. The translation from name to UUID always happens in the metadata store.</p><div><hr></div><p><em><strong>Reminder: this is a teaser of the subscriber-only newsletter series, exclusive to my golden members.</strong></em></p><p>When you upgrade, you&#8217;ll get:</p><ul><li><p><strong>High-level architecture of real-world systems.</strong></p></li><li><p>Deep dive into how popular real-world systems actually work.</p></li><li><p><strong>How real-world systems handle scale, reliability, and performance.</strong></p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.systemdesign.one/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><h2><strong>Deep Dive: Data Store</strong></h2><p>The data store is where most of the interesting engineering happens&#8230;</p><p>Let&#8217;s break down its internal architecture:</p><h3><strong>Internal Components</strong></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5Oiu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5faa566c-99a2-4140-8087-253c6a27e6ca_1600x765.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5Oiu!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5faa566c-99a2-4140-8087-253c6a27e6ca_1600x765.png 424w, https://substackcdn.com/image/fetch/$s_!5Oiu!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5faa566c-99a2-4140-8087-253c6a27e6ca_1600x765.png 848w, https://substackcdn.com/image/fetch/$s_!5Oiu!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5faa566c-99a2-4140-8087-253c6a27e6ca_1600x765.png 1272w, https://substackcdn.com/image/fetch/$s_!5Oiu!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5faa566c-99a2-4140-8087-253c6a27e6ca_1600x765.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5Oiu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5faa566c-99a2-4140-8087-253c6a27e6ca_1600x765.png" width="1456" height="696" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5faa566c-99a2-4140-8087-253c6a27e6ca_1600x765.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:696,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5Oiu!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5faa566c-99a2-4140-8087-253c6a27e6ca_1600x765.png 424w, https://substackcdn.com/image/fetch/$s_!5Oiu!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5faa566c-99a2-4140-8087-253c6a27e6ca_1600x765.png 848w, https://substackcdn.com/image/fetch/$s_!5Oiu!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5faa566c-99a2-4140-8087-253c6a27e6ca_1600x765.png 1272w, https://substackcdn.com/image/fetch/$s_!5Oiu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5faa566c-99a2-4140-8087-253c6a27e6ca_1600x765.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4><strong>Data Routing Service</strong></h4><p>The data routing service is the entry point into the data store&#8230;</p><p>It&#8217;s stateless, meaning it holds no state itself, so you can scale it horizontally by adding more instances. When a <em>write</em> comes in, the placement service determines which data node should receive the data, then sends the data there.</p><p>When a <em>read</em> comes in, it asks the placement service where the data lives, then fetches it.</p><h4><strong>Placement Service</strong></h4><p>The placement service<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-9" href="#footnote-9" target="_self">9</a> is responsible for knowing the physical layout of the entire storage cluster&#8230;</p><p>It maintains a virtual cluster map<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-10" href="#footnote-10" target="_self">10</a>, which is essentially a registry of every data node, including the rack and availability zone it&#8217;s in, how many disks it has, and how much space is used on each disk.</p><p>Placement service continuously receives heartbeat messages from every data node.</p><p>A heartbeat is a small message a node sends every few seconds, saying, <em>&#8220;I&#8217;m alive, here&#8217;s my current state.&#8221;</em> If the placement service doesn&#8217;t hear from a node within a configurable grace period (typically 15 seconds), it marks that node as down and stops sending new data to it.</p><p>Because the placement service is so critical, you run it as a cluster of 5 or 7 nodes using a consensus algorithm<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-11" href="#footnote-11" target="_self">11</a> like Raft or Paxos.</p><p>A consensus algorithm ensures the cluster agrees on a single consistent view of the world, even if some nodes fail. With a 7-node cluster, you can lose 3 nodes simultaneously, and the service keeps running. With a 5-node cluster, you can lose 2. You never run this as a single instance, because if it goes down, the entire storage cluster becomes unavailable for writes.</p><h4><strong>Data Nodes</strong></h4><p>Data nodes are where the actual bytes live&#8230;</p><p>Each data node manages one or more physical disks. Each node runs a daemon process that sends heartbeats to the placement service with information about disk count and available space. When the placement service receives a heartbeat from a new node it hasn&#8217;t seen before, it assigns that node an ID, adds it to the virtual cluster map, and tells the node where to replicate data.</p><p>Durability is the central promise of object storage and the hardest engineering problem in this design.</p><p>We'll get there.</p><p>But before we can talk about how data survives hardware failures and entire node outages, we need to understand exactly how a write moves through this system.</p><p>The decisions made in the next few steps determine whether durability can be guaranteed&#8230;</p>
      <p>
          <a href="https://newsletter.systemdesign.one/p/aws-s3-system-design">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[The 53 Concepts for Highly Effective Mobile System Design]]></title><description><![CDATA[#135: Part 1 - Client-Server Architecture, Push Notifications, Offline-First, and 16 others.]]></description><link>https://newsletter.systemdesign.one/p/mobile-system-design</link><guid isPermaLink="false">https://newsletter.systemdesign.one/p/mobile-system-design</guid><dc:creator><![CDATA[Shefali Jangid]]></dc:creator><pubDate>Wed, 25 Mar 2026 15:28:41 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/6c0254e3-929f-4131-9c21-1ba491cc0bb3_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Get my system design playbook for FREE on newsletter signup:</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><ul><li><p><em><a href="https://newsletter.systemdesign.one/p/mobile-system-design/?action=share">Share this post</a> &amp; I'll send you some rewards for the referrals.</em></p></li></ul><div><hr></div><p>Since people are always asking me for more &#8220;practical&#8221; system design content:</p><p>Following is the free edition of a <em>premium</em> 3-part newsletter series... This covers how mobile apps work under the hood. And it&#8217;s explained in a simple, realistic, and useful way.</p><p>On with part 1 of the newsletter:</p><div><hr></div><h3><strong><a href="https://blog.sentry.io/monitoring-microservices-distributed-systems-with-sentry/?utm_source=systemdesign&amp;utm_medium=paid-community&amp;utm_campaign=debugging-fy27q1-microservices&amp;utm_content=newsletter-primary-microservices-blog-learnmore">Trying to debug a request that touched 5 services? (Partner)</a></strong></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://blog.sentry.io/monitoring-microservices-distributed-systems-with-sentry/?utm_source=systemdesign&amp;utm_medium=paid-community&amp;utm_campaign=debugging-fy27q1-microservices&amp;utm_content=newsletter-primary-microservices-blog-learnmore" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_dcB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44a91a4e-b8d3-4c6c-84f3-4acab1b4eddf_2520x945.jpeg 424w, https://substackcdn.com/image/fetch/$s_!_dcB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44a91a4e-b8d3-4c6c-84f3-4acab1b4eddf_2520x945.jpeg 848w, https://substackcdn.com/image/fetch/$s_!_dcB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44a91a4e-b8d3-4c6c-84f3-4acab1b4eddf_2520x945.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!_dcB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44a91a4e-b8d3-4c6c-84f3-4acab1b4eddf_2520x945.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_dcB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44a91a4e-b8d3-4c6c-84f3-4acab1b4eddf_2520x945.jpeg" width="1456" height="546" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/44a91a4e-b8d3-4c6c-84f3-4acab1b4eddf_2520x945.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:546,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:181803,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:&quot;https://blog.sentry.io/monitoring-microservices-distributed-systems-with-sentry/?utm_source=systemdesign&amp;utm_medium=paid-community&amp;utm_campaign=debugging-fy27q1-microservices&amp;utm_content=newsletter-primary-microservices-blog-learnmore&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/191673121?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44a91a4e-b8d3-4c6c-84f3-4acab1b4eddf_2520x945.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_dcB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44a91a4e-b8d3-4c6c-84f3-4acab1b4eddf_2520x945.jpeg 424w, https://substackcdn.com/image/fetch/$s_!_dcB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44a91a4e-b8d3-4c6c-84f3-4acab1b4eddf_2520x945.jpeg 848w, https://substackcdn.com/image/fetch/$s_!_dcB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44a91a4e-b8d3-4c6c-84f3-4acab1b4eddf_2520x945.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!_dcB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44a91a4e-b8d3-4c6c-84f3-4acab1b4eddf_2520x945.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Distributed systems help you move faster&#8230;until you have to debug them.</p><p>This <a href="https://blog.sentry.io/monitoring-microservices-distributed-systems-with-sentry/?utm_source=systemdesign&amp;utm_medium=paid-community&amp;utm_campaign=debugging-fy27q1-microservices&amp;utm_content=newsletter-primary-microservices-blog-learnmore">blog</a> shows how to use Sentry tracing and logging to follow a request end-to-end. You don&#8217;t need prior microservices experience to follow these steps.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.sentry.io/monitoring-microservices-distributed-systems-with-sentry/?utm_source=systemdesign&amp;utm_medium=paid-community&amp;utm_campaign=debugging-fy27q1-microservices&amp;utm_content=newsletter-primary-microservices-blog-learnmore&quot;,&quot;text&quot;:&quot;Read the blog and start fixing&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://blog.sentry.io/monitoring-microservices-distributed-systems-with-sentry/?utm_source=systemdesign&amp;utm_medium=paid-community&amp;utm_campaign=debugging-fy27q1-microservices&amp;utm_content=newsletter-primary-microservices-blog-learnmore"><span>Read the blog and start fixing</span></a></p><p>(Thanks to <a href="https://blog.sentry.io/monitoring-microservices-distributed-systems-with-sentry/?utm_source=systemdesign&amp;utm_medium=paid-community&amp;utm_campaign=debugging-fy27q1-microservices&amp;utm_content=newsletter-primary-microservices-blog-learnmore">Sentry</a> for partnering on this post.)</p><div><hr></div><p>I want to reintroduce <a href="https://x.com/Shefali__J">Shefali Jangid</a> as a guest author.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://x.com/Shefali__J" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9raK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd58707-3a96-4850-a5f0-6e3bd8429d06_1200x630.png 424w, https://substackcdn.com/image/fetch/$s_!9raK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd58707-3a96-4850-a5f0-6e3bd8429d06_1200x630.png 848w, https://substackcdn.com/image/fetch/$s_!9raK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd58707-3a96-4850-a5f0-6e3bd8429d06_1200x630.png 1272w, https://substackcdn.com/image/fetch/$s_!9raK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd58707-3a96-4850-a5f0-6e3bd8429d06_1200x630.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9raK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd58707-3a96-4850-a5f0-6e3bd8429d06_1200x630.png" width="1200" height="630" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/acd58707-3a96-4850-a5f0-6e3bd8429d06_1200x630.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:630,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:152918,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://x.com/Shefali__J&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/178020218?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd58707-3a96-4850-a5f0-6e3bd8429d06_1200x630.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!9raK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd58707-3a96-4850-a5f0-6e3bd8429d06_1200x630.png 424w, https://substackcdn.com/image/fetch/$s_!9raK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd58707-3a96-4850-a5f0-6e3bd8429d06_1200x630.png 848w, https://substackcdn.com/image/fetch/$s_!9raK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd58707-3a96-4850-a5f0-6e3bd8429d06_1200x630.png 1272w, https://substackcdn.com/image/fetch/$s_!9raK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd58707-3a96-4850-a5f0-6e3bd8429d06_1200x630.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>She&#8217;s a web developer, technical writer, and content creator with a love for frontend architecture and building things that scale.</p><p>Check out her work and socials:</p><ul><li><p><a href="https://shefali.dev/">Shefali.dev</a></p></li><li><p><a href="https://github.com/WebdevShefali">GitHub</a></p></li><li><p><a href="https://x.com/Shefali__J">Twitter</a></p></li></ul><p>You&#8217;ll often find her writing about web development, sharing UI tips, and building tools that make developers&#8217; lives easier.</p><div><hr></div><p>Building &#8220;scalable mobile apps&#8221; is not just about writing frontend code and calling APIs&#8230;</p><p>It&#8217;s actually one of the toughest parts of software engineering!</p><p>Mobile apps don&#8217;t run in ideal environments. The network can go down, batteries run low, the operating system can close your app in the background, devices vary, and users still expect everything to work instantly, even offline.</p><p>When building mobile apps, you always have to think about limits. Every decision affects speed, stability, data safety, security, and user experience simultaneously. And making the right choices turns a simple app into a strong, production-ready package.</p><p>In this newsletter, we&#8217;ll focus on 19 concepts as the foundation:</p><ol><li><p>Client-Server Architecture,</p></li><li><p>WebSockets &amp; Persistent Connections,</p></li><li><p>Push Notifications,</p></li><li><p>Polling, Long Polling &amp; SSE,</p></li><li><p>REST vs GraphQL vs gRPC,</p></li><li><p>Network Resilience,</p></li><li><p>Idempotency in APIs,</p></li><li><p>Request Batching &amp; Payload Optimisation,</p></li><li><p>Resumable Uploads,</p></li><li><p>Handling Intermittent Connectivity,</p></li><li><p>On-Device Caching,</p></li><li><p>HTTP Caching (ETag, Cache-Control),</p></li><li><p>Offline-First Architecture,</p></li><li><p>CDN Strategy &amp; Media Optimisation,</p></li><li><p>Cache Invalidation Strategies,</p></li><li><p>Local Database Design,</p></li><li><p>Schema Migration Strategy,</p></li><li><p>Pagination (Cursor vs Offset vs Page Number),</p></li><li><p>Data Modeling for Mobile Constraints.</p></li></ol><p>(&#8230;and much more in parts 2 &amp; 3!)</p><p>For each concept, I&#8217;ll cover:</p><ul><li><p>What it is and how it works</p></li><li><p>Real-world example</p></li><li><p>The tradeoffs</p></li><li><p>Why it matters for mobile</p></li></ul><p>But before we get into the concepts, let&#8217;s first understand what mobile system design really means&#8230;</p><h4><strong>What is Mobile System Design?</strong></h4><p>Mobile system design is about planning how a mobile app works in the background.</p><p>It includes how the app connects to servers, saves and reads data, stays fast, keeps user data safe, and handles errors.</p><p>It&#8217;s not just about designing screens. It&#8217;s about understanding how data flows within the app, what happens when the network goes down, how the app runs in the background, and how it works across different devices.</p><p>Now that we understand what mobile system design means, let&#8217;s start with one of the most important parts: <em>how your app connects and communicates with the outside world.</em></p><div><hr></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Get the full premium newsletter series and max your system design career leverage:</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h1><strong>Networking &amp; Real-Time Communication</strong></h1><p>This section explains how a mobile app connects to servers and handles real-time data&#8230;</p><p>You&#8217;ll learn how apps send and receive data, choose the right APIs, and handle slow or unstable networks so the app stays fast and reliable.</p><h3><strong>1. Client-Server Architecture (Thick vs Thin Clients)</strong></h3><p>When building a mobile app, an important question to ask is: <em>how much work should the phone handle, and how much should the server do?</em></p><p>There are two main approaches:</p><p><strong>a. Thick Client</strong></p><p>A thick client means the app does most of the work on the device.</p><p>It can:</p><ul><li><p>Save data on the phone</p></li><li><p>Handle some logic by itself</p></li><li><p>Sync data later in the background</p></li><li><p>Work without network connectivity</p></li></ul><p>Most modern apps use this approach because:</p><ul><li><p>The network is not always stable</p></li><li><p>Phones are powerful now</p></li><li><p>Users expect apps to work offline</p></li></ul><p><strong>b. Thin Client</strong></p><p>A thin client means the app does very little on the device.</p><p>It:</p><ul><li><p>Sends requests to the server</p></li><li><p>Waits for the server to process everything</p></li><li><p>Shows whatever the server returns</p></li></ul><p>If the network fails, the app mostly stops working.</p><h4><strong>Why It Matters</strong></h4><p>This decision affects your entire app design:</p><ul><li><p>Whether your app can work offline</p></li><li><p>How difficult syncing will be</p></li><li><p>How easily you can update features</p></li><li><p>How much important logic stays on the server</p></li></ul><h4><strong>Real-World Example</strong></h4><ul><li><p>Gmail is a &#8220;thick&#8221; client. You can read emails and write drafts even without network connectivity. It syncs later.</p></li><li><p>A simple app that only loads a website is a thin client. It&#8217;s useless without the network.</p></li></ul><h4><strong>Trade-offs</strong></h4><ul><li><p><strong>Thick client:</strong> Faster and works offline, but syncing and updates can be harder.</p></li><li><p><strong>Thin client:</strong> Easy to update and simpler to build, but it doesn&#8217;t work well with a bad network.</p></li></ul><h4><strong>Practical Advice</strong></h4><p>For most apps:</p><ul><li><p>Keep the main app logic thick (use local storage and background sync).</p></li><li><p>Keep sensitive logic on the server (like payments, login, pricing, fraud detection).</p></li></ul><p>This gives you the best balance between speed, reliability, and security.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-Iek!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0be310ca-6d34-4958-ac62-e0634722213b_1600x716.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-Iek!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0be310ca-6d34-4958-ac62-e0634722213b_1600x716.png 424w, https://substackcdn.com/image/fetch/$s_!-Iek!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0be310ca-6d34-4958-ac62-e0634722213b_1600x716.png 848w, https://substackcdn.com/image/fetch/$s_!-Iek!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0be310ca-6d34-4958-ac62-e0634722213b_1600x716.png 1272w, https://substackcdn.com/image/fetch/$s_!-Iek!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0be310ca-6d34-4958-ac62-e0634722213b_1600x716.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-Iek!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0be310ca-6d34-4958-ac62-e0634722213b_1600x716.png" width="1456" height="652" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0be310ca-6d34-4958-ac62-e0634722213b_1600x716.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:652,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-Iek!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0be310ca-6d34-4958-ac62-e0634722213b_1600x716.png 424w, https://substackcdn.com/image/fetch/$s_!-Iek!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0be310ca-6d34-4958-ac62-e0634722213b_1600x716.png 848w, https://substackcdn.com/image/fetch/$s_!-Iek!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0be310ca-6d34-4958-ac62-e0634722213b_1600x716.png 1272w, https://substackcdn.com/image/fetch/$s_!-Iek!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0be310ca-6d34-4958-ac62-e0634722213b_1600x716.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Once you decide what the app and the server should handle, the next question to ask is: <em>How they should communicate, especially when you need real-time updates?</em></p><h3><strong>2. WebSockets &amp; Persistent Connections</strong></h3><p>Usually, mobile apps use HTTP requests to talk to a server.</p><p>That means:</p><ol><li><p>App asks for data</p></li><li><p>Server sends a response</p></li><li><p>Connection closes</p></li></ol><p>If the app needs new data, it has to ask again.</p><p><em>WebSocket</em> <em>works differently&#8230;</em></p><p>It creates a long-lasting connection between the app and the server.</p><p>Once connected:</p><ul><li><p>App can send data anytime</p></li><li><p>Server can also send data anytime</p></li><li><p>Connection stays open</p></li></ul><p>This is called <strong>real-time communication</strong>.</p><h4><strong>Why It Matters</strong></h4><p>With normal HTTP, the app must keep asking:</p><ul><li><p><em>&#8220;Any new messages?&#8221;</em></p></li><li><p><em>&#8220;Any updates?&#8221;</em></p></li></ul><p>With WebSockets, the server sends updates immediately when something happens.</p><p>This is very important for:</p><ul><li><p>Chat apps</p></li><li><p>Live sports scores</p></li><li><p>Stock price updates</p></li><li><p>Multiplayer games</p></li><li><p>Collaborative tools</p></li></ul><p>If updates are slow, users feel like the app is broken.</p><h4><strong>Real-World Example</strong></h4><p>Apps like Slack, WhatsApp, and Figma use WebSockets.</p><p>When someone sends a message, you see it instantly because the server pushes it directly to your app.</p><h4><strong>Important Things to Remember (For Mobile)</strong></h4><p>On mobile devices, keeping a connection open all the time can drain the battery, use more data, and cause the operating system to close it.</p><p>Instead:</p><ul><li><p>Send small &#8216;heartbeat&#8217; signals to keep it alive</p></li><li><p>Reconnect if the network drops</p></li><li><p>Close the connection when the app goes to the background</p></li></ul><p>Otherwise, the connection may appear active but stop working.</p><h4><strong>How It Works</strong></h4><ol><li><p>App starts with a normal HTTP request.</p></li><li><p>Server upgrades it to a WebSocket connection.</p></li><li><p>After that, both sides can send messages at any time.</p></li><li><p>Connection stays open until one side closes it.</p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5DET!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f987c94-15ac-4024-9396-64f7a879cdad_1600x706.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5DET!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f987c94-15ac-4024-9396-64f7a879cdad_1600x706.png 424w, https://substackcdn.com/image/fetch/$s_!5DET!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f987c94-15ac-4024-9396-64f7a879cdad_1600x706.png 848w, https://substackcdn.com/image/fetch/$s_!5DET!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f987c94-15ac-4024-9396-64f7a879cdad_1600x706.png 1272w, https://substackcdn.com/image/fetch/$s_!5DET!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f987c94-15ac-4024-9396-64f7a879cdad_1600x706.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5DET!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f987c94-15ac-4024-9396-64f7a879cdad_1600x706.png" width="1456" height="642" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2f987c94-15ac-4024-9396-64f7a879cdad_1600x706.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:642,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5DET!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f987c94-15ac-4024-9396-64f7a879cdad_1600x706.png 424w, https://substackcdn.com/image/fetch/$s_!5DET!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f987c94-15ac-4024-9396-64f7a879cdad_1600x706.png 848w, https://substackcdn.com/image/fetch/$s_!5DET!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f987c94-15ac-4024-9396-64f7a879cdad_1600x706.png 1272w, https://substackcdn.com/image/fetch/$s_!5DET!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f987c94-15ac-4024-9396-64f7a879cdad_1600x706.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4><strong>Trade-offs</strong></h4><p><strong>Pros</strong></p><ul><li><p>Very fast (low latency)</p></li><li><p>Instant updates</p></li><li><p>Two-way communication</p></li></ul><p><strong>Cons</strong></p><ul><li><p>Harder to manage</p></li><li><p>Needs reconnect logic</p></li><li><p>Can affect the battery if not handled properly</p></li></ul><p>In simple words, use WebSockets when your app needs instant updates. Just manage the connection carefully on mobile devices.</p><p>But not all updates require a constant open connection.</p><p><em>What if the app is completely closed?</em></p><p>That&#8217;s when push notifications become useful&#8230;</p><h3><strong>3. Push Notifications (APNs &amp; FCM)</strong></h3><p>Push notifications let your app send updates, such as <em>&#8220;Your order has been picked up&#8221;, &#8220;You got a new message&#8221;, &#8220;Your payment was successful&#8221;,</em> to users even when the app is closed.</p><p>But here&#8217;s the important part: Your app does not stay connected to your server all the time.</p><p>Instead, special services handle this for you:</p><ul><li><p><a href="https://en.wikipedia.org/wiki/Apple_Push_Notification_service">APNs (Apple Push Notification Service)</a> for iPhones</p></li><li><p><a href="https://firebase.google.com/docs/cloud-messaging">FCM (Firebase Cloud Messaging)</a> for Android</p></li></ul><p>These services act like messengers between your server and the user&#8217;s phone.</p><h4><strong>How It Works</strong></h4><ol><li><p>Something happens on the server (e.g., a new message).</p></li><li><p>Server sends the notification to APNs (iOS) or FCM (Android).</p></li><li><p>APNs/FCM deliver it to the user&#8217;s device.</p></li><li><p>The mobile phone&#8217;s operating system shows the notification.</p></li><li><p>If needed, the app wakes up in the background.</p></li></ol><p>Your app does NOT keep a permanent connection.</p><p>The phone&#8217;s operating system manages a &#8220;single&#8221; shared connection for all apps to conserve battery.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!G0e8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0af50997-a93d-4bdf-bf8c-eacb6bb6ee82_1600x781.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!G0e8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0af50997-a93d-4bdf-bf8c-eacb6bb6ee82_1600x781.png 424w, https://substackcdn.com/image/fetch/$s_!G0e8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0af50997-a93d-4bdf-bf8c-eacb6bb6ee82_1600x781.png 848w, https://substackcdn.com/image/fetch/$s_!G0e8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0af50997-a93d-4bdf-bf8c-eacb6bb6ee82_1600x781.png 1272w, https://substackcdn.com/image/fetch/$s_!G0e8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0af50997-a93d-4bdf-bf8c-eacb6bb6ee82_1600x781.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!G0e8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0af50997-a93d-4bdf-bf8c-eacb6bb6ee82_1600x781.png" width="1456" height="711" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0af50997-a93d-4bdf-bf8c-eacb6bb6ee82_1600x781.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:711,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!G0e8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0af50997-a93d-4bdf-bf8c-eacb6bb6ee82_1600x781.png 424w, https://substackcdn.com/image/fetch/$s_!G0e8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0af50997-a93d-4bdf-bf8c-eacb6bb6ee82_1600x781.png 848w, https://substackcdn.com/image/fetch/$s_!G0e8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0af50997-a93d-4bdf-bf8c-eacb6bb6ee82_1600x781.png 1272w, https://substackcdn.com/image/fetch/$s_!G0e8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0af50997-a93d-4bdf-bf8c-eacb6bb6ee82_1600x781.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4><strong>Why It Matters</strong></h4><p>Without push notifications, your app would need to stay connected all the time and keep checking for new updates. Neither iOS nor Android allows this because it drains battery.</p><p>Push services solve this problem in a battery-efficient way.</p><h4><strong>Real-World Example</strong></h4><p>Imagine a food delivery app. When your order is picked up:</p><ol><li><p>The backend sends a push notification using FCM or APNs.</p></li><li><p>Your phone receives it instantly.</p></li><li><p>Even if the app is fully closed, you still get the update.</p></li></ol><h4><strong>Trade-offs</strong></h4><p><strong>Pros</strong></p><ul><li><p>Saves battery</p></li><li><p>Works even if the app is closed</p></li><li><p>Delivers updates quickly</p></li><li><p>No need for a constant background connection</p></li></ul><p><strong>Cons</strong></p><ul><li><p>Delivery is not 100% guaranteed</p></li><li><p>Can fail on a poor network</p></li><li><p>Limited message size</p></li><li><p>Should not be used for very critical data</p></li></ul><p>You should never assume the user received the notification. Always refresh or sync data when the app opens.</p><h4><strong>Pro Tip</strong></h4><p>You can send a <strong>silent push notification</strong> (background push).</p><p>This shows nothing to the user. It simply wakes the app in the background to fetch fresh data. So when the user opens the app, everything updates without a loading spinner.</p><p>Push notifications are useful for sending updates to users even when the app is closed. But in many cases, the app itself needs to fetch updates from the server while it&#8217;s running.</p><p>In such situations, use techniques such as polling, long polling, or Server-Sent Events.</p><h3><strong>4. Polling, Long Polling &amp; SSE</strong></h3><p>Sometimes apps need server updates, but using WebSockets isn&#8217;t always necessary or practical. In those cases, use simpler methods such as polling, Long Polling, or Server-Sent Events (<strong>SSE</strong>) to receive updates from the server.</p><p>Each method works a little differently.</p><h4><strong>Why It Matters</strong></h4><p>Not every feature needs instant real-time updates:</p><ul><li><p>Metrics dashboard might refresh every 30 seconds.</p></li><li><p>Notification count might update every few seconds.</p></li></ul><p>Using WebSockets for these cases can be unnecessary and complex. Choosing the right approach can save server resources and keep the system simple.</p><p><strong>a. Polling (Short Polling)</strong></p><p>Polling is the simplest method.</p><p>The app requests updates from the server at fixed intervals:</p><ol><li><p>Every 10 seconds, the app sends a request.</p></li><li><p>The server responds with new data (if any).</p></li></ol><p>So the process looks like this:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dyjt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fefd3aa39-f851-4348-97a6-7e3a90216265_1600x406.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dyjt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fefd3aa39-f851-4348-97a6-7e3a90216265_1600x406.png 424w, https://substackcdn.com/image/fetch/$s_!dyjt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fefd3aa39-f851-4348-97a6-7e3a90216265_1600x406.png 848w, https://substackcdn.com/image/fetch/$s_!dyjt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fefd3aa39-f851-4348-97a6-7e3a90216265_1600x406.png 1272w, https://substackcdn.com/image/fetch/$s_!dyjt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fefd3aa39-f851-4348-97a6-7e3a90216265_1600x406.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dyjt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fefd3aa39-f851-4348-97a6-7e3a90216265_1600x406.png" width="1456" height="369" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/efd3aa39-f851-4348-97a6-7e3a90216265_1600x406.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:369,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dyjt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fefd3aa39-f851-4348-97a6-7e3a90216265_1600x406.png 424w, https://substackcdn.com/image/fetch/$s_!dyjt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fefd3aa39-f851-4348-97a6-7e3a90216265_1600x406.png 848w, https://substackcdn.com/image/fetch/$s_!dyjt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fefd3aa39-f851-4348-97a6-7e3a90216265_1600x406.png 1272w, https://substackcdn.com/image/fetch/$s_!dyjt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fefd3aa39-f851-4348-97a6-7e3a90216265_1600x406.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Problem:</strong></p><p>Most of the time, there's no new data, yet the app keeps asking the server. This wastes network and server resources.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!k5p9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc571bff7-e3b3-4f8f-a376-7608d59bf66a_1094x641.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!k5p9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc571bff7-e3b3-4f8f-a376-7608d59bf66a_1094x641.png 424w, https://substackcdn.com/image/fetch/$s_!k5p9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc571bff7-e3b3-4f8f-a376-7608d59bf66a_1094x641.png 848w, https://substackcdn.com/image/fetch/$s_!k5p9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc571bff7-e3b3-4f8f-a376-7608d59bf66a_1094x641.png 1272w, https://substackcdn.com/image/fetch/$s_!k5p9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc571bff7-e3b3-4f8f-a376-7608d59bf66a_1094x641.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!k5p9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc571bff7-e3b3-4f8f-a376-7608d59bf66a_1094x641.png" width="1094" height="641" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c571bff7-e3b3-4f8f-a376-7608d59bf66a_1094x641.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:641,&quot;width&quot;:1094,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!k5p9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc571bff7-e3b3-4f8f-a376-7608d59bf66a_1094x641.png 424w, https://substackcdn.com/image/fetch/$s_!k5p9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc571bff7-e3b3-4f8f-a376-7608d59bf66a_1094x641.png 848w, https://substackcdn.com/image/fetch/$s_!k5p9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc571bff7-e3b3-4f8f-a376-7608d59bf66a_1094x641.png 1272w, https://substackcdn.com/image/fetch/$s_!k5p9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc571bff7-e3b3-4f8f-a376-7608d59bf66a_1094x641.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>b. Long Polling</strong></p><p>Long polling improves on basic polling.</p><p>Instead of responding immediately, the server keeps the request open until new data is available.</p><p>How it works:</p><ol><li><p>App sends a request.</p></li><li><p>Server waits until something changes.</p></li><li><p>When new data arrives, the server sends the response.</p></li><li><p>The app immediately sends another request.</p></li></ol><p>This reduces unnecessary requests compared to short polling.</p><p>Yet it still requires managing many open connections on the server.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZJ5Y!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2847992c-09d8-4d49-9a9b-e32b07c95e38_1404x661.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZJ5Y!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2847992c-09d8-4d49-9a9b-e32b07c95e38_1404x661.png 424w, https://substackcdn.com/image/fetch/$s_!ZJ5Y!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2847992c-09d8-4d49-9a9b-e32b07c95e38_1404x661.png 848w, https://substackcdn.com/image/fetch/$s_!ZJ5Y!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2847992c-09d8-4d49-9a9b-e32b07c95e38_1404x661.png 1272w, https://substackcdn.com/image/fetch/$s_!ZJ5Y!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2847992c-09d8-4d49-9a9b-e32b07c95e38_1404x661.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZJ5Y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2847992c-09d8-4d49-9a9b-e32b07c95e38_1404x661.png" width="1404" height="661" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2847992c-09d8-4d49-9a9b-e32b07c95e38_1404x661.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:661,&quot;width&quot;:1404,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZJ5Y!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2847992c-09d8-4d49-9a9b-e32b07c95e38_1404x661.png 424w, https://substackcdn.com/image/fetch/$s_!ZJ5Y!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2847992c-09d8-4d49-9a9b-e32b07c95e38_1404x661.png 848w, https://substackcdn.com/image/fetch/$s_!ZJ5Y!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2847992c-09d8-4d49-9a9b-e32b07c95e38_1404x661.png 1272w, https://substackcdn.com/image/fetch/$s_!ZJ5Y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2847992c-09d8-4d49-9a9b-e32b07c95e38_1404x661.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>c. Server-Sent Events (SSE)</strong></p><p>SSE allows the server to push updates to the client over a single open HTTP connection.</p><p>How it works:</p><ol><li><p>Client sends a request.</p></li><li><p>Connection stays open.</p></li><li><p>Server sends updates whenever something happens.</p></li></ol><p>The client keeps listening for these updates.</p><p>SSE also has automatic reconnection built into browsers through the <a href="https://developer.mozilla.org/en-US/docs/Web/API/EventSource">EventSource API</a>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Y2ru!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf6b5441-56db-461c-83cd-c59b0f82aa93_1600x681.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Y2ru!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf6b5441-56db-461c-83cd-c59b0f82aa93_1600x681.png 424w, https://substackcdn.com/image/fetch/$s_!Y2ru!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf6b5441-56db-461c-83cd-c59b0f82aa93_1600x681.png 848w, https://substackcdn.com/image/fetch/$s_!Y2ru!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf6b5441-56db-461c-83cd-c59b0f82aa93_1600x681.png 1272w, https://substackcdn.com/image/fetch/$s_!Y2ru!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf6b5441-56db-461c-83cd-c59b0f82aa93_1600x681.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Y2ru!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf6b5441-56db-461c-83cd-c59b0f82aa93_1600x681.png" width="1456" height="620" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/af6b5441-56db-461c-83cd-c59b0f82aa93_1600x681.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:620,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Y2ru!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf6b5441-56db-461c-83cd-c59b0f82aa93_1600x681.png 424w, https://substackcdn.com/image/fetch/$s_!Y2ru!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf6b5441-56db-461c-83cd-c59b0f82aa93_1600x681.png 848w, https://substackcdn.com/image/fetch/$s_!Y2ru!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf6b5441-56db-461c-83cd-c59b0f82aa93_1600x681.png 1272w, https://substackcdn.com/image/fetch/$s_!Y2ru!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf6b5441-56db-461c-83cd-c59b0f82aa93_1600x681.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4><strong>Real-World Example</strong></h4><ul><li><p>GitHub uses SSE to stream live action logs.</p></li><li><p>Many analytics dashboards use polling every few seconds.</p></li><li><p>Early versions of Twitter used long polling for real-time updates.</p></li></ul><h4><strong>Trade-offs</strong></h4><ul><li><p>Short Polling: Simple to implement, but wastes resources if there are no updates.</p></li><li><p>Long Polling: More efficient than polling, but harder to manage on the server.</p></li><li><p>SSE: Efficient for real-time updates and simple compared to WebSockets, but communication is one-way (server &#8594; client only).</p></li></ul><p><strong>The bottom line:</strong></p><ul><li><p><strong>Polling:</strong> Client repeatedly asks for updates.</p></li><li><p><strong>Long Polling:</strong> Server waits until new data appears.</p></li><li><p><strong>SSE:</strong> Server pushes updates over a single open connection.</p></li></ul><p>These approaches are useful when WebSockets are unnecessary or unavailable, which helps you keep the system simple and efficient.</p><p>Once you understand how apps receive updates from the server, the next step is deciding how the app requests data from the server.</p><p>This is where API design becomes important.</p><p>APIs define how the mobile app and server communicate, what data is requested, and how that data is returned.</p><h3><strong>5. REST vs GraphQL vs gRPC</strong></h3><p>There are different ways to design APIs.</p><p>Three common approaches are <em>REST, GraphQL, and gRPC</em>. Each one has its own advantages depending on the type of app you&#8217;re building.</p><h4><strong>REST</strong></h4><p>REST (Representational State Transfer) is the most common and widely used API style:</p><ol><li><p>App calls specific endpoints like <code>/users, /orders, /products</code>.</p></li><li><p>Server returns predefined data.</p></li></ol><p>It is simple and easy to understand, which is why many mobile apps use it.</p><p>But REST can sometimes return too much or too little data.</p><p>For example, a screen needs only the user&#8217;s name and profile photo, but the API returns email, address, preferences, and more. This is called <strong>over-fetching</strong>.</p><p>Sometimes the opposite happens: the app needs multiple API requests to load one screen. This is called <strong>under-fetching</strong>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zVqQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd964e467-df7d-4b63-b1b1-90c1b78c4eed_1344x459.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zVqQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd964e467-df7d-4b63-b1b1-90c1b78c4eed_1344x459.png 424w, https://substackcdn.com/image/fetch/$s_!zVqQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd964e467-df7d-4b63-b1b1-90c1b78c4eed_1344x459.png 848w, https://substackcdn.com/image/fetch/$s_!zVqQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd964e467-df7d-4b63-b1b1-90c1b78c4eed_1344x459.png 1272w, https://substackcdn.com/image/fetch/$s_!zVqQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd964e467-df7d-4b63-b1b1-90c1b78c4eed_1344x459.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zVqQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd964e467-df7d-4b63-b1b1-90c1b78c4eed_1344x459.png" width="1344" height="459" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d964e467-df7d-4b63-b1b1-90c1b78c4eed_1344x459.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:459,&quot;width&quot;:1344,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zVqQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd964e467-df7d-4b63-b1b1-90c1b78c4eed_1344x459.png 424w, https://substackcdn.com/image/fetch/$s_!zVqQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd964e467-df7d-4b63-b1b1-90c1b78c4eed_1344x459.png 848w, https://substackcdn.com/image/fetch/$s_!zVqQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd964e467-df7d-4b63-b1b1-90c1b78c4eed_1344x459.png 1272w, https://substackcdn.com/image/fetch/$s_!zVqQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd964e467-df7d-4b63-b1b1-90c1b78c4eed_1344x459.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4><strong>GraphQL</strong></h4><p>GraphQL was designed to address the problems of over-fetching and under-fetching.</p><p>With GraphQL:</p><ol><li><p>Client asks for exactly the data it needs.</p></li><li><p>Server returns only that data.</p></li></ol><p>Example request:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;6fe40942-3070-44e3-bff0-d2742e7a0486&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">GetUser {
  name
  profilePhoto
}</code></pre></div><p>This makes GraphQL very useful for complex mobile interfaces where different screens need different types of data. Yet GraphQL can be harder to cache and secure compared to REST.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jSpL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66f14610-031e-4746-af5c-e674daa7a769_1464x450.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jSpL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66f14610-031e-4746-af5c-e674daa7a769_1464x450.png 424w, https://substackcdn.com/image/fetch/$s_!jSpL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66f14610-031e-4746-af5c-e674daa7a769_1464x450.png 848w, https://substackcdn.com/image/fetch/$s_!jSpL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66f14610-031e-4746-af5c-e674daa7a769_1464x450.png 1272w, https://substackcdn.com/image/fetch/$s_!jSpL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66f14610-031e-4746-af5c-e674daa7a769_1464x450.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jSpL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66f14610-031e-4746-af5c-e674daa7a769_1464x450.png" width="1456" height="448" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/66f14610-031e-4746-af5c-e674daa7a769_1464x450.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:448,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jSpL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66f14610-031e-4746-af5c-e674daa7a769_1464x450.png 424w, https://substackcdn.com/image/fetch/$s_!jSpL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66f14610-031e-4746-af5c-e674daa7a769_1464x450.png 848w, https://substackcdn.com/image/fetch/$s_!jSpL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66f14610-031e-4746-af5c-e674daa7a769_1464x450.png 1272w, https://substackcdn.com/image/fetch/$s_!jSpL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66f14610-031e-4746-af5c-e674daa7a769_1464x450.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4><strong>gRPC</strong></h4><p>gRPC is another approach focused on high performance.</p><p>Instead of sending JSON data like REST or GraphQL, gRPC uses a binary format called <strong>Protocol Buffers</strong>, which is smaller and faster.</p><p>It runs over HTTP/2, which allows faster communication between services. Because of this, gRPC is often used for internal microservices, backend communication, and high-performance mobile systems.</p><p>But it requires special tooling and is less common for public APIs.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SnKU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b3bfc60-15f7-4db0-82d6-c0b0fa0390df_1600x438.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SnKU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b3bfc60-15f7-4db0-82d6-c0b0fa0390df_1600x438.png 424w, https://substackcdn.com/image/fetch/$s_!SnKU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b3bfc60-15f7-4db0-82d6-c0b0fa0390df_1600x438.png 848w, https://substackcdn.com/image/fetch/$s_!SnKU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b3bfc60-15f7-4db0-82d6-c0b0fa0390df_1600x438.png 1272w, https://substackcdn.com/image/fetch/$s_!SnKU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b3bfc60-15f7-4db0-82d6-c0b0fa0390df_1600x438.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SnKU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b3bfc60-15f7-4db0-82d6-c0b0fa0390df_1600x438.png" width="1456" height="399" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0b3bfc60-15f7-4db0-82d6-c0b0fa0390df_1600x438.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:399,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!SnKU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b3bfc60-15f7-4db0-82d6-c0b0fa0390df_1600x438.png 424w, https://substackcdn.com/image/fetch/$s_!SnKU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b3bfc60-15f7-4db0-82d6-c0b0fa0390df_1600x438.png 848w, https://substackcdn.com/image/fetch/$s_!SnKU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b3bfc60-15f7-4db0-82d6-c0b0fa0390df_1600x438.png 1272w, https://substackcdn.com/image/fetch/$s_!SnKU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b3bfc60-15f7-4db0-82d6-c0b0fa0390df_1600x438.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4><strong>Why It Matters</strong></h4><p>The API style you choose affects network speed, battery usage, number of requests, and app performance.</p><p>If an API returns too much data, it wastes bandwidth and battery. If it returns too little, the app needs more requests, which increases latency. Choosing the right approach helps keep the app fast and efficient.</p><h4><strong>Real-World Example</strong></h4><ul><li><p>Facebook created GraphQL because its mobile apps needed many REST requests to load a single screen.</p></li><li><p>GitHub moved its public API to GraphQL.</p></li><li><p>Google uses gRPC internally for inter-service communication.</p></li></ul><h4><strong>Trade-offs</strong></h4><ul><li><p>REST: Simple, widely supported, and easy to cache, but sometimes returns too much or too little data.</p></li><li><p>GraphQL:  Clients request exactly the data they need and great for complex UIs, but caching and security can be more complicated.</p></li><li><p>gRPC: Very fast, efficient, and great for internal services, but requires HTTP/2 and special tooling.</p></li></ul><h4><strong>Practical Advice</strong></h4><p>A common approach in real systems is:</p><ul><li><p>REST for simple CRUD APIs.</p></li><li><p>GraphQL for complex UI data requirements.</p></li><li><p>gRPC for internal communication between services where performance matters most.</p></li></ul><p>Once you decide how the app will request data from the server, the next step is making sure those requests still work properly when the internet connection is unstable.</p><h3><strong>6. Network Resilience (Exponential Backoff &amp; Retry Strategy)</strong></h3><p>Mobile networks can be slow, unstable, or unavailable.</p><p>As a result, requests may fail due to weak signals, timeouts, or temporary server issues. Instead of failing immediately, most apps retry the request after a short delay.</p><p>Yet retrying requests often can cause serious problems.</p><h4><strong>Why It Matters</strong></h4><p>Imagine thousands of mobile apps trying to reconnect at the same time after a server outage.</p><p>If every app retries instantly, the server suddenly receives a huge number of requests at once. This can overload the server again, causing another failure. This situation is called the <strong>thundering herd problem,</strong> when many clients try at the same time and overwhelm the system.</p><p>A good retry strategy spreads these retries over time so the server can recover&#8230;</p><h4><strong>Exponential Backoff (The Common Solution)</strong></h4><p>To avoid retrying too aggressively, apps use exponential backoff. This means the app waits longer between each retry:</p><ul><li><p>First retry &#8594; wait 1 second</p></li><li><p>Second retry &#8594; wait 2 seconds</p></li><li><p>Third retry &#8594; wait 4 seconds</p></li><li><p>Fourth retry &#8594; wait 8 seconds</p></li></ul><p>Each retry waits longer than the previous one. Thus reducing pressure on the server.</p><h4><strong>Adding Jitter</strong></h4><p>If every app waited the exact same time, they would still retry together.</p><p>To avoid that, add <strong>jitter</strong>, which is a small random delay. For example, retry after 4 seconds + random delay.</p><p>This spreads the requests over time instead of sending them all at once.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Y_wh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d787a8f-d96c-4f3f-a961-e803393f16ef_1600x453.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Y_wh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d787a8f-d96c-4f3f-a961-e803393f16ef_1600x453.png 424w, https://substackcdn.com/image/fetch/$s_!Y_wh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d787a8f-d96c-4f3f-a961-e803393f16ef_1600x453.png 848w, https://substackcdn.com/image/fetch/$s_!Y_wh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d787a8f-d96c-4f3f-a961-e803393f16ef_1600x453.png 1272w, https://substackcdn.com/image/fetch/$s_!Y_wh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d787a8f-d96c-4f3f-a961-e803393f16ef_1600x453.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Y_wh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d787a8f-d96c-4f3f-a961-e803393f16ef_1600x453.png" width="1456" height="412" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7d787a8f-d96c-4f3f-a961-e803393f16ef_1600x453.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:412,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Y_wh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d787a8f-d96c-4f3f-a961-e803393f16ef_1600x453.png 424w, https://substackcdn.com/image/fetch/$s_!Y_wh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d787a8f-d96c-4f3f-a961-e803393f16ef_1600x453.png 848w, https://substackcdn.com/image/fetch/$s_!Y_wh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d787a8f-d96c-4f3f-a961-e803393f16ef_1600x453.png 1272w, https://substackcdn.com/image/fetch/$s_!Y_wh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d787a8f-d96c-4f3f-a961-e803393f16ef_1600x453.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4><strong>Real-World Example</strong></h4><p>Many popular services, such as AWS SDK, Google Cloud client libraries, and Stripe API clients, use this strategy by default. This approach is considered the standard for reliable network communication.</p><h4><strong>Trade-offs</strong></h4><p>Retry strategies must balance reliability and user experience.</p><p>If the app takes too long to retry, the user may wait too long for a response. If it retries too quickly, it might overload the server.</p><p>To handle this, many systems use a <strong>circuit breaker</strong>.</p><p>A circuit breaker stops retrying after several failures and shows a clear error message instead of retrying forever.</p><p>After adding retries to handle network failures, another challenge arises: <em>what happens if the same request is sent many times?</em></p><p>Sometimes a request reaches the server successfully, but the response never reaches the app because the network drops.</p><p>If the app retries the request, it could perform the same action again&#8230;</p><h3><strong>7. Idempotency in APIs (Safe Retries)</strong></h3><p>In mobile apps, network requests can fail halfway through.</p><p>For example:</p><ol><li><p>App sends a request to the server</p></li><li><p>Server successfully processes it</p></li><li><p>But the response never reaches the app because the internet connection drops</p></li></ol><p>From the app&#8217;s perspective, the request appears to have failed, so it retries it. If the server processes the request again, the action may happen twice.</p><p>This is where <em>idempotency</em> becomes important.</p><p>Idempotency means sending the same request many times produces the same result, rather than repeating the action.</p><h4><strong>How Idempotency Works</strong></h4><p>To make retries safe, the client sends a unique identifier with each request, called an <strong>idempotency key</strong>. This key is usually a <strong>unique ID (UUID)</strong> generated by the app.</p><p>Example request header:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;9d58a35d-7e89-4dec-89db-42f1d512a615&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">Idempotency-Key: 8f3c2c2e-92f1-4f9b-b8b0-1b0f1b1a3e3f</code></pre></div><p>When the server receives a request:</p><ol><li><p>It checks whether this key was used before.</p></li><li><p>If the key is new, the server processes the request normally.</p></li><li><p>If the same key appears again, the server <strong>does NOT repeat the action</strong>. Instead, it returns the previous response.</p></li></ol><p>This prevents duplicate operations.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6ZuB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bc6cc48-71c4-4b91-8c65-9ef16db57d6b_1600x367.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6ZuB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bc6cc48-71c4-4b91-8c65-9ef16db57d6b_1600x367.png 424w, https://substackcdn.com/image/fetch/$s_!6ZuB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bc6cc48-71c4-4b91-8c65-9ef16db57d6b_1600x367.png 848w, https://substackcdn.com/image/fetch/$s_!6ZuB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bc6cc48-71c4-4b91-8c65-9ef16db57d6b_1600x367.png 1272w, https://substackcdn.com/image/fetch/$s_!6ZuB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bc6cc48-71c4-4b91-8c65-9ef16db57d6b_1600x367.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6ZuB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bc6cc48-71c4-4b91-8c65-9ef16db57d6b_1600x367.png" width="1456" height="334" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4bc6cc48-71c4-4b91-8c65-9ef16db57d6b_1600x367.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:334,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6ZuB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bc6cc48-71c4-4b91-8c65-9ef16db57d6b_1600x367.png 424w, https://substackcdn.com/image/fetch/$s_!6ZuB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bc6cc48-71c4-4b91-8c65-9ef16db57d6b_1600x367.png 848w, https://substackcdn.com/image/fetch/$s_!6ZuB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bc6cc48-71c4-4b91-8c65-9ef16db57d6b_1600x367.png 1272w, https://substackcdn.com/image/fetch/$s_!6ZuB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bc6cc48-71c4-4b91-8c65-9ef16db57d6b_1600x367.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h4><strong>Why It Matters</strong></h4><p>Imagine a user taps the <strong>&#8220;</strong>Pay<strong>&#8221;</strong> button.</p><ol><li><p>The payment request reaches the server.</p></li><li><p>Server processes the payment successfully.</p></li><li><p>But the response never reaches the phone because the network drops.</p></li></ol><p>So the user taps <strong>&#8220;</strong>Pay<strong>&#8221;</strong> again!</p><p>Without idempotency, the user might be charged twice.</p><p>This is why idempotency is extremely important for operations that change data.</p><h4><strong>Real-World Example</strong></h4><p>Many large systems rely on idempotency:</p><ul><li><p>Stripe requires an Idempotency-Key for payment API calls.</p></li><li><p>Uber uses idempotency keys for ride requests.</p></li><li><p>Most financial systems use this approach to prevent duplicate transactions.</p></li></ul><h4><strong>Trade-offs</strong></h4><p>To support idempotency, the server must store previously used keys and their responses for a period of time (typically 24 hours).</p><p>This requires some storage, but the cost is small compared to the problems caused by duplicate operations.</p><p>It&#8217;s also important to design the key carefully, usually combining:</p><ul><li><p>user ID</p></li><li><p>operation type</p></li><li><p>unique request ID</p></li></ul><h4><strong>Practical Tip</strong></h4><p>The idempotency key should be generated on the client before the request is sent. The app should also save the key locally.</p><p>If the app crashes or restarts, it can retry the request using the same key, ensuring the action is not repeated.</p><p>After making retries safe, the next step is to improve the app's efficiency when communicating with the server. One important way to do this is by reducing the number of network requests the app sends.</p><h3><strong>8. Request Batching &amp; Payload Optimisation</strong></h3><p>Every time a mobile app sends a request to the server, it adds latency and uses battery.</p><p>The total time for a request to reach the server and its response to return is called a <strong>network round-trip</strong>.</p><p>If an app sends many small requests separately, it can slow down the app and drain the device&#8217;s battery.</p><p>To solve this, use these two techniques:</p><ul><li><p>Request batching</p></li><li><p>Payload optimisation</p></li></ul><p><strong>a. Request Batching</strong></p><p>Request batching combines multiple requests into a single request.</p><p>Instead of doing this:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;380c69b6-60a6-4e09-99e8-dd2271ea4219&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">App &#8594; Request 1
App &#8594; Request 2
App &#8594; Request 3</code></pre></div><p>The app sends one combined request:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;8c637ab1-4ce4-4d80-ba70-f2aa1d355925&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">App &#8594; Batch Request (Request 1 + Request 2 + Request 3)</code></pre></div><p>This reduces the number of times the app needs to communicate with the server.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SqRP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff752d8ca-cbff-4fce-ba31-53c2b264c996_1341x617.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SqRP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff752d8ca-cbff-4fce-ba31-53c2b264c996_1341x617.png 424w, https://substackcdn.com/image/fetch/$s_!SqRP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff752d8ca-cbff-4fce-ba31-53c2b264c996_1341x617.png 848w, https://substackcdn.com/image/fetch/$s_!SqRP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff752d8ca-cbff-4fce-ba31-53c2b264c996_1341x617.png 1272w, https://substackcdn.com/image/fetch/$s_!SqRP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff752d8ca-cbff-4fce-ba31-53c2b264c996_1341x617.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SqRP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff752d8ca-cbff-4fce-ba31-53c2b264c996_1341x617.png" width="1341" height="617" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f752d8ca-cbff-4fce-ba31-53c2b264c996_1341x617.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:617,&quot;width&quot;:1341,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!SqRP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff752d8ca-cbff-4fce-ba31-53c2b264c996_1341x617.png 424w, https://substackcdn.com/image/fetch/$s_!SqRP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff752d8ca-cbff-4fce-ba31-53c2b264c996_1341x617.png 848w, https://substackcdn.com/image/fetch/$s_!SqRP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff752d8ca-cbff-4fce-ba31-53c2b264c996_1341x617.png 1272w, https://substackcdn.com/image/fetch/$s_!SqRP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff752d8ca-cbff-4fce-ba31-53c2b264c996_1341x617.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>b. Payload Optimisation</strong></p><p>Payload is the data sent between the app and the server.</p><p>Payload optimisation focuses on reducing the amount of data being transferred.</p><p>This can be achieved by:</p><ul><li><p>Compression (like gzip or Brotli)</p></li><li><p>Sending only the needed fields</p></li><li><p>Using efficient formats like binary data instead of large text formats</p></li></ul><p>Reducing the app's payload size helps it load faster and use less network data.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZpSR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60bdad54-c059-43c4-83cf-171fb9ec77de_1494x550.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZpSR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60bdad54-c059-43c4-83cf-171fb9ec77de_1494x550.png 424w, https://substackcdn.com/image/fetch/$s_!ZpSR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60bdad54-c059-43c4-83cf-171fb9ec77de_1494x550.png 848w, https://substackcdn.com/image/fetch/$s_!ZpSR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60bdad54-c059-43c4-83cf-171fb9ec77de_1494x550.png 1272w, https://substackcdn.com/image/fetch/$s_!ZpSR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60bdad54-c059-43c4-83cf-171fb9ec77de_1494x550.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZpSR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60bdad54-c059-43c4-83cf-171fb9ec77de_1494x550.png" width="1456" height="536" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/60bdad54-c059-43c4-83cf-171fb9ec77de_1494x550.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:536,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZpSR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60bdad54-c059-43c4-83cf-171fb9ec77de_1494x550.png 424w, https://substackcdn.com/image/fetch/$s_!ZpSR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60bdad54-c059-43c4-83cf-171fb9ec77de_1494x550.png 848w, https://substackcdn.com/image/fetch/$s_!ZpSR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60bdad54-c059-43c4-83cf-171fb9ec77de_1494x550.png 1272w, https://substackcdn.com/image/fetch/$s_!ZpSR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60bdad54-c059-43c4-83cf-171fb9ec77de_1494x550.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4><strong>Why It Matters</strong></h4><p>On mobile devices, sending a request wakes up the device&#8217;s <strong>network radio</strong> (the hardware responsible for communication).</p><p>This wake-up process takes time and consumes battery.</p><p>For example:</p><ul><li><p>Sending 10 separate requests means the radio wakes up 10 times.</p></li><li><p>Sending 1 batched request wakes the radio only once.</p></li></ul><p>So batching requests can significantly reduce both latency and battery usage.</p><h4><strong>Real-World Example</strong></h4><p>Many popular apps use batching to improve performance.</p><ul><li><p>Facebook batches many GraphQL queries into a single HTTP request.</p></li><li><p>Instagram loads images gradually but fetches data for the next screen in a single batched request.</p></li></ul><p>This makes the app feel faster and smoother.</p><h4><strong>Trade-offs</strong></h4><p>Batching improves efficiency but also introduces challenges.</p><ul><li><p>If the app waits too long to collect requests before batching them, it can add extra delay.</p></li><li><p>Another challenge is handling errors. For example, if a batched request contains 10 operations and 3 fail, the system must handle those failures properly.</p></li></ul><p>To balance this, many systems use a short batching window, usually around 50&#8211;100 milliseconds, to collect requests before sending them together.</p><p>After reducing the number of network requests, another challenge appears when apps need to upload large files, such as photos, videos, or documents.</p><p>Large uploads can easily fail on mobile networks because connections may drop or become unstable. To handle this, apps use a technique called <strong>resumable uploads</strong>.</p><h3><strong>9. Resumable Uploads (Chunked Uploads)</strong></h3><p>When a mobile app uploads a large file, such as a 100MB video, sending the entire file in a single request is risky.</p><p>If the network disconnects halfway through the upload, the whole upload may fail. The app would have to start again from the beginning.</p><p>To avoid this problem, use resumable uploads, also called <em>chunked uploads</em>.</p><h4><strong>How Chunked Uploads Work</strong></h4><p>Instead of uploading the entire file at once, it&#8217;s split into smaller parts, called <strong>chunks</strong>.</p><p>For example, a 100MB file might be divided into 10 chunks of 10MB each, and each chunk is uploaded separately.</p><p>The process usually works like this:</p><ol><li><p>App starts the upload and creates an <strong>upload session</strong>.</p></li><li><p>Server returns a <strong>session ID </strong>or<strong> upload URL</strong>.</p></li><li><p>App uploads the file <strong>chunk by chunk</strong>.</p></li><li><p>Server keeps track of how much data it has already received.</p></li><li><p>If the network fails, the app asks the server <strong>where the upload stopped</strong>.</p></li><li><p>The upload continues from that point instead of starting from the beginning.</p></li></ol><p>This makes large uploads much more reliable on mobile networks.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!w3W4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93c356ff-1dbd-44f9-801a-f9b4299c6c04_1600x303.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!w3W4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93c356ff-1dbd-44f9-801a-f9b4299c6c04_1600x303.png 424w, https://substackcdn.com/image/fetch/$s_!w3W4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93c356ff-1dbd-44f9-801a-f9b4299c6c04_1600x303.png 848w, https://substackcdn.com/image/fetch/$s_!w3W4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93c356ff-1dbd-44f9-801a-f9b4299c6c04_1600x303.png 1272w, https://substackcdn.com/image/fetch/$s_!w3W4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93c356ff-1dbd-44f9-801a-f9b4299c6c04_1600x303.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!w3W4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93c356ff-1dbd-44f9-801a-f9b4299c6c04_1600x303.png" width="1456" height="276" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/93c356ff-1dbd-44f9-801a-f9b4299c6c04_1600x303.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:276,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!w3W4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93c356ff-1dbd-44f9-801a-f9b4299c6c04_1600x303.png 424w, https://substackcdn.com/image/fetch/$s_!w3W4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93c356ff-1dbd-44f9-801a-f9b4299c6c04_1600x303.png 848w, https://substackcdn.com/image/fetch/$s_!w3W4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93c356ff-1dbd-44f9-801a-f9b4299c6c04_1600x303.png 1272w, https://substackcdn.com/image/fetch/$s_!w3W4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93c356ff-1dbd-44f9-801a-f9b4299c6c04_1600x303.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h4><strong>Why It Matters</strong></h4><p>Mobile networks are often unstable.</p><p>Imagine uploading a 200MB video that reaches 99% when the network suddenly disconnects, causing the upload to fail&#8230;</p><p>Without resumable uploads, the app would have to restart the entire upload from the beginning, which wastes time, data, and battery.</p><p>With resumable uploads, the app can resume uploading from where it left off instead of starting over.</p><h4><strong>Real-World Example</strong></h4><p>Many popular apps use resumable uploads:</p><ul><li><p>YouTube for video uploads</p></li><li><p>Google Drive for file uploads</p></li><li><p>Dropbox for syncing files</p></li></ul><p>Any app that allows users to upload media usually uses this approach.</p><p>Even with reliable upload systems, mobile apps still face a major challenge: <em>network connections can drop at any time.</em></p><p>Users may lose connectivity when entering a subway tunnel, switching to airplane mode, or moving to an area with poor network coverage. Because of this, mobile apps must be designed to handle unstable or missing network connections without breaking.</p><h3><strong>10. Handling Intermittent Connectivity</strong></h3><p>Mobile apps should continue working as smoothly as possible even when the internet connection is weak or temporarily unavailable. Instead of failing immediately, the app should adapt to the network situation and recover when the connection returns.</p><p>Use these techniques to handle this:</p><p><strong>a. Local Request Queuing</strong></p><p>When the device is offline, the app can store user actions locally rather than sending them to the server immediately.</p><p>For example:</p><ol><li><p>A user sends a message.</p></li><li><p>The app saves the message locally.</p></li><li><p>When the internet connection returns, the app sends it to the server.</p></li></ol><p>This is called <strong>request queuing</strong>. The app keeps a list of pending actions and processes them once the network is available again.</p><p><strong>b. Optimistic UI</strong></p><p>Another common technique is optimistic UI.</p><p>With optimistic UI, the app updates the interface immediately, assuming the request will succeed.</p><p>For example:</p><ul><li><p>A user likes a post.</p></li><li><p>App shows the post as liked instantly.</p></li><li><p>Request is sent to the server in the background.</p></li></ul><p>If the request fails later, the app can correct the state.</p><p>This makes the app feel fast and responsive, even when the network is slow.</p><p><strong>c. Network State Awareness</strong></p><p>Apps can also monitor the device&#8217;s network status.</p><p>This allows the app to know when it is online, offline, or on a slow connection.</p><p>Based on this information, the app can adjust its behaviour. For example, it might delay data syncing while offline and resume syncing when connectivity returns.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!P2_3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb05560eb-49fb-42ea-b215-6bb5dc1a60e4_1600x524.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!P2_3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb05560eb-49fb-42ea-b215-6bb5dc1a60e4_1600x524.png 424w, https://substackcdn.com/image/fetch/$s_!P2_3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb05560eb-49fb-42ea-b215-6bb5dc1a60e4_1600x524.png 848w, https://substackcdn.com/image/fetch/$s_!P2_3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb05560eb-49fb-42ea-b215-6bb5dc1a60e4_1600x524.png 1272w, https://substackcdn.com/image/fetch/$s_!P2_3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb05560eb-49fb-42ea-b215-6bb5dc1a60e4_1600x524.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!P2_3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb05560eb-49fb-42ea-b215-6bb5dc1a60e4_1600x524.png" width="1456" height="477" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b05560eb-49fb-42ea-b215-6bb5dc1a60e4_1600x524.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:477,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!P2_3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb05560eb-49fb-42ea-b215-6bb5dc1a60e4_1600x524.png 424w, https://substackcdn.com/image/fetch/$s_!P2_3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb05560eb-49fb-42ea-b215-6bb5dc1a60e4_1600x524.png 848w, https://substackcdn.com/image/fetch/$s_!P2_3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb05560eb-49fb-42ea-b215-6bb5dc1a60e4_1600x524.png 1272w, https://substackcdn.com/image/fetch/$s_!P2_3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb05560eb-49fb-42ea-b215-6bb5dc1a60e4_1600x524.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4><strong>Why It Matters</strong></h4><p>Users frequently lose internet access during normal usage.</p><p>For example, in subway tunnels, on airplanes, or in areas with weak network coverage. If an app crashes, freezes, or loses user data, it feels unreliable.</p><p>Properly handling connectivity problems makes the app feel stable and trustworthy.</p><h4><strong>Real-World Example</strong></h4><p>Many modern apps handle intermittent connectivity this way:</p><ul><li><p>Notion queues edits locally and syncs them later.</p></li><li><p>Google Docs allows you to continue editing even when offline.</p></li><li><p>Messaging apps show a &#8220;sending&#8221; or &#8220;pending&#8221; status until the message is delivered.</p></li></ul><h4><strong>Practical Tip</strong></h4><p>Mobile operating systems provide built-in tools to monitor network connectivity:</p><ul><li><p><a href="https://developer.apple.com/documentation/network/nwpathmonitor">NWPathMonitor</a> on iOS</p></li><li><p><a href="https://developer.android.com/reference/android/net/ConnectivityManager">ConnectivityManager</a> on Android</p></li></ul><p>These tools notify the app when the network status changes.</p><p>This approach is event-driven and battery-efficient, unlike constantly checking the network in a loop.</p><p>Even with reliable networking strategies, mobile apps cannot always depend on an internet connection. To keep the app fast and usable even without internet access, developers use caching.</p><div><hr></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Get the full premium newsletter series and max your system design career leverage:</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h1><strong>Caching &amp; Offline</strong></h1><p>This section focuses on how mobile apps store and reuse data to reduce network requests and improve performance.</p><p>You&#8217;ll learn how caching strategies and offline-first approaches help apps stay fast and usable even when the internet is slow or unavailable.</p><h3><strong>11. On-Device Caching (Memory vs Disk)</strong></h3><p>Mobile apps usually store cached data in two places on the device:</p><ul><li><p>Memory (RAM)</p></li><li><p>Disk (device storage)</p></li></ul><p>Using both together creates a two-layer caching system.</p><p><strong>a. Memory Cache (RAM)</strong></p><p>Memory cache stores data in device RAM, which is extremely fast to access.</p><p>When data is stored in memory:</p><ul><li><p>App can load it almost instantly.</p></li><li><p>It avoids making a network request.</p></li></ul><p>Yet memory has a limitation:</p><ul><li><p>Data disappears when the app closes.</p></li><li><p>Operating system may remove cached data if the device runs low on memory.</p></li></ul><p>So memory caching is fast but temporary.</p><p><strong>b. Disk Cache (Storage)</strong></p><p>Disk cache stores data on the device&#8217;s permanent storage.</p><p>This means:</p><ul><li><p>Data remains available even after the app restarts.</p></li><li><p>App can load previously downloaded data without contacting the server again.</p></li></ul><p>But disk access is slower than memory access.</p><p>So disk caching is slower than RAM but more persistent.</p><h4><strong>Why It Matters</strong></h4><p>Fetching data from the server can take 50&#8211;500 milliseconds, depending on the network. But reading data from memory happens in nanoseconds, which is almost instantaneous.</p><p>So if the app can load data from the cache instead of the network, the user interface feels much faster. For example, profile pictures, app configurations, or recently viewed content.</p><p>These are perfect candidates for caching.</p><h4><strong>Real-World Example</strong></h4><p>Many image-loading libraries use this two-layer caching approach. For example, SDWebImage and Kingfisher.</p><p>When an image is requested:</p><ol><li><p>App first checks the <em>memory cache.</em></p></li><li><p>If it&#8217;s not there, it checks the <em>disk cache.</em></p></li><li><p>Only if both are missing does it requests image from the network.</p></li></ol><p>This makes images load almost instantly.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jshU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa9ecfb5-c5f3-4b64-b8c0-fc88de3a1201_1600x515.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jshU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa9ecfb5-c5f3-4b64-b8c0-fc88de3a1201_1600x515.png 424w, https://substackcdn.com/image/fetch/$s_!jshU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa9ecfb5-c5f3-4b64-b8c0-fc88de3a1201_1600x515.png 848w, https://substackcdn.com/image/fetch/$s_!jshU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa9ecfb5-c5f3-4b64-b8c0-fc88de3a1201_1600x515.png 1272w, https://substackcdn.com/image/fetch/$s_!jshU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa9ecfb5-c5f3-4b64-b8c0-fc88de3a1201_1600x515.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jshU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa9ecfb5-c5f3-4b64-b8c0-fc88de3a1201_1600x515.png" width="1456" height="469" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/aa9ecfb5-c5f3-4b64-b8c0-fc88de3a1201_1600x515.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:469,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jshU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa9ecfb5-c5f3-4b64-b8c0-fc88de3a1201_1600x515.png 424w, https://substackcdn.com/image/fetch/$s_!jshU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa9ecfb5-c5f3-4b64-b8c0-fc88de3a1201_1600x515.png 848w, https://substackcdn.com/image/fetch/$s_!jshU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa9ecfb5-c5f3-4b64-b8c0-fc88de3a1201_1600x515.png 1272w, https://substackcdn.com/image/fetch/$s_!jshU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa9ecfb5-c5f3-4b64-b8c0-fc88de3a1201_1600x515.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4><strong>Trade-offs</strong></h4><p>Both types of caching have advantages and limitations:</p><ul><li><p><strong>Memory Cache:</strong> Extremely fast, but cleared when the app closes.</p></li><li><p><strong>Disk Cache:</strong> Data remains available after restarting the app, but reading from disk is slower than memory.</p></li></ul><p>Cache size also needs to be controlled carefully.</p><p>If too much memory is used for caching, the operating system may remove the cached data or close the app to free memory.</p><h4><strong>Practical Tip</strong></h4><p>Mobile platforms provide built-in caching tools that automatically manage memory. For example, NSCache on iOS and LruCache on Android.</p><p>These systems automatically remove older items from memory when the device is running low on resources. This helps prevent the app from using too much memory.</p><p>Using simple data structures like dictionaries or hash maps for caching is NOT ideal, because they do not automatically respond to memory pressure. As a result, they may keep too much data in memory, increasing the risk that the app will be terminated by the operating system.</p><p>Local caching stores data on the device, which helps apps load faster.</p><p>But the internet itself also has built-in caching mechanisms that can reduce unnecessary network requests. This is where HTTP caching becomes useful.</p><h3><strong>12. HTTP Caching (ETag, Cache-Control)</strong></h3><p>HTTP caching allows apps and browsers to reuse previously downloaded responses instead of requesting the same data again.</p><p>This is controlled using special HTTP headers, <code>Cache-Control</code>, and <code>ETag</code>. These headers tell the app two important things:</p><ul><li><p>How long can cached data be reused</p></li><li><p>How to check if the data has changed</p></li></ul><p><strong>a. Cache-Control</strong></p><p>This header tells the app how long it can reuse the cached data before asking the server again.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;30915655-115f-46f2-88f3-06395de74a23&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">Cache-Control: max-age=300</code></pre></div><p>This means the response can be reused for 300 seconds (5 minutes).</p><p>During those 5 minutes, the app can load the data directly from cache instead of making a new request to the server.</p><p>This improves speed and reduces network usage.</p><p><strong>b. ETag</strong></p><p>An ETag is a unique identifier that represents a specific version of the data.</p><p>When the server sends a response, it also includes an ETag value:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;e492e2de-6397-4716-b569-d25c84598b74&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">ETag: &#8220;abc123&#8221;</code></pre></div><p>When the app requests the same data again, it sends that value back to the server:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;f4246bd4-52c8-43bc-a34f-fc2a1fe458d6&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">If-None-Match: &#8220;abc123&#8221;</code></pre></div><p>The server checks whether the data has changed:</p><ul><li><p>If the data has NOT changed, the server returns <code>304 Not Modified</code> with no response body.</p></li><li><p>If the data has changed, the server sends the new data.</p></li></ul><p>This saves bandwidth because the server does NOT need to resend the entire response.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!XC3I!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f9ce961-68e7-4111-a302-aa7b1dc50f55_1600x699.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!XC3I!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f9ce961-68e7-4111-a302-aa7b1dc50f55_1600x699.png 424w, https://substackcdn.com/image/fetch/$s_!XC3I!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f9ce961-68e7-4111-a302-aa7b1dc50f55_1600x699.png 848w, https://substackcdn.com/image/fetch/$s_!XC3I!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f9ce961-68e7-4111-a302-aa7b1dc50f55_1600x699.png 1272w, https://substackcdn.com/image/fetch/$s_!XC3I!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f9ce961-68e7-4111-a302-aa7b1dc50f55_1600x699.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!XC3I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f9ce961-68e7-4111-a302-aa7b1dc50f55_1600x699.png" width="1456" height="636" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9f9ce961-68e7-4111-a302-aa7b1dc50f55_1600x699.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:636,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!XC3I!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f9ce961-68e7-4111-a302-aa7b1dc50f55_1600x699.png 424w, https://substackcdn.com/image/fetch/$s_!XC3I!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f9ce961-68e7-4111-a302-aa7b1dc50f55_1600x699.png 848w, https://substackcdn.com/image/fetch/$s_!XC3I!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f9ce961-68e7-4111-a302-aa7b1dc50f55_1600x699.png 1272w, https://substackcdn.com/image/fetch/$s_!XC3I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f9ce961-68e7-4111-a302-aa7b1dc50f55_1600x699.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4><strong>Why It Matters</strong></h4><p>HTTP caching reduces unnecessary network requests.</p><p>For example, if an API response is 200 KB, returning a <code>304 Not Modified</code> response avoids downloading that 200 KB again.</p><p>This helps improve app performance, network efficiency, and battery usage. The best part is that HTTP caching works automatically when the correct headers are set.</p><h4><strong>Real-World Example</strong></h4><p>A news app might load a list of articles using an API. The server might return a header like this:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;ccec6c50-4ba7-4d51-980b-6aec6e4b0f70&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">Cache-Control: max-age=60, stale-while-revalidate=300</code></pre></div><p>This means cached response can be used for 60 seconds. After that, the app can still show the cached content while checking for updates in the background.</p><p>As a result, the articles appear instantly without showing a loading spinner.</p><h4><strong>Trade-offs</strong></h4><p>Different caching strategies have different advantages:</p><ul><li><p><strong>Strong caching (long max-age): </strong>Faster loading and fewer network requests, but the data may become outdated.</p></li><li><p><strong>Weak caching (ETag validation): </strong>Always checks if data is fresh, but still requires a small network request.</p></li></ul><p>Because of this, cache durations should match the frequency of data changes.</p><p>Caching helps apps load data faster, but it still depends on the network at some point. When the network connection disappears completely, caching alone cannot solve the problem.</p><p>To keep apps usable even without a network connection, many apps use a design approach called <em>offline-first architecture.</em></p><h3><strong>13. Offline-First Architecture</strong></h3><p>Offline-first architecture means the app is designed to work normally even without a network connection.</p><p>Instead of always asking the server for data, the app mainly works with data stored locally on the device. The app shows and updates local data first, and then synchronises changes with the server when the network becomes available.</p><p>This approach helps the app stay fast and responsive, even when the network is slow or unavailable.</p><h4><strong>How It Works</strong></h4><p>In an offline-first system, the app reads and writes data from local storage first.</p><p>The process usually works like this:</p><ol><li><p>The app reads data from the local database instead of requesting it from the server.</p></li><li><p>When the user makes a change, the app updates the local database immediately.</p></li><li><p>A background process then sends those changes to the server when a connection is available.</p></li><li><p>If the server has new updates, the app downloads them and updates the local database.</p></li><li><p>When the local data changes, the app automatically updates the user interface.</p></li></ol><p>Because the UI always reads from local data, the app feels fast and responsive.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!L_ER!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96e35bc1-ca02-4cc4-975c-1e1ad1f8f505_1411x662.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!L_ER!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96e35bc1-ca02-4cc4-975c-1e1ad1f8f505_1411x662.png 424w, https://substackcdn.com/image/fetch/$s_!L_ER!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96e35bc1-ca02-4cc4-975c-1e1ad1f8f505_1411x662.png 848w, https://substackcdn.com/image/fetch/$s_!L_ER!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96e35bc1-ca02-4cc4-975c-1e1ad1f8f505_1411x662.png 1272w, https://substackcdn.com/image/fetch/$s_!L_ER!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96e35bc1-ca02-4cc4-975c-1e1ad1f8f505_1411x662.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!L_ER!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96e35bc1-ca02-4cc4-975c-1e1ad1f8f505_1411x662.png" width="1411" height="662" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/96e35bc1-ca02-4cc4-975c-1e1ad1f8f505_1411x662.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:662,&quot;width&quot;:1411,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!L_ER!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96e35bc1-ca02-4cc4-975c-1e1ad1f8f505_1411x662.png 424w, https://substackcdn.com/image/fetch/$s_!L_ER!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96e35bc1-ca02-4cc4-975c-1e1ad1f8f505_1411x662.png 848w, https://substackcdn.com/image/fetch/$s_!L_ER!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96e35bc1-ca02-4cc4-975c-1e1ad1f8f505_1411x662.png 1272w, https://substackcdn.com/image/fetch/$s_!L_ER!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96e35bc1-ca02-4cc4-975c-1e1ad1f8f505_1411x662.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4><strong>Why It Matters</strong></h4><p>Apps that depend entirely on the network often feel slow.</p><p>Every action requires a request to the server, which can take 100&#8211;300 milliseconds or more. This delay often appears as loading spinners.</p><p>Offline-first apps avoid this problem.</p><p>Since the UI reads from local data, interactions feel instant, while the network syncing happens quietly in the background.</p><h4><strong>Real-World Example</strong></h4><p>Many popular apps use an offline-first architecture:</p><ul><li><p>Spotify downloads playlists so they can be played without internet.</p></li><li><p>Notion saves edits locally and syncs them later.</p></li><li><p>Google Maps stores map tiles for offline navigation.</p></li></ul><p>Apps dealing with media, productivity, or travel benefit from this approach.</p><h4><strong>Trade-offs</strong></h4><p>Offline-first systems provide a better user experience, but they are complex to build.</p><p>They require:</p><ul><li><p>Local database to store data</p></li><li><p>Sync system to send and receive updates</p></li><li><p>A way to handle conflicts when local and server data differ</p></li><li><p>Plan for updating the data structure over time</p></li></ul><p>Because of this complexity, an offline-first design usually needs to be planned early in app development.</p><p>Caching helps apps reuse data locally, but another challenge is delivering large files like images and videos efficiently. These files often make up most of the data in app downloads.</p><p>To improve speed and reduce data usage, use techniques like CDNs and media optimisation.</p><h3><strong>14. CDN Strategy &amp; Media Optimisation</strong></h3><p>Images and videos usually make up a large part of the data used by mobile apps.</p><p>To deliver this content quickly and efficiently, apps often rely on Content Delivery Networks (<strong>CDNs</strong>) and various optimisation techniques. A CDN is a network of servers distributed around the world. Instead of downloading content from a central server, the app downloads it from the nearest server, reducing load time.</p><p>While media files can be optimised, the app only downloads what it actually needs.</p><h4><strong>What a CDN Does</strong></h4><p>When a user opens an app and requests an image/video:</p><ol><li><p>The request is sent to the closest CDN server.</p></li><li><p>CDN returns the file from a nearby location instead of a distant server.</p></li><li><p>The content loads faster because the distance and network delay are reduced.</p></li></ol><p>This speeds up and improves the reliability of content delivery, especially for users in different parts of the world.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bpIB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6788c34d-b98c-4d80-a1a8-aef1f92338ad_1451x575.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bpIB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6788c34d-b98c-4d80-a1a8-aef1f92338ad_1451x575.png 424w, https://substackcdn.com/image/fetch/$s_!bpIB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6788c34d-b98c-4d80-a1a8-aef1f92338ad_1451x575.png 848w, https://substackcdn.com/image/fetch/$s_!bpIB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6788c34d-b98c-4d80-a1a8-aef1f92338ad_1451x575.png 1272w, https://substackcdn.com/image/fetch/$s_!bpIB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6788c34d-b98c-4d80-a1a8-aef1f92338ad_1451x575.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bpIB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6788c34d-b98c-4d80-a1a8-aef1f92338ad_1451x575.png" width="1451" height="575" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6788c34d-b98c-4d80-a1a8-aef1f92338ad_1451x575.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:575,&quot;width&quot;:1451,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bpIB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6788c34d-b98c-4d80-a1a8-aef1f92338ad_1451x575.png 424w, https://substackcdn.com/image/fetch/$s_!bpIB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6788c34d-b98c-4d80-a1a8-aef1f92338ad_1451x575.png 848w, https://substackcdn.com/image/fetch/$s_!bpIB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6788c34d-b98c-4d80-a1a8-aef1f92338ad_1451x575.png 1272w, https://substackcdn.com/image/fetch/$s_!bpIB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6788c34d-b98c-4d80-a1a8-aef1f92338ad_1451x575.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4><strong>Media Optimisation</strong></h4><p>Another important improvement is optimising the size and format of images and videos.</p><p>Some common techniques include:</p><ul><li><p><strong>Image resizing:</strong> Instead of sending a large image (e.g., 4K), the server sends a smaller version that fits the device screen, such as 200&#215;200 pixels.</p></li><li><p><strong>Modern image formats:</strong> Newer formats like WebP or AVIF store images more efficiently than older formats like JPEG or PNG. They provide similar image quality with smaller file sizes.</p></li><li><p><strong>Lazy loading:</strong> Images are loaded only when they are about to appear on the screen. This avoids downloading content that the user might never see.</p></li></ul><p>Together, these techniques significantly reduce data usage and improve loading speed.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WM5e!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3509288-42cf-43d4-9180-acb2aa16f469_1600x478.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WM5e!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3509288-42cf-43d4-9180-acb2aa16f469_1600x478.png 424w, https://substackcdn.com/image/fetch/$s_!WM5e!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3509288-42cf-43d4-9180-acb2aa16f469_1600x478.png 848w, https://substackcdn.com/image/fetch/$s_!WM5e!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3509288-42cf-43d4-9180-acb2aa16f469_1600x478.png 1272w, https://substackcdn.com/image/fetch/$s_!WM5e!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3509288-42cf-43d4-9180-acb2aa16f469_1600x478.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WM5e!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3509288-42cf-43d4-9180-acb2aa16f469_1600x478.png" width="1456" height="435" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f3509288-42cf-43d4-9180-acb2aa16f469_1600x478.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:435,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!WM5e!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3509288-42cf-43d4-9180-acb2aa16f469_1600x478.png 424w, https://substackcdn.com/image/fetch/$s_!WM5e!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3509288-42cf-43d4-9180-acb2aa16f469_1600x478.png 848w, https://substackcdn.com/image/fetch/$s_!WM5e!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3509288-42cf-43d4-9180-acb2aa16f469_1600x478.png 1272w, https://substackcdn.com/image/fetch/$s_!WM5e!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3509288-42cf-43d4-9180-acb2aa16f469_1600x478.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4><strong>Why It Matters</strong></h4><p>Large media files can slow down apps and consume a lot of mobile data.</p><p>For example, loading a 4MB image on a small phone screen wastes most of its pixels because the device only displays a small portion of the image&#8217;s resolution.</p><p>Serving images that match the device size <a href="https://imageengine.io/improve-pagespeed-image-cdn-vs-traditional-cdn/">can reduce total data usage by up to 80%</a>, making the app faster and more efficient.</p><h4><strong>Real-World Example</strong></h4><p>Many popular apps rely on CDN and media optimisation:</p><ul><li><p>Cloudinary and Imgix dynamically resize images based on URL parameters.</p></li><li><p>Instagram serves images that match the exact resolution of the user&#8217;s device.</p></li><li><p>On modern Android devices, Instagram also uses the AVIF format to reduce image size.</p></li></ul><h4><strong>Trade-offs</strong></h4><p>Using a CDN improves performance but introduces additional cost, as CDN providers charge based on data transfer.</p><p>As a result, CDN assets are usually cached for long periods. Apps often use versioned or hashed URLs to update assets. This process ensures that the new version gets downloaded when needed.</p><h4><strong>Practical Tip</strong></h4><p>To make images appear faster, apps often show a temporary low-quality preview while the full image loads.</p><p>Techniques like <a href="https://blurha.sh/">BlurHash</a> or <a href="https://cloudinary.com/blog/low_quality_image_placeholders_lqip_explained">LQIP (Low Quality Image Placeholder)</a> display a blurred or low-resolution version first. This makes the page feel fast, even before the high-quality image finishes loading.</p><p>Even with strong caching and efficient content delivery, one important challenge remains: <em>keeping cached data accurate and up to date.</em></p><h3><strong>15. Cache Invalidation Strategies</strong></h3><p>Caching helps apps load data faster by storing it locally.</p><p>But cached data can become outdated when the original data on the server changes. Because of this, apps need a way to update or remove old cached data. This process is called <strong>cache invalidation</strong>.</p><p>In simple terms, cache invalidation means deciding when cached data should no longer be used and needs to be refreshed.</p><h4><strong>Why It Matters</strong></h4><p>If outdated data stays in the cache for too long, the app may show incorrect information.</p><p>For example, a user might see an old product price, a deleted message, or an outdated address. In these situations, showing stale cached data can be worse than not using a cache at all.</p><p>That&#8217;s why managing cached data correctly is very important!</p><h4><strong>Common Cache Invalidation Strategies</strong></h4><p>There are several common ways to control when cached data should expire or update:</p><p><strong>a. TTL (Time-To-Live)</strong></p><p>TTL means cached data is only valid for a certain amount of time.</p><p>For example, if a cache has a TTL of 5 minutes, the app will reuse the cached data for five minutes. After that, it must fetch fresh data from the server.</p><p>This approach is simple but may briefly show outdated data.</p><p><strong>b. Event-Driven Invalidation</strong></p><p>In this approach, the server actively tells the app when cached data is no longer valid.</p><p>For example, a product price changes, and the server sends an event to clear or update the cache.</p><p>This ensures the app always shows the latest data, but it requires more complex systems to send these updates.</p><p><strong>c. Stale-While-Revalidate</strong></p><p>This strategy allows the app to show cached data immediately, even if it is slightly outdated.</p><p>The app quietly requests fresh data in the background. When the new data arrives, the cache is updated.</p><p>This approach improves the user experience by loading the app instantly while keeping data up to date.</p><p><strong>d. Versioned URLs</strong></p><p>Sometimes cached assets like images or style sheets are stored for a very long time.</p><p>To update them safely, the filename or URL includes a version or hash.</p><p>Example:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;36849cbe-157e-4e80-82ff-5b71caac2b92&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">style.abc123.css</code></pre></div><p>When the file changes, the version in the URL changes. Because the URL is different, the cache automatically downloads the new file.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!eQk5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7bb3ca-e3b2-4196-929f-f0a1cf9a9173_1471x843.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eQk5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7bb3ca-e3b2-4196-929f-f0a1cf9a9173_1471x843.png 424w, https://substackcdn.com/image/fetch/$s_!eQk5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7bb3ca-e3b2-4196-929f-f0a1cf9a9173_1471x843.png 848w, https://substackcdn.com/image/fetch/$s_!eQk5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7bb3ca-e3b2-4196-929f-f0a1cf9a9173_1471x843.png 1272w, https://substackcdn.com/image/fetch/$s_!eQk5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7bb3ca-e3b2-4196-929f-f0a1cf9a9173_1471x843.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eQk5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7bb3ca-e3b2-4196-929f-f0a1cf9a9173_1471x843.png" width="1456" height="834" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8d7bb3ca-e3b2-4196-929f-f0a1cf9a9173_1471x843.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:834,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!eQk5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7bb3ca-e3b2-4196-929f-f0a1cf9a9173_1471x843.png 424w, https://substackcdn.com/image/fetch/$s_!eQk5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7bb3ca-e3b2-4196-929f-f0a1cf9a9173_1471x843.png 848w, https://substackcdn.com/image/fetch/$s_!eQk5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7bb3ca-e3b2-4196-929f-f0a1cf9a9173_1471x843.png 1272w, https://substackcdn.com/image/fetch/$s_!eQk5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7bb3ca-e3b2-4196-929f-f0a1cf9a9173_1471x843.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4><strong>Real-World Example</strong></h4><p>An e-commerce app might use different caching strategies depending on the data.</p><p>For example:</p><ul><li><p>Product listings may use a TTL of 5 minutes, since a short delay in updates is acceptable.</p></li><li><p>Inventory or pricing data may use event-driven invalidation because it must always be accurate.</p></li><li><p>Static assets, such as images or stylesheets, can use versioned URLs, allowing them to be cached for a long time.</p></li></ul><h4><strong>Trade-offs</strong></h4><p>Each caching strategy has its advantages and limitations:</p><ul><li><p><strong>TTL-based caching:</strong> Simple to implement, but cached data may briefly become outdated.</p></li><li><p><strong>Event-driven invalidation:</strong> Keeps data accurate but requires more complex infrastructure.</p></li><li><p><strong>Stale-while-revalidate:</strong> Makes the app feel fast, but temporarily shows slightly outdated data.</p></li></ul><h4><strong>Important Tip</strong></h4><p>Caches should never have infinite expiration times for data that changes frequently.</p><p>If cached data never expires, the app may continue showing outdated information, making it difficult to detect and debug.</p><p>Once data is cached or fetched, the app also needs a reliable way to store and organise that data locally, which leads to the next part: <em>storage and database design.</em></p><div><hr></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Get the full premium newsletter series and max your system design career leverage:</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h1><strong>Storage &amp; Data</strong></h1><p>Mobile apps need efficient ways to store and manage data on the device.</p><p>This section explains how local databases, schema design, and pagination help apps handle large datasets while maintaining smooth performance on resource-limited devices.</p><h3><strong>16. Local Database Design (Schema Modeling)</strong></h3><p>Mobile apps often store data locally on the device so they can load information quickly and continue working even when the network is slow or unavailable.</p><p>To do this, apps use local databases such as SQLite, which is commonly used on both Android and iOS. A database schema simply describes how data is organised in the database. It defines tables, columns, and how different pieces of data are connected.</p><p>Good schema design is important because it helps the app load and display data quickly.</p><h4><strong>How Data Is Often Structured on Mobile</strong></h4><p>On servers, databases are usually designed to avoid duplicating data across many places. This approach is called <strong>normalisation</strong>.</p><p>But mobile apps often use a slightly different approach called <strong>denormalisation</strong>.</p><p>Denormalisation involves storing related data in the same table, enabling quick reads without complex queries.</p><p>This helps mobile apps load data faster because the app does NOT need to combine multiple tables every time it displays information.</p><h4><strong>Why It Matters</strong></h4><p>Mobile apps need to display data quickly.</p><p>For example, imagine an app showing a list of 50,000 messages. If the database is poorly designed, finding the right messages might take much longer than expected.</p><p>A missing <strong>index</strong> (a structure that helps the database locate data quickly) can cause the database to scan every row in a table. This can turn a 1-millisecond query into a 500-millisecond query, causing visible delays or lag in the user interface.</p><p>So the way a database is designed directly affects how smooth the app feels.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KAJK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c075758-73c7-42b9-a870-b96a1d14dc2a_1600x539.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KAJK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c075758-73c7-42b9-a870-b96a1d14dc2a_1600x539.png 424w, https://substackcdn.com/image/fetch/$s_!KAJK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c075758-73c7-42b9-a870-b96a1d14dc2a_1600x539.png 848w, https://substackcdn.com/image/fetch/$s_!KAJK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c075758-73c7-42b9-a870-b96a1d14dc2a_1600x539.png 1272w, https://substackcdn.com/image/fetch/$s_!KAJK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c075758-73c7-42b9-a870-b96a1d14dc2a_1600x539.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KAJK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c075758-73c7-42b9-a870-b96a1d14dc2a_1600x539.png" width="1456" height="490" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6c075758-73c7-42b9-a870-b96a1d14dc2a_1600x539.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:490,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KAJK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c075758-73c7-42b9-a870-b96a1d14dc2a_1600x539.png 424w, https://substackcdn.com/image/fetch/$s_!KAJK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c075758-73c7-42b9-a870-b96a1d14dc2a_1600x539.png 848w, https://substackcdn.com/image/fetch/$s_!KAJK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c075758-73c7-42b9-a870-b96a1d14dc2a_1600x539.png 1272w, https://substackcdn.com/image/fetch/$s_!KAJK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c075758-73c7-42b9-a870-b96a1d14dc2a_1600x539.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4><strong>Real-World Example</strong></h4><p>Consider a messaging app.</p><p>Instead of storing only a user ID with each message, the app may also store the sender&#8217;s name and profile image directly inside the message record.</p><p>This means the app can display messages immediately without looking up additional data from another table.</p><p>This approach speeds up loading when thousands of messages need to be shown quickly.</p><h4><strong>Trade-offs</strong></h4><p>Each database design approach has advantages and disadvantages:</p><ul><li><p><strong>Denormalised structure: </strong>Faster data reading and works well for displaying lists in the UI, but updating repeated data can be harder. For example, if a user changes their name, multiple records may need to be updated.</p></li><li><p><strong>Normalised structure: </strong>Cleaner, more organised, and easier to maintain consistent data, but requires more complex queries, which may slow down UI rendering.</p></li></ul><p>Because mobile apps mostly read data to display it on the screen, they often prefer denormalised structures for faster reads.</p><h4><strong>Practical Tip</strong></h4><p>When writing many records to the database, it&#8217;s usually faster to group them into a single transaction.</p><p>Instead of saving each item one by one, the app saves many items together in a single operation. This reduces overhead and makes database writes much more efficient.</p><p>As apps grow and evolve over time, the database structure may need to change as well. Because of this, apps must handle database schema updates carefully, which leads to the next topic: <em>schema migration strategies.</em></p><h3><strong>17. Schema Migration Strategy</strong></h3><p>Mobile apps often store data in a local database on the device.</p><p>Over time, the app may change how this data is structured. For example, a new version of the app might add a new column, change a table, or introduce new data fields.</p><p>When this happens, the database structure must be updated safely. This process is called <strong>schema migration</strong>.</p><p>A schema migration ensures that the existing database on a user&#8217;s device is updated to match the new version of the app.</p><h4><strong>Why It Matters</strong></h4><p>Unlike a web server, mobile databases exist on millions of individual devices that developers cannot directly control.</p><p>Some users may update the app immediately, while others may skip several versions before updating. For example, a user installs version 1 of an app and later updates directly to version 5.</p><p>The database must correctly apply every change that happened between those versions. So the migrations must run step-by-step:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;b17fbb8a-048a-4455-a49f-9b445d37fb20&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">1 &#8594; 2 &#8594; 3 &#8594; 4 &#8594; 5</code></pre></div><p>If a migration fails, the app may crash on launch. Because the database exists on the user&#8217;s device, fixing it later can be difficult&#8230;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qDff!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e41d26f-d812-4eb4-9d94-78b3ae1c39db_1600x599.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qDff!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e41d26f-d812-4eb4-9d94-78b3ae1c39db_1600x599.png 424w, https://substackcdn.com/image/fetch/$s_!qDff!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e41d26f-d812-4eb4-9d94-78b3ae1c39db_1600x599.png 848w, https://substackcdn.com/image/fetch/$s_!qDff!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e41d26f-d812-4eb4-9d94-78b3ae1c39db_1600x599.png 1272w, https://substackcdn.com/image/fetch/$s_!qDff!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e41d26f-d812-4eb4-9d94-78b3ae1c39db_1600x599.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qDff!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e41d26f-d812-4eb4-9d94-78b3ae1c39db_1600x599.png" width="1456" height="545" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7e41d26f-d812-4eb4-9d94-78b3ae1c39db_1600x599.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:545,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qDff!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e41d26f-d812-4eb4-9d94-78b3ae1c39db_1600x599.png 424w, https://substackcdn.com/image/fetch/$s_!qDff!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e41d26f-d812-4eb4-9d94-78b3ae1c39db_1600x599.png 848w, https://substackcdn.com/image/fetch/$s_!qDff!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e41d26f-d812-4eb4-9d94-78b3ae1c39db_1600x599.png 1272w, https://substackcdn.com/image/fetch/$s_!qDff!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e41d26f-d812-4eb4-9d94-78b3ae1c39db_1600x599.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4><strong>Real-World Example</strong></h4><p>Many mobile frameworks provide tools to manage database migrations.</p><p>For example, Room on Android requires every database version to define how the schema should be updated.</p><p>This helps prevent mistakes where the app changes the database structure without providing a safe upgrade path.</p><h4><strong>Trade-offs</strong></h4><p>There are two main ways to handle schema changes:</p><p><strong>a. Destructive Migration</strong></p><p>In this approach, the app deletes the existing database and creates a new one.</p><p>This is simple and safe for the app structure, but it removes all locally stored data.</p><p><strong>b. Additive Migration</strong></p><p>In this approach, the database is updated without deleting existing data.</p><p>For example, adding new columns, adding new tables, and keeping old data intact.</p><p>This method is usually preferred because it preserves user data.</p><h4><strong>Practical Tip</strong></h4><p>Sometimes migrations may fail due to issues like a corrupted database.</p><p>In these situations, it is better for the app to reset the database and re-download the data from the server rather than crash. Losing cached data can be recovered, but an app that keeps crashing when it starts creates a much worse experience.</p><p>As apps handle larger amounts of stored data, another important challenge appears: loading and displaying large datasets efficiently, which is where pagination becomes useful.</p><h3><strong>18. Pagination (Cursor vs Offset vs Page Number)</strong></h3><p>Mobile apps often display long lists of data, such as messages, posts, products, or search results.</p><p>Sometimes these lists can contain thousands of items.</p><p>Loading all of that data at once would be slow and would use a lot of memory. Because of this, apps usually load data in smaller chunks called <strong>pages</strong>. This technique is called <strong>pagination</strong>.</p><p>Pagination helps apps load content gradually as users scroll, keeping the app fast and responsive.</p><h4><strong>How It Works</strong></h4><p>Instead of loading everything at once, the app requests a small set of items at a time.</p><p>For example:</p><ol><li><p>App requests page 1 of data.</p></li><li><p>The server returns the first group of items.</p></li><li><p>When the user scrolls further, the app requests the next page.</p></li></ol><p>This process continues until the user reaches the end of the list&#8230;</p><h4><strong>Common Pagination Methods</strong></h4><p>There are three common ways to implement pagination:</p><p><strong>a. Page Number Pagination</strong></p><p>This is the simplest method. The client requests data using page numbers.</p><p>For example:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;0d9e305f-49d1-40ac-ba44-a2ae93b7ab5d&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">?page=1
?page=2
?page=3</code></pre></div><p>Each page contains a fixed number of items.</p><p>This approach is easy to understand and works well for things like search results or product listings.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5J9w!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41040c41-f786-4d9f-848a-d25bead94c58_1376x530.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5J9w!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41040c41-f786-4d9f-848a-d25bead94c58_1376x530.png 424w, https://substackcdn.com/image/fetch/$s_!5J9w!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41040c41-f786-4d9f-848a-d25bead94c58_1376x530.png 848w, https://substackcdn.com/image/fetch/$s_!5J9w!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41040c41-f786-4d9f-848a-d25bead94c58_1376x530.png 1272w, https://substackcdn.com/image/fetch/$s_!5J9w!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41040c41-f786-4d9f-848a-d25bead94c58_1376x530.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5J9w!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41040c41-f786-4d9f-848a-d25bead94c58_1376x530.png" width="1376" height="530" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/41040c41-f786-4d9f-848a-d25bead94c58_1376x530.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:530,&quot;width&quot;:1376,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5J9w!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41040c41-f786-4d9f-848a-d25bead94c58_1376x530.png 424w, https://substackcdn.com/image/fetch/$s_!5J9w!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41040c41-f786-4d9f-848a-d25bead94c58_1376x530.png 848w, https://substackcdn.com/image/fetch/$s_!5J9w!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41040c41-f786-4d9f-848a-d25bead94c58_1376x530.png 1272w, https://substackcdn.com/image/fetch/$s_!5J9w!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41040c41-f786-4d9f-848a-d25bead94c58_1376x530.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>b. Offset Pagination</strong></p><p>Offset pagination retrieves data starting from a specific position.</p><p>For example:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;56c6dff5-cfe3-422a-b9e7-8f2273a3da26&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">?limit=20&amp;offset=40</code></pre></div><p>This means skip the first 40 items and return the next 20 items.</p><p>This method works well for static data, but dynamically adding new items while the user is browsing can cause problems.</p><p>For example, if new posts appear at the top of a feed, the list may show duplicate items or skip items.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!L_0v!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F465bb9eb-0156-4845-a4bb-1cbcc2d4d375_1204x646.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!L_0v!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F465bb9eb-0156-4845-a4bb-1cbcc2d4d375_1204x646.png 424w, https://substackcdn.com/image/fetch/$s_!L_0v!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F465bb9eb-0156-4845-a4bb-1cbcc2d4d375_1204x646.png 848w, https://substackcdn.com/image/fetch/$s_!L_0v!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F465bb9eb-0156-4845-a4bb-1cbcc2d4d375_1204x646.png 1272w, https://substackcdn.com/image/fetch/$s_!L_0v!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F465bb9eb-0156-4845-a4bb-1cbcc2d4d375_1204x646.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!L_0v!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F465bb9eb-0156-4845-a4bb-1cbcc2d4d375_1204x646.png" width="1204" height="646" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/465bb9eb-0156-4845-a4bb-1cbcc2d4d375_1204x646.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:646,&quot;width&quot;:1204,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!L_0v!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F465bb9eb-0156-4845-a4bb-1cbcc2d4d375_1204x646.png 424w, https://substackcdn.com/image/fetch/$s_!L_0v!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F465bb9eb-0156-4845-a4bb-1cbcc2d4d375_1204x646.png 848w, https://substackcdn.com/image/fetch/$s_!L_0v!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F465bb9eb-0156-4845-a4bb-1cbcc2d4d375_1204x646.png 1272w, https://substackcdn.com/image/fetch/$s_!L_0v!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F465bb9eb-0156-4845-a4bb-1cbcc2d4d375_1204x646.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>c. Cursor Pagination</strong></p><p>Cursor pagination uses a <strong>cursor token</strong> that represents the last item the user has seen. Instead of requesting a page number, the app requests the next items after a specific position.</p><p>For example:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;49268bda-9d4c-4dd7-8d9e-abc8bfcb380a&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">next_cursor = &#8220;abc123&#8221;</code></pre></div><p>Then the next request might look like:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;100a9965-3647-4caa-9c40-b67138417ae6&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">?after=abc123</code></pre></div><p>The cursor usually represents a stable value, such as the ID or timestamp of the last item. This approach works well for data that constantly changes, such as social media feeds.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!27_3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b12a6d8-c01a-4f7b-967c-a714b152ce3a_1146x513.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!27_3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b12a6d8-c01a-4f7b-967c-a714b152ce3a_1146x513.png 424w, https://substackcdn.com/image/fetch/$s_!27_3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b12a6d8-c01a-4f7b-967c-a714b152ce3a_1146x513.png 848w, https://substackcdn.com/image/fetch/$s_!27_3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b12a6d8-c01a-4f7b-967c-a714b152ce3a_1146x513.png 1272w, https://substackcdn.com/image/fetch/$s_!27_3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b12a6d8-c01a-4f7b-967c-a714b152ce3a_1146x513.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!27_3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b12a6d8-c01a-4f7b-967c-a714b152ce3a_1146x513.png" width="1146" height="513" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1b12a6d8-c01a-4f7b-967c-a714b152ce3a_1146x513.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:513,&quot;width&quot;:1146,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!27_3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b12a6d8-c01a-4f7b-967c-a714b152ce3a_1146x513.png 424w, https://substackcdn.com/image/fetch/$s_!27_3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b12a6d8-c01a-4f7b-967c-a714b152ce3a_1146x513.png 848w, https://substackcdn.com/image/fetch/$s_!27_3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b12a6d8-c01a-4f7b-967c-a714b152ce3a_1146x513.png 1272w, https://substackcdn.com/image/fetch/$s_!27_3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b12a6d8-c01a-4f7b-967c-a714b152ce3a_1146x513.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4><strong>Why It Matters</strong></h4><p>Imagine a social feed with 10,000 posts. If the app tried to load everything at once:</p><ul><li><p>The request would take a long time</p></li><li><p>Rendering the UI would be slow</p></li><li><p>Device might run out of memory</p></li></ul><p>Pagination solves this by loading only a small portion of the data at a time, which keeps scrolling smooth.</p><h4><strong>Real-World Example</strong></h4><p>Many popular apps, such as Twitter, Instagram, and TikTok, rely on cursor-based pagination.</p><p>These apps load new posts as the user scrolls, using a cursor that points to the last item that was displayed.</p><p>This ensures the feed remains stable even when new posts appear&#8230;</p><h4><strong>Trade-offs</strong></h4><p>Each pagination method has advantages and limitations:</p><ul><li><p><strong>Page number pagination: </strong>Simple to understand and works well for search results, but less efficient for large datasets.</p></li><li><p><strong>Offset pagination: </strong>Easy to implement and allows jumping to any position, but it can break when new data is inserted.</p></li><li><p><strong>Cursor pagination: </strong>Stable and efficient for live feeds and prevents duplicates and skipped items, but does not easily support jumping directly to a specific page.</p></li></ul><p>Because mobile devices have limited memory and storage, loading and storing data must be done carefully! Therefore, it&#8217;s important to design efficient data models that work well within these constraints.</p><h3><strong>19. Data Modeling for Mobile Constraints</strong></h3><p>Mobile apps often store data locally on the device so they can load information quickly and continue working even when the network is slow.</p><p>Yet mobile devices have limited storage, CPU power, and battery, so the way data is structured must be carefully designed.</p><p>Because of these limits, mobile apps usually store and organise data to help the UI load quickly, even if it means repeating some information.</p><h4><strong>Things Mobile Data Models Must Consider</strong></h4><p>When designing data models for mobile apps, several important constraints should be considered:</p><p><strong>a. Limited storage</strong></p><p>Mobile devices cannot store unlimited data.</p><p>Apps should avoid caching everything and instead remove old or unused data after a period of time. For example, an app might store cached data for a few hours or days, then automatically delete it.</p><p><strong>b. CPU performance</strong></p><p>Complex database queries can slow down the app.</p><p>Queries that join many tables may take longer to process. To keep the app fast, store related data together so the app can read it quickly without performing complicated queries.</p><p><strong>c. Battery usage</strong></p><p>Every time an app writes data to disk, it uses battery power.</p><p>Frequent database writes can drain the battery over time. Because of this, apps try to minimise unnecessary writes and batch operations when possible.</p><h4><strong>Why It Matters</strong></h4><p>Data models used on backend servers are usually designed to save storage space and keep data perfectly structured.</p><p>But mobile apps have a different goal: <em>fast rendering on the screen</em>.</p><p>This means the data model on mobile may look different from the backend data model, because it&#8217;s optimised for speed and user experience rather than strict structure.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!T5xu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66021ef4-e515-4be6-aa4d-eb034dc30976_1450x726.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!T5xu!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66021ef4-e515-4be6-aa4d-eb034dc30976_1450x726.png 424w, https://substackcdn.com/image/fetch/$s_!T5xu!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66021ef4-e515-4be6-aa4d-eb034dc30976_1450x726.png 848w, https://substackcdn.com/image/fetch/$s_!T5xu!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66021ef4-e515-4be6-aa4d-eb034dc30976_1450x726.png 1272w, https://substackcdn.com/image/fetch/$s_!T5xu!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66021ef4-e515-4be6-aa4d-eb034dc30976_1450x726.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!T5xu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66021ef4-e515-4be6-aa4d-eb034dc30976_1450x726.png" width="1450" height="726" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/66021ef4-e515-4be6-aa4d-eb034dc30976_1450x726.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:726,&quot;width&quot;:1450,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!T5xu!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66021ef4-e515-4be6-aa4d-eb034dc30976_1450x726.png 424w, https://substackcdn.com/image/fetch/$s_!T5xu!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66021ef4-e515-4be6-aa4d-eb034dc30976_1450x726.png 848w, https://substackcdn.com/image/fetch/$s_!T5xu!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66021ef4-e515-4be6-aa4d-eb034dc30976_1450x726.png 1272w, https://substackcdn.com/image/fetch/$s_!T5xu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66021ef4-e515-4be6-aa4d-eb034dc30976_1450x726.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4><strong>Real-World Example</strong></h4><p>Imagine a contacts app that shows a list of people.</p><p>Instead of calculating everything each time the list loads, the app may store some information in advance, such as display name, profile picture URL, and the first letter used for alphabetical sections.</p><p>Because this information is already stored, the app can render the list instantly without performing extra calculations.</p><h4><strong>Trade-offs</strong></h4><p>Storing extra or pre-computed data improves speed, but it also has some downsides.</p><p>Pre-computed data uses more storage space and must be updated whenever the original data changes.</p><p>Yet this trade-off is usually acceptable because storage is relatively cheap, while slow or laggy interfaces are immediately noticeable to users.</p><div><hr></div><h2><strong>Final Words</strong></h2><p>None of these concepts works on its own.</p><p>The choices you make in one area affect everything else. Networking affects caching, caching affects sync, and sync affects how conflicts are handled.</p><p>It&#8217;s all connected.</p><p>The best engineers aren&#8217;t the ones who know the most concepts. They&#8217;re the ones who understand the tradeoffs and know when to use what.</p><p>So remember this:</p><p><em>Start simple. Add complexity only when you really need it. And when that time comes, you&#8217;ll know what to use.</em></p><div><hr></div><p>&#128075; I&#8217;d like to thank <strong><a href="https://x.com/Shefali__J">Shefali</a></strong> for writing this newsletter!</p><p>Plus, don&#8217;t forget to check out her work and socials:</p><ul><li><p><a href="https://shefali.dev/">Shefali.dev</a></p></li><li><p><a href="https://github.com/WebdevShefali">GitHub</a></p></li><li><p><a href="https://x.com/Shefali__J">Twitter</a></p></li></ul><p>You&#8217;ll often find her writing about web development, sharing UI tips, and building tools that make developers&#8217; lives easier.</p><div><hr></div><p>Get the full <em>premium</em> newsletter series and max your system design career leverage:</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.systemdesign.one/subscribe?"><span>Subscribe now</span></a></p><p>Plus, there are <a href="http://newsletter.systemdesign.one/subscribe?group=true">group discounts</a>, <a href="http://newsletter.systemdesign.one/subscribe?gift=true">gift options</a>, and <a href="https://newsletter.systemdesign.one/leaderboard">referral rewards</a> available.</p><div><hr></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://www.linkedin.com/in/nk-systemdesign-one/" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bEFk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 424w, https://substackcdn.com/image/fetch/$s_!bEFk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 848w, https://substackcdn.com/image/fetch/$s_!bEFk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 1272w, https://substackcdn.com/image/fetch/$s_!bEFk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bEFk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png" width="152" height="152" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:320,&quot;width&quot;:320,&quot;resizeWidth&quot;:152,&quot;bytes&quot;:74009,&quot;alt&quot;:&quot;Author Neo Kim; System design case studies&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://www.linkedin.com/in/nk-systemdesign-one/&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Author Neo Kim; System design case studies" title="Author Neo Kim; System design case studies" srcset="https://substackcdn.com/image/fetch/$s_!bEFk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 424w, https://substackcdn.com/image/fetch/$s_!bEFk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 848w, https://substackcdn.com/image/fetch/$s_!bEFk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 1272w, https://substackcdn.com/image/fetch/$s_!bEFk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f94ab8c-0d67-4775-992e-05e09ab710db_320x320.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption"><strong>&#128075; Find me on <a href="https://www.linkedin.com/in/nk-systemdesign-one/">LinkedIn</a> | <a href="https://x.com/intent/follow?screen_name=systemdesignone">Twitter</a> | <a href="https://www.threads.net/@systemdesignone">Threads</a> | <a href="https://www.instagram.com/systemdesignone/">Instagram</a></strong></figcaption></figure></div><div><hr></div><p><strong>Want to reach 210K+ tech professionals at scale? </strong>&#128240;</p><p>If your company wants to reach 210K+ tech professionals, <a href="https://newsletter.systemdesign.one/p/sponsorship">advertise with me</a>.</p><div><hr></div><p>Thank you for supporting this newsletter.</p><p>You are now 210,001+ readers strong, very close to 210k. Let&#8217;s try to get 211k readers by 27 March. Consider sharing this post with your friends and get rewards.</p><p>Y&#8217;all are the best.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6oWl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6oWl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 424w, https://substackcdn.com/image/fetch/$s_!6oWl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 848w, https://substackcdn.com/image/fetch/$s_!6oWl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 1272w, https://substackcdn.com/image/fetch/$s_!6oWl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6oWl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png" width="590" height="368.75" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2e739087-a910-4643-be36-997b6dd5b4af_800x500.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:500,&quot;width&quot;:800,&quot;resizeWidth&quot;:590,&quot;bytes&quot;:87878,&quot;alt&quot;:&quot;system design newsletter&quot;,&quot;title&quot;:&quot;system design newsletter&quot;,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.systemdesign.one/i/163380418?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="system design newsletter" title="system design newsletter" srcset="https://substackcdn.com/image/fetch/$s_!6oWl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 424w, https://substackcdn.com/image/fetch/$s_!6oWl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 848w, https://substackcdn.com/image/fetch/$s_!6oWl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 1272w, https://substackcdn.com/image/fetch/$s_!6oWl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e739087-a910-4643-be36-997b6dd5b4af_800x500.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/p/mobile-system-design?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.systemdesign.one/p/mobile-system-design?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><div><hr></div><ul><li><p>Block diagrams created using <a href="https://app.eraser.io/auth/sign-up?ref=neo">Eraser</a>.</p></li></ul>]]></content:encoded></item><item><title><![CDATA[The Mobile Engineer's Guide to System Design Interviews]]></title><description><![CDATA[#134: Mobile System Design Interview]]></description><link>https://newsletter.systemdesign.one/p/mobile-system-design-interview</link><guid isPermaLink="false">https://newsletter.systemdesign.one/p/mobile-system-design-interview</guid><dc:creator><![CDATA[Neo Kim]]></dc:creator><pubDate>Tue, 24 Mar 2026 13:40:42 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/6968c1a0-7187-4c26-ab1b-c0c731466554_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Get my system design playbook for FREE on newsletter signup:</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><ul><li><p><em><a href="https://newsletter.systemdesign.one/p/mobile-system-design-interview/?action=share">Share this post</a> &amp; I'll send you some rewards for the referrals.</em></p></li></ul><div><hr></div><p>You&#8217;ve probably been there: You&#8217;re sitting across from two engineers (or staring at them through a webcam), and they&#8217;ve just asked you to design Instagram&#8217;s feed. Or a ride-sharing app. Or some chat system with offline support and end-to-end encryption.</p><p>The clock is ticking&#8230;</p><p>You have 45 minutes to turn a vague prompt into something that looks like you know what you&#8217;re doing.</p><p>Your mind races: <em>Where do I even start? Do I draw boxes first? Ask about the user numbers? Talk about databases? What if I forget something critical?</em></p><p>Most mobile engineers don&#8217;t realize this about system design interviews: <strong>They&#8217;re NOT testing whether you can build Instagram.</strong> They&#8217;re testing whether you can take something fuzzy and turn it into something concrete.</p><p>This is the same skill you use every day when your PM says, <em>&#8220;We need push notifications,&#8221;</em> or your designer drops a Figma file in Slack with the message <em>&#8220;Thoughts?</em>&#8221; in your day job.</p><div><hr></div><h4 style="text-align: justify;"><strong><a href="https://platform.minimax.io/">12% OFF MiniMax M2.7 - The SOTA Cowork Agent Model That Just Outranked Opus and Gemini 3.1. (Partner)</a></strong></h4><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://platform.minimax.io" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4CuL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61da77e5-6a17-4176-9714-ed285e414a98_1600x795.png 424w, https://substackcdn.com/image/fetch/$s_!4CuL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61da77e5-6a17-4176-9714-ed285e414a98_1600x795.png 848w, https://substackcdn.com/image/fetch/$s_!4CuL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61da77e5-6a17-4176-9714-ed285e414a98_1600x795.png 1272w, https://substackcdn.com/image/fetch/$s_!4CuL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61da77e5-6a17-4176-9714-ed285e414a98_1600x795.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4CuL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61da77e5-6a17-4176-9714-ed285e414a98_1600x795.png" width="1456" height="723" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/61da77e5-6a17-4176-9714-ed285e414a98_1600x795.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:723,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:&quot;https://platform.minimax.io&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!4CuL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61da77e5-6a17-4176-9714-ed285e414a98_1600x795.png 424w, https://substackcdn.com/image/fetch/$s_!4CuL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61da77e5-6a17-4176-9714-ed285e414a98_1600x795.png 848w, https://substackcdn.com/image/fetch/$s_!4CuL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61da77e5-6a17-4176-9714-ed285e414a98_1600x795.png 1272w, https://substackcdn.com/image/fetch/$s_!4CuL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61da77e5-6a17-4176-9714-ed285e414a98_1600x795.png 1456w" sizes="100vw" loading="lazy" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;"><strong><a href="https://platform.minimax.io/subscribe/coding-plan?code=KGlGYHegnw&amp;source=link">MiniMax M2.7</a></strong> is here, and it&#8217;s unlike any agent model released before it.</p><p style="text-align: justify;">Built on their proprietary SOTA Cowork Agent Model architecture, M2.7 shows early echoes of self-evolution: the ability to follow complex, layered instructions across long workflows without losing the thread.</p><p style="text-align: justify;">At 5 tools, every model looks sharp.</p><p style="text-align: justify;">At 40, almost all fall apart silently.</p><p style="text-align: justify;">Wrong tool invoked. Step missed. No error thrown. You find out 3 days post-deploy.</p><p style="text-align: justify;">M2.7 was tested across 40 complex skills, each exceeding 2,000 tokens.<br>Skill adherence rate: 97%.</p><p style="text-align: justify;">Here&#8217;s what that looks like in production:</p><ul><li><p>Alert fires &#8594; metrics correlated &#8594; root cause traced across systems</p></li><li><p>DB queried to verify &#8594; fix submitted</p></li><li><p>Recovery time: under 3 minutes. Manual process: 4+ hours.</p></li></ul><p style="text-align: justify;">Most models are built to write code.</p><p style="text-align: justify;">M2.7 is built to understand your production systems.</p><p style="text-align: justify;">And it ships inside something new: the world&#8217;s first subscription plan for an all-modality-capable model. One API key covers M2.7, Speech, Image, Video, and Music generation; the full stack, no provider switching.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://platform.minimax.io/subscribe/coding-plan?code=KGlGYHegnw&amp;source=link&quot;,&quot;text&quot;:&quot;MiniMax Token Plan 12% OFF&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://platform.minimax.io/subscribe/coding-plan?code=KGlGYHegnw&amp;source=link"><span>MiniMax Token Plan 12% OFF</span></a></p><div><hr></div><p>I want to introduce <strong><a href="https://www.linkedin.com/in/tjeerdintveen/">Tjeerd in &#8216;t Veen</a></strong> as the guest author.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://www.mobilesystemdesign.com/book" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cuNb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb1c7e4b-6898-4d5c-98c1-79db5b9a9675_1238x581.png 424w, https://substackcdn.com/image/fetch/$s_!cuNb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb1c7e4b-6898-4d5c-98c1-79db5b9a9675_1238x581.png 848w, https://substackcdn.com/image/fetch/$s_!cuNb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb1c7e4b-6898-4d5c-98c1-79db5b9a9675_1238x581.png 1272w, https://substackcdn.com/image/fetch/$s_!cuNb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb1c7e4b-6898-4d5c-98c1-79db5b9a9675_1238x581.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cuNb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb1c7e4b-6898-4d5c-98c1-79db5b9a9675_1238x581.png" width="1238" height="581" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/db1c7e4b-6898-4d5c-98c1-79db5b9a9675_1238x581.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:581,&quot;width&quot;:1238,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:&quot;https://www.mobilesystemdesign.com/book&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cuNb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb1c7e4b-6898-4d5c-98c1-79db5b9a9675_1238x581.png 424w, https://substackcdn.com/image/fetch/$s_!cuNb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb1c7e4b-6898-4d5c-98c1-79db5b9a9675_1238x581.png 848w, https://substackcdn.com/image/fetch/$s_!cuNb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb1c7e4b-6898-4d5c-98c1-79db5b9a9675_1238x581.png 1272w, https://substackcdn.com/image/fetch/$s_!cuNb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb1c7e4b-6898-4d5c-98c1-79db5b9a9675_1238x581.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>He recently released the <strong><a href="https://www.mobilesystemdesign.com/book">Mobile System Design book bundle</a></strong> based on years of interviewing mobile engineers and building large-scale mobile apps. This newsletter distills the core process (framework) from the books into something you can use to prep for interviews or plan features at work.</p><p>If you want to go deeper, the <a href="https://www.mobilesystemdesign.com/book">full book bundle</a> has more examples, detailed architecture patterns, and code-level implementation strategies. </p><div><hr></div><h2>What this newsletter covers</h2><p>There are plenty of great general system design guides. This one is about mobile constraints and how they shape the architecture.</p><p>This newsletter walks through the actual process:</p><ul><li><p>How to take a vague prompt and figure out what the interviewer <em>actually</em> wants</p></li><li><p>How to uncover the hidden requirements nobody mentioned (because they&#8217;re testing if <em>you&#8217;ll</em> think of them)</p></li><li><p>How to structure your thinking so you don&#8217;t forget the mobile-specific stuff (offline state, battery drain, app lifecycle, memory constraints)</p></li><li><p>How to turn abstract architecture talk into concrete implementation details</p></li></ul><p>This works whether you&#8217;re in an interview or planning a real feature at work.</p><p>The process is the same.</p><div><hr></div><h2>Who this is for</h2><p>If you&#8217;re a mobile engineer (iOS, Android, Flutter, React Native, etc.) and you need to:</p><ul><li><p>Prepare for system design interviews</p></li><li><p>Stop feeling like you&#8217;re fumbling when asked to &#8220;design something.&#8221;</p></li><li><p>Learn how to think through mobile architecture in a structured way</p></li><li><p>Get good at turning vague requirements into actual plans</p></li></ul><p>Then this is for you&#8230;</p><p>I&#8217;m assuming you already know how to develop mobile apps. You understand the basics of networking, data persistence, and why you shouldn&#8217;t do heavy work on the main thread. You&#8217;ve shipped something to production and dealt with the consequences.</p><p>What you might not have is a structured way to <em>think</em> about system design. That&#8217;s what we&#8217;re covering here&#8230;</p><div><hr></div><h1>Interview Structure</h1><p>Let&#8217;s talk about what actually happens in a mobile system design interview:</p><p>The format can vary wildly between companies. Some give you 45 minutes, others give you 60. Some want you on a whiteboard, others use tools like <a href="https://codesignal.com">CodeSignal</a> or <a href="https://www.hackerrank.com">HackerRank</a>. Some companies even ask backend system design questions in mobile interviews, which is a separate discussion entirely.</p><p>But most mobile-focused interviews follow a similar arc.</p><p>Here&#8217;s what you can typically expect&#8230;</p><div><hr></div><h2>Typical flow</h2><p>Most interviews break down roughly like this:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mmwm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F961017fd-3581-4a7e-8f46-d141979a981a_1052x181.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mmwm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F961017fd-3581-4a7e-8f46-d141979a981a_1052x181.png 424w, https://substackcdn.com/image/fetch/$s_!mmwm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F961017fd-3581-4a7e-8f46-d141979a981a_1052x181.png 848w, https://substackcdn.com/image/fetch/$s_!mmwm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F961017fd-3581-4a7e-8f46-d141979a981a_1052x181.png 1272w, https://substackcdn.com/image/fetch/$s_!mmwm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F961017fd-3581-4a7e-8f46-d141979a981a_1052x181.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mmwm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F961017fd-3581-4a7e-8f46-d141979a981a_1052x181.png" width="1052" height="181" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/961017fd-3581-4a7e-8f46-d141979a981a_1052x181.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:181,&quot;width&quot;:1052,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mmwm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F961017fd-3581-4a7e-8f46-d141979a981a_1052x181.png 424w, https://substackcdn.com/image/fetch/$s_!mmwm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F961017fd-3581-4a7e-8f46-d141979a981a_1052x181.png 848w, https://substackcdn.com/image/fetch/$s_!mmwm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F961017fd-3581-4a7e-8f46-d141979a981a_1052x181.png 1272w, https://substackcdn.com/image/fetch/$s_!mmwm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F961017fd-3581-4a7e-8f46-d141979a981a_1052x181.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>At a high level, the flow is:</p><ul><li><p>Requirements and scope (5-10 minutes): clarify what you&#8217;re designing and the constraints</p></li><li><p>API and data needs (5-10 minutes): define what the app needs from the server</p></li><li><p>Architecture (10-15 minutes): outline client components and data flow</p></li><li><p>Deep dive (15-20 minutes): walk one critical flow or adjust for a new constraint</p></li><li><p>Wrap-up (1-5 minutes): summarize decisions and tradeoffs</p></li></ul><p>I&#8217;ll go deep on how to execute each phase in the next section&#8230;</p><div><hr></div><h2>What&#8217;s expected at different levels</h2><p>Interviewers don&#8217;t judge all mobile engineers the same way.</p><p>Here&#8217;s what interviewers are actually looking for based on your level:</p><h3>Junior to mid-level engineers</h3><p>You&#8217;re expected to:</p><ul><li><p>Ask clarifying questions (even if they seem obvious)</p></li><li><p>Draw a reasonable architecture diagram without major gaps</p></li><li><p>Explain common mobile patterns (caching, background tasks, handling lifecycle)</p></li><li><p>Discuss basic tradeoffs - <em>&#8221;This approach is simpler but uses more memory.&#8221;</em></p></li></ul><p>You&#8217;re <em>not</em> expected to:</p><ul><li><p>Know every edge case off the top of your head</p></li><li><p>Have opinions on monorepo versus multi-repo</p></li><li><p>Design for 100 million users on day one</p></li></ul><p><strong>If you get stuck on something complex (like offline conflict resolution), just say so. </strong><em>&#8220;I haven&#8217;t built a system that merges offline edits before, but I&#8217;d start with last-write-wins and ask about requirements,&#8221;</em> is fine.</p><p>Honesty beats bullshitting.</p><h3>Senior engineers</h3><p>Everything from junior/mid, plus:</p><ul><li><p>Proactively uncover hidden requirements without prompting - <em>&#8221;What about users on limited data plans?&#8221;</em></p></li><li><p>Propose architecture with clear reasoning for your choices - <em>&#8221;I&#8217;m using a repository pattern here because we need to support offline-first&#8221;</em></p></li><li><p>Demonstrate deep knowledge of mobile-specific challenges (app lifecycle, memory pressure, battery drain, offline sync)</p></li><li><p>Adapt when requirements change mid-interview - <em>&#8221;Okay, if we need to support 50MB video uploads, we&#8217;ll need background upload with chunking&#8230;&#8221;</em></p></li></ul><p><em><strong>You should drive the conversation.</strong></em></p><p>The interviewer shouldn&#8217;t have to pull answers out of you.</p><h3>Staff and principal engineers</h3><p>Everything from senior, plus:</p><ul><li><p>Frame the problem: <em>&#8220;Before we design this, let&#8217;s talk about whether we even need offline support for this use case.&#8221;</em></p></li><li><p>Consider team structure: <em>&#8220;If we structure this as a separate module, we can have one team own it independently.&#8221;</em></p></li><li><p>Think long-term: <em>&#8220;This works for v1, but if we add feature X later, we&#8217;ll need to refactor. Let&#8217;s design for that now.&#8221;</em></p></li><li><p>Make architectural tradeoffs at the system level: modularization strategy, build times, deployment, backward compatibility</p></li></ul><p><strong>You&#8217;re expected to think like a tech lead or architect</strong>, not just an IC implementing a feature. The interviewer wants to see if you can make decisions that affect the entire team or org.</p><p>Now that you know what the interview <em>structure</em> looks like, let&#8217;s talk about the actual <em>process</em> you&#8217;ll use during those 45 minutes.</p><div><hr></div><h1>Core Process</h1><p>The interview structure gives you the timeline. But what do you actually <em>do</em> in each phase?</p><p>This is the process you&#8217;ll follow, roughly in order, during the first 30-40 minutes of the interview. Think of it as a checklist for making sure you forget nothing critical.</p><p>You won&#8217;t always do these steps in exactly this order. Sometimes you&#8217;ll jump back and forth. Sometimes the interviewer will steer you in a different direction. That&#8217;s fine.</p><p>But this is the general flow that works:</p><div><hr></div><h2>A reusable pattern (applies to every step)</h2><p>Most steps look different, but the rhythm is the same:</p><ul><li><p>Ask 2-3 high-impact questions before designing.</p></li><li><p>Restate what you heard and get a clear &#8220;<em>yes</em>.&#8221;</p></li><li><p>Write down decisions: constraints, must-haves versus nice-to-haves, and open questions.</p></li><li><p>Call out red flags early instead of waiting.</p></li><li><p>You don&#8217;t have to solve everything on the spot. Show you see the issue, pick a reasonable tradeoff, and move on.</p></li></ul><div><hr></div><h2>Capturing the briefing</h2><p>Here&#8217;s how most system design interviews start:</p><p>The interviewer says something like, <em>&#8220;Design a feed for a social media app,&#8221; </em>or<em> &#8220;Build a way for users to share photos with friends.&#8221;</em></p><p>And your brain immediately starts racing: <em>Should I ask about the backend? Should I start drawing? What if I forget to mention caching?</em></p><p>Before you do any of that, pause.</p><p><em><strong>Your first job is not to design anything. It&#8217;s understanding what you&#8217;re being asked to design.</strong></em></p><p>This sounds obvious, but most candidates blow past this step in 60 seconds and then spend 40 minutes designing the wrong thing&#8230;</p><div><hr></div><h3>What you&#8217;re actually trying to do</h3><p>In the first 5-10 minutes, you&#8217;re trying to turn a vague prompt into something concrete enough that you know what problem you&#8217;re solving.</p><p>You&#8217;re not trying to get <em>every</em> detail. You&#8217;re trying to get <em>enough</em> clarity that you&#8217;re not completely guessing.</p><p>Think of it like this: If someone asks you to &#8220;<em>design a messaging feature</em>,&#8221; do they want:</p><ul><li><p>WhatsApp (end-to-end encrypted, real-time, requires phone number)</p></li><li><p>Slack (threaded, searchable, integrates with other tools)</p></li><li><p>Twitter/X DMs (simple, text-only, tied to social graph)</p></li></ul><p>All of those are &#8220;messaging.&#8221; They&#8217;re also completely different architectures.</p><p><em><strong>So your job is to ask questions that reveal which one they actually want.</strong></em></p><div><hr></div><h3>Questions that actually matter</h3><p>Here&#8217;s what you should ask:</p><p><strong>About success criteria:</strong></p><ul><li><p>&#8220;What does success look like for this feature?&#8221; (Are we optimizing for speed? Reliability? Simplicity?)</p></li><li><p>&#8220;What&#8217;s the most important thing this needs to do well?&#8221; (If we could only do one thing right, what would it be?)</p></li></ul><p><strong>About scope:</strong></p><ul><li><p>&#8220;Is this the whole app, or are we focusing on one part?&#8221; (Don&#8217;t design all of Instagram if they just want the feed)</p></li><li><p>&#8220;What platforms are we targeting?&#8221; (iOS only? Android? Cross-platform?)</p></li></ul><p><strong>About constraints:</strong></p><ul><li><p>&#8220;What network conditions should we plan for?&#8221; (Always online? Intermittent? Fully offline?)</p></li><li><p>&#8220;Are there any technical constraints I should know about?&#8221; (Existing systems we need to integrate with? Legacy code?)</p></li></ul><p><strong>About users:</strong></p><ul><li><p>&#8220;Who are the primary users?&#8221; (Consumers? Enterprise? Internal tools?)</p></li><li><p>&#8220;What devices are they on?&#8221; (Flagship phones? Low-end Android? Tablets?)</p></li></ul><p><em><strong>Don&#8217;t ask everything.</strong></em></p><p>Pick the questions that will most change your design. You have limited time.</p><div><hr></div><h3>A bad and a good example</h3><p><strong>Bad example:</strong></p><blockquote><p>Interviewer: &#8220;Design a feed for a social media app.&#8221;</p><p>You: &#8220;Okay, so I&#8217;ll use MVVM for the architecture. We&#8217;ll have a ViewModel that fetches data from a Repository, which talks to both a local database and a REST API. UI will be built with&#8212;&#8221;</p><p>Interviewer (internally): <em>They didn&#8217;t ask a single question. Do they even know what they&#8217;re designing?</em></p></blockquote><p>You just spent 3 minutes designing an architecture without knowing:</p><ul><li><p>Whether it needs to work offline</p></li><li><p>Whether it&#8217;s a text-based feed or heavy on images and video</p></li><li><p>Whether it requires pre-caching for a smoother scrolling experience</p></li></ul><p>That&#8217;s a red flag. Interviewers are <em>explicitly</em> looking for whether you ask questions.</p><p><strong>Good example:</strong></p><blockquote><p>Interviewer: &#8220;Design a feed for a social media app.&#8221;</p><p>You: &#8220;Okay, let me make sure I understand what we&#8217;re designing. Is this feed mostly images and video, or mostly text? That affects caching and pre-fetching on mobile.&#8221;</p><p>Interviewer: &#8220;Mostly images, some video.&#8221;</p><p>You: &#8220;Got it. For mobile, I&#8217;d plan on disk caching images (and maybe short videos) and pre-fetching upcoming content so the feed loads fast and works offline. Is that aligned with what you want?&#8221;</p><p>Interviewer: &#8220;Yes, basic offline support for previously loaded content is important.&#8221;</p><p>You: &#8220;Makes sense. One more thing: do we need live updates, or is pull-to-refresh enough for v1?&#8221;</p><p>Interviewer: &#8220;Pull-to-refresh is fine.&#8221;</p><p>You: &#8220;Perfect. So, to summarize: we&#8217;re designing a media-heavy feed with basic offline support. We&#8217;ll cache images (and maybe short videos), pre-fetch upcoming content for fast loads, and start with pull-to-refresh for updates. Does that sound right?&#8221;</p><p>Interviewer: &#8220;Yep, that&#8217;s it.&#8221;</p></blockquote><div><hr></div><p><em><strong>Reminder: this is a teaser of the subscriber-only newsletter series, exclusive to my golden members.</strong></em></p><p>When you upgrade, you&#8217;ll get:</p><ul><li><p><strong>High-level architecture of real-world systems.</strong></p></li><li><p>Deep dive into how popular real-world systems actually work.</p></li><li><p><strong>How real-world systems handle scale, reliability, and performance.</strong></p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.systemdesign.one/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><h2>Briefing wrap-up (red flags + next steps)</h2><p>Sometimes the briefing is incomplete or contradictory. Call it out early instead of guessing&#8230;</p><p><strong>Common red flags:</strong></p><ul><li><p>Requirements that are vague or overly broad - <em>&#8221;make it fast.&#8221;</em></p></li><li><p>Contradictory statements -<em> &#8221;we want it simple, but also highly customizable.&#8221;</em></p></li><li><p>Missing critical info (no mention of offline, auth, or data sources)</p></li></ul><p><strong>When you spot these, call them out:</strong></p><p><em>&#8220;You mentioned we want it to be &#8216;fast.&#8217; Can you be more specific? Are we talking about load time under 1 second, smooth scrolling, or something else?&#8221;</em></p><p><em>&#8220;I&#8217;m noticing we haven&#8217;t talked about where the data comes from. Should I assume we&#8217;re pulling from a REST API, or is there a different data source?&#8221;</em></p><p>Don&#8217;t rush ahead, hoping it&#8217;ll become clear later. It won&#8217;t. You&#8217;ll end up designing something that doesn&#8217;t match what they wanted&#8230;</p><p><strong>Next steps once the briefing is clear:</strong></p><ul><li><p>Write down key constraints in the shared doc or on the whiteboard (&#8221;Offline support required&#8221;, &#8220;100k users&#8221;, &#8220;Existing auth system&#8221;)</p></li><li><p>Mentally categorize what is a must-have versus a nice-to-have</p></li><li><p>Identify the biggest unknowns you&#8217;ll need to address</p></li></ul><p><em><strong>Lesson: Don&#8217;t start designing until you know what you&#8217;re designing.</strong></em></p><p>This can feel slow, but it&#8217;s faster than designing the wrong thing&#8230;</p><div><hr></div><h2>Defining scope and requirements</h2><p>Okay, so you&#8217;ve asked your questions, and you have a basic understanding of what you&#8217;re designing. Now comes the hard part: figuring out what you&#8217;re <em>actually</em> going to design.</p><p>Here&#8217;s the problem: If someone asks you to <em>&#8220;design a messaging feature,&#8221;</em> there are about 47 different things they could mean. Do they want read receipts? Typing indicators? Group chats? Voice messages? End-to-end encryption? Reactions? Threads?</p><p>You can&#8217;t fit all of that into v1. You probably can&#8217;t even fit half of it.</p><p><em><strong>So your job is to separate what&#8217;s essential from what&#8217;s nice-to-have.</strong></em></p><p>This is where most people mess up. They either:</p><ol><li><p>Try to include everything at once and run out of time</p></li><li><p>Design the wrong things and miss what actually matters</p></li><li><p>Don&#8217;t ask at all and just guess</p></li></ol><p>All three are bad. Let&#8217;s talk about how to do this right:</p><div><hr></div><h3>What you&#8217;re trying to figure out</h3><p>You need to answer these questions:</p>
      <p>
          <a href="https://newsletter.systemdesign.one/p/mobile-system-design-interview">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[RAG - A Deep Dive]]></title><description><![CDATA[#133: Understanding Retrieval-Augmented Generation]]></description><link>https://newsletter.systemdesign.one/p/how-rag-works</link><guid isPermaLink="false">https://newsletter.systemdesign.one/p/how-rag-works</guid><dc:creator><![CDATA[Neo Kim]]></dc:creator><pubDate>Mon, 23 Mar 2026 11:45:50 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/8d630c6b-2576-4b38-8561-e3b88cf47d0e_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Get my system design playbook for FREE on newsletter signup:</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><ul><li><p><em><a href="https://newsletter.systemdesign.one/p/how-rag-works/?action=share">Share this post</a> &amp; I'll send you some rewards for the referrals.</em></p></li></ul><div><hr></div><p>Every large language model<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a> (<strong>LLM</strong>) you use has lied to you with confidence, fluency, and frequency&#8230;</p><p>Ask any model about something that happened last week. It doesn&#8217;t know. It can&#8217;t know. Its knowledge was frozen months ago. They might try, and if they do, they will hallucinate.</p><p>This isn&#8217;t a bug.</p><p>It&#8217;s a fundamental architectural limitation. LLMs keep knowledge in their parameters<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a>. These are billions of numerical weights learned during training. Once training ends, the knowledge is locked. The model doesn&#8217;t know what it doesn&#8217;t know, so it fills the gaps with confident fabrication. Studies show that hallucination<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a> rates can be as low as 1% for simple summarization tasks. However, they can exceed 58% for complex work.</p><p>This is exactly what Retrieval-Augmented Generation<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a> (<strong>RAG</strong>) solves.</p><p>RAG doesn&#8217;t bake knowledge into the model. Instead, it pulls in relevant context when you ask a question. The model stays smart, your data stays current, and every answer is traceable back to its source.</p><p>This is the architectural pattern that makes AI actually useful.</p><p>Here&#8217;s how it works, why it works, and how to build it:</p><div><hr></div><h2><a href="https://link.outskill.com/NEOKIMMAR4">Still juggling 10 different tools? Learn AI workflows and replace 80% of them (Partner)</a></h2><p>Perplexity&#8217;s new computer thinks, designs, codes, and manages projects &#8212; all without you lifting a finger to switch tools.</p><p>This is where AI is headed: one place to get everything done. Smarter, faster, and built to make you 10x more productive.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://link.outskill.com/NEOKIMMAR4" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!knkf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3234eb1b-ca33-4b13-bc85-4e599170e1d4_1600x902.png 424w, https://substackcdn.com/image/fetch/$s_!knkf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3234eb1b-ca33-4b13-bc85-4e599170e1d4_1600x902.png 848w, https://substackcdn.com/image/fetch/$s_!knkf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3234eb1b-ca33-4b13-bc85-4e599170e1d4_1600x902.png 1272w, https://substackcdn.com/image/fetch/$s_!knkf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3234eb1b-ca33-4b13-bc85-4e599170e1d4_1600x902.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!knkf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3234eb1b-ca33-4b13-bc85-4e599170e1d4_1600x902.png" width="1456" height="821" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3234eb1b-ca33-4b13-bc85-4e599170e1d4_1600x902.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:821,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:&quot;https://link.outskill.com/NEOKIMMAR4&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!knkf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3234eb1b-ca33-4b13-bc85-4e599170e1d4_1600x902.png 424w, https://substackcdn.com/image/fetch/$s_!knkf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3234eb1b-ca33-4b13-bc85-4e599170e1d4_1600x902.png 848w, https://substackcdn.com/image/fetch/$s_!knkf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3234eb1b-ca33-4b13-bc85-4e599170e1d4_1600x902.png 1272w, https://substackcdn.com/image/fetch/$s_!knkf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3234eb1b-ca33-4b13-bc85-4e599170e1d4_1600x902.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Now is your chance to enter into your most productive era!<br><strong><br></strong>That&#8217;s why we recommend joining <strong><a href="https://link.outskill.com/NEOKIMMAR4">Outskill</a></strong>- the world&#8217;s first AI learning platform where over 10+ Million Learners have learnt from top industry leaders like Microsoft, NVIDIA, and Google.</p><p style="text-align: justify;">They are hosting a 2-day LIVE AI Mastermind where you&#8217;ll build automations, create personalized agents, and learn to turn AI into your ultimate competitive edge.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://link.outskill.com/NEOKIMMAR4&quot;,&quot;text&quot;:&quot;START LEARNING NOW&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://link.outskill.com/NEOKIMMAR4"><span>START LEARNING NOW</span></a></p><p style="text-align: justify;">You will also unlock exclusive bonuses for free when you show up: A Prompt Bible, AI monetization roadmap, and a personalized toolkit builder - for which you would have to pay $1000+ outside. <br><br><strong>&#129504; Happening LIVE- Saturday and Sunday<br>&#128348; 10 AM EST to 7PM EST</strong></p><p><strong><a href="https://link.outskill.com/NEOKIMMAR4">Register here</a> </strong>before they run out of seats. (free for the next 72 hours only!)</p><div><hr></div><p>I want to introduce <strong><a href="https://www.linkedin.com/in/codingwithroby/">Eric Roby</a></strong> as a guest author.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="http://cwroby.com/M9xR2bK7tQs" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1zG8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3bb9b58-3c53-4e9d-bf40-fb19cadb4850_1600x811.png 424w, https://substackcdn.com/image/fetch/$s_!1zG8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3bb9b58-3c53-4e9d-bf40-fb19cadb4850_1600x811.png 848w, https://substackcdn.com/image/fetch/$s_!1zG8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3bb9b58-3c53-4e9d-bf40-fb19cadb4850_1600x811.png 1272w, https://substackcdn.com/image/fetch/$s_!1zG8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3bb9b58-3c53-4e9d-bf40-fb19cadb4850_1600x811.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1zG8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3bb9b58-3c53-4e9d-bf40-fb19cadb4850_1600x811.png" width="1456" height="738" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c3bb9b58-3c53-4e9d-bf40-fb19cadb4850_1600x811.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:738,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:&quot;http://cwroby.com/M9xR2bK7tQs&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!1zG8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3bb9b58-3c53-4e9d-bf40-fb19cadb4850_1600x811.png 424w, https://substackcdn.com/image/fetch/$s_!1zG8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3bb9b58-3c53-4e9d-bf40-fb19cadb4850_1600x811.png 848w, https://substackcdn.com/image/fetch/$s_!1zG8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3bb9b58-3c53-4e9d-bf40-fb19cadb4850_1600x811.png 1272w, https://substackcdn.com/image/fetch/$s_!1zG8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3bb9b58-3c53-4e9d-bf40-fb19cadb4850_1600x811.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>He&#8217;s a senior backend and AI engineer focused on building real-world systems and teaching developers how to do the same. He runs the YouTube channel <a href="https://www.youtube.com/@codingwithroby">@codingwithroby</a>, where he focuses on backend engineering, and created the platform <a href="http://cwroby.com/M9xR2bK7tQs">The Backend OS</a>.</p><p>Through his content and courses, he helps engineers go beyond tutorials, think in systems, and develop the skills that actually matter for real backend roles.</p><p>Check out <strong><a href="http://cwroby.com/M9xR2bK7tQs">The Backend OS</a></strong>, built to close the knowledge gaps you don&#8217;t even know you have.</p><div><hr></div><h2><strong>The Knowledge Problem</strong></h2><p>We need to understand why LLMs struggle with real-world knowledge before we fix them. The problem has <em>three</em> layers, and none of the obvious solutions work&#8230;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Knyq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee59d528-eb06-4bac-b934-4eabe279e7e6_1280x720.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Knyq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee59d528-eb06-4bac-b934-4eabe279e7e6_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!Knyq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee59d528-eb06-4bac-b934-4eabe279e7e6_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!Knyq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee59d528-eb06-4bac-b934-4eabe279e7e6_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!Knyq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee59d528-eb06-4bac-b934-4eabe279e7e6_1280x720.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Knyq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee59d528-eb06-4bac-b934-4eabe279e7e6_1280x720.png" width="1280" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ee59d528-eb06-4bac-b934-4eabe279e7e6_1280x720.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Knyq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee59d528-eb06-4bac-b934-4eabe279e7e6_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!Knyq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee59d528-eb06-4bac-b934-4eabe279e7e6_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!Knyq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee59d528-eb06-4bac-b934-4eabe279e7e6_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!Knyq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee59d528-eb06-4bac-b934-4eabe279e7e6_1280x720.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>LLM knowledge is frozen at training time</strong></h3><p>Every LLM has a cutoff date for its knowledge.</p><p>Everything the model knows was scraped, processed, and compressed into parameters during training. After that date, the model is blind. It isn&#8217;t aware of your new product launch, yesterday&#8217;s security issue, or this morning&#8217;s HR policy change.</p><p>This isn&#8217;t a minor inconvenience.</p><p>In business, accuracy is crucial. This includes areas like customer support, legal analysis, internal search, and compliance. A model that can&#8217;t access current information is a liability. It doesn&#8217;t help the business at all.</p><h3><strong>Context Windows Aren&#8217;t the Answer</strong></h3><p>The brute-force approach is tempting: <em>dump everything into the prompt.</em></p><p>Context windows<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-5" href="#footnote-5" target="_self">5</a> have increased a lot. Some models can now handle over a million tokens<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-6" href="#footnote-6" target="_self">6</a>. But this approach has three fatal problems.</p><p><strong>First, it&#8217;s expensive</strong>.</p><p>You pay for each token. Sending your entire knowledge base with every query will wipe out your budget.</p><p><strong>Second, there are hard limits.</strong></p><p>A million-token window can&#8217;t fit all of a large enterprise&#8217;s documents or its databases.</p><p><strong>Third, and this is the one most people miss, models get </strong><em><strong>worse</strong></em><strong> with more context.</strong></p><p>Research from Stanford and UC Berkeley shows LLM performance follows a U-shaped curve. Models do best when key information is at the start or end, but accuracy drops sharply when important facts are buried in the middle. In Liu et al.&#8217;s multi-document QA experiments, accuracy for some models dropped to roughly 25% when key information was placed in the middle of a 20-document context.</p><p>It&#8217;s clear: adding more context to the prompt doesn&#8217;t guarantee the model will use it.</p><div><hr></div><h2><strong>Fine-tuning isn&#8217;t the answer either</strong></h2><p>Fine-tuning<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-7" href="#footnote-7" target="_self">7</a> is when you take a pre-trained model and continue training it on your own specific data, so it learns new knowledge or behavior.</p><p>Fine-tuning changes the model&#8217;s weights to incorporate new knowledge. Think of it like sending someone back to school for a specialized course. In theory, this lets you teach the model about your domain.</p><p>In practice, it creates more problems than it solves&#8230;</p><p>It requires GPU compute, machine learning expertise, and carefully prepared training data. It takes days or weeks to complete. The result is a snapshot. Once your underlying data changes, your fine-tuned model becomes stale.</p><h3><strong>What&#8217;s Actually Needed</strong></h3><p>The key is to provide the AI with the right information at the right time for each query. And it should be done without changing the model itself.</p><p>That&#8217;s RAG.</p><p>Now let&#8217;s investigate RAG deeper:</p><div><hr></div><h2><strong>What RAG actually is</strong></h2><p>Retrieval-Augmented Generation is an architectural pattern, not a product. The concept is straightforward, and the best analogy is an open-book exam.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WV5q!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50ac6e9d-d48d-403a-aafa-40092392d779_1280x720.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WV5q!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50ac6e9d-d48d-403a-aafa-40092392d779_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!WV5q!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50ac6e9d-d48d-403a-aafa-40092392d779_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!WV5q!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50ac6e9d-d48d-403a-aafa-40092392d779_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!WV5q!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50ac6e9d-d48d-403a-aafa-40092392d779_1280x720.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WV5q!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50ac6e9d-d48d-403a-aafa-40092392d779_1280x720.png" width="1280" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/50ac6e9d-d48d-403a-aafa-40092392d779_1280x720.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!WV5q!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50ac6e9d-d48d-403a-aafa-40092392d779_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!WV5q!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50ac6e9d-d48d-403a-aafa-40092392d779_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!WV5q!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50ac6e9d-d48d-403a-aafa-40092392d779_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!WV5q!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50ac6e9d-d48d-403a-aafa-40092392d779_1280x720.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A student taking a closed-book exam relies on what they have memorized.</p><p>That&#8217;s a standard LLM. It&#8217;s smart and can reason, but it&#8217;s limited to what&#8217;s in its parameters. A student with an open-book exam has the same reasoning skills. They can check relevant pages before answering.</p><p>That&#8217;s RAG.</p><p>The formal definition comes from a 2020 paper by Patrick Lewis and his team at Facebook AI Research and University College London. RAG models combine parametric memory<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-8" href="#footnote-8" target="_self">8</a> with a pre-trained language model and non-parametric memory<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-9" href="#footnote-9" target="_self">9</a>. This non-parametric memory uses an external knowledge index. It&#8217;s accessed by a neural retriever<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-10" href="#footnote-10" target="_self">10</a>.</p><p>You can still update the model&#8217;s knowledge by swapping the retrieval index. No one needs to retrain.</p><p>In practice, RAG follows a three-step loop:</p><ol><li><p>Query comes in: a user asks a question.</p></li><li><p>Retrieve: System looks in an external knowledge base for the best information chunks.</p></li><li><p>Question and context go to the LLM. It then creates an answer based on the retrieved documents.</p></li></ol><p>You&#8217;re not changing the model. You&#8217;re changing what it sees. That distinction makes RAG so powerful and so practical.</p><p>The open-book analogy makes sense at a high level. But understanding why it works so well means diving deeper into the topic&#8230;</p><div><hr></div><h2><strong>How RAG Works Under the Hood</strong></h2><p>The three-step loop sounds simple. The engineering that makes it work is where things get interesting.</p><p>A RAG system has two main parts:</p><ul><li><p>An <strong>offline ingestion pipeline</strong> that gets your data ready.</p></li><li><p>An <strong>online retrieval pipeline</strong> that answers queries.</p></li></ul><p>Let&#8217;s walk through each one&#8230;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!y9LI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F821e9ca3-44bb-4aae-87c9-26d932f2f67a_1280x720.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!y9LI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F821e9ca3-44bb-4aae-87c9-26d932f2f67a_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!y9LI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F821e9ca3-44bb-4aae-87c9-26d932f2f67a_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!y9LI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F821e9ca3-44bb-4aae-87c9-26d932f2f67a_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!y9LI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F821e9ca3-44bb-4aae-87c9-26d932f2f67a_1280x720.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!y9LI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F821e9ca3-44bb-4aae-87c9-26d932f2f67a_1280x720.png" width="1280" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/821e9ca3-44bb-4aae-87c9-26d932f2f67a_1280x720.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!y9LI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F821e9ca3-44bb-4aae-87c9-26d932f2f67a_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!y9LI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F821e9ca3-44bb-4aae-87c9-26d932f2f67a_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!y9LI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F821e9ca3-44bb-4aae-87c9-26d932f2f67a_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!y9LI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F821e9ca3-44bb-4aae-87c9-26d932f2f67a_1280x720.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>Embeddings: Core Mechanism</strong></h3><p>Before anything else, you need to understand embeddings<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-11" href="#footnote-11" target="_self">11</a>. This is the concept that enables semantic search<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-12" href="#footnote-12" target="_self">12</a>.</p><p>Text embeddings turn text into dense numerical vectors<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-13" href="#footnote-13" target="_self">13</a>. These are arrays of floating-point numbers, usually with 1,536 or 3,072 dimensions. They capture the text meaning. The magic is in what &#8220;capture meaning&#8221; means. Words and sentences with similar intent are close together in vector space. This happens even with different words.</p><p>Consider this: <em>&#8220;How do I reset my password?&#8221;</em> and <em>&#8220;I can&#8217;t log into my account&#8221;</em> use completely different words.</p><p>But when converted to embeddings, they produce nearly identical vectors. The distance between them is tiny because their meanings are similar. This is measured using cosine similarity<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-14" href="#footnote-14" target="_self">14</a>, dot product, or Euclidean distance.</p><p>The key insight for RAG is clear.</p><p>System finds relevant content. It does this even if the user&#8217;s question doesn&#8217;t use the exact words from the source document. It searches by meaning, not by keywords.</p><h3><strong>Data Ingestion Pipeline (Offline Phase)</strong></h3><p>First, process your knowledge base. Then, index it. Only then can your RAG system answer questions.</p><p>This happens in five steps:</p><p><strong>What triggers the offline phase?</strong></p><p>In traditional workflows, this pipeline runs when new data is available. This can happen when documents are added or updated. It can also occur when a database changes or on a regular schedule, like nightly or weekly. Some teams trigger re-ingestion if retrieval quality drops.</p><p>They also do this when a new data source connects:</p><ol><li><p><strong>Load documents from anywhere</strong>: PDFs, databases, APIs, wikis, Slack channels, and Confluence pages. Frameworks like LangChain<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-15" href="#footnote-15" target="_self">15</a> and LlamaIndex<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-16" href="#footnote-16" target="_self">16</a> offer ready-made connectors for many common sources.</p></li><li><p><strong>Chunking</strong><a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-17" href="#footnote-17" target="_self">17</a><strong>: </strong>Split documents into semantically meaningful pieces. This is the single highest-leverage step to get right, and we will cover it in depth later.</p></li><li><p><strong>Embedding:</strong> Convert each chunk into a vector using an embedding model.</p></li><li><p><strong>Storage:</strong> Store the vectors in a vector database<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-18" href="#footnote-18" target="_self">18</a> optimized for similarity search.</p></li><li><p><strong>Metadata tagging:</strong> Tag each chunk with source, timestamp, category, and access control information. This metadata becomes critical for filtering, attribution, and security later.</p></li></ol><h3><strong>Retrieval Pipeline (Online Phase)</strong></h3><p>When a user asks a question, the retrieval pipeline kicks in:</p><ol><li><p><strong>Query embedding</strong>: User&#8217;s question turns into a vector. This uses the same embedding model from ingestion. This is critical: the query and the documents must live in the same vector space.</p></li><li><p><strong>Similarity search</strong>: The system finds the top-K<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-19" href="#footnote-19" target="_self">19</a> similar chunks. It compares the query vector to each vector in the database.</p></li><li><p><strong>Retrieval strategy</strong>: This is where the real engineering decisions happen. Three primary approaches exist:</p><ol><li><p>Sparse retrieval<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-20" href="#footnote-20" target="_self">20</a> uses a statistical method that matches exact keywords. It weighs these matches using term frequency and inverse document frequency.</p></li><li><p>Dense retrieval (embeddings):<strong> </strong>Semantic search via vectors. Finds conceptually relevant content even when the wording differs.</p></li><li><p>Hybrid search (combining both): Use both sparse and dense retrieval. Then, merge the results with Reciprocal Rank Fusion<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-21" href="#footnote-21" target="_self">21</a> (RRF). This boosts documents that rank high in both systems.</p></li></ol></li><li><p><strong>Re-ranking</strong>: After retrieving results, a cross-encoder model rescans them. It processes the query and each document as one input. This captures fine-grained relevance that bi-encoders miss.</p></li></ol><h3><strong>Generation Phase</strong></h3><p>With the most relevant chunks in hand, the system assembles the final prompt:</p><ol><li><p><strong>Prompt Construction:</strong></p><ol><li><p><strong>System Prompt:</strong> Combine the system prompt (the instructions that tell the LLM how to behave, like <em>&#8220;You are a helpful customer support agent&#8221;</em>) with the user&#8217;s request.</p></li><li><p><strong>Retrieved Context Chunks:</strong> Integrate the relevant pieces of text pulled from your knowledge base. For instance, if someone asks, <em>&#8220;What&#8217;s your refund policy?&#8221;</em> You might pull two paragraphs from your company&#8217;s policy document.</p></li><li><p><strong>User&#8217;s Original Query:</strong> Include the user&#8217;s actual question or task &#8212; exactly as they typed it.</p></li></ol></li><li><p><strong>LLM Call:</strong> The model generates an answer grounded in the retrieved documents. The key facts are in the prompt. The model uses these facts to reason instead of relying on its trained memory. Think of it like giving someone an open-book exam instead of asking them to answer from memory.</p></li><li><p><strong>Citation and Attribution:</strong> The system shows which source documents were used and provides verifiable citations. For example, the response might say, <em>&#8220;Based on Section 3.2 of the Employee Handbook...&#8221;</em> so the user knows exactly where the answer came from. This is one of RAG&#8217;s biggest advantages over fine-tuning: transparency.</p></li></ol><p>Now that you know how RAG works mechanically, the real question is whether it&#8217;s the right tool. That depends on what you&#8217;re comparing it to&#8230;</p><div><hr></div><h2><strong>Why RAG over alternatives?</strong></h2><p>RAG isn&#8217;t the only approach to giving AI access to knowledge.</p><p>You need to know when to use it and when not to. This means comparing it to other options.</p><h3><strong>RAG vs. Fine-Tuning</strong></h3><p>This is the comparison most teams face first. Here&#8217;s how they stack up:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1-2N!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80bd91be-a383-4f21-a05e-1a2acadabb96_1164x814.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1-2N!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80bd91be-a383-4f21-a05e-1a2acadabb96_1164x814.png 424w, https://substackcdn.com/image/fetch/$s_!1-2N!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80bd91be-a383-4f21-a05e-1a2acadabb96_1164x814.png 848w, https://substackcdn.com/image/fetch/$s_!1-2N!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80bd91be-a383-4f21-a05e-1a2acadabb96_1164x814.png 1272w, https://substackcdn.com/image/fetch/$s_!1-2N!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80bd91be-a383-4f21-a05e-1a2acadabb96_1164x814.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1-2N!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80bd91be-a383-4f21-a05e-1a2acadabb96_1164x814.png" width="1164" height="814" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/80bd91be-a383-4f21-a05e-1a2acadabb96_1164x814.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:814,&quot;width&quot;:1164,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1-2N!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80bd91be-a383-4f21-a05e-1a2acadabb96_1164x814.png 424w, https://substackcdn.com/image/fetch/$s_!1-2N!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80bd91be-a383-4f21-a05e-1a2acadabb96_1164x814.png 848w, https://substackcdn.com/image/fetch/$s_!1-2N!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80bd91be-a383-4f21-a05e-1a2acadabb96_1164x814.png 1272w, https://substackcdn.com/image/fetch/$s_!1-2N!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80bd91be-a383-4f21-a05e-1a2acadabb96_1164x814.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The key nuance: fine-tuning and RAG are complementary, not competing.</p><p>Fine-tuning is the best option when you want to adjust the model&#8217;s behavior, style, or output format, not just its knowledge. The best production systems do two things: they fine-tune style and behavior, and then they use RAG for knowledge. You can adjust a model to produce structured JSON in a certain format.</p><p>Then, use RAG to fill it with up-to-date data.</p><h3><strong>RAG vs. Long Context Windows</strong></h3><p>Context windows keep growing. Why not dump everything in the prompt?</p><p>Cost and precision. RAG retrieves only the 5-20 most relevant chunks. Long context stuffs everything in and hopes the model finds the needle. RAG is cheaper per query. It&#8217;s also more precise in what it finds. Plus, it can search millions of documents, unlike others that have a token limit.</p><p>They work well together: first, use RAG to get the right 20 chunks.</p><p>Then, use long context to process them all at once. The &#8220;lost in the middle&#8221; research shows that retrieving small, relevant chunks works better than using large context windows.</p><div><hr></div><p><em><strong>Reminder: this is a teaser of the subscriber-only newsletter series, exclusive to my golden members.</strong></em></p><p>When you upgrade, you&#8217;ll get:</p><ul><li><p><strong>High-level architecture of real-world systems.</strong></p></li><li><p>Deep dive into how popular real-world systems actually work.</p></li><li><p><strong>How real-world systems handle scale, reliability, and performance.</strong></p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.systemdesign.one/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://newsletter.systemdesign.one/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><h2><strong>Traditional RAG vs. Agentic RAG</strong></h2><p>Traditional RAG follows the straightforward three-step loop described above: retrieve, then generate. It works well for simple question-answering but has limitations. What happens when the first retrieval doesn&#8217;t return good results? Traditional RAG just pushes forward with whatever it finds.</p><p>Agentic RAG adds a reasoning layer on top.</p><p>An AI agent decides how to handle each query. It can change the search, link several retrievals, pick a knowledge base, or skip retrieval completely. Think of Traditional RAG as a student who looks up one page and writes their answer. Agentic RAG is like a student. They check several sources. They re-read sections that seem off and cross-reference information before writing.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WJ3i!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb15a5d41-dd18-4a64-bd28-79621c669aec_1404x570.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WJ3i!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb15a5d41-dd18-4a64-bd28-79621c669aec_1404x570.png 424w, https://substackcdn.com/image/fetch/$s_!WJ3i!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb15a5d41-dd18-4a64-bd28-79621c669aec_1404x570.png 848w, https://substackcdn.com/image/fetch/$s_!WJ3i!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb15a5d41-dd18-4a64-bd28-79621c669aec_1404x570.png 1272w, https://substackcdn.com/image/fetch/$s_!WJ3i!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb15a5d41-dd18-4a64-bd28-79621c669aec_1404x570.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WJ3i!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb15a5d41-dd18-4a64-bd28-79621c669aec_1404x570.png" width="1404" height="570" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b15a5d41-dd18-4a64-bd28-79621c669aec_1404x570.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:570,&quot;width&quot;:1404,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!WJ3i!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb15a5d41-dd18-4a64-bd28-79621c669aec_1404x570.png 424w, https://substackcdn.com/image/fetch/$s_!WJ3i!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb15a5d41-dd18-4a64-bd28-79621c669aec_1404x570.png 848w, https://substackcdn.com/image/fetch/$s_!WJ3i!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb15a5d41-dd18-4a64-bd28-79621c669aec_1404x570.png 1272w, https://substackcdn.com/image/fetch/$s_!WJ3i!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb15a5d41-dd18-4a64-bd28-79621c669aec_1404x570.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h2><strong>Multi-Step RAG Pipeline</strong></h2><p>Production RAG systems often go beyond the simple retrieve-and-generate loop. A multi-step pipeline adds intelligence before and after retrieval:</p><ol><li><p>Query Intent Parsing: Before searching, the system analyzes what the user actually wants. Is it a factual question? A comparison? A request for a summary? Understanding intent helps the system choose the right retrieval strategy and knowledge base.</p></li><li><p>Query Reformulation: The system may rewrite the user&#8217;s question to improve retrieval quality. For example, <em>&#8220;Why is my app slow?&#8221;</em> might become <em>&#8220;application performance bottleneck causes and solutions.&#8221;</em></p></li><li><p>Retrieval: System searches for relevant chunks (as described above).</p></li><li><p>Live Web Search: If internal documents aren&#8217;t enough, some RAG systems can do live web searches. This helps them get current information from the internet.</p></li><li><p>Reranking<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-22" href="#footnote-22" target="_self">22</a> and Filtering: Results are scored, filtered, and reranked for relevance.</p></li><li><p>Generation: LLM produces an answer grounded in all the gathered context.</p></li></ol><p>This multi-step approach is what separates demo-quality RAG from production-quality RAG.</p><div><hr></div><h2><strong>RAG Limitations</strong></h2><p>RAG is powerful, but it&#8217;s not a silver bullet&#8230;</p><p>Understanding its limitations helps you decide when to use it and when to look elsewhere:</p>
      <p>
          <a href="https://newsletter.systemdesign.one/p/how-rag-works">
              Read more
          </a>
      </p>
   ]]></content:encoded></item></channel></rss>