The System Design Newsletter

The System Design Newsletter

Share this post

The System Design Newsletter
The System Design Newsletter
This Is How Airbnb Adopted HTTP Streaming to Save 84 Million USD in Costs
Copy link
Facebook
Email
Notes
More
User's avatar
Discover more from The System Design Newsletter
Weekly newsletter to help busy engineers become good at system design
Over 150,000 subscribers
Already have an account? Sign in

This Is How Airbnb Adopted HTTP Streaming to Save 84 Million USD in Costs

#7: Read Now - Awesome Web Optimisation Technique (6 minutes)

Neo Kim's avatar
Neo Kim
Sep 21, 2023
32

Share this post

The System Design Newsletter
The System Design Newsletter
This Is How Airbnb Adopted HTTP Streaming to Save 84 Million USD in Costs
Copy link
Facebook
Email
Notes
More
2
3
Share

Get the powerful template to approach system design for FREE on newsletter sign-up:


A 100 ms delay in website load time decreases sales by 1%. That’s an 84 million USD loss per year for Airbnb.1

Airbnb switched to HTTP streaming to improve its page load performance. It reduced the first contentful paint (FCP) metric by 100 milliseconds on every page.

The FCP measures the time from the start of the page load to the screen render.

This post outlines how Airbnb reduced latency by 100 ms with HTTP streaming. If you want to learn more, scroll to the bottom and find the references.

  • Share this post & I'll send you some rewards for the referrals.


What Is Critical Rendering Path?

I will give you an overview of the critical rendering path.

A browser performs a sequence of events before the screen render. This sequence of events is called the critical rendering path.

What is critical rendering path?
Critical rendering path

The browser receives HTML and CSS from the server. Then it translates HTML and CSS into DOM and CSSOM. This is called parsing.

Document Object Model (DOM) is the browser's internal representation of markup. CSS Object Model (CSSOM) is the browser's internal representation of styles. And they are independent tree data structures.

The browser parses HTML incrementally. So, DOM construction is incremental.

But the browser blocks the screen until the server sends CSS files. Also parsing CSS is not incremental.

The browser then combines the DOM and CSSOM to create the render tree.

The determination of the dimensions and node locations by browser is called layout.

The browser finally renders the individual nodes to the screen. This is called painting.

What Is HTTP Streaming?

I will teach you HTTP streaming with an analogy. Imagine there is a water tap and a tube. And you want to fill the tube with water. There are 2 options:

Buffering: Take a cup and fill it with water from the tap. And then pour it down the tube. This is called buffering. Everything happens in sequence. In computing, the server writes the entire response into a buffer. And then sends it to the browser.

Streaming: Connect the tap to the tube. And fill it with water. This is called streaming. Many steps happen in parallel. In computing, the server breaks the response into chunks. And sends it out as soon as they are ready. The browser handles the received chunks. This speeds up things. A popular method for HTTP streaming is Chunked transfer encoding.

How Airbnb Optimised Critical Rendering Path?

According to the performance golden rule, 90% load time of most websites is on the front end. Yet many websites today rely on buffering. Because they want to avoid extra engineering effort to enable streaming.

A switch to streaming might not be worth it if the page content depends on a slow backend query. Because nothing gets rendered until that query finishes. But Airbnb found a universal use case for HTTP streaming to improve performance.

And they reduced their network waterfall by switching to streaming. A waterfall happens when a network request triggers another request.

What is critical rendering path? Buffering
Buffering; How a browser fetches resources

Here is what they found. The browser sat idle until the server created the entire HTML. Also their website needed to load external files (fonts, CSS, JavaScript) to render. But the browser downloaded the external files only after it received the entire HTML. So, this resulted in a cascade of requests (waterfall). And poor performance.

What is HTTP streaming?
Streaming; How a browser fetches resources

The browser parsed HTML incrementally. So, they used early flush to stream HTML to the browser.

Here's how they did it. Split the HTML document into 2 chunks and send them separately. They sent the HTML <header> in the first chunk. And it allowed the browser to download external files as soon as it parsed the first chunk.

In the meantime, the server generated the remaining HTML. And this reduced the network waterfall.

<html>
    <head>
        <title>Your Page</title>
        <link rel="stylesheet" href="your.css" />
        <script src="script.js" defer></script>
    </head>
    
    Flush

    <body>
    ...
    </body>
</html>

But a problem remained - the user saw a blank page until the HTML <body> tag arrived. So, they rendered a loading state if there was no data. And the browser fetched data.

But this created a new problem - they sent an extra network request to fetch data. So, they streamed a third chunk of data instead. It goes only after all the visible content and doesn't block render. This eliminated an extra network request.

They used JavaScript to detect the data chunk arrival. And MutationObserver to observe DOM changes. After that insert data into the application's network data store.

Node.js (Express framework) ran on the server. They rewrote their React component to enable streaming. And rendered the entire HTML document as 3 separate React components.

But they ran into a few problems with enabling HTTP streaming. And resolved the problems by:

  • Disabling response buffering in nginx

  • Disabling Nagle's algorithm in the haproxy load balancer. This allowed chunks to reach the browser unaltered

Takeaways

The takeaways from this case study are:

  • Stream HTML. It allows incremental construction of the page. And enables the early discovery of external files

  • Code split CSS files. And inline critical styles. This improves performance

  • Move non-critical JavaScript out from the critical rendering path. And download JavaScript after the initial render. Because JavaScript parsing is blocking

  • Stream HTML. And request-response approach for the remaining content. This keeps it simple

Google Search uses HTTP streaming to load the headers even before the user types in a full query. This speeds up their render times.

Do you know any websites that use HTTP streaming? Leave a comment.


👋 PS - Are you unhappy at your current job?

While preparing for system design interviews to get your dream job can be stressful.

Don't worry, I'm working on content to help you pass the system design interview. I'll make it easier - you spend only a few minutes each week to go from 0 to 1. Yet paid subscription fees will be higher than current pledge fees.

So pledge now to get access at a lower price.

“An excellent newsletter to learn system design through practical case studies.” Franco


Consider subscribing to get simplified case studies delivered straight to your inbox:


Author NK; System design case studies
Follow me on LinkedIn | YouTube | Threads | Twitter | Instagram | Bluesky

Thank you for supporting this newsletter. Consider sharing this post with your friends and get rewards. Y’all are the best.

system design newsletter

Share


How Disney+ Hotstar Scaled to 25 Million Concurrent Users

How Disney+ Hotstar Scaled to 25 Million Concurrent Users

NK
·
September 17, 2023
Read full story
11 Reasons Why YouTube Was Able to Support 100 Million Video Views a Day With Only 9 Engineers

11 Reasons Why YouTube Was Able to Support 100 Million Video Views a Day With Only 9 Engineers

NK
·
September 16, 2023
Read full story

Are you interested in writing great software? And improving your soft skills? Consider subscribing to the Refactoring newsletter by

Luca Rossi
. He also built an awesome community around it.


References

  • Victor (2023). Improving Performance with HTTP Streaming. [online] The Airbnb Tech Blog. Available at: https://medium.com/airbnb-engineering/improving-performance-with-http-streaming-ba9e72c66408.

  • www.stevesouders.com. (n.d.). Flushing the Document Early | High-Performance Web Sites. [online] Available at: https://www.stevesouders.com/blog/2009/05/18/flushing-the-document-early/ [Accessed 9 Sep. 2023].

  • Optimization, W. (2013). Flush HTML Early and Often - flushing HTML to speed up start render times and page rendering. [online] WebSiteOptimization.com. Available at: https://www.websiteoptimization.com/speed/tweak/flush/ [Accessed 9 Sep. 2023].

  • developer.mozilla.org. (n.d.). Populating the page: how browsers work - Web Performance | MDN. [online] Available at: https://developer.mozilla.org/en-US/docs/Web/Performance/How_browsers_work.

  • www.youtube.com. (n.d.). Critical rendering path - Crash course on web performance (Fluent 2013). [online] Available at: youtube.com [Accessed 9 Sep. 2023].

1

Tests done at Amazon in 2007 revealed that for every 100ms increase in load time, sales would decrease by 1%. Airbnb's annual revenue is $8.4B. So, a 100ms delay would cost Airbnb 84 million USD per year. The user behavior might be different from the studies and the numbers are subject to vary.


Subscribe to The System Design Newsletter

By Neo Kim · Launched 2 years ago
Weekly newsletter to help busy engineers become good at system design
Chandra Kiran Guntur's avatar
Brian Charniga's avatar
sai charan's avatar
Sanketh B K's avatar
Adham Salama's avatar
32 Likes∙
3 Restacks
32

Share this post

The System Design Newsletter
The System Design Newsletter
This Is How Airbnb Adopted HTTP Streaming to Save 84 Million USD in Costs
Copy link
Facebook
Email
Notes
More
2
3
Share

Discussion about this post

User's avatar
Jordan Cutler's avatar
Jordan Cutler
Oct 3, 2023

Loved this short but informative read! Very helpful and also appreciated the code example.

Thanks NK!

Expand full comment
Like (1)
Reply
Share
1 reply by Neo Kim
1 more comment...
8 Reasons Why WhatsApp Was Able to Support 50 Billion Messages a Day With Only 32 Engineers
#1: Learn More - Awesome WhatsApp Engineering (6 minutes)
Aug 27, 2023 • 
Neo Kim
737

Share this post

The System Design Newsletter
The System Design Newsletter
8 Reasons Why WhatsApp Was Able to Support 50 Billion Messages a Day With Only 32 Engineers
Copy link
Facebook
Email
Notes
More
24
How PayPal Was Able to Support a Billion Transactions per Day With Only 8 Virtual Machines
#30: Learn More - Awesome PayPal Engineering (4 minutes)
Dec 26, 2023 • 
Neo Kim
246

Share this post

The System Design Newsletter
The System Design Newsletter
How PayPal Was Able to Support a Billion Transactions per Day With Only 8 Virtual Machines
Copy link
Facebook
Email
Notes
More
14
How Stripe Prevents Double Payment Using Idempotent API
#45: A Simple Introduction to Idempotent API (4 minutes)
May 9, 2024 • 
Neo Kim
380

Share this post

The System Design Newsletter
The System Design Newsletter
How Stripe Prevents Double Payment Using Idempotent API
Copy link
Facebook
Email
Notes
More
29

Ready for more?

© 2025 Neo Kim
Publisher Privacy
Substack
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More

Create your profile

User's avatar

Only paid subscribers can comment on this post

Already a paid subscriber? Sign in

Check your email

For your security, we need to re-authenticate you.

Click the link we sent to , or click here to sign in.