The System Design Newsletter

The System Design Newsletter

Share this post

The System Design Newsletter
The System Design Newsletter
How Shopify Handles Flash Sales at 32 Million Requests per Minute
User's avatar
Discover more from The System Design Newsletter
Download my system design playbook for free on newsletter signup
Over 160,000 subscribers
Already have an account? Sign in

How Shopify Handles Flash Sales at 32 Million Requests per Minute

#16: Learn More - Awesome Shopify Engineering (4 minutes)

Neo Kim's avatar
Neo Kim
Oct 19, 2023
50

Share this post

The System Design Newsletter
The System Design Newsletter
How Shopify Handles Flash Sales at 32 Million Requests per Minute
9
2
Share

Get my system design playbook for FREE on newsletter signup:


This post outlines flash sales at Shopify. If you want to learn more, scroll to the bottom and find the references.

  • Share this post & I'll send you some rewards for the referrals.

Once upon a time, there lived a Shopify server.

He had a single purpose in life: to help people sell products online.

Shopify Flash Sale

He was the sweet child of Shopify engineers - and built with a simple architecture.

He used Nginx for load balancing because of its fast event loop. And Redis for caching expensive operations.

He was living a Happy life.

Shopify Flash Sale; Software Architecture
Cartoon Architecture of Shopify

Until one day.

An Evil social media influencer, Pam entered the picture.

And ruined it all.

She wanted to sell a limited edition of shoes in a short time at a discount price - a Flash sale.

Shopify Flash Sale; Scalable Software Architecture

So she opened the Bird app and Instagram - and hyped about it.

Shopify server knew what was coming for him: massive traffic.

Shopify Flash Sale

.He got Scared and called Shopify engineers for help.

Shopify Flash Sale; Architecture

So they thought hard.

And came up with 10 simple ideas to handle the Flash sale.


Shopify Flash Sale

Here are 10 ideas that Shopify used to handle Flash sales at 32 million requests per minute:

1. Low Timeout

A user wouldn’t wait more than a few seconds for an action to finish.

Besides a client would waste computing resources and increase costs by waiting on an unresponsive server.

Shopify Flash Sale; Timeout

So they reduced timeout wherever they could in the system.

2. Circuit Breaker

A degraded service is better than a completely down service.

And connecting to a server that failed many times in a short time is bad. Because it prevents the server from becoming healthy again.

Shopify Flash Sale; Circuit breaker

So they installed the circuit breaker pattern. It stops requests if the circuit breaker is open and protects the database and API server.

3. Rate Limit

Performance degrades if the number of requests exceeds system capacity.

Shopify Flash Sale; Rate Limiter

So they used rate limiting and load shedding to prevent extra requests.

Also they installed the back pressure pattern. It allows many requests to go through without breaking the service.

Besides they set up fair queueing to give priority to users who came first.

4. Monitoring

Monitoring and alerting the important metrics helps to identify server failure risks quickly.

Shopify Flash Sale; Monitoring

So they monitored the following metrics:

  • Latency: time it takes to process a unit of work

  • Traffic: the rate at which new work comes into the system in requests per minute

  • Errors: the rate at which unexpected things occur

  • Saturation: system load relative to its total capacity

5. Structured Logging

They needed logs to debug and understand what happened in a web request.

So they stored logs in a central place because there were many servers creating logs. And they wanted to keep it searchable.

Shopify Flash Sale; Logging

They structured logs in a machine-readable format to parse and index quickly.

Besides they included correlation ID to allow tracing.

6. Idempotent Keys

The probability of an unreliable event to occur in a distributed system is high, especially with high traffic.

And retry of a failed request should be done safely. For example, they didn’t want to double charge a customer’s card.

Shopify Flash Sale; Idempotency Keys

So they sent a unique idempotency key with each request.

7. Consistent Reconciliation

They relied on financial partners. And merged financial data to achieve data consistency.

Shopify Flash Sale; Data reconciliation

They tracked any data mismatch and automated the fix.

8. Load Testing

They used load testing to find system bottlenecks and to install protection mechanisms.

Shopify Flash Sale; Load testing

They did load testing by simulating a large volume of traffic.

9. Incident Management and Retrospective

They set up proper incident management to improve service reliability.

And involved 3-roles in incident management:

  • Incident Manager on Call (IMOC): coordinates the incident

  • Support Response Manager (SRM): responsible for public communication

  • Service Owner: responsible for restoring stability

Shopify Flash Sale; Incident management

Also they did incident retrospectives after an incident occurred. This helped them to:

  • Dig deep into what happened

  • Identify wrong assumptions about the system

Besides they came up with action items based on the discussion. And implemented them to prevent the same failure from occurring again.

10. Data Isolation

They kept people’s data in different database shards. So if a database shard crashes it wouldn’t affect another.

Shopify Flash Sale; data isolation

But they shared the stateless workers between shards to get the best performance.

And replicated the database across many data centers for high availability.


The probability of failures still exists but they reduced the risk of downtime. And limited the scope of impact.

And everybody lived Happily ever after.


Consider subscribing to get simplified case studies delivered straight to your inbox:


Author NK; System design case studies
Follow me on LinkedIn | YouTube | Threads | Twitter | Instagram

Thank you for supporting this newsletter. Consider sharing this post with your friends and get rewards. Y’all are the best.

system design newsletter

Share


Did you enjoy this post? Then don't forget to hit the Like button ❤️

7 Simple Ways to Fail System Design Interview

7 Simple Ways to Fail System Design Interview

NK
·
October 10, 2023
Read full story
How Giphy Delivers 10 Billion GIFs a Day to 1 Billion Users

How Giphy Delivers 10 Billion GIFs a Day to 1 Billion Users

NK
·
October 12, 2023
Read full story

References

  • https://shopify.engineering/building-resilient-payment-systems

  • https://www.usenix.org/conference/srecon16europe/program/presentation/stolarsky

  • https://www.infoq.com/presentations/shopify-architecture-flash-sale/

  • Photo by Roberto Cortese on Unsplash

50

Share this post

The System Design Newsletter
The System Design Newsletter
How Shopify Handles Flash Sales at 32 Million Requests per Minute
9
2
Share

Discussion about this post

User's avatar
Stephen Kazibwe's avatar
Stephen Kazibwe
Nov 3, 2023

Great Article. Thanks for sharing.

Expand full comment
Like (1)
Reply
Share
Gregor Ojstersek's avatar
Gregor Ojstersek
Oct 21, 2023

Great article! Enjoyed the story like edition with a lot of added visuals.

Expand full comment
Like (1)
Reply
Share
1 reply by Neo Kim
7 more comments...
8 Reasons Why WhatsApp Was Able to Support 50 Billion Messages a Day With Only 32 Engineers
#1: Learn More - Awesome WhatsApp Engineering (6 minutes)
Aug 27, 2023 • 
Neo Kim
745

Share this post

The System Design Newsletter
The System Design Newsletter
8 Reasons Why WhatsApp Was Able to Support 50 Billion Messages a Day With Only 32 Engineers
25
How PayPal Was Able to Support a Billion Transactions per Day With Only 8 Virtual Machines
#30: Learn More - Awesome PayPal Engineering (4 minutes)
Dec 26, 2023 • 
Neo Kim
252

Share this post

The System Design Newsletter
The System Design Newsletter
How PayPal Was Able to Support a Billion Transactions per Day With Only 8 Virtual Machines
14
How Stripe Prevents Double Payment Using Idempotent API
#45: A Simple Introduction to Idempotent API (4 minutes)
May 9, 2024 • 
Neo Kim
393

Share this post

The System Design Newsletter
The System Design Newsletter
How Stripe Prevents Double Payment Using Idempotent API
30

Ready for more?

© 2025 Neo Kim
Publisher Privacy
Substack
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Create your profile

User's avatar

Only paid subscribers can comment on this post

Already a paid subscriber? Sign in

Check your email

For your security, we need to re-authenticate you.

Click the link we sent to , or click here to sign in.