The System Design Newsletter

The System Design Newsletter

Share this post

The System Design Newsletter
The System Design Newsletter
How Razorpay Scaled to Handle Flash Sales at 1500 Requests per Second
Copy link
Facebook
Email
Notes
More

How Razorpay Scaled to Handle Flash Sales at 1500 Requests per Second

#46: A Case Study on Payment Gateway Scalability (5 min read)

Neo Kim's avatar
Neo Kim
May 17, 2024
85

Share this post

The System Design Newsletter
The System Design Newsletter
How Razorpay Scaled to Handle Flash Sales at 1500 Requests per Second
Copy link
Facebook
Email
Notes
More
4
4
Share

Get my system design playbook for FREE on newsletter signup:


2020 - India.

IPL, the most famous cricket league in the world is about to start.

And more than 23 million raving fans of cricket in India will stream it.

Payment Gateway Architecture

While companies sell food at a discount via flash sales minutes before the game starts.

And accepts online payment via Razorpay, a shiny payment gateway service.

Flash sales create a traffic spike with transactions reaching 1500 requests per second.

Although it’s possible to serve this traffic, scaling the infrastructure quickly to handle it could be difficult.

This post outlines how Razorpay scales to handle flash sales. If you want to learn more, scroll to the bottom and find the references.

  • Share this post & I'll send you some rewards for the referrals.

Note: This post is based on my research and may differ from real-world implementation.

system design newsletter

Refind – Brain food, delivered daily (Featured)

Loved by 450,000+ curious minds. Every day Refind analyzes thousands of articles and sends you only the best. Subscribe for free today.

Refind

Join Refind


Payment Gateway Architecture

Here are their scalability techniques for flash sale:

1. Rate Limit the Traffic

They rate limit the traffic to prevent server overload.

Rate Limiting Unwanted Traffic
Rate Limiting Unwanted Traffic

And use a Nginx proxy server as the rate limiter. It gets deployed as a sidecar and runs a dedicated cache for rate limiting. Imagine the sidecar pattern as extending a service by attaching an extra container.

Besides they use the fixed window algorithm for efficient rate limiting. It uses a single atomic counter per key with an expiry time (TTL).

2. Connection Pooling

They use MySQL as the main database. While clients compete with each other for a database connection during flash sales.

They run PHP on the application layer. But PHP uses a process model for execution. So sharing resources between processes isn’t possible. Hence PHP doesn’t support database connection pooling natively.

This means the application layer holds database connections while waiting for results. So connection starvation is likely to happen if the queries get expensive.

Also the database performance degrades if the number of idle connections increases.

Database Proxy for Connection Pooling
Database Proxy for Connection Pooling

So they use ProxySQL as a database proxy. It holds a pool of connections to the database.

And a tenant connects to ProxySQL instead of MySQL directly. Thus limiting the number of MySQL connections to avoid connection starvation.

Think of a tenant as an isolated data space for a specific user.

Also constant opening and closing of database connections could be expensive. While ProxySQL prevents it via persistent connections.

Besides ProxySQL caches the query results for low latency.

They deploy ProxySQL as a sidecar to keep the application layer stateless. And set up a fallback to connect directly with MySQL if ProxySQL fails for high availability.

3. Avoid the Thundering Herd

Thundering herd occurs when many clients query the server concurrently during flash sales. That means bad performance and downtime.

3. Avoid the Thundering Herd
Thundering Herd

So they use these techniques to prevent the thundering herd problem:

  • Throttle incoming traffic

  • Add exponential backoff by the client

  • Include caching

Besides they use ProxySQL to throttle tenants issuing expensive database queries.

4. Autoscaling Isn’t Enough

It takes around 4 minutes for a newly provisioned server to become healthy. So they don't rely only on autoscaling to handle traffic spikes.

Autoscaling
Autoscaling

Instead they prewarm their infrastructure. And run baked container images to reduce the deployment time.

They do capacity planning based on estimated transactions and scale their servers horizontally. Also they scale down the infrastructure after flash sales with autoscaling.

5. Smart Routing

They should forward the traffic only to external bank gateways that are operational.

Routing Traffic to an Operational Gateway
Routing Traffic to an Operational Gateway

So they use routing rules based on machine learning. It considers the success and failure events from payments. And then predicts the success probability of each external gateway.

6. Testing

They must resolve system bottlenecks for better performance.

Load Testing
Load Testing

So they do load testing using an open-source tool called k6. It checks the system's performance under an expected load. And provides information about latency and throughput.

7. Flywheel Effect

They profile the system for bottlenecks.

Feedback Loop
Feedback Loop

And put their learning in a constant loop. It helped to improve their performance.

system design newsletter

They run critical services like payments and orders in separate microservices. Because it gives scalability.

This case study shows that simple and proven techniques can solve most scalability problems.


Consider subscribing to get simplified case studies delivered straight to your inbox:


Author NK; System design case studies
Follow me on LinkedIn | YouTube | Threads | Twitter | Instagram

Thank you for supporting this newsletter. Consider sharing this post with your friends and get rewards. Y’all are the best.

system design newsletter

Share


How McDonald’s Food Delivery Platform Handles 20,000 Orders per Second

How McDonald’s Food Delivery Platform Handles 20,000 Orders per Second

Neo Kim
·
May 3, 2024
Read full story
How Stripe Prevents Double Payment Using Idempotent API

How Stripe Prevents Double Payment Using Idempotent API

Neo Kim
·
May 9, 2024
Read full story

References

  • IPL: Razorpay’s second innings

  • Auto Incident Management for improved Systems Availability and Developer Productivity

  • Razorpay’s Authentication Revamp: Turbocharging Performance

  • Never Have I Ever - Gone Live without Perf

  • ProxySQL Website

  • k6 documentation

  • Load Testing vs Stress Testing: What's the Difference and Why It Matters?

  • How to Implement Fixed Window Rate Limiting using Redis

  • Why does PHP not support a database connection pool?

Shivani Joshi's avatar
Son Lam's avatar
Kumar Mritunjay's avatar
Chirag patel's avatar
Alexandre Zajac's avatar
85 Likes∙
4 Restacks
85

Share this post

The System Design Newsletter
The System Design Newsletter
How Razorpay Scaled to Handle Flash Sales at 1500 Requests per Second
Copy link
Facebook
Email
Notes
More
4
4
Share

Discussion about this post

User's avatar
Raul Junco's avatar
Raul Junco
May 17, 2024

Great lessons on scalability @systemdesignone

Expand full comment
Like (2)
Reply
Share
Fran Soto's avatar
Fran Soto
May 19, 2024

Now every time I read "thundering herd" I'll think of a flash sale and people running.

Good article, Neo!

Expand full comment
Like (1)
Reply
Share
2 more comments...
8 Reasons Why WhatsApp Was Able to Support 50 Billion Messages a Day With Only 32 Engineers
#1: Learn More - Awesome WhatsApp Engineering (6 minutes)
Aug 27, 2023 â€¢ 
Neo Kim
742

Share this post

The System Design Newsletter
The System Design Newsletter
8 Reasons Why WhatsApp Was Able to Support 50 Billion Messages a Day With Only 32 Engineers
Copy link
Facebook
Email
Notes
More
24
How PayPal Was Able to Support a Billion Transactions per Day With Only 8 Virtual Machines
#30: Learn More - Awesome PayPal Engineering (4 minutes)
Dec 26, 2023 â€¢ 
Neo Kim
250

Share this post

The System Design Newsletter
The System Design Newsletter
How PayPal Was Able to Support a Billion Transactions per Day With Only 8 Virtual Machines
Copy link
Facebook
Email
Notes
More
14
How Stripe Prevents Double Payment Using Idempotent API
#45: A Simple Introduction to Idempotent API (4 minutes)
May 9, 2024 â€¢ 
Neo Kim
383

Share this post

The System Design Newsletter
The System Design Newsletter
How Stripe Prevents Double Payment Using Idempotent API
Copy link
Facebook
Email
Notes
More
30

Ready for more?

© 2025 Neo Kim
Publisher Privacy
Substack
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More

Create your profile

User's avatar

Only paid subscribers can comment on this post

Already a paid subscriber? Sign in

Check your email

For your security, we need to re-authenticate you.

Click the link we sent to , or click here to sign in.