The System Design Newsletter

The System Design Newsletter

Share this post

The System Design Newsletter
The System Design Newsletter
8 Reasons Why WhatsApp Was Able to Support 50 Billion Messages a Day With Only 32 Engineers

8 Reasons Why WhatsApp Was Able to Support 50 Billion Messages a Day With Only 32 Engineers

#1: Learn More - Awesome WhatsApp Engineering (6 minutes)

Neo Kim's avatar
Neo Kim
Aug 27, 2023
752
Error

Share this post

The System Design Newsletter
The System Design Newsletter
8 Reasons Why WhatsApp Was Able to Support 50 Billion Messages a Day With Only 32 Engineers
25
66
Error
Share

Get my system design playbook for FREE on newsletter signup:

Error

This post outlines the incredible story of WhatsApp co-founder Jan Koum. And the engineering techniques used to scale WhatsApp. If you want to learn more, scroll to the bottom and find the references.

  • Share this post & I'll send you some rewards for the referrals.

January 2008 - California, United States.

Jan Koum, an engineer at Yahoo, applies for work at Facebook - rejected.

It was not the end - he moved on with his life.

He buys an iPhone next year and immediately recognizes the huge potential of the new App Store.

So he decided to build an instant messenger with some of his former coworkers from Yahoo. And named it WhatsApp. The vision behind WhatsApp was to replace the expensive SMS.

With 1 million people signing up each day, the growth rate of WhatsApp was mind-boggling.

WhatsApp was able to support 50 billion messages a day from 450 million daily active users. And they did it with only 32 engineers.

Although explosive product growth is a good problem to have, Jan Koum and the WhatsApp team had to adopt the best engineering practices to overcome the challenges.


WhatsApp Engineering

WhatsApp engineering practices to meet extreme scalability were:

1. Single Responsibility Principle

They put product focus only on the core feature - messaging.

And didn’t bother to build an advertising network or a social media platform.

WhatsApp Engineering; Single responsibility principle
Single Responsibility Principle

Also they eliminated feature creep at all costs.

Feature creep occurs when you add excessive features to a product. And make it difficult to use.

Besides they focused on the reliability of WhatsApp over everything else.

2. Technology Stack

They used Erlang to build the core functionalities of WhatsApp servers. Because it:

  • Provides high scalability with a tiny footprint

  • And supports hot-loading

Threads are a native feature of Erlang. But in Java or C++ threads belong to the operating system. So there is no need to save the entire CPU state in Erlang. And this makes context switching cheaper.

Hot loading makes it easier to deploy code changes without a server restart. Or traffic redirection. In simple words, Hot loading offers high availability.

3. Why Reinvent the Wheel?

Don’t reinvent the wheel - either use open source or buy a commercial solution.

WhatsApp Engineering; Do not reinvent the wheel
Don’t Reinvent the Wheel

Ejabberd is an open-source real-time messaging server written in Erlang.

And they built WhatsApp on top of ejabberd. Also they rewrote some of the ejabberd core components to meet their needs.

Besides WhatsApp leveraged third-party services such as Google Push to provide push notifications.

4. Cross-Cutting Concerns

They put huge emphasis on cross-cutting concerns to improve product quality.

Cross-cutting concerns are things that affect many parts of a product. And are hard to separate. For example, monitoring and alerting the health of the services.

WhatsApp engineering; Cross-cutting concerns
Cross-Cutting Concerns

And they improved the software development process with Continuous integration and Continuous delivery.

Continuous integration is the process of merging the code changes regularly into a central repository.

Continuous delivery is the process of code deployment to a testing or production environment.

5. Scalability

WhatsApp used diagonal scaling to keep the costs and operational complexity low.

Horizontal scaling is the process of increasing the number of machines in the resource pool.

Vertical scaling is the process of increasing the capacity of an existing machine, such as the CPU or memory.

And diagonal scaling is a hybrid of horizontal and vertical scaling. The computing resources get added both vertically and horizontally.

WhatsApp engineering; Scalability
Scalability

They ran WhatsApp servers on the FreeBSD operating system. Because they had previous experience with FreeBSD while working at Yahoo. Besides FreeBSD offered a reliable network stack.

Also they fine-tuned FreeBSD to accommodate 2 million+ connections per server. And modified kernel parameters such as files and sockets.

They overprovisioned servers to handle sudden traffic spikes and keep headroom for failures. For example, failures such as network partitions or hardware faults.

6. Flywheel Effect

They measured the metrics such as CPU, context switches, and system calls. Then identified and eliminated the bottlenecks. And they did this at regular intervals.

WhatsApp Engineering; Continuous feedback cycle
Continuous Feedback Cycle

The continuous feedback cycle tremendously improved the performance of WhatsApp.

7. Quality

They used load testing to identify single points of failure.

Load testing is the process of measuring the performance of the system under the anticipated load.

WhatsApp Engineering; Load testing
Load Testing

And they used artificial production traffic and DNS configuration changes for load testing.

8. Small Team Size

The communication paths between engineers increase quadratically as the team size grows. This is a recipe for degraded productivity.

WhatsApp Engineering; Communication paths between engineers
Communication Paths Between Engineers

So they kept the team size small - 32 engineers.


WhatsApp is one of the most successful instant messengers in the market.

In 2014, the same Facebook that rejected Jan Koum acquired WhatsApp for a whopping 19 billion USD.

According to Forbes, Jan Koum has a net worth of 14 billion USD in 2023.


Subscribe to get simplified case studies delivered straight to your inbox:

Error

Author NK; System design case studies
Follow me on LinkedIn | YouTube | Threads | Twitter | Instagram

Thank you for supporting this newsletter. Consider sharing this post with your friends and get rewards. Y’all are the best.

system design newsletter

Share



This Is How Quora Shards MySQL to Handle 13+ Terabytes

This Is How Quora Shards MySQL to Handle 13+ Terabytes

NK
·
September 3, 2023
Read full story
Tumblr Shares Database Migration Strategy With 60+ Billion Rows

Tumblr Shares Database Migration Strategy With 60+ Billion Rows

NK
·
September 10, 2023
Read full story

References

  • http://highscalability.com/blog/2014/2/26/the-whatsapp-architecture-facebook-bought-for-19-billion.html

  • https://www.shopify.com/partners/blog/feature-creep

  • https://stackoverflow.com/questions/2708033/technically-why-are-processes-in-erlang-more-efficient-than-os-threads

  • https://www.ejabberd.im/index.html

  • https://en.wikipedia.org/wiki/Jan_Koum

  • https://www.atlassian.com/continuous-delivery/principles/continuous-integration-vs-delivery-vs-deployment

  • https://www.nops.io/blog/horizontal-vs-vertical-scaling/

  • https://www.javatpoint.com/scaling-in-cloud-computing

  • https://www.businessinsider.com/whatsapp-built-using-erlang-and-freebsd-2015-10

  • https://www.blazemeter.com/blog/performance-testing-vs-load-testing-vs-stress-testing

  • Thumbnail Photo by Anton from Pexels

  • Rick Reed. WhatsApp: Half a billion unsuspecting FreeBSD users

  • Anton Lavrik. (2018) A Reflection on Building the WhatsApp Server - Code BEAM


Subscribe to The System Design Newsletter

By Neo Kim · Launched 2 years ago
Download my system design playbook for free on newsletter signup
Error
Nitin Kasat's avatar
Bendalam Saiteja's avatar
Lauri Elias's avatar
Lucas Garcia's avatar
🅟🅐🅤🅛 🅜🅐🅒🅚🅞's avatar
752 Likes∙
66 Restacks
752
Error

Share this post

The System Design Newsletter
The System Design Newsletter
8 Reasons Why WhatsApp Was Able to Support 50 Billion Messages a Day With Only 32 Engineers
25
66
Error
Share

Discussion about this post

User's avatar
Math police's avatar
Math police
Aug 28, 2023

Hi,

You write "The communication paths between engineers increase exponentially as the team grows in size.", but in fact the total number of communication paths between n nodes is equal to n * (n-1) / 2, which is not an exponential growth, but an polynomial growth (quadratic to be exact), i.e. O(n^2).

Expand full comment
Like (20)
Reply
Share
4 replies by Neo Kim and others
Bogdan Veliscu's avatar
Bogdan Veliscu
Mar 17, 2024

Great read, Neo! In short don’t reinvent the wheel and avoid complexity as long as possible.

Expand full comment
Like (3)
Reply
Share
23 more comments...
How PayPal Was Able to Support a Billion Transactions per Day With Only 8 Virtual Machines
#30: Learn More - Awesome PayPal Engineering (4 minutes)
Dec 26, 2023 • 
Neo Kim
274

Share this post

The System Design Newsletter
The System Design Newsletter
How PayPal Was Able to Support a Billion Transactions per Day With Only 8 Virtual Machines
14
How Stripe Prevents Double Payment Using Idempotent API
#45: A Simple Introduction to Idempotent API (4 minutes)
May 9, 2024 • 
Neo Kim
413

Share this post

The System Design Newsletter
The System Design Newsletter
How Stripe Prevents Double Payment Using Idempotent API
30
System Design Playbook 🔥
Download My Playbook for Free
May 23 • 
Neo Kim
252

Share this post

The System Design Newsletter
The System Design Newsletter
System Design Playbook 🔥
166

Ready for more?

Error
© 2025 Neo Kim
Publisher Privacy
Substack
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

ErrorError

Create your profile

User's avatar

Only paid subscribers can comment on this post

Already a paid subscriber? Sign in

Check your email

For your security, we need to re-authenticate you.

Click the link we sent to , or click here to sign in.

User's avatar

javinpaul, a subscriber of The System Design Newsletter, shared this with you.