The System Design Newsletter

The System Design Newsletter

Share this post

The System Design Newsletter
The System Design Newsletter
5 Reasons Why Zoom Was Able to Support 300 Million Video Calls a Day
Copy link
Facebook
Email
Notes
More

5 Reasons Why Zoom Was Able to Support 300 Million Video Calls a Day

#28: Learn More - Awesome Zoom Architecture (4 minutes)

Neo Kim's avatar
Neo Kim
Dec 11, 2023
112

Share this post

The System Design Newsletter
The System Design Newsletter
5 Reasons Why Zoom Was Able to Support 300 Million Video Calls a Day
Copy link
Facebook
Email
Notes
More
10
10
Share

Get my system design playbook for FREE on newsletter signup:


This post outlines how Zoom architecture supports 300 million video calls per day. If you want to learn more, scroll to the bottom and find the references.

  • Share this post & I'll send you some rewards for the referrals.

March 2020 - Berlin, Germany.

Annika moved to a new apartment during lockdown.

She has to attend video calls for work.

Yet she has only mobile internet.

So she was sad.

She receives a Zoom meeting invitation from a coworker the next day.

Zoom Scalability

She installed the Zoom app and was mind-blown by its video call quality.



Zoom Architecture

Here’s how Zoom supports 300 million video calls a day:

1. Video Streaming

They do adaptive streaming because each device type needs a different video resolution.

Adjusting resolution based on device type and bandwidth is called adaptive streaming.

And the number of pixels in a video frame is called resolution.

But sending many video streams for different resolutions isn’t scalable.

So they use Scalable Video Coding (SVC) to stream videos.

Imagine SVC as having a video in Lego blocks. The lower blocks contain the basic picture. While upper blocks contain extra details.

Scalable Video Coding; Zoom Architecture
Scalable Video Coding

Put another way, SVC sends a single video stream that is divided into hierarchical layers. Each layer holds a different resolution.

The lower layers contain basic information. While upper layers contain extra information for higher resolution.

The receiving client decodes only specific layers that match its device type.

SVC reduces bandwidth usage because there is only a single video stream.

Also it reduces server CPU usage by avoiding the need to encode and decode many video streams.

So SVC video streaming scales well and provides low latency.

2. Video Processing

They separate video stream processing from routing.

Also they don’t process video streams on the server because it isn’t scalable.

Stream Routing vs Processing; Zoom Architecture
Stream Routing vs Processing

Instead the server only routes the video streams.

While the client processes it.

3. Video Routing

They don’t combine video streams from a video call’s participants on the server.

Instead they send separate video streams from each participant to the client. The client then decodes them.

So it avoids the need for transcoding on the server.

Converting a video to a different format is called transcoding.

Separate Video Streams Zoom Architecture
Each Participant Send Separate Video Streams

They do multimedia routing to send video streams with low latency.

The multimedia router finds the best network paths to send video between participants in a video call.

4. Monitoring Quality of Service

The network can be unreliable especially if the user is on mobile internet.

Quality of Service Zoom Architecture
Client Monitors Quality of Service

So Zoom client monitors the Quality of Service (QoS). It does that by measuring data packet loss and latency.

The client then optimizes the video stream using proprietary algorithms to provide the best user experience.

5. Network Awareness

Video calls need faster delivery of data.

So they use User Datagram Protocol (UDP).

Imagine UDP as a person sending postcards without a confirmation recipient.

It's a lightweight and connectionless protocol.

Also they set up the client to use TCP, HTTPS, and HTTP as a fallback for consistent user experience.

Peer-To-Peer Connection Between 2 Participants in Zoom
Peer-to-Peer Connection Between Two Participants in a Video Call

Zoom uses a peer-to-peer connection if there are only 2 participants in the video call. Because it reduces server load and provides low latency.


They use a client-server architecture.

And run microservices on Amazon Web Services (AWS).

Zoom client connects to the closest data center for low latency.

Zoom Architecture
Zoom Architecture

They use meeting zones to group servers.

While a zone controller manages every activity that occurs within a meeting zone.

They engineered Zoom for video streaming by keeping its architecture simple.


Consider subscribing to get simplified case studies delivered straight to your inbox:


Author NK; System design case studies
Follow me on LinkedIn | YouTube | Threads | Twitter | Instagram

Thank you for supporting this newsletter. Consider sharing this post with your friends and get rewards. Y’all are the best.

system design newsletter

Share


How to Scale an App to 10 Million Users on AWS

How to Scale an App to 10 Million Users on AWS

NK
·
December 6, 2023
Read full story
How Uber Computes ETA at Half a Million Requests per Second

How Uber Computes ETA at Half a Million Requests per Second

NK
·
December 3, 2023
Read full story

References

  • Here’s How Zoom Provides Industry-Leading Video Capacity

  • How Zoom's Unique Architecture Powers Your Video First UC Future

  • Zoom Expands with Equinix to Future-Proof and Scale Its Video-First, Cloud-Native Architecture

  • Scalable video coding (SVC)

Abraham Onoja's avatar
Denis Nuțiu's avatar
Richard Donovan's avatar
raj arun's avatar
Nicola Ballotta's avatar
112 Likes∙
10 Restacks
112

Share this post

The System Design Newsletter
The System Design Newsletter
5 Reasons Why Zoom Was Able to Support 300 Million Video Calls a Day
Copy link
Facebook
Email
Notes
More
10
10
Share

Discussion about this post

User's avatar
Mindi Weik's avatar
Mindi Weik
Dec 23, 2023

Excellent. It's clear, concise and informative! Thank you 🙌

Expand full comment
Like (3)
Reply
Share
1 reply by Neo Kim
Akhil's avatar
Akhil
Dec 12, 2023

A very useful article with valuable information. A lot is happening behind the scenes of our Zoom meetings. So, are meetings end-to-end encrypted? 😄

Expand full comment
Like (3)
Reply
Share
3 replies by Neo Kim and others
8 more comments...
8 Reasons Why WhatsApp Was Able to Support 50 Billion Messages a Day With Only 32 Engineers
#1: Learn More - Awesome WhatsApp Engineering (6 minutes)
Aug 27, 2023 â€¢ 
Neo Kim
744

Share this post

The System Design Newsletter
The System Design Newsletter
8 Reasons Why WhatsApp Was Able to Support 50 Billion Messages a Day With Only 32 Engineers
Copy link
Facebook
Email
Notes
More
25
How PayPal Was Able to Support a Billion Transactions per Day With Only 8 Virtual Machines
#30: Learn More - Awesome PayPal Engineering (4 minutes)
Dec 26, 2023 â€¢ 
Neo Kim
252

Share this post

The System Design Newsletter
The System Design Newsletter
How PayPal Was Able to Support a Billion Transactions per Day With Only 8 Virtual Machines
Copy link
Facebook
Email
Notes
More
14
How Stripe Prevents Double Payment Using Idempotent API
#45: A Simple Introduction to Idempotent API (4 minutes)
May 9, 2024 â€¢ 
Neo Kim
393

Share this post

The System Design Newsletter
The System Design Newsletter
How Stripe Prevents Double Payment Using Idempotent API
Copy link
Facebook
Email
Notes
More
30

Ready for more?

© 2025 Neo Kim
Publisher Privacy
Substack
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More

Create your profile

User's avatar

Only paid subscribers can comment on this post

Already a paid subscriber? Sign in

Check your email

For your security, we need to re-authenticate you.

Click the link we sent to , or click here to sign in.

User's avatar

javinpaul, a subscriber of The System Design Newsletter, shared this with you.