Discover more from System Design Newsletter
How Disney+ Scaled to 11 Million Users on Launch Day
#42: A Simple Introduction to Disney+ Architecture (5 minutes)
Get the powerful template to approach system design for FREE on newsletter sign-up:
This post outlines how Disney+ scaled to 11 million users on launch day. If you want to learn more, scroll to the bottom and find the references.
Consider sharing this post with someone who wants to study system design.
Once upon a time, there lived a teenager named Olivia.
She’s a big fan of Marvel and Pixar movies.
But she couldn’t find all the movies she wanted to watch on her streaming service.
So she was unhappy.
She hears about the launch of a new video streaming service called Disney+.
And got hooked as it had everything she ever wished for.
She buys a Disney+ subscription on the very first day.
Programming Digest (Featured)
Programming Digest is a free carefully curated weekly newsletter for software engineers. Read 5 handpicked articles with short summaries. And learn something new every week.
Disney Architecture
Here’s a simplified version of Disney+ architecture:
1. A Tuesday Evening
Olivia streams a movie named Dark Phoenix on Android TV.
They run infrastructure across many regions. Because it helps with failover when infrastructure in a region fails. And routes the user to the nearest region.
Besides they serve videos from CDN for low latency. The content delivery network (CDN) improves performance by caching videos closer to users. While videos get stored in an object storage to lower costs.
2. An Hour Later
Olivia takes a dinner break before finishing the movie.
They send the video timestamp as a data stream while the user watches the movie. And the server stores it in a key-value database.
They use the Kinesis to stream data. Think of Kinesis stream as a data streaming service for processing large amounts of data.
While a global table in DynamoDB stores the video timestamp for flexibility. A global table is a DynamoDB feature that replicates data across many regions automatically. It offers high availability.
Olivia resumes the movie on her mobile phone.
So they find the video timestamp where she left off. And streams it from there.
3. Fifty Minutes Later
Olivia finishes watching the movie.
But was a little disappointed with it.
So she searches the movie catalog for something more interesting.
They store the movie catalog in a document data store. Because it offers a flexible schema for movie metadata and user reviews.
Also they cache the popular queries in a key-value database. It buffers requests, thus reducing the load on the document data store.
Although DynamoDB isn't the best choice, they use it for caching. Because they couldn't predict the user traffic during launch. And scaling an in-memory cache would be extra effort and difficult.
Besides DynamoDB keeps things simple as they already used it in their architecture. That means reduced operational complexity.
Yet Olivia couldn’t find any interesting movies to watch.
4. Some Minutes Later
So she starts exploring the Disney+ home page.
And sees the recommendations list.
They use machine learning to recommend movies. It's based on various factors like location and watch history. Imagine machine learning as studying data patterns to make better predictions.
They send this list as a data stream and do some extra processing on it for correctness. After that, they store the result in a key-value database.
Olivia finds an interesting movie named Lilo & Stitch.
But has little time in the day to watch it.
5. Time for Bed
So she adds that movie to her watchlist.
They store the movies put on the watchlist in a key-value database.
And uses the global table in DynamoDB to keep it simple. This means the watchlist is automatically synchronized across many regions. So a failover wouldn’t show an outdated watchlist to the user.
While Olivia was excited about the movie days to come.
And went happily to bed.
DynamoDB automatically partitions a table as traffic grows. Yet partitioning takes time. And the traffic gets throttled if it exceeds a specific limit before partitioning occurs. So they pre-partitioned the tables before launch to avoid throttling. And autoscaled the database to handle growing traffic.
Besides the metadata of a popular movie gets more read traffic than others. So they replicate the metadata of a video across many database partitions. And randomly route the reads to a partition to prevent the hot shard problem.
Disney+ has grown to around 149 million users. And remains one of the biggest video streaming services in the market.
Consider subscribing to get simplified case studies delivered straight to your inbox:
Thank you for supporting this newsletter. Consider sharing this post with your friends and get rewards. Y’all are the best.
Love your posts but imho, you use too many short sentences, it really takes me as a reader out of the narrative thought sometimes.
Light yet informative read, Neo! Thank you for putting the effort.