How Reddit Works 🔥

#76: Break Into Reddit Architecture for 100M Daily Users (5 Minutes)

Jul 03, 2025

∙ Paid

Get my system design playbook for FREE on newsletter signup:

This post outlines Reddit's architecture. You will find references at the bottom of this page if you want to go deeper.

Share this post & I'll send you some rewards for the referrals.

Note: This post is based on my research and may differ from real-world implementation.

June 2005.

Two university students decide to build a mobile based food ordering service.

And pitched their startup idea for seed capital.

Yet their idea got rejected as smartphones didn’t exist back then.

So they pivoted to create a social network and called it Reddit.

They ran the entire site on a single machine.

And stored accounts, posts, subreddits, and comments information in Postgres.

As more users joined, they installed separate machines for app server and database.

Although it temporarily solved their performance issues, there were new problems.

Here are some of them:

1. Scalability

Their growth rate was explosive.

Yet a single database and monolithic codebase became a bottleneck.

2. Latency

Popular posts received many votes and comments.

But there was a massive delay in showing the latest data.

Onward.

CodeRabbit: Free AI Code Reviews in VS Code - Sponsor

CodeRabbit brings real-time, AI-powered code reviews straight into VS Code, Cursor, and Windsurf. It lets you:

Get contextual feedback on every commit, not just at the PR stage
Catch bugs, security flaws, and performance issues as you code
Apply AI-driven suggestions instantly to implement code changes
Do code reviews in your IDE for free and in your PR for a paid subscription

Install in VS Code for FREE

Reddit Architecture

Here’s how Reddit works:

1. Post Submission

They added more app servers and set up a load balancer in front for scalability.

Also they scaled the database vertically to handle extra I/O operations. Yet there’s a hardware limit with vertical scaling. So they partitioned the database to scale writes.

But processing new posts is an expensive operation. Because it must update different tables, such as posts, subreddits, and accounts. Besides each submission goes through rate limit checks and spam filters.

So they use a job queue.

Processing New Posts Asynchronously Using Job Queue — Processing a New Post Asynchronously Using Job Queue

Here’s how it works:

They add a message to the job queue for each new post
The processor then pulls the job from the queue asynchronously
It handles the post and updates the databases

Ready for the best part?

2. Lists

They show each subreddit and front page as an ordered list of posts.

Here’s the query to fetch a list:

SELECT * FROM posts ORDER BY hot (upvotes, downvotes);

It selects all columns from the posts table
It then orders the posts by a custom function named ‘hot’, which takes upvotes and downvotes as inputs

Yet querying the database often for an ordered list is expensive and adds extra latency.

So they store the list of post IDs along with their rank on the cache server. Think of the cache as a denormalized index of posts. This makes it easy to look up the posts by primary key.

Also they installed replica databases for read scalability.

A single Reddit post is part of many lists.

So they pre-compute each list using a job queue and store the results on the cache server for performance. Besides they store the popular lists in Cassandra for durability.

New post submissions and voting will change the list order. But voting is an expensive operation. Because it must invalidate the cache and update different tables, such as posts, comments, and accounts. So they use a job queue.

Here’s how voting works:

They add a message to the job queue
The processor pulls the job from the queue asynchronously
It then invalidates the cache and updates the databases

They update the cache through an atomic 'read-mutate-write' operation.

Yet there’s a risk of a race condition when many processors handle votes on the same post at the same time. So they use locks. It ensures correctness when many processors access the same queue.

They use Zookeeper for locking as it offers high availability.

But the time spent waiting for locks skyrocketed during peak traffic. It means a user’s vote waits in the queue for hours before processing, thus affecting the user experience.

While more processors mean extra lock contention. It happens because different processors try to acquire the lock for a popular post. And it further slows down the queue.

So they partitioned the queue. And put votes into different queues based on the subreddit the voted post is in.

They do a modulo N operation on the subreddit ID to find the correct queue. For example, if there are 3 queues, they do a modulo 3 on the subreddit ID.

The processors then handle different queues. Thus fewer processors ask for the same lock at once. It allowed them to scale well with the same number of processors without waiting for locks.

Let’s keep going!

The System Design Newsletter