The System Design Newsletter

The System Design Newsletter

User's avatar
Discover more from The System Design Newsletter
Download my system design playbook on newsletter signup for FREE
Over 184,000 subscribers
Already have an account? Sign in

How Hashnode Generates Feed at Scale

#33: Learn More - Awesome Personalized Feed Architecture (5 minutes)

Neo Kim's avatar
Neo Kim
Jan 21, 2024
63
Error
6
7
Error
Share

Get my system design playbook for FREE on newsletter signup:

Error

This post outlines how the feed gets created at scale on Hashnode. If you want to learn more, scroll to the bottom and find the references.

  • Share this post & I'll send you some rewards for the referrals.

June 2020 - Bangalore, India.

Two engineers create a blogging platform for software developers to share knowledge. And called it Hashnode.

Yet they faced difficulties with user engagement and content discovery on their platform.

So they created a feed.

A User Scrolling Through the Feed to Find New Articles
A User Scrolling Through the Feed to Find New Articles

A feed is a personalized list of articles a user sees when they log into the platform. It’s created based on a user’s interests, preferences, and activities within the platform. Put another way, a feed is a collection of articles from authors that a user follows.

system design newsletter

Feed Architecture

Here’s how Hashnode generates feed at scale:

1. Ranking Articles

Everyone gets a unique feed because each user follows different authors and tags.

Yet they don’t use machine learning to compute personalized feeds.

Instead they use a ranking approach to keep it simple. Put another way, a score gets assigned to each article based on different factors.

Factors Considered by Hashnode in Ranking Articles
Factors Considered by Hashnode in Ranking Articles

They normalize the likes, views, and comments on each article to keep the feed diverse.

Besides they reduce the rank of an article if it was already shown on a user's feed to keep the feed fresh.

2. Feed Generation

Their main page displays the feed. So feed generation should be fast.

Yet it’s expensive to create a personalized feed because a user might follow many authors. And articles from each author must be fetched.

Also expensive database queries should be avoided at scale for performance. And fetching the same data many times without reuse wastes computing resources.

So they precompute the feed for active users and store it in the Redis cache. Put another way, the feed of a user gets computed before they visit it and gets served from memory.

An active user is someone who has logged into Hashnode in the last few days.

They compute the feed only for active users to reduce memory usage on Redis.

An event-driven architecture is used to keep the services decoupled and flexible. Put another way, an event gets emitted for each user action like publishing or liking an article.

Hashnode Feed Architecture
Hashnode Feed Architecture

They use serverless architecture to precompute the feed. Because it’s easy to scale and avoids the need for server management.

AWS EventBridge collects events like publishing or liking an article.

EventBridge is a serverless service that connects application components via events.

They store the articles in disks on MongoDB. It’s a NoSQL database.

And Upstash is their serverless Redis cache provider.

The metadata cache stores the author's data and their list of active followers.

A Lambda function queries the metadata cache to generate the feed. It queries the database if the cached data is stale.

A Lambda is an event-driven computing service.

They use another Lambda function to compute the feed and store it in the Redis cache.

And each computation gets its own Lambda function for low latency.

They use AWS Step Function's distributed map execution feature to generate the feed in parallel.

AWS Step Function is a serverless orchestration service. It keeps the Lambda functions free of extra logic by triggering and tracking them. Put another way, it’s used to coordinate many Lambda functions.

Feed Cache; Feed architecture
Feed Cache

The details of their feed cache implementation are unknown. But Redis list is an option to store the user's feed. Put another way, each user gets a separate Redis list.

A Lambda function computes the feed when an article gets published. It then inserts the article ID and author ID into each follower’s Redis list.

They could store a few hundred article IDs in the Redis list to avoid frequent re-computation.

Also only article IDs get stored in the feed cache to prevent duplicating the article data. And a separate cache could store the article data.

The MGET command in Redis can be used to fetch many keys via a single operation to display the feed. Because it’s more efficient than separate GET operations. Put another way, they could fetch data about many articles with a single MGET request for speed.

3. Performance

Hashnode is a read-heavy system because articles get read more often than written.

Write vs Read Performance of Feed Cache
Write vs Read Performance of Feed Cache

When an author publishes an article, they insert the article ID into each follower's feed. So it’s a linear time operation, O(n).

While the feed of a user gets fetched with a single Redis request. So it’s a constant time operation, O(1).

So the feed generation can be slow for famous authors with many followers.

A possible solution is to merge the article ID only when the follower accesses their feed. Put another way, the feed of a famous author's followers doesn’t get immediately populated.

system design newsletter

Hundreds of articles get published on Hashnode every day. And the feed lets them get the newest articles to interested users.

Now they serve 2.3 million software developers across the world.


Consider subscribing to get simplified case studies delivered straight to your inbox:

Error

Author NK; System design case studies
Follow me on LinkedIn | YouTube | Threads | Twitter | Instagram

Thank you for supporting this newsletter. Consider sharing this post with your friends and get rewards. Y’all are the best.

system design newsletter

Share


How Cloudflare Was Able to Support 55 Million Requests per Second With Only 15 Postgres Clusters

How Cloudflare Was Able to Support 55 Million Requests per Second With Only 15 Postgres Clusters

NK
·
January 12, 2024
Read full story
How Uber Finds Nearby Drivers at 1 Million Requests per Second

How Uber Finds Nearby Drivers at 1 Million Requests per Second

NK
·
January 4, 2024
Read full story

References

  • Hashnode's Feed Architecture

  • The art of feed curating: Our approach to generating personalized feeds that match users' interests

  • Hashnode's Overall Architecture

  • Twitter: Timelines at Scale

  • How is Reddit and Hacker News Ranking Algorithms Used?

  • AWS Step Functions vs AWS Lambda

  • Blogging Start-Up Hashnode Raises $6.7 Million in Series A Funding to Power the Creator Economy for Software Developers and Global Tech Community

  • How did Hashnode give a voice to the developer community

  • Giphy Feed GIF

Joshua Obafemi's avatar
vamsi krishna's avatar
Ari's avatar
Prashanth's avatar
Swapnil Srivastava's avatar
63 Likes∙
7 Restacks
63
Error
6
7
Error
Share

Discussion about this post

User's avatar
Jacob Olson's avatar
Jacob Olson
Jan 25, 2024

Is it not insanely inefficient to pre-compute everyone's feed? That just seems wild to me. Is that a common approach or is it unique to HashNode?

Expand full comment
Like (2)
Reply
Share
1 reply by Neo Kim
Basma Taha's avatar
Basma Taha
Jan 25, 2024

Great article!

I learned a lot of new information. The most shocking part was discovering that feed preferences use a "ranking algorithm," not just machine learning. I always thought it was only machine learning.

It's also interesting that the feed is pre-calculated in cache before a user visits, saving computer resources.

Thanks, Neo, for teaching us this!

Expand full comment
Like (2)
Reply
Share
4 more comments...
8 Reasons Why WhatsApp Was Able to Support 50 Billion Messages a Day With Only 32 Engineers
#1: Learn More - Awesome WhatsApp Engineering (6 minutes)
Aug 27, 2023 • 
Neo Kim
760
25
How PayPal Was Able to Support a Billion Transactions per Day With Only 8 Virtual Machines
#30: Learn More - Awesome PayPal Engineering (4 minutes)
Dec 26, 2023 • 
Neo Kim
293
14
How Stripe Prevents Double Payment Using Idempotent API
#45: A Simple Introduction to Idempotent API (4 minutes)
May 9, 2024 • 
Neo Kim
449
32

Ready for more?

Error
© 2025 Neo Kim
Publisher Privacy
Substack
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture
ErrorError
Share this post
The System Design Newsletter
The System Design Newsletter
How Hashnode Generates Feed at Scale
Copy link
Facebook
Email
Notes
More