The System Design Newsletter

The System Design Newsletter

How Google Ads Was Able to Support 4.77 Billion Users With a SQL Database 🔥

#60: Break Into Google Spanner Architecture (5 Minutes)

Neo Kim's avatar
Neo Kim
Nov 09, 2024
∙ Paid

Get my system design playbook for FREE on newsletter signup:


This post outlines Google Spanner architecture. You will find references at the bottom of this page if you want to go deeper.

  • Share this post & I'll send you some rewards for the referrals.

Note: This post is based on my research and may differ from real-world implementation.

Once upon a time, 2 students decided to sell their university project for a million dollars.

Yet they failed to make the sale.

cloud spanner database

So they decided to develop it further and named it Google.

The main source of their income was from advertisement slots (Ads).

And their growth rate was explosive.

Yet they stored Ads data in MySQL for simplicity and reliability.

As more users joined, they partitioned MySQL for scale.

Key-Range Partitioning of Mysql to Scale
Key-Range Partitioning of MySQL to Scale

Although partitioning temporarily solved their scalability issues, there were newer problems.

Here are some of them:

1. Scalability

Their storage needs skyrocketed with growth.

And re-partitioning MySQL takes time as data must be moved between servers.

Yet they had extremely minimal downtime requirements.

So it became hard.

2. Transactions

They need ACID compliance for Ads data.

But transactions became difficult after partitioning.

Think of a transaction as a series of writes and reads.


System Design Guided Practice - Sponsor

Hello Interview

Preparing for system design interviews? Work through common questions with personalized feedback to help you improve and own your next interview. Start practicing with System Design Guided Practice by Hello Interview today.

Try Now


Cloud Spanner Database

They need massive scalability of NoSQL.

And ACID properties of MySQL.

So they created Spanner - a distributed SQL database.

Here’s how:

1. Atomicity

Atomicity means a transaction is all or nothing. Put simply, a transaction must update data in different partitions at once.

Two-Phase Commit for Atomic Transactions
Two-Phase Commit for Atomic Transactions

Yet it’s difficult to ensure every partition will commit.

So they use the two-phase commit (2PC) protocol.

Here’s how it works:

  • Prepare phase: coordinator asks relevant partitions if they’re ready to commit

  • Commit phase: coordinator tells relevant partitions to commit if everyone agreed

The transaction gets aborted if any partition isn't ready. Thus achieving atomicity.

Ready for the best part?

2. Consistency

They provide strong consistency at a global level.

This means if data gets updated in Europe, the same data would be shown in Asia. Put simply, a read will always return the latest write.

Deploying Partitions Across Zones
Deploying Partitions Across Zones

A database partition gets replicated across different zones for scalability. Think of a zone as a geographical location.

Yet it’s important to coordinate writes among replicas to avoid data conflicts.

So they use the Paxos algorithm to find a partition leader. A partition leader is responsible for managing writes, while followers handle reads. Also different partition leaders might get deployed in separate zones.

Imagine Paxos as a technique to ensure consensus across distributed systems.

TrueTime Architecture
TrueTime Architecture

A simple way to achieve strong consistency is by ordering writes with timestamps. It allows a consistent view of data across servers.

Yet it’s difficult to maintain the same time across every server in the world.

So they use TrueTime. Think of TrueTime as a combination of GPS receivers and atomic clocks. It finds the current time in each data center with high accuracy.

And each server synchronizes its quartz clock with TrueTime every 30 seconds.

Comparing Timestamps for Strong Consistency During Reads
Comparing Timestamps for Strong Consistency During Reads

Here’s how they perform reads:

  1. The request gets routed to the nearest zone even if it has a follower

  2. The follower asks the leader for the latest timestamp of the requested data

  3. The follower compares the leader’s response with its timestamp

  4. The follower responds to the client

Also the follower will wait for the leader to synchronize in case its data is outdated.

Ready for the next technique?

3. Isolation

Isolation means transactions don’t interfere with each other.

Two-Phase Locking for Data Isolation During Writes
Two-Phase Locking for Data Isolation During Writes

Yet it’s difficult to avoid data conflicts due to concurrent transactions.

So they use two-phase locking (2PL) for data isolation during writes.

Here’s how it works:

  • Growing phase: transaction performs reads or writes once it acquires locks on data

  • Shrinking phase: transaction releases locks one at a time after it’s complete

Thus avoiding data conflicts on writes.

Snapshot Isolation During Reads
Snapshot Isolation During Reads

Besides they use snapshot isolation.

It’s a multi-version concurrency control (MVCC) technique for lock-free reads. Think of snapshot isolation as a way to look at the database from a point in time.

It returns a specific version of data during reads and doesn't affect ongoing writes. Thus providing data isolation.

Here’s how it works:

  • The previous data isn’t overwritten

  • Instead a new value gets written along with a TrueTime timestamp

Also older versions get automatically removed after a specific period to save storage.

Onward.

4. Durability

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Neo Kim · Publisher Privacy
Substack · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture