The System Design Newsletter

The System Design Newsletter

How Uber Finds Nearby Drivers at 1 Million Requests per Second

#31: And How Proximity Service Works Explained Like You’re Twelve (5 minutes)

Neo Kim's avatar
Neo Kim
Jan 04, 2024
∙ Paid

Get my system design playbook for FREE on newsletter signup:


This post outlines how Uber finds nearby drivers at 1 million requests per second. If you want to learn more, scroll to the bottom and find the references.

  • Share this post & I'll send you some rewards for the referrals.

March 2016 - Istanbul, Turkey.

Charles and his family are on a tourist visit.

Yet it was difficult for them to travel around because the nearest taxi stand was 3 km away from the hotel.

So they were sad.

How Does Uber Find Nearby Drivers

They hear about the ride-sharing app called Uber from the hotel receptionist.

They installed it immediately and were stunned by the easiness of finding a driver.


How Does Uber Find Nearby Drivers?

Finding nearby drivers with accuracy and scalability is a hard problem.

Here’s how Uber solves it:

1. Location Indexing

It’s difficult to find nearby drivers only based on latitude and longitude.

So they index locations to find nearby drivers efficiently.

Uber; Finding Nearby Drivers
Finding Nearby Drivers

They use the H3 library for location indexing.

H3 is a hexagonal-shaped hierarchical geospatial indexing system created at Uber.

H3 divides Earth’s surface into cells on a flat grid, giving each cell a unique identifier with a 64-bit integer.

An H3 index is a reference to a specific cell on the grid.

An H3 Index Referencing a Specific Cell
An H3 Index Referencing a Specific Cell

They find nearby drivers by identifying the relevant cells covering the rider's area. And then listing the drivers in those cells sorted by the estimated time of arrival (ETA).

H3 offers the benefits of a hierarchical indexing system and a hexagonal grid system.

Here’s how H3 works:

i) Hierarchical Index

They need the index of the smallest possible area to find the nearby drivers accurately. Put another way, a high data resolution is necessary.

A small cell provides high data resolution. Yet storage costs increase if there are many smaller-sized cells.

While bigger-sized cells hide details in data.

Data Resolution Based on Grid Size
Data Resolution Based on Grid Size

So they created a hierarchical grid. And defined the smaller hexagonal cells as subdivisions of bigger hexagonal cells.

A hierarchical grid allows data compression. Because the areas with lower details are represented with fewer cells. While areas with higher details are represented by many cells.

Fewer Cells Representing Large Areas Without Features in Hierarchical Grid
Fewer Cells Representing Large Areas Without Details in Hierarchical Grid

Also changing data resolution became easier with a hierarchical grid. Because the identifier of child cells could be truncated to find ancestor cells.

They created a global grid on H3 with 122 base cells. Yet 12 out of 122 base cells are pentagons because it’s not possible to tile a sphere (Earth) only with hexagons. H3 treats the pentagon as a hexagon without a part.

Besides hexagons don’t subdivide perfectly. So they subdivide a hexagon into 7 hexagons with a known amount of error. And the child hexagons get rotated to fit the shape of the parent hexagon.

H3 Subdividing Areas Into Smaller Hexagons
H3 Subdividing Areas Into Smaller Hexagons

The H3 library supports 16 data resolution types. And can index an area as small as 1 square meter.

They use the cell identifier as the sharding key to partition the H3.

ii) Hexagonal Grid

They need distance between cells to find nearby drivers.

It's possible to create a grid with triangle, square, or hexagon-shaped cells.

Distance Measurement in Grid
Distance Measurement in Grid

But H3 uses a hexagon as the cell shape to make it easier to measure the distance between 2 cells. Because each neighboring cell is the same distance away from a cell's center. It simplifies distance measurement.

While triangles have 3 distanced neighbors and squares have 2 distanced neighbors. Because some neighbors share an edge and others share a vertex.

So distance measurement between cells needs extra coefficients in triangles and squares.

iii) Hierarchical Algorithms

H3 supports the indexing function to find a cell in the grid efficiently. It accepts latitude, longitude, and resolution to return the cell identifier.

Finding a Cell Identifier Using Indexing Function
Finding a Cell Identifier Using Indexing Function

H3 finds the neighboring cells around a specific cell using the kRing function.

Besides H3 does constant-time bitwise operations to truncate the cell index. It makes switching between data resolutions easy.


2. Location Storage

Each driver app sends its location to the server every few seconds. But the GPS signals can get noisy and sparse due to a poor network.

The exact location is necessary to find nearby drivers. So they do map matching.

Map Matching
Map Matching

Map matching transforms raw GPS signals into actual road segments.

They use Apache Cassandra to store raw locations for long-term durability.

Apache Cassandra is a distributed database. Yet Cassandra is optimized for write operations.

So they add a Redis cache layer on top of it to shed the read operations.

Redis Buffering Data Points for Map Matching
Redis Buffering Data Points for Map Matching

Redis stores the recent locations of each driver. Also it buffers enough data points to do map matching.

The map-matched data then gets stored in a separate schema on Cassandra.


3. Scalability

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Neo Kim · Publisher Privacy
Substack · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture