Discover more from System Design Newsletter
Virtual Waiting Room Architecture That Handles High-Demand Ticket Sales at SeatGeek
#29: How Virtual Waiting Room Handles Ticket Sales (5 minutes)
Get the powerful template to approach system design for FREE on newsletter sign-up:
This post outlines the architecture of the virtual waiting room at SeatGeek. If you want to learn more, scroll to the bottom and find the references.
Consider sharing this post with somebody who wants to study system design.
May 2019 - Paris, France.
Eva wants a ticket to the Taylor Swift concert.
Yet she never had luck with buying tickets.
The tickets will be on sale at a specific time on SeatGeek, an online ticket platform.
Also only a limited number of tickets are available.
She sees a virtual waiting room page on SeatGeek and then gets to buy the tickets.
She was dazzled.
SeatGeek allows the users who arrive earlier to buy the tickets. And limit the number of users who access the sales page.
A virtual waiting room is like a queue.
Besides auto scaling isn’t fast enough to provide a good user experience with high traffic. So a virtual waiting room is necessary.
Here’s how the virtual waiting room handles high-demand ticket sales :
It absorbs traffic spikes and pipes it as constant traffic to the infrastructure
It provides a fair way to sell tickets using the First-in-First-out (FIFO) principle
It prevents service outages due to high-traffic
Technically (Featured)
Technically explains software and hardware in a simple and engaging way so you can impress your boss.
Virtual Waiting Room Tech Stack
They run a backend and content delivery network (CDN).
They use a static HTML page to create the virtual waiting room and run it on CDN.
The backend consists of AWS Lambda and DynamoDB.
While CDN consists of an in-memory data store, they use Fastly CDN.
AWS Lambda is a serverless computing service. While DynamoDB is a fully managed NoSQL database.
They store the queue in DynamoDB and keep a cache of it in CDN.
Put another way, DynamoDB is set up as the primary data store. And CDN as an edge cache.
The user access validation and routing logic runs on CDN. Because it avoids a request round trip.
Yet the state synchronization between DynamoDB and CDN is necessary.
They use DynamoDB because it supports Dynamo Stream for data synchronization.
Also it offers garbage collection using the time-to-live (TTL) attribute. Put another way, the queue gets automatically removed after the ticket sales.
Dynamo Stream captures and triggers events for any change in DynamoDB tables.
But DynamoDB throttles if a single partition receives more requests than it supports.
So they partition DynamoDB and use the scatter-gather pattern for data queries.
They use Lambda because it simplifies infrastructure management and prevents cascading failures.
Also it’s easy to scale and provides great support for concurrent executions.
Virtual Waiting Room Workflow
The system prevents traffic to the protected zone before the sales start.
Instead the users get routed to the virtual waiting room.
A web page where people buy tickets is called the protected zone.
The user gets an entry to the protected zone only if they have an access token. And the access token is given when the sales start.
Also CDN stores the protected zone state in its cache. Thus it's possible to route the user to the virtual waiting room without talking to the backend.
Here’s the workflow during sales:
The client traffic gets routed to CDN
The users in the virtual waiting room get a visitor token with an associated timestamp. It contains their arrival time
The virtual waiting room (CDN) opens a websocket connection to the API Gateway. It sends the user's visitor token to the backend
API Gateway talks to Lambda functions
The Lambda function registers the timestamp in DynamoDB. It guarantees the user's position in the queue
The exchanger Lambda function runs periodically. It swaps the visitor token for an access token
Dynamo Stream updates the CDN cache to synchronize it with DynamoDB
The notifier Lambda function consumes the Dynamo Stream. It gives the access token to the user via websocket
The access token gets validated and the user gets routed to the protected zone
The user buys the ticket from the protected zone
They use a leaky bucket to control the number of users that enter the protected zone at any point. The leaky bucket is implemented in the backend.
The leaky bucket stores access tokens of the users that enter the protected zone.
Put another way, the leaky bucket size equals the protected zone capacity.
So a user gets routed to the virtual room waiting room if the bucket is full. And a 429 HTTP status code gets returned.
But the user gets routed to the protected zone if there is a space in the bucket. Also an access token gets added to the bucket.
They cache the returned status code in CDN to avoid extra requests to the backend if the bucket is full.
Besides each music concert is mapped to a different protected zone. And gets a separate leaky bucket.
They use the transactional outbox pattern to update DynamoDB and CDN.
The transactional outbox pattern offers reliable messaging in distributed systems.
Any change to the protected zone table in DynamoDB is streamed using Dynamo Stream. And a message relay Lambda updates the data change in CDN.
They use Dynamo Stream because it provides an ordered flow of information.
Consider subscribing to get simplified case studies delivered straight to your inbox:
Thank you for supporting this newsletter. Consider sharing this post with your friends and get rewards. Y’all are the best.
References
https://www.infoq.com/presentations/ticketing-system-virtual-waiting-room/
https://developer.fastly.com/solutions/tutorials/waiting-room/
https://blog.carbonfive.com/using-redis-sorted-sets-to-build-a-scalable-real-time-web-waiting-list/
https://aws.amazon.com/solutions/implementations/virtual-waiting-room-on-aws/
Nice one! Thank you!
Really interesting! And an original approach to solving spikes :)