How Canva Supports Real-Time Collaboration for 135 Million Monthly Users
#36: Learn More - Awesome RSocket (5 minutes)
Get the powerful template to approach system design for FREE on newsletter sign-up:
This post outlines how Canva does real-time collaboration for millions of users. If you want to learn more, scroll to the bottom and find the references.
Consider sharing this post with someone who wants to study system design.
December 2020 - Sydney, Australia.
Two UX designers want to edit the same file at once.
And they use a design app called Hooli.
One of them changes the heading. While the other deletes it being unaware of the user’s action.
And it caused a conflict in the file.
So they were frustrated.
Until one day when they hear about Canva at a design conference.
They tried it and were dazzled by the real-time collaboration feature.
A request-response model over HTTP protocol isn’t enough for real-time collaboration.
Instead a bidirectional protocol for pushing data to the clients in real time is needed.
Yet it becomes difficult at a high scale because many connections must be maintained simultaneously. And it’s complex to keep them reliable.
Long polling sends a new request after each response. So it increases latency. Also state management and concurrency become difficult.
While SSE on HTTP/2 needs an extra protocol on top of it to become bidirectional.
And WebSockets in a microservices architecture increases the server load. Also every connection needs some memory. Thus it wouldn't scale.
Besides WebSockets is a transport layer protocol. So it’s hard to find out the data packets are for a specific backend only using WebSockets.
Most network protocols don’t satisfy the use cases needed for real-time collaboration.
Here are some of these use cases:
Message-driven communication to avoid extra processing on the sender-receiver side
High-performance communication to reduce system costs
Support various communication patterns for flexibility
Stable and simple protocol for resilience
Support many platforms and programming languages
RSocket
So they use RSocket for real-time collaboration.
RSocket is an application layer protocol and it supports Reactive Streams semantics.
Reactive Streams is an initiative to standardize asynchronous stream processing. Its principles are based on the Reactive Manifesto.
Also RSocket supports client-server and server-server communication.
A group of companies created RSocket to build resilient and reactive microservices.
And here’s how RSocket offers extreme scalability:
1. Performance
WebSockets doesn't support multiplexing.
While RSocket solves the multiplexing problem by creating many channels within a network connection. Besides it’s possible to mix parts of one message with another on the wire.
Multiplexing divides the capacity of a network connection into many logical channels. And each logical channel transfers a separate data stream.
RSocket communication is asynchronous.
And Reactive Streams like RxJava can be used to implement RSocket.
2. Data Format
RSocket uses the binary protocol on top of TCP, WebSockets, or Aeron. Put another way, it’s transport layer agnostic.
For example, RSocket encodes JSON and sends it in binary data format.
An RSocket message consists of data and metadata. Yet data and metadata can be in different data formats.
Besides RSocket supports RPC and event-based messaging. And sends real messages instead of just bytes.
3. Flexibility
RSocket is supported by most programming languages.
And it offers application-level flow control. So there's no need to implement buffering or windowing on the application. Instead the consumable amount of messages can be specified.
Besides it allows adding data for logical stream identification and infrastructure.
4. Resilience
Each channel within an RSocket connection does backpressure independently. Put another way, there’s data flow control on each logical stream.
Slowing down data flow to prevent overwhelming the receiver is called Backpressure.
Without backpressure, pending requests might bring down the entire system. Thus it reduces the blast radius.
Besides RSocket allows implementation of backpressure only on certain channels within a connection. Because it makes sense to buffer data for services like analytics. But drop data for critical services to have strong consistency.
The client could inform the server about the number of messages it expects. And the server would reserve its capacity to return the exact number of messages. It’s called Leasing.
Also the server can inform the client about its capacity via data frames. Thus acting as a natural rate limiter and circuit breaker.
Besides RSocket supports keepalive heartbeat signals.
5. Communication Pattern
Once a connection is created, the client vs server distinction gets removed in RSocket.
And both sides become symmetrical via peer-to-peer communication. Put another way, each side can initiate the interactions.
RSocket supports these communication patterns:
Request-Response: send and receive a message
Request-Stream: send a message and receive a stream of messages
Request-Channel: send streams of messages in both directions
Fire-and-Forget: send a message one way
Canva Architecture
The client connects to the WebSocket Gateway server over HTTP.
While the connection between the WebSocket Gateway server and backends is via RSocket.
And they run Java on the backend.
They manage many channels within a single RSocket connection. Thus the total number of connections is kept low.
Besides the least-loaded algorithm is used to load balance the RSocket channels. Because connections to backends are mostly long-lived.
Real-time streaming offers a better user experience. Yet it's complex to implement.
And a wrong protocol could increase the system costs.
RSocket is a cloud-native protocol designed for high performance. It's slowly gaining popularity, so use it if necessary.
Consider subscribing to get simplified case studies delivered straight to your inbox:
NK’s Recommendations
Crushing Tech Education: Join a community of 5,000 engineers and technical managers dedicated to learning system design. Also consider subscribing to the YouTube channel.
Author: Eugene Shulga
Leading Developers: If you are a Development Team Leader, Engineering Manager, or considering that career path - try this newsletter.
Author:
Thank you for supporting this newsletter. Consider sharing this post with your friends and get rewards. Y’all are the best.
WebSockets actually operate at the Application layer, not the Transport layer, according to the OSI model. This is a common misconception because WebSockets provide a way to establish a persistent, full-duplex communication channel over a single TCP connection, which operates at the Transport layer
Amazing blog post! Having heard of RSocket before and this gave an amazing overview of this protocol!