Real-time communication

Table of Contents

Alex, a day trader, was monitoring a promising tech stock during market hours.

Alex saw the stock at $78.50 and decided to wait for it to drop to $77 before buying. After seeing the price unchanged for a few minutes, Alex placed a market order for 200 shares.

Unknown to Alex, the stock had jumped to $82.75 following a product announcement. The order executed at this higher price, though Alex’s screen still showed the outdated value.

Confused, Alex restarted the trading app and finally noticed the payment of $850 more than expected!

This is just one example where applications with time-sensitive information need updates to appear as they happen.

Why real-time communication? #

Real-time communication is the ability for information to be transferred between systems with minimal latency and without manual intervention.

In theory, Alex’s problem could have been solved if the trading app periodically polled the information server for updates. However, there may be many other users like Alex who are also monitoring the same stock, and the constant polling requests would have strained the server’s resources.

At a similar scale, real-time approaches will also lead to unavoidable resource consumption, but these updates would be much more efficient. The key advantage is that the updates are only pushed when something changes. If nothing changes, the server doesn’t need to send anything, but a polling approach would still send requests.

Ways to implement real-time updates #

If we want to use real-time communication, we have a couple of options we can try.

WebSockets #

WebSockets provide bidirectional connection between a client and a server. Most commonly this would be between a web browser and a backend server.

The WebSocket protocol (ws:// or wss://) is built on top of the Transmission Control Protocol (TCP), and was standardized in RFC 6455.

WebSockets start with a standard HTTP request that includes an “Upgrade” header asking to switch to the WebSocket protocol.

Once established, the connection remains open, and during this entire time both the client and server are listening for messages. Both can send messages to each other until one of them explicitly closes the connection.

%%{ init: { "sequence": { "mirrorActors": false } } }%% sequenceDiagram participant C as Client participant S as Server C->>S: HTTP Request with Upgrade header S-->>C: HTTP 101 Switching Protocols note over C,S: Connection established rect rgba(240, 240, 240) loop Until connection closed S->>C: Server pushes message C->>S: Client sends message end end C->>S: Close frame S-->>C: Close frame

WebSockets are most useful when you need bidirectional communication with the ability to send text OR binary messages. The blocker is usually that server or client may not support the WebSocket protocol.

Server-sent events (SSE) #

Server-sent events are a way for a server to push events that a client can listen to. There is a browser API called EventSource that a web application can use to receive events from the server.

This is a unidirectional channel – the client cannot send any messages to the server.

%%{ init: { "sequence": { "mirrorActors": false } } }%% sequenceDiagram participant C as Client participant S as Server C->>S: HTTP Request for event stream S-->>C: HTTP 200 with text/event-stream note over C,S: Connection established rect rgba(240, 240, 240) loop Until connection closed S->>C: Server sends event end end

SSE works well when there’s no need for bidirectional communication and the messages are text-based. If either of those assumptions is not met, SSE is not a viable choice.

Long polling #

Long polling is a technique that uses standard HTTP requests, but is not like a regular request-response polling approach.

The client opens an HTTP request to the server, and the server and client keep that connection open until new information is available. Then the server will respond with the new information and the client will immediately send a new request.

The cycle repeats as long as a client wishes to receive updates.

%%{ init: { "sequence": { "mirrorActors": false } } }%% sequenceDiagram participant C as Client participant S as Server C->>S: HTTP Request (holds connection open) note over S: Server holds request until data is available or timeout S-->>C: HTTP Response with data C->>S: New HTTP Request (immediately) note over S: Server holds new request S-->>C: HTTP Response when more data available

Long polling allows sending binary and text data, but does not lend itself to bidirectional communication. It’s most commonly used as a fallback or downgrade when WebSockets are not available (e.g. older browsers, firewall issues).

Considerations #

While real-time communication offers significant benefits, it comes with some overhead.

Resource consumption #

Persistent connections consume server resources. Each connected client requires memory for its connection state, potentially limiting scalability. A server might handle thousands of WebSocket connections, but millions would require specialized architecture.

Complexity of implementation #

Implementing real-time systems introduces complexity. Handling reconnection, authentication, and connection timeouts.

Also, if event ordering matters, then ensuring events are processed in the correct sequence requires some overhead.