Real-Time: Polling, WebSockets & SSE
HTTP has a built-in bias: the client asks, the server answers. That works beautifully for “give me this page.” It works terribly for “tell me the moment something changes” — a new chat message, a price tick, a notification, a live score. The server knows something happened, but in plain request-response it has no way to speak unless spoken to. This page is about the techniques for getting fresh data to clients with low latency, and how to choose among them by direction and scale.
The naive baseline: short polling
Section titled “The naive baseline: short polling”The simplest answer to “how do I know when something changed?” is to keep asking. The client sends a request every few seconds: anything new? anything new? anything new?
client → GET /messages?since=... server → "nothing" (wait 3s)client → GET /messages?since=... server → "nothing" (wait 3s)client → GET /messages?since=... server → "here's 1 new message"It is trivial to build on ordinary HTTP. But the cost is brutal at scale: most requests return nothing, yet each one still pays the full price of a round trip, headers, and (often) a fresh connection. With 100,000 clients polling every 3 seconds, that’s ~33,000 requests per second of which the vast majority are wasted. And your data is still up to 3 seconds stale. Short polling trades near-zero implementation effort for enormous waste and mediocre freshness.
Long polling: hold the request open
Section titled “Long polling: hold the request open”Long polling is a clever hack on the same machinery. The client makes a request, and instead of answering “nothing” immediately, the server holds the connection open until it actually has something to say (or a timeout fires). The instant news arrives, the server responds; the client immediately reconnects to wait for the next event.
client → GET /messages (server holds it open... waiting...) ── new message arrives ──► server responds instantlyclient → GET /messages (reconnect, hold open again...)This cuts the wasted empty responses and delivers events with near-real-time latency, all over ordinary HTTP — which is why it was the workhorse of real-time web before better transports existed. But it’s still one request per message, ties up a server connection per waiting client, and is fiddly around timeouts and reconnection. It’s a bridge technology: better than short polling, clumsier than what came next.
WebSockets: a real two-way pipe
Section titled “WebSockets: a real two-way pipe”A WebSocket abandons the request-response model entirely. It starts as a normal HTTP request that asks to “upgrade” the connection, and once the server agrees, the same TCP connection becomes a persistent, full-duplex channel: either side can send a message at any time, with very little per-message overhead.
client → HTTP GET ... Upgrade: websocketserver → 101 Switching Protocols═══════════ persistent bidirectional connection ═══════════client ⇄ server ← either side sends, anytime, low overhead →This is the right tool when communication is genuinely two-way and chatty: multiplayer games,
collaborative editing (think cursors and edits flying both directions), chat, live trading. What does
this buy us, and what does it cost? It buys true bidirectional, low-latency, low-overhead messaging.
It costs a persistent stateful connection — each open socket consumes server memory and a
connection slot, which complicates load balancing and cuts
against statelessness. It’s a different protocol (ws://),
so some proxies and infrastructure need special handling, and you must manage reconnection yourself.
Server-Sent Events (SSE): one-way, the easy way
Section titled “Server-Sent Events (SSE): one-way, the easy way”Often you don’t need two-way. A news feed, a notification stream, a progress bar, a live dashboard — the data flows server → client only. For that, a WebSocket is overkill. Server-Sent Events is a much simpler, often-overlooked tool: a single long-lived HTTP response that the server keeps writing to, streaming events down to the client as they happen.
client → GET /stream (Accept: text/event-stream)server → (keeps the response open, writes events as they occur) data: price 101 data: price 102 data: price 103 ...Because it’s just a long HTTP response, SSE inherits HTTP’s virtues: it works over plain HTTP, passes
through proxies and firewalls cleanly, and — a genuinely nice touch — the browser’s EventSource
automatically reconnects and can resume from the last event ID. You give up the upstream channel;
the client can’t push back over the same connection (it just makes ordinary requests for that).
Choosing by direction and scale
Section titled “Choosing by direction and scale”| Technique | Direction | Connection cost | Latency | Best for |
|---|---|---|---|---|
| Short polling | client → server | wasteful (many empty hits) | poor (stale) | rare updates, dead-simple needs |
| Long polling | client → server | one conn per waiting client | near-real-time | fallback, legacy infra |
| SSE | server → client | one open response/client | real-time | feeds, notifications, dashboards |
| WebSockets | bidirectional | persistent socket/client | real-time | chat, games, collaboration |
The scale axis matters because every option except short polling holds a connection open per client. A million concurrent users means a million live connections — a real capacity and cost problem that pushes you toward connection-efficient servers, careful load balancing, and sometimes a dedicated real-time tier. Persistent connections are not free; they are state, and state is the thing horizontal scaling works hardest to avoid.
The thread
Section titled “The thread”How does a server tell a client about something the instant it happens, when HTTP only lets clients ask? The answers form a ladder: polling fakes it by asking repeatedly; long polling holds the question open; SSE turns one response into a one-way stream; WebSockets open a true two-way pipe. Climbing the ladder buys you freshness and bidirectionality, and costs you persistent stateful connections — so you pick the lowest rung that satisfies your direction and your scale.
Check your understanding
Section titled “Check your understanding”- Why does plain request-response HTTP make server-initiated updates awkward in the first place?
- How does long polling reduce the waste of short polling, and what does it still cost?
- What does the WebSocket “upgrade” accomplish, and what new operational cost does a persistent socket introduce?
- When is SSE the better choice than a WebSocket, and what capability are you giving up by choosing it?
- Why does the scale of concurrent clients push back against every option except short polling?