← Blog
Engineering · API design

Why SSE beats polling for agent-facing APIs.

·6 min read·by Cards402 engineering

When Cards402 launched, GET /v1/orders/:id was the only way to watch an order. It worked — poll every few seconds until phase: "ready" showed up — and it was the wrong primitive for agents. This post explains why we default to Server-Sent Events now, why we kept the polling endpoint as a fallback rather than deleting it, and the little backpressure quirks that matter when your clients are long-lived processes instead of browsers.

The three-option question

When a client needs to observe a long-running server-side job, there are basically three families of answers:

  1. Poll. The client hits a read endpoint on a timer. Simple, cache-friendly, stateless.
  2. Push to a webhook. The server calls the client back when something changes.
  3. Stream over one long-lived connection. The client opens a connection and the server writes updates to it until a terminal event.

We looked at all three seriously. The requirements that broke the decision for us were specific to the agent audience: agents that create dozens or hundreds of orders per day, run as long-lived processes (not request/response apps), and often run behind consumer NAT where hosting an inbound webhook endpoint is awkward at best.

Why polling loses

Polling has three failure modes that bite agents harder than they bite browsers:

Latency is a choice you're always making wrong.Poll every second, and you're burning two HTTP round-trips per second for a job that might finish in the next 30 seconds. Poll every ten seconds and your median “time to know” is 5 seconds slower than the underlying pipeline. There's no right interval; there's only a trade-off you pick at call time with no visibility into how long the job will actually take.

Rate limits and polling are adversaries. A well-behaved poll loop on a busy agent ends up being the biggest fraction of our rate-limit traffic, burning slots for reads that mostly return the same state you already knew. You can widen the limit, but then the limit stops protecting you from the misbehaving clients you built it for.

Resumption is manual. Agent crashes mid-poll, comes back, and has to rediscover which orders were still in flight. Either you persist the order list locally and walk it, or you hit a GET /v1/orders list endpoint and fan out a poll per in-flight order. Fine in small numbers, obnoxious once the agent has fifty concurrent orders.

Why webhooks also lose (for this audience)

Webhooks solve the latency and the rate-limit problems. They introduce a different set:

You have to host an endpoint.For an agent running on a laptop, a cloud function without inbound HTTP, or a Claude Desktop extension, “just spin up a public HTTPS listener” is a real barrier. The agents that can host one usually can poll. The ones that can't poll usually can't host one either.

Delivery semantics need work.Cards402 webhooks are retried with exponential backoff (30s, 5m, 30m), signed with HMAC, and require the receiver to be idempotent. That's absolutely the right design — see the webhooks section of /docsfor the details — but it's a lot of code for an agent developer to write correctly just to watch one order.

Webhook-first agents don't compose.The moment you have two agents running on the same machine each needing their own inbound URL, you start building a routing layer. At three agents you start writing a load balancer. That's infrastructure the agent author shouldn't be building.

Why SSE fits

Server-Sent Events solve the problems with polling without creating the problems with webhooks:

  • One open connection. No round-trips, no rate-limit pressure, no latency negotiation. The server pushes on every state change.
  • Outbound-only. The agent opens the connection. No inbound HTTP listener, no routing infrastructure, no NAT-punch required. Works the same on a laptop and in a Lambda.
  • Resumable. Every SSE event from Cards402 carries the full current state as its data: payload — not a delta. A client that reconnects always sees the latest phase on the first message, without needing Last-Event-ID replay. If the agent crashes mid- order, it just re-opens the stream and gets the current state.
  • Plain HTTP. Works through every proxy, CDN, and load balancer that already passes regular HTTP responses. No WebSocket upgrade, no sticky-session requirement.

The fallback story

We did not delete the polling endpoint when we added SSE. We kept it and made it the fallback path in client.waitForCard(): the SDK tries SSE first, and if the text/event-streamheader is stripped in transit — which happens more often than you'd think behind corporate proxies, some CDN caching configurations, and at least one specific enterprise egress gateway we don't want to name — it silently falls back to polling.

This matters because SSE dependencies travel poorly. A customer who tests their integration in dev against our real SSE stream and then deploys into a production environment with a corporate proxy can find themselves with a broken integration they didn't write. The SDK handling both under one surface means their code doesn't care.

The fallback poll interval defaults to 3 seconds — faster than you'd normally pick, because we're only paying for it when the primary path is unavailable and the customer probably has a user-visible timeout they're fighting against. The whole point of the fallback is to degrade gracefully; we want it to hurt a little so we know when it's firing.

Operational details

A few things we had to get right for SSE to work reliably in practice:

  • : keepalive every 15 seconds. SSE comments are normally stripped at the server boundary, but intermediate proxies idle-kill long-lived connections after anywhere from 30 seconds to a few minutes of no bytes moving. Writing : keepaliveevery 15s keeps the socket warm in every production path we've tested.
  • X-Accel-Buffering: no. Tells nginx to pass bytes through instead of buffering until the response completes. Without this, you can have a perfectly functional stream that still arrives at the client in one batch after the terminal event fires.
  • Terminal event closes the stream. After ready, failed, refunded, rejected, or expired, we write the final event and immediately close. Don't make clients guess when to stop listening.
  • Full state on reconnect. Every event includes the full order state, not just the delta. A naive client that reconnects without any tracking state still sees the current phase on its first message. This is the feature that makes SSE strictly cheaper than polling for the common case — a reconnect is exactly one message.

What we gave up

Three things polling did better:

  • Cacheability.A poll hits a GET endpoint with a cacheable body. An SSE stream is never cacheable because it's a streaming response. For agents this almost never matters, but it's the reason most public APIs ship poll first and stream later.
  • Operator debuggability.A poll loop leaves nice even stripes in the access log. An SSE connection is a single request that might be open for 90 seconds with bytes written in the middle — existing log analysis tools don't always know what to do with it. We had to build purpose-built dashboards.
  • Connection budget.Every SSE connection holds a server thread / event loop slot open. At our current volume it's cheap; at 100×, we'll need to pick whether to move the SSE termination to an edge component or multiplex inside the existing Node process.

When you should still use polling

If you're integrating Cards402 and for whatever reason SSE isn't available — a framework that makes streaming responses hard, a corporate proxy that fingerprints and blocks long-lived connections, a test environment where you'd rather not manage socket lifetimes — GET /v1/orders/:id is a first-class supported path, not a deprecated fallback. The docs documents both. You can mix them freely.

But if you're writing new agent code today, default to SSE (or just use purchaseCardOWS(), which picks for you). You'll spend less time on the timing logic and your agent will know about terminal events a few seconds sooner than the polling version ever could.

Related

The technical walk-through of the receiver contract and watcher is at non-custodial card issuance on Soroban, the 33-second timeline of a full order is at anatomy of a Cards402 order, and the full API reference for the SSE endpoint is in the stream section of /docs.

Subscribe

New posts cross-post to the changelog. RSS feed →

All posts →