Zero is a local-first sync engine by Rocicorp. I use it in Dara as the sync layer between the client and our API. In production, we hit two related bugs in how Zero handles JWT expiry that together caused silent data loss.
The race condition
When a JWT expires, the server sends an AuthInvalidated error followed by a websocket close with code 3000. Both events call #disconnect() on the client. The error handler transitions state to NeedsAuth. The close handler calls connecting() with a CleanClose reason.
The problem is ordering. connecting() in connection-manager.ts only guards against transitions from Closed and Disconnected:
if (this.#state.name === ConnectionStatus.Closed) {
return;
}
if (this.#state.name === ConnectionStatus.Disconnected && !isHiddenDisconnect) {
return;
}
NeedsAuth and Error are terminal states, but connecting() doesn't check for them. If the close event fires after the error, connecting() overrides NeedsAuth and the client enters a 60-second retry loop with an expired token.
I opened #5500 with a fix and test case. Rocicorp shipped a cleaner version in #5504 that guards connecting() against all terminal states.
The stale token
With the race condition fixed, NeedsAuth fired correctly and the client refreshed its token. But mutations kept failing with JWTExpired on the API side.
Zero-cache caches the auth token at connection time in a private field:
// DEPRECATED: remove #token
// and forward auth and cookie headers that were
// sent with the push.
readonly #token: string | undefined;
This token is used for all subsequent pushes to the upstream API. When the client refreshes its JWT and sends a mutation, zero-cache ignores the fresh token and forwards the stale connection token instead.
When the API returns 401, zero-cache marks the mutation as processed without retrying or surfacing an error. The write is silently lost.
Here's what this looked like in production with a 5-minute token lifetime:
token issued: 00:44:44 (5 min lifetime)
token expires: 00:49:44
00:49:49 - zero-cache forwards token, expiresIn: -5s → auth ok (60s tolerance)
00:50:24 - zero-cache forwards token, expiresIn: -40s → auth ok
00:50:52 - zero-cache forwards token, expiresIn: -68s → 401 FAIL
The client refreshed multiple times during this window. Zero-cache kept using the 6-minute-old connection token.
The fix
Added an optional auth field to the PushBody schema so the client can include its current token with each push. Zero-cache prefers push auth over the cached connection token:
const authToken = msg[1].auth ?? this.#token;
Backwards compatible since clients without the field fall back to existing behavior. Bumped PROTOCOL_VERSION to 47 because the schema change affects the protocol hash. There was discussion about whether the new field would break old servers, since connection.ts rejects unknown fields in schema validation. We kept the approach and documented that servers need to update before clients.
#5503, reviewed and merged by Matt Wonlaw. The fix led to a follow-up #5530 by Rocicorp extending the same pattern to changeDesiredQueries, so query auth is also refreshed per-request rather than cached at connection time.