Skip to content

Centralize response in server and ensure response for transactional requests#3457

Open
kenhuuu wants to merge 2 commits into
masterfrom
server-coordinator
Open

Centralize response in server and ensure response for transactional requests#3457
kenhuuu wants to merge 2 commits into
masterfrom
server-coordinator

Conversation

@kenhuuu

@kenhuuu kenhuuu commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Why

Response writing in the 4.x HTTP endpoint was coordinated through three uncoordinated state holders (Context.requestState, Context.startedResponse, StateKey.HTTP_RESPONSE_SENT) touched by two threads (eval worker + timeout scheduler) with no happens-before relationship. The non-atomic check-then-act guards allowed a double-response race under timeout and an unterminated chunked stream (which hangs the client and poisons the
keep-alive channel). Write logic was also scattered across makeChunk and HttpHandlerUtil, so "who writes, in what order" was an unwritten contract.

What changed

A single per-request HttpResponseCoordinator is now the sole owner of response writes, guarded by one monitor with a 3-state machine (NOT_STARTED → STREAMING → COMPLETED). This makes the key invariant — exactly one well-formed, terminated response — true by construction:

  • Both the eval task and the timeout task enter the same monitor; whichever reaches COMPLETED first wins, the other no-ops. No more check-then-act gap.
  • complete() is idempotent and called from a finally, so the terminal chunk is always written.

The three state holders, makeChunk, and the HttpHandlerUtil write helpers are removed/absorbed (sendError stays for the pre-evaluation rejection path). Backpressure stays outside the lock, so there's no deadlock surface.

This is a pure internal refactor: wire output is byte-identical, no public/driver API or status-code changes.

Second fix: enqueued transactions must return errors

Validating the refactor surfaced an intermittent CI hang. Transactions run on a single-threaded executor; a request queued behind a commit was silently discarded by shutdownNow(), so its client never got a response. Fixed in UnmanagedTransaction.close by using a graceful shutdown() (queued tasks run and return 404 instead of vanishing) and removing the transaction from the manager before shutdown so those tasks fail fast without mutating a closed transaction. The ordering is documented as load-bearing.

NOTE: No documentation or CHANGELOG as this is just a refactoring.

VOTE+1

kenhuuu added 2 commits June 11, 2026 21:15
Introduce a per-request HttpResponseCoordinator that is the sole owner of
response writes for the HTTP endpoint, guarded by a single monitor. It replaces
three uncoordinated state holders (Context.requestState, Context.startedResponse,
StateKey.HTTP_RESPONSE_SENT) with one {NOT_STARTED, STREAMING, COMPLETED} state
machine, closing the timeout-vs-eval double-response race: whichever thread
terminates the response first wins and the others no-op via the COMPLETED
short-circuit.

Assisted-by: Claude Code:claude-opus-4-8
Makes sure that if a traversal closes the transaction that
all subsequent requests meant for that transaction receive
a 404 and aren't left waiting for a response.

Assisted-by: Claude Code:claude-opus-4-8
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant