fix: disable HTTP/2 connection pooling to prevent "channel closed" errors#4274
Open
zhenguo1492 wants to merge 1 commit into
Open
fix: disable HTTP/2 connection pooling to prevent "channel closed" errors#4274zhenguo1492 wants to merge 1 commit into
zhenguo1492 wants to merge 1 commit into
Conversation
…rors When the local application (e.g. Quarkus gRPC with Vert.x) closes idle HTTP/2 connections, the pooled sender becomes a dead reference. Subsequent requests that reuse this sender fail with "channel closed". The retry logic exists but also pulls from the pool, getting the same dead connection. This was already documented in the code comments as a known issue on Windows, but it also affects Linux with any HTTP/2 server that has short idle timeouts. Root cause timeline: 1. First request → new HTTP/2 connection → success 2. Connection idle ~3s → server sends GOAWAY → connection closes normally 3. Next request → pool returns cached dead sender → "channel closed" 4. Retry → pool returns same dead sender → "channel closed" again The proper fix would be to check sender.is_closed() when retrieving from the pool, but disabling pooling is the minimal safe change. Tested with: Istio 1.24 + Quarkus 3.34 gRPC + mirrord steal mode.
Contributor
|
Thanks for contributing to mirrord! 🤘 TBH I don't think the described root cause timeline applies. The sender is only returned to the pool explicitly here, after a successful HTTP exchange. So the next attempt should use a new sender |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
ClientStoreto fix "channel closed" errors when stealing HTTP/2 / gRPC trafficProblem
When the local application closes idle HTTP/2 connections (e.g. Quarkus gRPC with Vert.x sends GOAWAY after ~3s idle), the pooled
http2::SendRequestsender becomes a dead reference. Subsequent requests that reuse this sender fail with:The retry logic (
can_retry()) correctly identifiesSendFailedas retryable, butget_with_pooling()returns the same dead connection from the pool, causing all retries to fail.Root Cause Timeline
make_client()creates new HTTP/2 connection → request succeedswait_for_ready()returns cached dead sender from poolsender.send_request()→ "channel closed" (sender's internal channel already closed)This was already documented in the code as a known issue on Windows (line 90-92 in
client_store.rs), but it equally affects Linux with any HTTP/2 server that has short idle timeouts.Fix
Set
should_enable_connection_pooling()to returnfalse. Each request now creates a fresh HTTP/2 connection.Better Alternative (suggestion)
A more targeted fix would be to check
sender.is_closed()when retrieving from the pool and discard stale entries, or to have the spawned connection task notify the pool when the connection closes. Disabling pooling entirely is the minimal safe change.Test Environment
Test Plan
grpcurl -plaintext localhost:8080works (local gRPC server is functional)