fix: disable HTTP/2 connection pooling to prevent "channel closed" errors by zhenguo1492 · Pull Request #4274 · metalbear-co/mirrord

zhenguo1492 · 2026-05-18T05:05:47Z

Summary

Disable HTTP/2 connection pooling in ClientStore to fix "channel closed" errors when stealing HTTP/2 / gRPC traffic

Problem

When the local application closes idle HTTP/2 connections (e.g. Quarkus gRPC with Vert.x sends GOAWAY after ~3s idle), the pooled http2::SendRequest sender becomes a dead reference. Subsequent requests that reuse this sender fail with:

WARN: failed to send the request to the local application's HTTP server: channel closed

The retry logic (can_retry()) correctly identifies SendFailed as retryable, but get_with_pooling() returns the same dead connection from the pool, causing all retries to fail.

Root Cause Timeline

First request arrives → make_client() creates new HTTP/2 connection → request succeeds
Connection goes idle for ~3s → local server sends GOAWAY → spawned connection task finishes normally
Next request arrives → wait_for_ready() returns cached dead sender from pool
sender.send_request() → "channel closed" (sender's internal channel already closed)
Retry → pool returns same dead sender → same error

This was already documented in the code as a known issue on Windows (line 90-92 in client_store.rs), but it equally affects Linux with any HTTP/2 server that has short idle timeouts.

Fix

Set should_enable_connection_pooling() to return false. Each request now creates a fresh HTTP/2 connection.

Better Alternative (suggestion)

A more targeted fix would be to check sender.is_closed() when retrieving from the pool and discard stale entries, or to have the spawned connection task notify the pool when the connection closes. Disabling pooling entirely is the minimal safe change.

Test Environment

mirrord 3.210.0 (OSS) steal mode
Istio 1.24.3 service mesh (mTLS, PERMISSIVE)
Quarkus 3.34.6 gRPC server (Vert.x transport, Virtual Threads)
Kubernetes 1.30.14
Envoy gRPC-JSON transcoder → gRPC backend

Test Plan

Verified grpcurl -plaintext localhost:8080 works (local gRPC server is functional)
Confirmed "channel closed" with connection pooling enabled (multiple attempts)
Confirmed zero "channel closed" errors with connection pooling disabled
Tested with Istio sidecar enabled (full mTLS mesh) — works correctly
Tested Quarkus dev mode hot reload through mirrord — works correctly

…rors When the local application (e.g. Quarkus gRPC with Vert.x) closes idle HTTP/2 connections, the pooled sender becomes a dead reference. Subsequent requests that reuse this sender fail with "channel closed". The retry logic exists but also pulls from the pool, getting the same dead connection. This was already documented in the code comments as a known issue on Windows, but it also affects Linux with any HTTP/2 server that has short idle timeouts. Root cause timeline: 1. First request → new HTTP/2 connection → success 2. Connection idle ~3s → server sends GOAWAY → connection closes normally 3. Next request → pool returns cached dead sender → "channel closed" 4. Retry → pool returns same dead sender → "channel closed" again The proper fix would be to check sender.is_closed() when retrieving from the pool, but disabling pooling is the minimal safe change. Tested with: Istio 1.24 + Quarkus 3.34 gRPC + mirrord steal mode.

Razz4780 · 2026-05-19T10:02:08Z

Thanks for contributing to mirrord! 🤘

TBH I don't think the described root cause timeline applies. The sender is only returned to the pool explicitly here, after a successful HTTP exchange. So the next attempt should use a new sender

aviramha requested a review from Razz4780 May 19, 2026 08:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: disable HTTP/2 connection pooling to prevent "channel closed" errors#4274

fix: disable HTTP/2 connection pooling to prevent "channel closed" errors#4274
zhenguo1492 wants to merge 1 commit into
metalbear-co:mainfrom
zhenguo1492:fix/http2-connection-pooling-channel-closed

zhenguo1492 commented May 18, 2026

Uh oh!

Razz4780 commented May 19, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

zhenguo1492 commented May 18, 2026

Summary

Problem

Root Cause Timeline

Fix

Better Alternative (suggestion)

Test Environment

Test Plan

Uh oh!

Razz4780 commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Razz4780 commented May 19, 2026 •

edited

Loading