Lazy initialize remote streams#2601
Conversation
|
Thanks! The design intent of this change makes sense to me, though I haven't yet completed a detailed review.
It might be slightly smaller, I guess? Pretty marginal benefit in return for code complexity though. |
I reverted it, it's just an extra conditional in a few places to avoid queuing the event. |
Ralith
left a comment
There was a problem hiding this comment.
Implementation logic LGTM on first pass; mostly cosmetic issues. Would appreciate a careful second set of eyes on this though.
|
As usual, please squash commits that revise earlier commits in the same PR. |
ef9957b to
1b34ca7
Compare
|
I think this is ready, although I haven't tested outside of running the unit tests. |
Ralith
left a comment
There was a problem hiding this comment.
LGTM (though please rebase rather than merging when updating)
heh sorry not used to force pushing |
`StreamsState::new` used to eagerly insert `(id, None)` placeholder entries in both `send` and `recv` FxHashMaps for every remote stream id With `max_concurrent_*_streams = 10_000`, this would cost ~0.5-2 MB per Connection. We now only allocate based on the largest stream ID received thus far. Any stream ID higher than the max, but lower than the flow control limit, is implicitly in the default state. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
33f5a5a to
ebf9e43
Compare
Port of quinn-rs/quinn#2601. `StreamsState::new` used to eagerly insert `(id, None)` placeholder entries in both `send` and `recv` FxHashMaps for every remote stream id in `0..max_remote[dir]`. With `max_concurrent_*_streams = 10_000` (e.g. for MoQ relay nodes that burn through short-lived streams), hashbrown rounded each map's bucket array up to ~65K buckets, costing ~0.5-2 MB of bucket memory per Connection before any stream data was sent. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fixes #2600.
Previously, we would insert into the send/recv HashMaps based on our advertised MAX_STREAMS amount. So if we establish a connection with MAX_STREAMS_UNI=10k and MAX_STREAMS_BI=10k, we'll end up allocating 30k HashMap entries. Obviously this takes up a lot of RAM, even if the connection only uses a handful of streams.
Now we only insert into the HashMap based on the maximum stream ID ever received. If the maximum we've seen is 3 and we get a stream for 6, then we'll insert 4..=6 into the HashMap. Senders typically open stream IDs in ascending order (although it's not a requirement) so the usual case means we insert once. An attacker could open the maximum stream ID allowed to cause the old behavior... but again that's the current behavior, and the whole point of MAX_STREAMS is meant to limit this resource exhaustion attack.
The reason why we can't insert only 6 into the HashMap is because we need some way of tracking what streams have ever been closed. With this change, we're using HashMap removal as the Closed state, but only for streams below next_remote (7 in this example). We would need some separate HashSet to store closed (or not closed) streams which just kind of defeats the entire purpose of inserting into the HashMap in the first place? IDK unless somebody can think of an alternative.