PLT-459: Block-indexed tx→inclusion tracker#51
Conversation
Build the submit→inclusion correlation that did not exist: a new InclusionTracker indexes txHash→inclusion per block (one BlockByNumber per block, O(blocks) not O(txs)), matches against a bounded in-flight registry the worker populates at send-completion, and stamps LoadTx.InclusionTime from the including block's header-arrival wall clock (single-clock with IntendedSendTime; not header.Time, not fetch time). Retires the lossy per-tx receipt polling (watchTransactions/ waitForReceipt/sentTxs) and the coordinated-omission-contaminated receiptLatency metric; --track-receipts now enables the tracker. Conservation (asserting test): registered == included + expired + inflight_at_shutdown, with registered ⊆ succeeded; dropped-at-cap and WS-gap misses are surfaced (counters), never leaked. No backfill — WS gaps degrade conservatively. Bounded+reaped map, -race clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
PR SummaryMedium Risk Overview Workers drop Adds OTel Reviewed by Cursor Bugbot for commit 740eef6. Bugbot is set up for automated code reviews on this repo. Configure here. |
- Skip the inclusion-latency sample when IntendedSendTime is zero (prewarm txs are never scheduled) so the histogram isn't polluted with epoch-based durations (systems review). - Don't wire the inclusion tracker under --dry-run: simulated sends never hit the chain and would all reap as expired (security review LOW-1). - processHead short-circuits on a duplicate/out-of-order head (num <= lastSeen): no redundant re-fetch, no spurious gap count (systems nit). - Note the ~2x reapAfter worst-case eviction latency (security INFO-1). - Conservation test now exercises dropped_at_cap within the identity, proving it sits outside the registered set (systems nit). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ursor) A non-positive maxInflight made len(inflight) >= cap always true, so every Register dropped — silently disabling inclusion tracking when --max-in-flight is 0 (e.g. closed-loop + --track-receipts). NewInclusionTracker now floors a non-positive cap to defaultMaxInflight. Test added. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A failed BlockByNumber left the block's txs unmatched while lastSeen still advanced, so they reaped as expired with no signal. Count failures in a block_fetch_errors metric and document the conservative-undercount boundary (same treatment as a WS gap); no retry, to avoid adding RPC load to a struggling SUT. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Blockers1. Late register can miss one-shot block matchSuccessful sends are registered with the inclusion tracker after Location: Suggested fix: register immediately after a successful send, before err = w.sendTransaction(ctx, client, tx) if err == nil { if t, ok := w.cfg.Inclusion.Get(); ok { t.Register(tx) } } if tx.OnComplete != nil { tx.OnComplete(err) } Update 2. Registry cap is undersized for sustained TPSThe inclusion registry cap is set from
not as Location: Suggested fix (pick one):
At minimum, log a warning when Non-blockers
Issue | Severity | Notes
-- | -- | --
inclusion_latency in closed-loop | Nit | Histogram records arrival - IntendedSendTime, but in closed-loop IntendedSendTime is enqueue time, not schedule time. Fine if PLT-462 is open-loop-only; otherwise gate on RunSummary.ArrivalModel or use AttemptedSendTime.
reapAfter hardcoded at 30s | Nit | Congested chains with >30s inclusion will inflate expired. Consider exposing as a flag or tying to expected block time × N.
README stale | Nit | Still says --track-receipts tracks "transaction receipts"; behavior is now block-indexed inclusion.
Fetch failure = permanent miss | Accepted | block_fetch_errors surfaces it; conservative undercount is documented. Reasonable tradeoff vs retry storm on a struggling SUT.
|
- inclusion_latency recorded only on open-loop runs: closed-loop IntendedSendTime is enqueue time, so arrival-IntendedSendTime would mix an enqueue→inclusion latency into the histogram. Tracker is told the model at construction; included/expired counts still accrue in both. - reapAfter is now configurable (--inclusion-reap-after / inclusionReapAfter, default 30s) so congested chains with >30s inclusion don't inflate expired. - README: --track-receipts now documents the block-indexed inclusion tracker, not per-tx receipts. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.
There are 3 total unresolved issues (including 2 from previous reviews).
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit b68dbf6. Configure here.
- reapAfter<=0 (explicit --inclusion-reap-after=0) made reapLoop call time.NewTicker(0), which panics and crashes the tracker. Floor to defaultInclusionReapAfter (30s), mirroring the maxInflight<=0 floor. - The registry cap was MaxInFlight×4, but MaxInFlight bounds concurrent SENDS while a registry entry lives from send to inclusion (much longer). By Little's law size it as max(MaxInFlight×4, ceil(TPS×reapAfter×1.5)) so healthy high-TPS runs don't hit dropped_at_cap and undercount inclusion. - Document the late-register one-shot-match race as an accepted boundary (microsecond window vs block time; rare conservative undercount). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Compute schedule_lag = AttemptedSendTime - IntendedSendTime per open-loop tx (bounded reservoir, Algorithm R), expose p99 every run, and render a run VERDICT: VOID when schedule_lag_p99 > threshold x (1/lambda) — a generator-bound run is void, not a footnote. Threshold is a named const (0.10, 'tune from first calibration run'), overridable via config. Gated on the actual arrival model (closed-loop / ramped-lambda => N/A); prewarm and zero-IntendedSendTime txs excluded. Stacked on PLT-459 (#51): needs the inclusion run-summary surface. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Implements PLT-459 (M1.3) — the submit→inclusion correlation that did not exist in sei-load.
What
stats/inclusion_tracker.go: per block, fetch the body once (O(blocks)), match tx hashes against a bounded in-flight registry, stampLoadTx.InclusionTime. Reaps un-included txs.watchTransactions/waitForReceipt/sentTxs) + the contaminatedreceiptLatencymetric.--track-receiptsnow enables the tracker.Locked design decisions (cohort-reviewed)
InclusionTime= wall-clock at the including block'snewHeadarrival (single-clock withIntendedSendTime; notheader.Time, not fetch-completion). PLT-462's histogram builds on this.lastSeento first head; WS gaps degrade conservatively (unmatched →expired) and are surfaced via ablock_gapscounter.Endpoints[0](no new flag); O(blocks) read load documented as accepted.Conservation
registered == included + expired + inflight_at_shutdown,registered ⊆ succeeded(register only successful sends, at send-completion after OnComplete).dropped_at_captracked + excluded from the inclusion denominator (succeeded/txs_accepted). Asserting + concurrent -race tests.Verify
make lint0 issues ·go build ./...·go test -race ./...green.🤖 Generated with Claude Code
📐 Design: decision brief — PLT-459 · parent design