PLT-459: Block-indexed tx→inclusion tracker by bdchatham · Pull Request #51 · sei-protocol/sei-load

bdchatham · 2026-06-15T21:59:35Z

Implements PLT-459 (M1.3) — the submit→inclusion correlation that did not exist in sei-load.

What

New stats/inclusion_tracker.go: per block, fetch the body once (O(blocks)), match tx hashes against a bounded in-flight registry, stamp LoadTx.InclusionTime. Reaps un-included txs.
Retires lossy per-tx receipt polling (watchTransactions/waitForReceipt/sentTxs) + the contaminated receiptLatency metric. --track-receipts now enables the tracker.

Locked design decisions (cohort-reviewed)

Clock: InclusionTime = wall-clock at the including block's newHead arrival (single-clock with IntendedSendTime; not header.Time, not fetch-completion). PLT-462's histogram builds on this.
No backfill: startup inits lastSeen to first head; WS gaps degrade conservatively (unmatched → expired) and are surfaced via a block_gaps counter.
Fetch endpoint: Endpoints[0] (no new flag); O(blocks) read load documented as accepted.

Conservation

registered == included + expired + inflight_at_shutdown, registered ⊆ succeeded (register only successful sends, at send-completion after OnComplete). dropped_at_cap tracked + excluded from the inclusion denominator (succeeded/txs_accepted). Asserting + concurrent -race tests.

Verify

make lint 0 issues · go build ./... · go test -race ./... green.

🤖 Generated with Claude Code

📐 Design: decision brief — PLT-459 · parent design

Build the submit→inclusion correlation that did not exist: a new InclusionTracker indexes txHash→inclusion per block (one BlockByNumber per block, O(blocks) not O(txs)), matches against a bounded in-flight registry the worker populates at send-completion, and stamps LoadTx.InclusionTime from the including block's header-arrival wall clock (single-clock with IntendedSendTime; not header.Time, not fetch time). Retires the lossy per-tx receipt polling (watchTransactions/ waitForReceipt/sentTxs) and the coordinated-omission-contaminated receiptLatency metric; --track-receipts now enables the tracker. Conservation (asserting test): registered == included + expired + inflight_at_shutdown, with registered ⊆ succeeded; dropped-at-cap and WS-gap misses are surfaced (counters), never leaked. No backfill — WS gaps degrade conservatively. Bounded+reaped map, -race clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

cursor · 2026-06-15T21:59:41Z

PR Summary

Medium Risk
Changes core load-test observability and RPC usage (WS heads + block bodies on Endpoints[0]); inclusion counts can undercount on gaps/fetch failures by design, but send path behavior is unchanged aside from removed receipt polling.

Overview
--track-receipts now turns on a block-indexed inclusion tracker instead of per-tx receipt polling. Successful sends are registered after OnComplete; the tracker matches txs against one block fetch per head, stamps LoadTx.InclusionTime, reaps stale entries, and reports included / expired / dropped-at-cap / inflight-at-shutdown.

Workers drop watchTransactions, sentTxs, and the receipt_latency metric. NewShardedSender takes an optional shared InclusionTracker; wiring is skipped for --dry-run. New --inclusion-reap-after (default 30s) and inclusionRegistryCap (TPS × reap window vs MaxInFlight) size the registry.

Adds OTel inclusion_* / block_gaps metrics and RunSummary inclusion fields; docs and tests cover conservation and worker register ordering.

^{Reviewed by Cursor Bugbot for commit 740eef6. Bugbot is set up for automated code reviews on this repo. Configure here.}

- Skip the inclusion-latency sample when IntendedSendTime is zero (prewarm txs are never scheduled) so the histogram isn't polluted with epoch-based durations (systems review). - Don't wire the inclusion tracker under --dry-run: simulated sends never hit the chain and would all reap as expired (security review LOW-1). - processHead short-circuits on a duplicate/out-of-order head (num <= lastSeen): no redundant re-fetch, no spurious gap count (systems nit). - Note the ~2x reapAfter worst-case eviction latency (security INFO-1). - Conservation test now exercises dropped_at_cap within the identity, proving it sits outside the registered set (systems nit). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…ursor) A non-positive maxInflight made len(inflight) >= cap always true, so every Register dropped — silently disabling inclusion tracking when --max-in-flight is 0 (e.g. closed-loop + --track-receipts). NewInclusionTracker now floors a non-positive cap to defaultMaxInflight. Test added. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

A failed BlockByNumber left the block's txs unmatched while lastSeen still advanced, so they reaped as expired with no signal. Count failures in a block_fetch_errors metric and document the conservative-undercount boundary (same treatment as a WS gap); no retry, to avoid adding RPC load to a struggling SUT. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

amir-deris · 2026-06-15T22:33:27Z

Blockers

1. Late register can miss one-shot block match

Successful sends are registered with the inclusion tracker after OnComplete and RecordTransaction, while each block is matched exactly once when its newHead is handled. If matchBlock for the including block runs before Register inserts the hash, that tx is never matched again and later reaps as expired.

Location: sender/worker.go (register after OnComplete)

Suggested fix: register immediately after a successful send, before OnComplete:

err = w.sendTransaction(ctx, client, tx)

if err == nil {

if t, ok := w.cfg.Inclusion.Get(); ok {

t.Register(tx)

}

if tx.OnComplete != nil {

tx.OnComplete(err)

}

Update TestRunTxSender_RegistersSuccessfulSend accordingly. registered ⊆ succeeded still holds because registration remains gated on err == nil.

2. Registry cap is undersized for sustained TPS

The inclusion registry cap is set from MaxInFlight × 4, but open-loop permits are released when each send finishes while entries stay registered until a block match or reapAfter (30s). Under sustained TPS, registry size scales roughly as:

TPS × inclusion_latency

not as MaxInFlight. Example: 2k TPS × 30s reap window → ~60k entries; default cap is 40k (10_000 × 4). Healthy txs can hit dropped_at_cap and never get matched — a silent inclusion undercount.

Location: main.go (MaxInFlight * maxInflightMultiple)

Suggested fix (pick one):

Size cap from TPS × inclusionReapAfter (with a floor)
Add --inclusion-max-inflight (default derived)
Document that --track-receipts requires tuning max-in-flight for high-TPS runs and increase the multiplier substantially

At minimum, log a warning when dropped_at_cap > 0 at shutdown.

Non-blockers

Issue | Severity | Notes -- | -- | -- inclusion_latency in closed-loop | Nit | Histogram records arrival - IntendedSendTime, but in closed-loop IntendedSendTime is enqueue time, not schedule time. Fine if PLT-462 is open-loop-only; otherwise gate on RunSummary.ArrivalModel or use AttemptedSendTime. reapAfter hardcoded at 30s | Nit | Congested chains with >30s inclusion will inflate expired. Consider exposing as a flag or tying to expected block time × N. README stale | Nit | Still says --track-receipts tracks "transaction receipts"; behavior is now block-indexed inclusion. Fetch failure = permanent miss | Accepted | block_fetch_errors surfaces it; conservative undercount is documented. Reasonable tradeoff vs retry storm on a struggling SUT.

- inclusion_latency recorded only on open-loop runs: closed-loop IntendedSendTime is enqueue time, so arrival-IntendedSendTime would mix an enqueue→inclusion latency into the histogram. Tracker is told the model at construction; included/expired counts still accrue in both. - reapAfter is now configurable (--inclusion-reap-after / inclusionReapAfter, default 30s) so congested chains with >30s inclusion don't inflate expired. - README: --track-receipts now documents the block-indexed inclusion tracker, not per-tx receipts. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

cursor

Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.

There are 3 total unresolved issues (including 2 from previous reviews).

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit b68dbf6. Configure here.}

- reapAfter<=0 (explicit --inclusion-reap-after=0) made reapLoop call time.NewTicker(0), which panics and crashes the tracker. Floor to defaultInclusionReapAfter (30s), mirroring the maxInflight<=0 floor. - The registry cap was MaxInFlight×4, but MaxInFlight bounds concurrent SENDS while a registry entry lives from send to inclusion (much longer). By Little's law size it as max(MaxInFlight×4, ceil(TPS×reapAfter×1.5)) so healthy high-TPS runs don't hit dropped_at_cap and undercount inclusion. - Document the late-register one-shot-match race as an accepted boundary (microsecond window vs block time; rare conservative undercount). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Compute schedule_lag = AttemptedSendTime - IntendedSendTime per open-loop tx (bounded reservoir, Algorithm R), expose p99 every run, and render a run VERDICT: VOID when schedule_lag_p99 > threshold x (1/lambda) — a generator-bound run is void, not a footnote. Threshold is a named const (0.10, 'tune from first calibration run'), overridable via config. Gated on the actual arrival model (closed-loop / ramped-lambda => N/A); prewarm and zero-IntendedSendTime txs excluded. Stacked on PLT-459 (#51): needs the inclusion run-summary surface. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

cursor Bot reviewed Jun 15, 2026

View reviewed changes

Comment thread stats/inclusion_tracker.go Outdated

Comment thread stats/inclusion_tracker.go

bdchatham and others added 2 commits June 15, 2026 15:11

cursor Bot reviewed Jun 15, 2026

View reviewed changes

Comment thread stats/inclusion_tracker.go

cursor Bot reviewed Jun 15, 2026

View reviewed changes

Comment thread sender/worker.go

cursor Bot reviewed Jun 15, 2026

View reviewed changes

Comment thread main.go Outdated

bdchatham requested a review from amir-deris June 15, 2026 22:27

bdchatham assigned masih and unassigned masih Jun 15, 2026

bdchatham requested a review from masih June 15, 2026 22:27

amir-deris approved these changes Jun 15, 2026

View reviewed changes

bdchatham mentioned this pull request Jun 15, 2026

PLT-463: schedule_lag gate + run verdict (stacked on #51) #53

Closed

cursor Bot reviewed Jun 15, 2026

View reviewed changes

Comment thread stats/inclusion_tracker.go

bdchatham merged commit 087369b into main Jun 15, 2026
4 checks passed

bdchatham deleted the brandon2/plt-459-m13-block-indexed-txinclusion-tracker branch June 15, 2026 23:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PLT-459: Block-indexed tx→inclusion tracker#51

PLT-459: Block-indexed tx→inclusion tracker#51
bdchatham merged 6 commits into
mainfrom
brandon2/plt-459-m13-block-indexed-txinclusion-tracker

bdchatham commented Jun 15, 2026 •

edited

Loading

Uh oh!

cursor Bot commented Jun 15, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

amir-deris commented Jun 15, 2026 •

edited

Loading

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

bdchatham commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Locked design decisions (cohort-reviewed)

Conservation

Verify

Uh oh!

cursor Bot commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

amir-deris commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Blockers

1. Late register can miss one-shot block match

2. Registry cap is undersized for sustained TPS

Non-blockers

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

bdchatham commented Jun 15, 2026 •

edited

Loading

cursor Bot commented Jun 15, 2026 •

edited

Loading

amir-deris commented Jun 15, 2026 •

edited

Loading