PLT-463: schedule_lag gate + run verdict (stacked on #51)#53
Conversation
PR SummaryMedium Risk Overview Workers record schedule_lag ( main arms the VOID bound on fixed-λ open-loop runs, computes the verdict from the actual arrival model, logs it, and extends run summary + OTel gauges ( Reviewed by Cursor Bugbot for commit 7caba67. Bugbot is set up for automated code reviews on this repo. Configure here. |
Compute schedule_lag = AttemptedSendTime - IntendedSendTime per open-loop tx (bounded reservoir, Algorithm R), expose p99 every run, and render a run VERDICT: VOID when schedule_lag_p99 > threshold x (1/lambda) — a generator-bound run is void, not a footnote. Threshold is a named const (0.10, 'tune from first calibration run'), overridable via config. Gated on the actual arrival model (closed-loop / ramped-lambda => N/A); prewarm and zero-IntendedSendTime txs excluded. Stacked on PLT-459 (#51): needs the inclusion run-summary surface. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- RampUp => N/A (checked before TPS): the ramper drives the live limit via SetLimit so cfg.Settings.TPS is stale; gating against 1/TPS is wrong. - Zero samples on a fixed-λ open-loop run => N/A, never VALID (a trust gate must not bless 'no data' as a clean run). Thread the admitted count from the dispatcher conservation counters; if admitted>0 yet samples==0, log loudly (recorder may be mis-wired) and flag Anomaly. - Drop the redundant inline comment on ScheduleLagVoidThreshold (go-doc keeps the rationale). Cohort: security (false-VALID F1/F2), systems (F2 confirm), idiom (doc dup). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The whole-run p99 (from a uniform reservoir) can dilute a sub-percentile late-run tail blowup → false VALID. Add an UNSAMPLED over-bound counter (incremented per recorded send, not sampled) + max lag: VOID when p99 > bound OR > scheduleLagOverBoundFraction (0.5%, provisional) of sends exceed the bound, with a distinct reason per criterion. Bound is single-sourced (ScheduleLagBound) so run-start arming and verdict-time can't drift; armed only on fixed-λ open-loop runs (inert elsewhere, matching the N/A set). EvaluateScheduleLag now takes ScheduleLagInputs (kills the adjacent-bool positional trap). VOID stays advisory. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Both re-reviewers flagged that the over-bound counter was armed whenever TPS>0, including ramped open-loop runs (RampUp+TPS>0 is a valid config) — the verdict is N/A there so it was never a false-VOID, but it emitted a meaningless over_bound_fraction and contradicted the 'inert on ramped runs' comment. Gate arming on !RampUp so the counter stays inert exactly where the verdict is N/A. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
8beecca to
c1da595
Compare
Strip bare (PLT-463) self-labels and a was-X-now-Y changelog line, drop a standalone TODO and a few what-comments. Keep load-bearing why/invariant comments (reservoir-dilution rationale, Little's-law sizing, registered ⊆ succeeded, negative-lag clamp) and forward-pointing cross-refs. Comment-only. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Implements PLT-463 (M1.5) — the self-check that proves the open-loop fix is actually open-loop.
mainonce #51 merges.What
schedule_lag = AttemptedSendTime − IntendedSendTimeper open-loop tx, recorded into a bounded reservoir (Algorithm R, cap 16384). p99 reported on every run.schedule_lag_p99 > threshold × (1/λ)— named constscheduleLagVoidThreshold = 0.10('tune from first calibration run'), overridable via config. VALID otherwise; N/A for closed-loop or ramped-λ (p99 still reported). Logged loudly + run-summary fields (ScheduleLagP99,Verdict,VoidReason).ArrivalModel.Design forks (flagged for review)
cfg.Settings.TPS; ramped-λ runs → N/A (not gated).Verify
make lint0 issues ·go build·go test -race ./...green.🤖 Generated with Claude Code