feat(core): implement validatorapi beacon/sync committee selections handlers#459
Open
varex83agent wants to merge 8 commits into
Open
feat(core): implement validatorapi beacon/sync committee selections handlers#459varex83agent wants to merge 8 commits into
varex83agent wants to merge 8 commits into
Conversation
f5c3b49 to
8eedb3f
Compare
Collaborator
Author
/loop-review-pr summaryRan 3 review-and-fix iteration(s) against this PR. Terminated by: completion_promise ( Quality gates (final)
Resolved during the loopMajor (8)
Minor (7)
Nits (3)
Outstanding (intentional deferrals)These were flagged during the loop but deferred because they sit outside this PR's stated scope or trade off poorly against the size of the change required:
VerdictPR is ideal — all bug/major findings resolved, gates green. State log: |
…andlers Ports `BeaconCommitteeSelections` and `SyncCommitteeSelections` from `core/validatorapi/validatorapi.go` (lines 798-864 and 1072-1138). Each handler resolves every selection's validator-index to a DV root pubkey via a new `register_active_validators` hook, builds the matching `signeddata::*::new_partial` wrapper, verifies the per-share selection proof, groups the partial-signed data by slot, fans the per-slot `ParSignedDataSet` out to every subscriber under the corresponding `PrepareAggregator` / `PrepareSyncContribution` duty, then pulls each aggregated reply back out of the AggSigDB via `await_agg_sig_db` and stitches it into the response. The `BeaconCommitteeSelection` / `SyncCommitteeSelection` placeholder structs in `validatorapi/types.rs` are now aliases for the consensus `v1::BeaconCommitteeSelection` / `v1::SyncCommitteeSelection`, matching the established pattern (`AttestationData = phase0::AttestationData`) so the handler signatures carry the real validator-index / slot / selection-proof tuple Go uses. `fetch_active_validators` is bounded by `UPSTREAM_REQUEST_TIMEOUT`; each `await_agg_sig_db` lookup is bounded by `SELECTIONS_AGG_SIG_DB_TIMEOUT` (~two slots). Upstream stalls surface as 504/408; unknown validator indices surface as 400 with a generic "validator not found" message; verification failures surface as 400; subscriber / lookup errors surface as 500 with the original error attached as `source` for debug logging only. Tests (10 new): - happy path for each handler (one selection in, one aggregated out) - multi-selection input — per-slot subscriber fanout + response stitching - unknown validator-index short-circuits with 400 before the AggSigDB is touched - tampered (zero) selection proof short-circuits with 400 before the AggSigDB is touched - stalled `await_agg_sig_db` trips `SELECTIONS_AGG_SIG_DB_TIMEOUT` and returns 408 Co-Authored-By: Bohdan Ohorodnii <35969035+varex83@users.noreply.github.com>
The Handler trait methods Component::{beacon,sync}_committee_selections
were reachable only from the Handler trait — the router still had
todo!() stubs at the corresponding axum entry points, so any real
request to POST /eth/v1/validator/{beacon,sync}_committee_selections
would panic the request task.
Wires both endpoints into AppState::handler, mirroring Charon's
beaconCommitteeSelectionsJSON / syncCommitteeSelectionsJSON wire shape
(${data: [...]}) via two small response wrappers in types.rs.
Adds SELECTIONS_BODY_LIMIT (64 KiB) and a selections_post helper so
the new POST routes are bounded before deserialisation, defending
against the request-amplification surface (one BLS verification per
selection) once the routes are reachable.
Extends TestHandler with selection response stubs and adds router
tests for: response shape, sync-variant wrapping, and the body-limit
layer rejecting oversized POSTs.
The previous SELECTIONS_AGG_SIG_DB_TIMEOUT was applied per AggSigDB
await, so a batch of N selections could legitimately spend N*24s in
the lookup phase before failing; the subscriber fan-out had no
deadline at all, letting a stalled subscriber pin the handler.
Both selection handlers now anchor a single Instant deadline at the
start of the fan-out phase and use tokio::time::timeout_at for every
subscriber call and every await_agg_sig_db_fn call. The renamed
SELECTIONS_PHASE_TIMEOUT is the total budget for that phase rather
than a per-await bound.
Each error path in the fan-out and AggSigDB-await loops now emits a
tracing::warn with the slot and the underlying source error so an
operator can correlate a 408/500 response with a debug-pipeline log
without leaking either into the client-visible message.
Adds six tests:
* subscriber timeout (beacon, sync)
* empty selections input is a 200 no-op (beacon, sync)
* AggSigDB returning the wrong concrete SignedData type produces 500
via the downcast helper (beacon, sync)
Cleanup after the selection-route wiring change — rustfmt collapses the BeaconCommitteeSelectionsResponse import block onto one line and trims the let bindings in the new tests to match the file's existing formatting.
clippy::arithmetic_side_effects (workspace-wide deny) flags `Instant::now() + Duration` even though overflow is unreachable in practice. Switches to `checked_add` with an .expect() carrying the invariant, matching the pattern already used elsewhere in the crate (qbft/consensus timer code).
The selection endpoints (and the duties endpoints on this branch)
extracted the body as bare `Json<...>`, so malformed JSON, wrong
element types, and missing content-type produced axum's default
plain-text 400 / 422 / 415 — out of line with the router's other
4xx responses which already use the `{code, message}` envelope via
`query_rejection_to_api_error`.
Mirrors the fix already on main for the duties endpoints
(b5eb0ca) which this branch predates. Adds a
`json_rejection_to_api_error` helper that normalises every parse
failure to a uniform 400 (matching Charon's `unmarshal`) while
preserving the 413 path so the body-size cap still surfaces
correctly. Both selection routes and both duties routes route their
`Json<...>` rejection through it.
Also:
* Reorders router.rs so all `use` items come before the
`const` declarations (style — the SELECTIONS_BODY_LIMIT
addition had split the `use` blocks).
* Collapses the duplicate `duties_post` / `selections_post`
helpers into a single generic `bounded_post(handler, limit)`.
* Tightens the `SELECTIONS_BODY_LIMIT` doc-comment per-entry
size estimate (~210-250 B, ~250-300 entries per request).
* Adds two router tests asserting the envelope shape for a
malformed body — one per selection endpoint.
* Extract a `selections_deadline()` helper so the
`Instant::checked_add(...).expect(...)` pattern lives in one place
and the panic message tracks the constant rather than embedding a
hard-coded duration string.
* Switch `beacon_committee_selections` to call
`selection.message_root()` instead of inlining
`selection.slot.tree_hash_root().0` — matches the sync handler
and centralises the SSZ selection-proof root in
`crates/eth2api/src/v1.rs`. The `tree_hash::TreeHash` import is
no longer needed.
* Add `tracing::error!` to the two "hook not registered" branches
in `fetch_active_validators` and the new
`await_agg_sig_db_fn` check inside each selection handler.
A handler running with a half-wired `Component` is exactly the
kind of operational red flag the surrounding logs are designed
to surface.
* Bump the subscriber-`Err` and aggsigdb-`Err` log calls in the
selection fan-out / await loops from `tracing::warn!` to
`tracing::error!`. Timeouts stay at `warn` because they are
transient peer-side issues; non-transient wiring failures
belong at `error` so production alerting picks them up.
* Destructure-by-copy in both per-slot loops
(`for (&slot, set) in &psigs_by_slot`) so the `*slot` deref
noise in eight `tracing` field assignments and four
`SlotNumber::new(*slot)` calls collapses to plain `slot`.
* `TestHandler::{beacon,sync}_committee_selections` switch from
`.clone().expect(...)` to `match .as_ref()` so unstubbed
call-sites panic with `unimplemented!` (matching every other
stubbed method in the file) and the clone only happens on the
`Some` path.
* Reorder `component.rs` imports back to the workspace
`std` → external → `super`/`crate` grouping. Iter2's helper
changes had left the `super::` block sitting between `std`
and the external crates, drifting from every other file in
`validatorapi/` and the sibling core modules.
* Apply the iter2 `match self.X.as_ref() { Some(r) => Ok(r.clone()),
None => unimplemented!(...) }` stub shape to the remaining four
`Option`-based `TestHandler` methods (attester_duties,
proposer_duties, sync_committee_duties, attestation_data) so the
whole file uses one style instead of two.
* Add `attester_duties_returns_api_error_shape_on_malformed_body`
— the duties endpoints share the same `json_rejection_to_api_error`
plumbing as the selection routes, and the iter2 envelope tests
only locked the selection side. This test would now fail if a
future refactor accidentally re-introduced bare `Json<ValIndexes>`
extraction on a duties handler. The selection envelope test
docstring is updated to point at the new duties counterpart.
554ce69 to
bc5f269
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
BeaconCommitteeSelections(core/validatorapi/validatorapi.go:798-864) andSyncCommitteeSelections(validatorapi.go:1072-1138) intocrates/core/src/validatorapi/component.rs, replacing bothunimplemented!()arms.register_active_validatorshook onComponentso the handlers can translate per-selectionvalidator_indexvalues to DV root pubkeys, mirroring Go'sc.eth2Cl.ActiveValidators(ctx)lookup.BeaconCommitteeSelection {}/SyncCommitteeSelection {}placeholder structs invalidatorapi/types.rswith type aliases forpluto_eth2api::v1::BeaconCommitteeSelection/v1::SyncCommitteeSelection, matching the establishedAttestationData = phase0::AttestationDatapattern.await_agg_sig_dblookup with a newSELECTIONS_AGG_SIG_DB_TIMEOUT(~two slots) so a stalled AggSigDB cannot pin a selections request forever; bounds theactive_validatorshook withUPSTREAM_REQUEST_TIMEOUT.ApiError::sourcefor debug logs only.Go reference
beacon_committee_selectionscore/validatorapi/validatorapi.go:798-864Component::beacon_committee_selectionsinvalidatorapi/component.rssync_committee_selectionscore/validatorapi/validatorapi.go:1072-1138Component::sync_committee_selectionsinvalidatorapi/component.rsActiveValidatorslookupc.eth2Cl.ActiveValidators(ctx)(called at the top of both handlers)register_active_validatorshook onComponentreturningHashMap<ValidatorIndex, BLSPubKey>verifyPartialSigvalidatorapi.go:1352-1368Component::verify_partial_sig(PR-1 plumbing)NewPartialSignedBeaconCommitteeSelectioncore/signeddata.go:1241signeddata::BeaconCommitteeSelection::new_partialNewPartialSignedSyncCommitteeSelectioncore/signeddata.go:1303signeddata::SyncCommitteeSelection::new_partialNewPrepareAggregatorDuty(slot)Duty::new_prepare_aggregator_duty(SlotNumber::new(slot))NewPrepareSyncContributionDuty(slot)Duty::new_prepare_sync_contribution_duty(SlotNumber::new(slot))awaitAggSigDBFunc(ctx, duty, pk)validatorapi.go:849/:1123await_agg_sig_db_fnhook (PR-1 plumbing) bounded bySELECTIONS_AGG_SIG_DB_TIMEOUTTest plan
cargo +nightly fmt --all --checkcargo clippy -p pluto-core --all-targets --all-features -- -D warningscargo test -p pluto-core --all-features- 384/384 passing (10 new tests covering: happy path / multi-selection fanout+stitching / unknown validator-index 400 / verification-failure short-circuit / await timeout - for each of the two handlers)