Skip to content

feat(core): implement validatorapi voluntary_exit + validator_registrations handlers#460

Open
varex83agent wants to merge 13 commits into
mainfrom
bohdan/validatorapi-exit-registrations
Open

feat(core): implement validatorapi voluntary_exit + validator_registrations handlers#460
varex83agent wants to merge 13 commits into
mainfrom
bohdan/validatorapi-exit-registrations

Conversation

@varex83agent
Copy link
Copy Markdown
Collaborator

Summary

  • submit_voluntary_exit: implements validatorapi.go:752-795 — looks up
    the DV root pubkey via a new active_validators_fn hook, derives the
    duty slot as slots_per_epoch * epoch, builds the ParSignedData
    through signeddata::SignedVoluntaryExit::new_partial, verifies the
    partial signature against the DOMAIN_VOLUNTARY_EXIT domain at the
    exit's epoch, and fans the set out to every registered subscriber.
  • submit_validator_registrations: implements validatorapi.go:674-749
    — early-returns on empty input, swallows on disabled builder mode with
    a tracing::warn!, iterates entries through submit_one_registration
    which silently skips non-DV pubkeys (mirrors Go's swallowRegFilter),
    derives the slot from the registration timestamp + genesis time +
    slot duration, and verifies under DOMAIN_APPLICATION_BUILDER at
    epoch 0 (matches Go's VersionedSignedValidatorRegistration.Epoch).
  • Replaces the placeholder SignedVoluntaryExit and
    SignedValidatorRegistration unit structs in validatorapi/types.rs
    with real wrappers around phase0::SignedVoluntaryExit and
    versioned::VersionedSignedValidatorRegistration. The handler trait
    signatures are unchanged.
  • Adds one new ActiveValidatorsFn Component hook +
    register_active_validators method (same PR-1 pattern: Arc<…>,
    overwrite-on-re-register). Drops the now-stale #[allow(dead_code)]
    attributes on share_idx, builder_enabled, subs, and
    verify_partial_sig now that they're all consumed.

Go reference

Pluto Charon Go
Component::submit_voluntary_exit Component.SubmitVoluntaryExit (validatorapi.go:752-795)
Component::submit_validator_registrations Component.SubmitValidatorRegistrations (validatorapi.go:731-749)
Component::submit_one_registration (private) Component.submitRegistration (validatorapi.go:674-728)
slot_from_timestamp (private helper) SlotFromTimestamp (validatorapi.go:41-70)
ActiveValidatorsFn hook c.eth2Cl.ActiveValidators(ctx)
signeddata::SignedVoluntaryExit::new_partial core.NewPartialSignedVoluntaryExit
signeddata::VersionedSignedValidatorRegistration::new_partial core.NewPartialVersionedSignedValidatorRegistration

Test plan

  • cargo +nightly fmt --all --check
  • cargo clippy -p pluto-core --all-targets --all-features -- -D warnings
  • cargo test -p pluto-core --all-features — 383/383 passing (9 new):
    • submit_voluntary_exit_resolves_validator_and_fanouts (happy path),
    • submit_voluntary_exit_rejects_unknown_validator (400),
    • submit_voluntary_exit_returns_500_when_hook_unregistered,
    • submit_voluntary_exit_rejects_bad_signature (real BLS, 400),
    • submit_validator_registrations_swallows_when_builder_disabled,
    • submit_validator_registrations_no_op_on_empty_input,
    • submit_validator_registrations_swallows_non_dv_pubkey,
    • submit_validator_registrations_happy_path_fanouts,
    • submit_validator_registrations_rejects_bad_signature (real BLS, 400).

varex83 and others added 11 commits May 28, 2026 14:08
Threads the Handler through Axum state via AppState<H> + with_state,
wires the node_version route to the real handler, and adds a TestHandler
mock that future PRs will extend per-endpoint.
Re-uses the auto-generated pluto_eth2api envelopes
(GetProposerDutiesResponseResponse, GetVersionResponseResponse) as the
on-the-wire shape rather than hand-rolling parallel types. node_version
is migrated to the same pattern; the body.rs hand-rolled wrapper module
is removed.
Drops the per-handler generic parameter and routes through
Arc<dyn Handler> via AppState. The Handler trait is object-safe
(Send + Sync + 'static + async_trait-generated methods), so this
is a pure type change with no surface impact.
Adds the Handler impl that the router has been calling through.
node_version returns the obolnetwork/pluto/{version}-{commit}/{arch}-{os}
identity string; proposer_duties calls the upstream beacon node and
rewrites known DV root public keys to this node's public share so the
validator client sees keys matching its keystore. The remaining 17
trait methods are unimplemented!() stubs that land per-PR as their
router handlers are ported.
Wires POST /eth/v1/validator/duties/attester/{epoch}: dual-format
(numeric or string-encoded) validator index body, upstream call,
pubshare swap.
Wires POST /eth/v1/validator/duties/sync/{epoch}, reusing the
ValIndexes dual-format body extractor.
Wires GET /eth/v1/validator/attestation_data. The Component now
holds an Arc<MemDB> and awaits unsigned attestation data from the
local DutyDB rather than hitting upstream.
Bug fixes (must-fix per review):

- attestation_data: wrap MemDB::await_attestation in tokio::time::timeout
  (24s) so a request for a slot that never produces consensus output
  cannot hold a handler task indefinitely. delete_duty now records
  evicted keys per duty type and notifies waiters, so await_data returns
  Error::AwaitDutyExpired immediately when the awaited duty is gone
  instead of spinning until the timeout fires. Maps to 408 on the wire.
- Stop leaking upstream BlindedBlock400Response Debug output (incl.
  stacktraces) into the client-visible ApiError.message. The variant
  payload is now attached as `source` for debug logs; the message stays
  generic.

Hardening:

- new_insecure is gated behind #[cfg(test)] so the insecure_test flag
  cannot reach production builds.
- new_router applies DefaultBodyLimit::max(64 KiB) on the two
  POST /duties/{attester,sync}/{epoch} routes — defends against the
  Vec<u64> parse amplification on the ValIndexes deserializer.
- All upstream eth2_cl calls are wrapped in tokio::time::timeout(12s)
  so a hanging beacon node cannot stall handler tasks.
- proposer_duties / attester_duties / sync_committee_duties propagate
  upstream BadRequest as 400 and ServiceUnavailable as 503 instead of
  collapsing every non-Ok variant to 502 — the VC can now back off on
  upstream syncing instead of treating it as a gateway failure.
- swap_attester_pubshares / swap_sync_committee_pubshares now return
  500 (cluster misconfig) instead of 502 when a pubshare is missing —
  the upstream returned well-formed data, the failure is local.

ValIndexes:

- Replace #[serde(untagged)] with a streaming Visitor that validates
  each element via SeqAccess::next_element. Avoids the speculative
  Vec<u64> parse and the serde Content cache. Now accepts mixed
  numeric/string elements and rejects negative integers.
- Hard cap at 8192 indices per request.

ApiError:

- with_boxed_source for sources that aren't std::error::Error (e.g.
  anyhow::Error from auto-gen request builders).

Router:

- attestation_data uses Result<Query<...>, QueryRejection> so 4xx
  responses from missing/malformed query params share the same
  { code, message } envelope as the rest of the router.

Tests (+13):

- attestation_data: timeout when data never arrives; 408 when duty is
  evicted while a waiter is parked; cancellation cleanup when the
  handler future is dropped; negative lookup on wrong committee_index.
- Status-mapping helpers: confirm upstream Debug output is never
  serialized into the message.
- Router: ApiError envelope on bad query; oversized body rejection;
  ValIndexes empty/mixed/oversized/negative cases.

Co-Authored-By: Bohdan Ohorodnii <35969035+varex83@users.noreply.github.com>
Adds the plumbing every subsequent submit/await handler needs without
implementing any of the unimplemented!() arms. Mirrors Charon's
core/validatorapi/validatorapi.go:196-256 (subscriber list + six
Register* hooks) plus :1352 (verifyPartialSig).

- New Component fields: subs, await_proposal_fn, await_agg_attestation_fn,
  await_sync_contribution_fn, await_agg_sig_db_fn, duty_def_fn,
  pub_key_by_att_fn. All Option<Arc<…>> so registration before the
  Component is shared in an Arc, then read-only thereafter.
- subscribe() wraps the user closure with a set-clone step so each
  subscriber receives its own ParSignedDataSet — matches Go's
  Subscribe clone-before-fanout at validatorapi.go:249-256.
- register_* methods replace any prior registration, matching Go's
  single-function input semantics.
- verify_partial_sig() honours insecure_test, looks up this node's
  public share from pub_share_by_pubkey, then delegates to
  pluto_eth2util::signing::verify. Unlike Go — which projects domain /
  epoch / message-root through the core.Eth2SignedData interface — the
  Rust hook takes those three values directly so we don't have to port
  the Eth2SignedData trait in this plumbing PR; submit handlers in PRs
  3-6 will derive the triple from their concrete signed-data wrapper.

Tests: subscribe fanout clones per subscriber; the six register hooks
all overwrite on re-register; unregistered hooks default to None;
verify_partial_sig accepts a real BLS signature, rejects a tampered
one, rejects an unknown DV pubkey, and short-circuits in insecure_test
mode.

Co-Authored-By: Bohdan Ohorodnii <35969035+varex83@users.noreply.github.com>
…ations handlers

Replaces the two unimplemented!() arms on Component with the
real submit_voluntary_exit and submit_validator_registrations
handlers. Mirrors Charon's core/validatorapi/validatorapi.go:752-795
(SubmitVoluntaryExit), :674-728 (submitRegistration), and
:731-749 (SubmitValidatorRegistrations).

- types.rs: replace placeholder unit structs with real wrappers
  around phase0::SignedVoluntaryExit and
  versioned::VersionedSignedValidatorRegistration so handlers can
  read message-root, pubkey, timestamp, and signature.

- component.rs/submit_voluntary_exit:
  * Resolve the DV root pubkey via the new active_validators_fn
    hook (mirrors Go's c.eth2Cl.ActiveValidators). Unknown
    validator-index -> 400 ("validator not found") matching Go's
    errors.New branch.
  * Derive the duty slot as slots_per_epoch * exit_epoch via
    fetch_slots_config — same as Go's eth2wrap.FetchSlotsConfig
    path.
  * Build the ParSignedData through
    signeddata::SignedVoluntaryExit::new_partial, then call
    verify_partial_sig with DomainName::VoluntaryExit + exit.epoch
    + tree-hash of the unsigned message.
  * Fanout one set per subscriber via the existing subs vec.

- component.rs/submit_validator_registrations:
  * Empty-input early return + builder-mode gate (warn-and-swallow
    when builder_enabled = false) matching Go:732-739.
  * Per-entry submit_one_registration:
    - Group pubkey comes from v1.message.pubkey; if not a DV pubkey
      on this node, swallow with a tracing::debug and skip (mirrors
      Go's swallowRegFilter at :686-691).
    - SlotFromTimestamp inlined as slot_from_timestamp() — pure
      function over genesis_time + slot_duration.
    - ParSignedData built through
      signeddata::VersionedSignedValidatorRegistration::new_partial.
    - verify_partial_sig called with DomainName::ApplicationBuilder
      and epoch 0 — matches Go's
      VersionedSignedValidatorRegistration.Epoch returning 0.

- New ActiveValidatorsFn hook + register_active_validators method
  matches the existing PR-1 hook pattern (Arc<…> + overwrite-on-
  re-register). Dropped the dead_code allow attributes on
  share_idx, builder_enabled, subs, and verify_partial_sig now that
  they are all consumed by these handlers.

Tests (9 new):
- happy-path fanout for both endpoints (insecure_test mode),
- voluntary exit unknown-validator -> 400,
- voluntary exit unregistered hook -> 500,
- voluntary exit bad signature -> 400 (real BLS + BeaconMock),
- registrations empty-input no-op,
- registrations builder-disabled swallows without fanout,
- registrations non-DV pubkey swallowed (no fanout),
- registrations bad signature -> 400 (real BLS + BeaconMock).

cargo test -p pluto-core --all-features: 383/383 passing.

Co-Authored-By: Bohdan Ohorodnii <35969035+varex83@users.noreply.github.com>
@varex83agent varex83agent force-pushed the bohdan/validatorapi-plumbing branch from f5c3b49 to 8eedb3f Compare June 2, 2026 14:09
Base automatically changed from bohdan/validatorapi-plumbing to main June 3, 2026 20:02
varex83agent and others added 2 commits June 5, 2026 12:47
- submit_validator_registrations: hoist fetch_slots_config and
  fetch_genesis_time out of the per-registration loop. Charon's
  eth2wrap caches these so per-entry calls are cheap there; Pluto's
  eth2 client is not yet cached, so a batched submission would
  otherwise issue 2N upstream calls. submit_one_registration now
  takes the resolved slot_duration / genesis_time as arguments.

- Downgrade builder-disabled swallow log from warn! to debug! to
  match Charon's log.Debug (validatorapi.go:736-739). Vouch-style
  VCs send registrations every slot, so warn! would be noisy in
  non-builder configurations.

- Add doc comment on submit_validator_registrations documenting
  that fan-out is per-entry and not transactional — matches Charon
  but worth calling out for downstream subscriber authors.

- Drop the no-op sig_to_array helper. phase0::BLSSignature and
  crate::types::Signature are both [u8; 96], so the call sites can
  pass &v1.signature / &exit.0.signature directly.

- Drop the |_: Elapsed| type annotation and the now-unused
  tokio::time::error::Elapsed import in attestation_data, matching
  the |_| pattern used by every other timeout site in the file.

Co-Authored-By: Bohdan Ohorodnii <35969035+varex83@users.noreply.github.com>
…ion batches

The iteration-1 hoist of fetch_slots_config / fetch_genesis_time
addressed only two of the four uncached upstream calls per
submit_one_registration. The dominant per-entry cost was
verify_partial_sig → signing::verify → signing::get_domain, which
issues another `fetch_domain_type` (GET /eth/v1/config/spec) and
`fetch_genesis_domain` (GET /eth/v1/beacon/genesis) per entry. For
a batch of N DV-pubkey registrations with bogus signatures this was
still 2N upstream calls past the hoist — the membership filter
admits the entry because the pubkey is on this node, and BLS verify
fails only AFTER the two domain-lookup calls.

All entries in a SubmitValidatorRegistrations batch use the same
DomainName::ApplicationBuilder signing domain at epoch 0, so the
domain itself can be resolved once and reused.

- eth2util/signing: add verify_with_domain, a synchronous variant of
  verify that takes a pre-resolved phase0::Domain. Two unit tests
  mirror verify_accepts_valid_signature and verify_rejects_zero_signature
  for the new entry-point.
- core/validatorapi/component: add Component::verify_partial_sig_with_domain
  that calls the new entry-point. submit_validator_registrations
  resolves the ApplicationBuilder domain once before the loop and
  threads it through submit_one_registration alongside slot_duration
  and genesis_time. Total upstream cost for a registration batch is
  now bounded at 3 calls (slots + genesis + domain), independent of
  N.

Charon achieves the same effect via `eth2wrap` caching of these
endpoints. Pluto's eth2 client is not cached yet; once it is, the
explicit hoist can be removed without behaviour change.

cargo test -p pluto-core --all-features: 383 passing
cargo test -p pluto-eth2util --all-features: 146 passing (+2)

Co-Authored-By: Bohdan Ohorodnii <35969035+varex83@users.noreply.github.com>
@varex83agent
Copy link
Copy Markdown
Collaborator Author

/loop-review-pr summary

Ran 3 review-and-fix iteration(s) against this PR.
Terminated by: completion_promise (PR_IDEAL).

Quality gates (final)

  • cargo +nightly fmt --all --check — pass
  • cargo clippy --workspace --all-targets --all-features -- -D warnings — pass
  • cargo test --workspace --all-features — pass (1375 passed, 0 failed)

Resolved during the loop

Bugs (0)

  • (none)

Major (1)

  • verify_partial_sig per-entry domain-resolution amplification (2N upstream calls on /spec + /genesis for an N-entry batch) — crates/core/src/validatorapi/component.rs:376 — fix in afea16b. Iteration 1 hoisted fetch_slots_config + fetch_genesis_time out of the loop, but verify_partial_sig → signing::verify → get_domain was still issuing two HTTP calls per entry. Iteration 2 added signing::verify_with_domain (synchronous, takes a pre-resolved phase0::Domain) plus Component::verify_partial_sig_with_domain, and submit_validator_registrations now resolves the ApplicationBuilder domain once before the loop. Total upstream cost is now bounded at 4 HTTP calls per batch regardless of N.

Minor (4)

  • submit_validator_registrations per-entry upstream amplification on fetch_slots_config / fetch_genesis_timecrates/core/src/validatorapi/component.rs:350 — fix in 2ac73b9. Hoisted both calls out of the per-registration loop; submit_one_registration now takes (slot_duration, genesis_time) as args.
  • Subscriber fan-out is not transactional (Charon parity) — crates/core/src/validatorapi/component.rs:711 — fix in 2ac73b9. Added a doc comment on submit_validator_registrations documenting the per-entry, non-transactional semantics so subscriber authors know to be idempotent.
  • Inconsistent |_: Elapsed| closure annotation in attestation_datacrates/core/src/validatorapi/component.rs:640 — fix in 2ac73b9. Dropped the annotation and the now-unused tokio::time::error::Elapsed import to match the |_| pattern used by every other timeout site in the file.
  • Builder-disabled swallow logged at warn!crates/core/src/validatorapi/component.rs:725 — fix in 2ac73b9. Downgraded to debug!. Vouch-style VCs send registrations every slot, so warn! would be noisy in non-builder configurations.

Nits (1)

  • sig_to_array no-op identity helper — crates/core/src/validatorapi/component.rs:988 — fix in 2ac73b9. Both phase0::BLSSignature and crate::types::Signature resolve to [u8; 96]; the two call sites now pass &v1.signature / &exit.0.signature directly.

Tests added

  • verify_with_domain_accepts_valid_signature in crates/eth2util/src/signing.rs — locks in that the new pre-resolved-domain path produces a bit-identical signing root to the original verify.
  • verify_with_domain_rejects_zero_signature in crates/eth2util/src/signing.rs — mirrors the zero-signature rejection on the new entry-point.

Outstanding (deferred / out-of-scope, not addressed by this loop)

  • dutydb/memory.rs unbounded eviction HashSets. Inherited from an earlier stacked commit (5da9093 "fix(core): address PR feat(core): implement attester_duties, sync_committee_duties, attestation_data handlers #451 review feedback"); a high-water-marks bounding fix already exists on main as commit 6a8a94c "fix(core): bound dutydb eviction state with high-water marks". Will resolve on rebase, so not a fix for this PR.
  • crates/core/src/validatorapi/router.rs:246 still routes submit_validator_registrations and submit_exit to todo!(). HTTP wiring is deliberately deferred to a follow-up PR per the PR description.
  • HTTP 400 vs Charon's default 500 for several client-error paths (validator-not-found, missing V1 payload, signing-failure). Documented in the code as a deliberate Pluto improvement (non-retriable rejection without leaking upstream details).
  • slot_from_timestamp falls back to slot 0 when timestamp is before genesis, where Charon falls back to time.Now(). Documented as a deliberate determinism improvement (cluster-consistent rather than per-node).
  • active_validators_fn unregistered → 500. Pluto-specific safety feature; Charon assumes the hook is always present and would panic.

Charon parity notes

  • Iteration 2's commit message claims "Total upstream cost for any registration batch is now bounded at 3 calls." The actual count is 4 HTTP calls (fetch_slots_config and fetch_domain_type both internally invoke fetch_spec_data; fetch_genesis_time and fetch_genesis_domain both internally invoke fetch_genesis_data). The security property — O(1) regardless of N — is unaffected, which is what the fix was about. Charon's eth2wrap caches both endpoints; once the Pluto eth2 client gains caching, the explicit hoist can be removed without behaviour change.
  • Iteration 1's commit message attributes the warn!→debug! downgrade to "matching Charon's log.Debug". Charon actually has no log at the builder-disabled gate (validatorapi.go:737-739); the log.Debug is on the per-entry non-DV swallow filter. Pluto's tracing::debug! is an extra log, not a parity restoration. Harmless at debug level.

Verdict

PR is ideal — all bug/major findings resolved, gates green.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants