Add EVM support#299
Conversation
Supernode's startup migration path holds a single signing key and cannot run the K-of-N ceremony required for a multisig legacy account. Detect is_multisig=true from MigrationEstimate and return an actionable error pointing to lumerad's 4-step offline flow (generate-proof-payload → sign-proof → assemble-proof → submit-proof). After the operator completes the offline ceremony, the next restart finds the on-chain MigrationRecord and drives local cleanup through the existing alreadyMigrated branch — no new multisig-aware code in the daemon. Also log the new is_multisig/threshold/num_signers fields from QueryMigrationEstimate in the pre-flight summary, and bump the lumera pseudo-version pin to origin/evm HEAD (e09830d70704) so a clean checkout without ../lumera still resolves to a version that has the LegacyProof oneof, SingleKeyProof/MultisigProof types, and MaxMultisigSubKeys param. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
sncli transitively depends on lumera's app package (via x/lumeraid/securekeyx tests), which pulls in cosmos/evm's statedb → github.com/ethereum/go-ethereum/trie/utils. Upstream go-ethereum doesn't export that package; cosmos/evm requires the cosmos/go-ethereum fork. The main go.mod already has this replace; mirror it here so `go build ./...` from cmd/sncli resolves. Bump the lumera require to the pseudo-version at origin/evm HEAD (was v1.11.0 from before the EVM work) and let go mod tidy refresh the rest of the indirect graph to match the root module. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lumera unified legacy and new sides under a single MigrationProof oneof. Wrap newSig in a SingleKeyProof and pass it via NewProof on both MsgMigrateValidator and MsgMigrateAccount. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
# Conflicts: # .gitignore # go.mod # go.sum # p2p/kademlia/bootstrap.go # p2p/kademlia/dht.go # supernode/supernode_metrics/reachability_active_probing_test.go # tests/system/go.mod # tests/system/go.sum
- Bump cmd/sncli and tests/system to lumera v1.20.0-rc2; drop local ../lumera replace now that the EVM-enabled release exists upstream. - tests/system: bump devnet tx fee to 1000ulume and make initSDKConfig idempotent so multiple test invocations don't trip the SDK's "Config is sealed" panic. - CHANGELOG: add Upcoming EVM Release section summarising the evm-support work (migration flow, keyring, p2p reactions, deps). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mateeullahmalik
left a comment
There was a problem hiding this comment.
I reviewed this as the EVM support branch, not just the migration helper. The main migration path is directionally sound and I verified the important unsigned evmigration tx assumption against Lumera v1.20.0-rc2: the chain intentionally registers empty signers for MsgClaimLegacyAccount / MsgMigrateValidator, so the daemon-side direct broadcast is not a problem by itself.
I do not think this is mergeable yet though. Two blockers need fixing first:
- The PR is currently conflicted with
master.git merge-tree origin/master origin/pr-299reports conflicts ingo.mod,go.sum,supernode/host_reporter/service.go,tests/system/cli.go,tests/system/go.mod,tests/system/go.sum, andtests/system/main_test.go. cmd/snclidoes not build/test from a clean checkout because its module replaces Lumera with a local relative path.
Validation I ran locally:
go test ./...from the repo root: passed.go test ./supernode/cmd ./pkg/keyring ./pkg/lumera/codec ./sdk/task: passed.go test ./...insn-manager: passed.go test ./...incmd/sncli: failed immediately due to the local Lumera replace.
Once the conflicts and nested-module replace are fixed, I’d re-run root + nested module tests and then do a devnet migration/cascade pass before calling this ready.
There was a problem hiding this comment.
Pull request overview
This PR introduces end-to-end EVM compatibility across supernode, SDK, CLI, and tests, centered on migrating legacy Cosmos secp256k1 (coin type 118) accounts to EVM eth_secp256k1 (coin type 60), plus aligning dependencies and operational workflows for the EVM-enabled Lumera stack.
Changes:
- Adds automatic legacy→EVM account migration at supernode startup (with chain capability checks, dual-signing proofs, config rewrite, and legacy key cleanup).
- Updates SDK/client plumbing for EVM-era chain interactions (gRPC conn exposure, spendable-balance checks, ICA signature logic tweaks, and codec registrations).
- Updates system/integration tests, docs, and module dependencies to the EVM-enabled Lumera versions and Go toolchain.
Reviewed changes
Copilot reviewed 39 out of 46 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/system/main_test.go | Makes SDK bech32 config init resilient to sealed config / repeated init in system tests. |
| tests/system/go.mod | Aligns system test module dependencies (including cosmos/evm and go-ethereum replace) and Go version. |
| tests/system/e2e_cascade_test.go | Updates expected bech32 addresses for EVM-derived keys in E2E flow. |
| tests/system/config.test-1.yml | Updates test identity address to new EVM-derived address. |
| tests/system/config.test-2.yml | Updates test identity address to new EVM-derived address. |
| tests/system/config.test-3.yml | Updates test identity address to new EVM-derived address. |
| tests/system/cli.go | Increases default fee/amount used by system CLI helper. |
| tests/integration/evmigration/evmigration_test.go | Adds integration tests for legacy→EVM keyring lifecycle, dual-signing, and config persistence. |
| supernode/supernode_metrics/reachability_active_probing_test.go | Updates fake Lumera client to satisfy new Conn() method in interface. |
| supernode/host_reporter/service.go | Stops reporting cascade kademlia DB bytes in epoch report payload. |
| supernode/config/config.go | Adds evm_key_name config field to support one-time migration workflow. |
| supernode/cmd/start.go | Integrates EVM chain check + auto-migration at startup and triggers P2P refresh on migration. |
| supernode/cmd/evmigration.go | Implements chain detection, migration validation, dual signing, broadcast + confirmation polling, and local cleanup. |
| supernode/cmd/evmigration_test.go | Adds unit tests for migration decision logic, message selection, persistence, and tx confirmation polling. |
| sn-manager/go.sum | Updates checksums following dependency/toolchain alignment. |
| sn-manager/go.mod | Aligns sn-manager module dependencies and Go version; points Lumera to local checkout. |
| sdk/task/task.go | Switches eligibility check to use spendable balance and refactors probe logic. |
| sdk/task/spendable_balance_test.go | Adds unit tests for spendable-balance eligibility behavior. |
| sdk/task/ica_signature.go | Adds helper to detect whether an action creator is an ICA account. |
| sdk/task/ica_signature_test.go | Extends test fake client + adds tests around vesting/non-ICA creator behavior. |
| sdk/task/helpers.go | Restricts ICA fallback signature validation to configured ICA usage or actual ICA creators. |
| sdk/adapters/lumera/adapter.go | Adds spendable-balance query method to Lumera adapter interface and implementation. |
| pkg/testutil/lumera.go | Extends mock Lumera client/bank module with Conn() and spendable-balance support. |
| pkg/net/credentials/lumeratc.go | Adds cache-clearing hook for key exchanger instances after identity changes. |
| pkg/lumera/modules/bank/interface.go | Extends bank module interface with spendable-balance query. |
| pkg/lumera/modules/bank/impl.go | Implements spendable-balance query against gRPC bank client. |
| pkg/lumera/modules/bank/bank_mock.go | Updates generated mock to include spendable-balance method. |
| pkg/lumera/lumera_mock.go | Updates generated Lumera client mock to include Conn(). |
| pkg/lumera/interface.go | Adds Conn() *grpc.ClientConn to Lumera client interface. |
| pkg/lumera/codec/encoding.go | Registers cosmos/evm, vesting, and evmigration interfaces for codec unpacking. |
| pkg/lumera/codec/encoding_test.go | Adds test ensuring delayed vesting accounts unpack via encoding config. |
| pkg/lumera/client.go | Exposes underlying gRPC connection via Conn(). |
| pkg/keyring/keyring.go | Switches default derivation to coin type 60 and adds eth_secp256k1 support + signature normalization. |
| pkg/keyring/keyring_test.go | Adds EVM key derivation/signing/idempotent SDK config tests. |
| p2p/p2p.go | Adds P2P-level migration notification API. |
| p2p/kademlia/dht.go | Adds migration notification channel for bootstrap refresher acceleration. |
| p2p/kademlia/bootstrap.go | Implements migration-triggered immediate bootstrap refresh + temporary accelerated interval; clears credential cache/connection pool. |
| go.mod | Aligns root module deps for EVM-enabled Lumera + cosmos/evm and toolchain updates. |
| docs/evm-migration.md | Adds operator documentation for migration workflow, including multisig offline path. |
| cmd/sncli/go.mod | Aligns sncli module deps for EVM-enabled Lumera + toolchain updates. |
| CHANGELOG.md | Adds “Upcoming EVM Release” notes describing migration, keyring, P2P, SDK, and dependency changes. |
| .gitignore | Adds ignores for debug output and local Claude settings. |
| .github/workflows/build&release.yml | Adds [no-deploy] tag message handling to create draft releases. |
Files not reviewed (2)
- pkg/lumera/lumera_mock.go: Language not supported
- pkg/lumera/modules/bank/bank_mock.go: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
# Conflicts: # go.mod # go.sum # supernode/host_reporter/service.go # tests/system/cli.go # tests/system/go.mod # tests/system/go.sum # tests/system/main_test.go
- go.mod: drop local `=> ../../../lumera` replace in cmd/sncli and sn-manager so a clean checkout builds against the tagged lumera v1.20.0-rc2 instead of failing on a missing directory. - evmigration: isEthSecp256k1Key now type-asserts *ethsecp256k1.PubKey instead of treating any non-legacy key as EVM, so ed25519/multisig/ offline keys are rejected at the pre-flight gate. Add negative test. - evmigration_test: compare legacy vs EVM addresses in the same bech32 encoding (was bech32 vs hex, which could never collide). - sdk/task: report actual and required spendable balance from `min` and denom instead of hard-coding ">= 1 LUME". - docs/evm-migration: fix "superno0de" typo and the `keys recover` example (positional [name], not --name). - gofmt reachability_active_probing_test.go. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The merge brought master's test fakes together with the EVM branch's extended interfaces, leaving the fakes incomplete: - lumera.Client gained Conn() *grpc.ClientConn — add it to fakeLumera (self_healing) and handlerLumera (transport/grpc/self_healing), delegating to the underlying testutil mock. Fixes the unit-tests build failure in both packages. - p2p.P2P gained NotifyEVMMigration() — add a no-op to the lep6StoreBackedP2P fake. Fixes the tests/systemtests build failure that broke both the cascade-e2e and lep6-e2e jobs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The merged-in LEP-6 e2e configs and tests still used legacy secp256k1
(coin type 118) supernode addresses. With the EVM-enabled binary,
testkey1/2/3 derive eth_secp256k1 (coin type 60) addresses, so the
supernodes failed config verification at startup ("Key resolves to X
but config identity is Y") and never came up — cascading into the
concurrency/enforcement/restart/matrix test failures.
Update config.lep6-{1,2,3}.yml identities and the hardcoded address
constants/assertions in the runtime and negative-matrix tests to the
EVM-derived addresses, matching the mapping already applied for the
cascade e2e in 651b9d8 (which now passes).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
generateLEP6UploadUsers created and funded user accounts with legacy secp256k1 (sdkhd.Secp256k1) while newLEP6ActionClientsForKey signs via keyring.RecoverAccountFromMnemonic, which uses eth_secp256k1 (coin type 60). The funded address and the signing address therefore differed, so RequestAction failed with "account ... not found" in the concurrency and restart e2e tests. Derive upload users with evmhd.EthSecp256k1 so the funded address matches the address the action client signs with. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
a-ok123
left a comment
There was a problem hiding this comment.
Review: EVM support (evm-support → master)
Architecturally solid and unusually well-tested for a migration of this scope. The migration flow in supernode/cmd/evmigration.go is the strongest part: idempotent/rerunnable (checks MigrationRecord before broadcasting), correct state-mutation ordering (config persisted to disk before legacy key deletion, with a config-restore-on-failure path), fail-closed on MigrationEstimate errors, and an explicit refusal for multisig accounts with an offline lumerad playbook. The migrationChainClient interface gives good unit-test coverage without a live gRPC connection, and the P2P refresh handling (NotifyEVMMigration → cache clear + conn-pool release + accelerated bootstrap) is well thought through, including the buffered-channel / migrationPending replay when the DHT isn't up yet.
CI: build, unit-tests, integration-tests, cascade-e2e-tests green. lep6-e2e-tests is failing — the one blocker (see inline comment).
Main points (details inline)
lep6-e2ered — packageFAILwith passing visible subtests; needs root-causing before merge.- Signature-length inconsistency — migration EVM proof keeps the 65-byte sig (
evmigration.go) whilekeyring.SignBytesnow truncates to 64. Both may be correct for their respective verifiers, but please confirm + document. - Unsigned broadcast tx — depends on the chain's evmigration ante decorator; worth a code comment so it isn't "fixed" later.
requireEVMChainfatal on transient errors — a chain blip at boot becomes a crash; consider distinguishing "absent" from "unreachable".- Minor:
return sig, nilclarity; import grouping intestutil/lumera.go.
Nothing here is architecturally wrong — the open items are the failing e2e job and a couple of "please confirm the chain side agrees" assumptions around signature encoding. Leaving as a non-blocking comment review since it's a draft.
Generated by Claude Code
Update lumera to v1.20.0-rc3, the new release with LEP-6, across go.mod/go.sum for the root, cmd/sncli, and tests/system modules. Ran go mod tidy in each. Updated CHANGELOG reference. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- requireEVMChain: distinguish a transient module-version query failure (chain unreachable / gRPC blip at boot) from a definitive "EVM module absent" answer. Transient errors are now retried with backoff instead of immediately Fatal-ing into a crash loop; module-absent still fails fast. Extracted requireEVMChainWithQuerier behind a small interface and added unit tests (present, absent-no-retry, transient-then-success, persistent-failure, context-cancelled). - evmigration: document why the EVM proof signature intentionally keeps the 65th recovery byte (chain verifier requires R||S||V and strips V itself), and why the migration broadcast tx is deliberately unsigned (self-authenticating via dual MigrationProof + evmigration ante). - keyring.SignBytes: return sig, nil (err is nil here) and document that the 65->64 truncation targets cosmos-form chain verifiers; the P2P securekeyx handshake is unaffected (separate sign/verify path). - testutil/lumera.go: gofmt import grouping (grpc was mid-group). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
TestGetKeysForReplication_CanceledContext intermittently failed in CI with "TempDir RemoveAll cleanup: directory not empty" because the sqlite store's background goroutines could still write WAL/-shm files after Close() returned. Root cause: Worker.Stop() sent a single value on an unbuffered quit channel that BOTH the DB worker and the checkpoint worker selected on, so only one was released and the checkpoint worker leaked. It also slept up to checkpointInterval inside backoff.RetryNotify before re-checking quit, and Close() never waited for the goroutines before closing the DB. Fix: - Worker.Stop() now closes the quit channel (broadcast to all listeners) via sync.Once instead of sending one value. - The checkpoint worker bails out of its retry loop on shutdown (backoff.Permanent(errStoreClosing)) and uses an interruptible inter-checkpoint wait so it stops promptly. - Store tracks its three long-lived goroutines (DB worker, checkpoint worker, replication writer) with a WaitGroup; Close() signals stop, waits for them to exit, then closes the DB. Close() is now idempotent (sync.Once), and startRepWriter no longer spawns a nested goroutine. - Converted RetrieveBatchValues / retrieveBatchValues from value to pointer receiver (a Store value can no longer be copied now that it holds sync primitives). Added TestStore_CloseStopsWorkersAndIsIdempotent. Verified with go test -race -count=10 (sqlite + kademlia packages). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
TestLEP6ConcurrentCascadesContendedReporter intermittently failed when a single supernode finalized several cascade actions concurrently: the finalize *simulation* pre-check (cascade/register.go step 11) fatally failed with "account sequence mismatch, expected N, got M". Simulation reads the signer's committed sequence into accountInfo, but under concurrent in-flight txs from the same signer that value goes stale between the read and the simulate ante check (TOCTOU). The actual FinalizeAction broadcast already retries sequence mismatches, so a simulate-only mismatch is a false negative that needlessly fails the upload. TxHelper.Simulate now retries on sequence mismatch — re-fetching fresh account info each attempt, bounded by the same SequenceMismatchMaxAttempts cap as the broadcast path. Added unit tests for the retry-then-succeed and exhaust-cap cases. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The other half of TestLEP6ConcurrentCascadesContendedReporter: under concurrent uploads, StoreArtefacts could fail with transient P2P conditions — "no eligible store peers" (routing not yet converged on store-eligible peers), "0.00% successful" (store RPCs to peers failing under load), or a momentary "zero peers" — and the cascade register path made a single attempt, failing the whole upload. storeArtefacts now retries these transient failures with bounded backoff (4 attempts, base 2s * attempt). Retries force IdempotentDirectoryRecord so the symbol-directory row (a plain INSERT on first pass) is upserted rather than colliding; the symbol/data key stores are already idempotent by key. Deterministic errors (e.g. layout/encoding) are not retried. Added tests: transient-then-succeed (asserting retries flip to the idempotent path), non-transient-no-retry, and exhaust-cap. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…g (LEP-6/EVM) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
mateeullahmalik
left a comment
There was a problem hiding this comment.
Re-reviewed current head 14bddf0e0fae6b0f413856cfe35698d0885a0b11.
The blockers from my previous change request are resolved:
- PR now merges cleanly with
master;git merge-tree origin/master origin/pr-299returns a clean tree hash, no conflict hunks. - The committed local Lumera replaces are gone from the nested modules; clean-checkout
go test ./...passes in bothcmd/sncliandsn-manager.
I also checked the follow-up reliability commits after the green full-test run. The latest code delta adds bounded retries around transient P2P artefact-store failures and a focused test suite. It is context-aware, forces idempotent directory upsert on retry, and does not retry deterministic errors.
Local validation at PR head:
go test ./...from repo root: PASSgo test ./supernode/cascade: PASSgo test ./supernode/cmd ./pkg/keyring ./pkg/lumera/codec ./sdk/task: PASSgo test ./...incmd/sncli: PASSgo test ./...insn-manager: PASSgit diff --check origin/master...HEAD: PASS
CI evidence checked:
- Build green on current head
14bddf0. - Full tests workflow green on
fc063fffor unit, integration, cascade-e2e, and lep6-e2e; later commits are the artefact-store retry + tests and changelog.
There was a problem hiding this comment.
💡 Codex Review
supernode/p2p/kademlia/store/sqlite/sqlite.go
Lines 453 to 457 in 14bddf0
When Close races with an original/local StoreBatch after the job is queued, the DB worker can take the closed quit case and return without processing the queued job or sending on job.Done. In that case this wait has no escape unless the caller's context is canceled, so callers using context.Background() or a request context not tied to shutdown can hang indefinitely during store teardown; return on store shutdown or ensure pending Done channels are completed.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Summary
This PR adds broad EVM support across supernode, SDK, CLI, and supporting modules, then brings the branch current with
master. The migration tooling is one part of the branch, but the overall change is larger: it updates account/key handling, Lumera dependencies, client interfaces, signing paths, and operator workflows for the EVM-enabled Lumera stack.What changed
eth_secp256k1keys and the EVM coin type 60 derivation path (m/44'/60'/0'/0/0).secp256k1accounts.evmigrationsupernode command, including proof generation/validation coverage and integration tests.docs/evm-migration.mdfor the migration workflow.cmd/sncli,sn-manager, andtests/systemwith the EVM-enabled Lumera stack.masterchanges via merge commit25589ad, including the host reporter storage-full compatibility updates from80ecbb7.Why
The EVM-enabled Lumera path requires supernode and related tools to produce and understand EVM-compatible account keys, updated signing/proof payloads, and the newer Lumera module APIs. Without these changes, operators and developers would only have partial migration tooling, while normal account creation/recovery and SDK flows would still use the legacy key model.
User and developer impact
evmigrationworkflow for preparing migration proofs.Validation
go test ./...passed locally from the repository root on 2026-05-27.Notes
This is the broader EVM support branch, not only an EVM migration PR. The migration command and docs are included because they are part of the operator path for adopting the EVM-enabled stack.
Update — merge with
master, review fixes, and reliability hardeningBrought the branch current with
masterand resolved the follow-on issues found in review and CI.Merge & dependency
origin/master(go.mod/go.sum, host_reporter, tests/system).v1.20.0-rc3, which restores storage-truth recovery for POSTPONED nodes (lumera implemented p2p ping #139/sn-manager updates #143).rc2excluded postponed nodes from storage-truth target assignment, which broke the postpone→recover lifecycle and the LEP-6 enforcement/heal e2e tests.=> ../../../lumerareplace fromcmd/sncliandsn-managerso clean checkouts build against the tagged dependency.Reliability & concurrency hardening
account sequence mismatch(re-fetching account info) instead of failing the upload — matters when one SuperNode finalizes several cascade actions concurrently.no eligible store peers,0.00% successful, momentaryzero peers) with bounded backoff, using the idempotent directory-upsert path on retries.Closenow stops and waits for all background goroutines before closing the DB (broadcast quit + interruptible checkpoint worker), fixing a goroutine leak / WAL-write-after-close race that flaked unit tests.requireEVMChaindistinguishes a transient chain-unreachable query failure (retry w/ backoff) from a definitive "EVM module absent" (fail fast), avoiding a boot crash-loop on a momentary blip.Review fixes
isEthSecp256k1Keytype-asserts*ethsecp256k1.PubKey(rejects ed25519/multisig/offline) instead of "anything not legacy".SignBytespath; documented the unsigned migration broadcast tx.hasMinimumSpendableBalancereports actual+required frommin/denom; docs typo +keys recoverusage fixes; gofmt/import grouping.lumera.Client/p2p.P2Pinterfaces.Validation
unit-tests,integration-tests,cascade-e2e, andbuildgreen on CI; the full LEP-6 enforcement/heal e2e suite passes onrc3. New unit tests added for every behavior change above.