Skip to content

Add EVM support#299

Merged
akobrin1 merged 25 commits into
masterfrom
evm-support
Jun 10, 2026
Merged

Add EVM support#299
akobrin1 merged 25 commits into
masterfrom
evm-support

Conversation

@akobrin1

@akobrin1 akobrin1 commented May 27, 2026

Copy link
Copy Markdown
Contributor

Summary

This PR adds broad EVM support across supernode, SDK, CLI, and supporting modules, then brings the branch current with master. The migration tooling is one part of the branch, but the overall change is larger: it updates account/key handling, Lumera dependencies, client interfaces, signing paths, and operator workflows for the EVM-enabled Lumera stack.

What changed

  • Switched new/recovered supernode keyring accounts to EVM-compatible eth_secp256k1 keys and the EVM coin type 60 derivation path (m/44'/60'/0'/0/0).
  • Registered Cosmos EVM crypto interfaces so the keyring can store and load the new key type while retaining compatibility with existing keyring behavior.
  • Updated private key derivation, account recovery, and signing-related tests to validate deterministic EVM key generation and distinguish it from legacy Cosmos secp256k1 accounts.
  • Updated Lumera client, codec, bank, SDK task, and test utility interfaces for the EVM-enabled Lumera dependency set.
  • Added EVM migration support through the evmigration supernode command, including proof generation/validation coverage and integration tests.
  • Added operator documentation in docs/evm-migration.md for the migration workflow.
  • Added safety handling so automatic migration is refused for multisig legacy accounts.
  • Aligned dependency manifests across the root module, cmd/sncli, sn-manager, and tests/system with the EVM-enabled Lumera stack.
  • Included the latest master changes via merge commit 25589ad, including the host reporter storage-full compatibility updates from 80ecbb7.

Why

The EVM-enabled Lumera path requires supernode and related tools to produce and understand EVM-compatible account keys, updated signing/proof payloads, and the newer Lumera module APIs. Without these changes, operators and developers would only have partial migration tooling, while normal account creation/recovery and SDK flows would still use the legacy key model.

User and developer impact

  • New supernode accounts are created as EVM-compatible accounts by default.
  • Recovered accounts use the EVM derivation path and key type, producing deterministic EVM-compatible addresses from the provided mnemonic.
  • Developers get updated interfaces and tests around EVM key handling, proof generation, balance checks, and signing helpers.
  • Operators get a documented evmigration workflow for preparing migration proofs.
  • Multisig legacy accounts are not migrated automatically, which avoids silently applying an unsafe migration path.
  • Downstream module users should expect dependency updates in all affected Go modules.

Validation

  • go test ./... passed locally from the repository root on 2026-05-27.

Notes

This is the broader EVM support branch, not only an EVM migration PR. The migration command and docs are included because they are part of the operator path for adopting the EVM-enabled stack.


Update — merge with master, review fixes, and reliability hardening

Brought the branch current with master and resolved the follow-on issues found in review and CI.

Merge & dependency

  • Resolved conflicts merging origin/master (go.mod/go.sum, host_reporter, tests/system).
  • Bumped lumera to v1.20.0-rc3, which restores storage-truth recovery for POSTPONED nodes (lumera implemented p2p ping #139/sn-manager updates #143). rc2 excluded postponed nodes from storage-truth target assignment, which broke the postpone→recover lifecycle and the LEP-6 enforcement/heal e2e tests.
  • Removed the local => ../../../lumera replace from cmd/sncli and sn-manager so clean checkouts build against the tagged dependency.

Reliability & concurrency hardening

  • Finalize simulation retries transient account sequence mismatch (re-fetching account info) instead of failing the upload — matters when one SuperNode finalizes several cascade actions concurrently.
  • Artefact storage retries transient P2P conditions (no eligible store peers, 0.00% successful, momentary zero peers) with bounded backoff, using the idempotent directory-upsert path on retries.
  • SQLite P2P store Close now stops and waits for all background goroutines before closing the DB (broadcast quit + interruptible checkpoint worker), fixing a goroutine leak / WAL-write-after-close race that flaked unit tests.
  • requireEVMChain distinguishes a transient chain-unreachable query failure (retry w/ backoff) from a definitive "EVM module absent" (fail fast), avoiding a boot crash-loop on a momentary blip.

Review fixes

  • Stricter EVM key validation: isEthSecp256k1Key type-asserts *ethsecp256k1.PubKey (rejects ed25519/multisig/offline) instead of "anything not legacy".
  • Documented the deliberately-65-byte EVM migration proof signature (chain verifier requires R||S||V) vs. the 64-byte SignBytes path; documented the unsigned migration broadcast tx.
  • hasMinimumSpendableBalance reports actual+required from min/denom; docs typo + keys recover usage fixes; gofmt/import grouping.
  • Fixed LEP-6 e2e identities/keys for coin type 60 (configs + tests) and test fakes for the EVM-extended lumera.Client/p2p.P2P interfaces.

Validation

  • unit-tests, integration-tests, cascade-e2e, and build green on CI; the full LEP-6 enforcement/heal e2e suite passes on rc3. New unit tests added for every behavior change above.

akobrin1 and others added 11 commits April 21, 2026 15:10
Supernode's startup migration path holds a single signing key and cannot
run the K-of-N ceremony required for a multisig legacy account. Detect
is_multisig=true from MigrationEstimate and return an actionable error
pointing to lumerad's 4-step offline flow (generate-proof-payload →
sign-proof → assemble-proof → submit-proof). After the operator completes
the offline ceremony, the next restart finds the on-chain MigrationRecord
and drives local cleanup through the existing alreadyMigrated branch —
no new multisig-aware code in the daemon.

Also log the new is_multisig/threshold/num_signers fields from
QueryMigrationEstimate in the pre-flight summary, and bump the lumera
pseudo-version pin to origin/evm HEAD (e09830d70704) so a clean checkout
without ../lumera still resolves to a version that has the LegacyProof
oneof, SingleKeyProof/MultisigProof types, and MaxMultisigSubKeys param.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
sncli transitively depends on lumera's app package (via
x/lumeraid/securekeyx tests), which pulls in cosmos/evm's
statedb → github.com/ethereum/go-ethereum/trie/utils. Upstream
go-ethereum doesn't export that package; cosmos/evm requires the
cosmos/go-ethereum fork. The main go.mod already has this replace;
mirror it here so `go build ./...` from cmd/sncli resolves.

Bump the lumera require to the pseudo-version at origin/evm HEAD
(was v1.11.0 from before the EVM work) and let go mod tidy refresh
the rest of the indirect graph to match the root module.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lumera unified legacy and new sides under a single MigrationProof oneof.
Wrap newSig in a SingleKeyProof and pass it via NewProof on both
MsgMigrateValidator and MsgMigrateAccount.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
# Conflicts:
#	.gitignore
#	go.mod
#	go.sum
#	p2p/kademlia/bootstrap.go
#	p2p/kademlia/dht.go
#	supernode/supernode_metrics/reachability_active_probing_test.go
#	tests/system/go.mod
#	tests/system/go.sum
- Bump cmd/sncli and tests/system to lumera v1.20.0-rc2; drop local
  ../lumera replace now that the EVM-enabled release exists upstream.
- tests/system: bump devnet tx fee to 1000ulume and make initSDKConfig
  idempotent so multiple test invocations don't trip the SDK's "Config
  is sealed" panic.
- CHANGELOG: add Upcoming EVM Release section summarising the
  evm-support work (migration flow, keyring, p2p reactions, deps).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@akobrin1 akobrin1 changed the title [codex] Add EVM migration support EVM support May 27, 2026
@akobrin1 akobrin1 changed the title EVM support EVM migration support May 27, 2026
@akobrin1 akobrin1 changed the title EVM migration support [codex] Add EVM support May 27, 2026
@akobrin1 akobrin1 changed the title [codex] Add EVM support Add EVM support May 27, 2026

@mateeullahmalik mateeullahmalik left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed this as the EVM support branch, not just the migration helper. The main migration path is directionally sound and I verified the important unsigned evmigration tx assumption against Lumera v1.20.0-rc2: the chain intentionally registers empty signers for MsgClaimLegacyAccount / MsgMigrateValidator, so the daemon-side direct broadcast is not a problem by itself.

I do not think this is mergeable yet though. Two blockers need fixing first:

  1. The PR is currently conflicted with master. git merge-tree origin/master origin/pr-299 reports conflicts in go.mod, go.sum, supernode/host_reporter/service.go, tests/system/cli.go, tests/system/go.mod, tests/system/go.sum, and tests/system/main_test.go.
  2. cmd/sncli does not build/test from a clean checkout because its module replaces Lumera with a local relative path.

Validation I ran locally:

  • go test ./... from the repo root: passed.
  • go test ./supernode/cmd ./pkg/keyring ./pkg/lumera/codec ./sdk/task: passed.
  • go test ./... in sn-manager: passed.
  • go test ./... in cmd/sncli: failed immediately due to the local Lumera replace.

Once the conflicts and nested-module replace are fixed, I’d re-run root + nested module tests and then do a devnet migration/cascade pass before calling this ready.

Comment thread cmd/sncli/go.mod Outdated
Comment thread supernode/cmd/evmigration.go Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces end-to-end EVM compatibility across supernode, SDK, CLI, and tests, centered on migrating legacy Cosmos secp256k1 (coin type 118) accounts to EVM eth_secp256k1 (coin type 60), plus aligning dependencies and operational workflows for the EVM-enabled Lumera stack.

Changes:

  • Adds automatic legacy→EVM account migration at supernode startup (with chain capability checks, dual-signing proofs, config rewrite, and legacy key cleanup).
  • Updates SDK/client plumbing for EVM-era chain interactions (gRPC conn exposure, spendable-balance checks, ICA signature logic tweaks, and codec registrations).
  • Updates system/integration tests, docs, and module dependencies to the EVM-enabled Lumera versions and Go toolchain.

Reviewed changes

Copilot reviewed 39 out of 46 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
tests/system/main_test.go Makes SDK bech32 config init resilient to sealed config / repeated init in system tests.
tests/system/go.mod Aligns system test module dependencies (including cosmos/evm and go-ethereum replace) and Go version.
tests/system/e2e_cascade_test.go Updates expected bech32 addresses for EVM-derived keys in E2E flow.
tests/system/config.test-1.yml Updates test identity address to new EVM-derived address.
tests/system/config.test-2.yml Updates test identity address to new EVM-derived address.
tests/system/config.test-3.yml Updates test identity address to new EVM-derived address.
tests/system/cli.go Increases default fee/amount used by system CLI helper.
tests/integration/evmigration/evmigration_test.go Adds integration tests for legacy→EVM keyring lifecycle, dual-signing, and config persistence.
supernode/supernode_metrics/reachability_active_probing_test.go Updates fake Lumera client to satisfy new Conn() method in interface.
supernode/host_reporter/service.go Stops reporting cascade kademlia DB bytes in epoch report payload.
supernode/config/config.go Adds evm_key_name config field to support one-time migration workflow.
supernode/cmd/start.go Integrates EVM chain check + auto-migration at startup and triggers P2P refresh on migration.
supernode/cmd/evmigration.go Implements chain detection, migration validation, dual signing, broadcast + confirmation polling, and local cleanup.
supernode/cmd/evmigration_test.go Adds unit tests for migration decision logic, message selection, persistence, and tx confirmation polling.
sn-manager/go.sum Updates checksums following dependency/toolchain alignment.
sn-manager/go.mod Aligns sn-manager module dependencies and Go version; points Lumera to local checkout.
sdk/task/task.go Switches eligibility check to use spendable balance and refactors probe logic.
sdk/task/spendable_balance_test.go Adds unit tests for spendable-balance eligibility behavior.
sdk/task/ica_signature.go Adds helper to detect whether an action creator is an ICA account.
sdk/task/ica_signature_test.go Extends test fake client + adds tests around vesting/non-ICA creator behavior.
sdk/task/helpers.go Restricts ICA fallback signature validation to configured ICA usage or actual ICA creators.
sdk/adapters/lumera/adapter.go Adds spendable-balance query method to Lumera adapter interface and implementation.
pkg/testutil/lumera.go Extends mock Lumera client/bank module with Conn() and spendable-balance support.
pkg/net/credentials/lumeratc.go Adds cache-clearing hook for key exchanger instances after identity changes.
pkg/lumera/modules/bank/interface.go Extends bank module interface with spendable-balance query.
pkg/lumera/modules/bank/impl.go Implements spendable-balance query against gRPC bank client.
pkg/lumera/modules/bank/bank_mock.go Updates generated mock to include spendable-balance method.
pkg/lumera/lumera_mock.go Updates generated Lumera client mock to include Conn().
pkg/lumera/interface.go Adds Conn() *grpc.ClientConn to Lumera client interface.
pkg/lumera/codec/encoding.go Registers cosmos/evm, vesting, and evmigration interfaces for codec unpacking.
pkg/lumera/codec/encoding_test.go Adds test ensuring delayed vesting accounts unpack via encoding config.
pkg/lumera/client.go Exposes underlying gRPC connection via Conn().
pkg/keyring/keyring.go Switches default derivation to coin type 60 and adds eth_secp256k1 support + signature normalization.
pkg/keyring/keyring_test.go Adds EVM key derivation/signing/idempotent SDK config tests.
p2p/p2p.go Adds P2P-level migration notification API.
p2p/kademlia/dht.go Adds migration notification channel for bootstrap refresher acceleration.
p2p/kademlia/bootstrap.go Implements migration-triggered immediate bootstrap refresh + temporary accelerated interval; clears credential cache/connection pool.
go.mod Aligns root module deps for EVM-enabled Lumera + cosmos/evm and toolchain updates.
docs/evm-migration.md Adds operator documentation for migration workflow, including multisig offline path.
cmd/sncli/go.mod Aligns sncli module deps for EVM-enabled Lumera + toolchain updates.
CHANGELOG.md Adds “Upcoming EVM Release” notes describing migration, keyring, P2P, SDK, and dependency changes.
.gitignore Adds ignores for debug output and local Claude settings.
.github/workflows/build&release.yml Adds [no-deploy] tag message handling to create draft releases.
Files not reviewed (2)
  • pkg/lumera/lumera_mock.go: Language not supported
  • pkg/lumera/modules/bank/bank_mock.go: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread supernode/cmd/evmigration.go
Comment thread sdk/task/task.go
Comment thread supernode/supernode_metrics/reachability_active_probing_test.go Outdated
Comment thread docs/evm-migration.md Outdated
Comment thread docs/evm-migration.md
Comment thread supernode/cmd/evmigration_test.go Outdated
akobrin1 and others added 6 commits June 8, 2026 14:18
# Conflicts:
#	go.mod
#	go.sum
#	supernode/host_reporter/service.go
#	tests/system/cli.go
#	tests/system/go.mod
#	tests/system/go.sum
#	tests/system/main_test.go
- go.mod: drop local `=> ../../../lumera` replace in cmd/sncli and
  sn-manager so a clean checkout builds against the tagged
  lumera v1.20.0-rc2 instead of failing on a missing directory.
- evmigration: isEthSecp256k1Key now type-asserts *ethsecp256k1.PubKey
  instead of treating any non-legacy key as EVM, so ed25519/multisig/
  offline keys are rejected at the pre-flight gate. Add negative test.
- evmigration_test: compare legacy vs EVM addresses in the same bech32
  encoding (was bech32 vs hex, which could never collide).
- sdk/task: report actual and required spendable balance from `min`
  and denom instead of hard-coding ">= 1 LUME".
- docs/evm-migration: fix "superno0de" typo and the `keys recover`
  example (positional [name], not --name).
- gofmt reachability_active_probing_test.go.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The merge brought master's test fakes together with the EVM branch's
extended interfaces, leaving the fakes incomplete:

- lumera.Client gained Conn() *grpc.ClientConn — add it to fakeLumera
  (self_healing) and handlerLumera (transport/grpc/self_healing),
  delegating to the underlying testutil mock. Fixes the unit-tests
  build failure in both packages.
- p2p.P2P gained NotifyEVMMigration() — add a no-op to the
  lep6StoreBackedP2P fake. Fixes the tests/systemtests build failure
  that broke both the cascade-e2e and lep6-e2e jobs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The merged-in LEP-6 e2e configs and tests still used legacy secp256k1
(coin type 118) supernode addresses. With the EVM-enabled binary,
testkey1/2/3 derive eth_secp256k1 (coin type 60) addresses, so the
supernodes failed config verification at startup ("Key resolves to X
but config identity is Y") and never came up — cascading into the
concurrency/enforcement/restart/matrix test failures.

Update config.lep6-{1,2,3}.yml identities and the hardcoded address
constants/assertions in the runtime and negative-matrix tests to the
EVM-derived addresses, matching the mapping already applied for the
cascade e2e in 651b9d8 (which now passes).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
generateLEP6UploadUsers created and funded user accounts with legacy
secp256k1 (sdkhd.Secp256k1) while newLEP6ActionClientsForKey signs via
keyring.RecoverAccountFromMnemonic, which uses eth_secp256k1 (coin type
60). The funded address and the signing address therefore differed, so
RequestAction failed with "account ... not found" in the concurrency
and restart e2e tests.

Derive upload users with evmhd.EthSecp256k1 so the funded address
matches the address the action client signs with.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@a-ok123 a-ok123 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: EVM support (evm-support → master)

Architecturally solid and unusually well-tested for a migration of this scope. The migration flow in supernode/cmd/evmigration.go is the strongest part: idempotent/rerunnable (checks MigrationRecord before broadcasting), correct state-mutation ordering (config persisted to disk before legacy key deletion, with a config-restore-on-failure path), fail-closed on MigrationEstimate errors, and an explicit refusal for multisig accounts with an offline lumerad playbook. The migrationChainClient interface gives good unit-test coverage without a live gRPC connection, and the P2P refresh handling (NotifyEVMMigration → cache clear + conn-pool release + accelerated bootstrap) is well thought through, including the buffered-channel / migrationPending replay when the DHT isn't up yet.

CI: build, unit-tests, integration-tests, cascade-e2e-tests green. ⚠️ lep6-e2e-tests is failing — the one blocker (see inline comment).

Main points (details inline)

  1. lep6-e2e red — package FAIL with passing visible subtests; needs root-causing before merge.
  2. Signature-length inconsistency — migration EVM proof keeps the 65-byte sig (evmigration.go) while keyring.SignBytes now truncates to 64. Both may be correct for their respective verifiers, but please confirm + document.
  3. Unsigned broadcast tx — depends on the chain's evmigration ante decorator; worth a code comment so it isn't "fixed" later.
  4. requireEVMChain fatal on transient errors — a chain blip at boot becomes a crash; consider distinguishing "absent" from "unreachable".
  5. Minor: return sig, nil clarity; import grouping in testutil/lumera.go.

Nothing here is architecturally wrong — the open items are the failing e2e job and a couple of "please confirm the chain side agrees" assumptions around signature encoding. Leaving as a non-blocking comment review since it's a draft.


Generated by Claude Code

Comment thread supernode/cmd/evmigration.go
Comment thread supernode/cmd/evmigration.go
Comment thread pkg/keyring/keyring.go Outdated
Comment thread supernode/cmd/start.go
Comment thread pkg/testutil/lumera.go Outdated
Comment thread tests/system/e2e_lep6_concurrency_test.go
akobrin1 and others added 2 commits June 9, 2026 16:29
Update lumera to v1.20.0-rc3, the new release with LEP-6, across
go.mod/go.sum for the root, cmd/sncli, and tests/system modules.
Ran go mod tidy in each. Updated CHANGELOG reference.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- requireEVMChain: distinguish a transient module-version query failure
  (chain unreachable / gRPC blip at boot) from a definitive "EVM module
  absent" answer. Transient errors are now retried with backoff instead
  of immediately Fatal-ing into a crash loop; module-absent still fails
  fast. Extracted requireEVMChainWithQuerier behind a small interface and
  added unit tests (present, absent-no-retry, transient-then-success,
  persistent-failure, context-cancelled).
- evmigration: document why the EVM proof signature intentionally keeps
  the 65th recovery byte (chain verifier requires R||S||V and strips V
  itself), and why the migration broadcast tx is deliberately unsigned
  (self-authenticating via dual MigrationProof + evmigration ante).
- keyring.SignBytes: return sig, nil (err is nil here) and document that
  the 65->64 truncation targets cosmos-form chain verifiers; the P2P
  securekeyx handshake is unaffected (separate sign/verify path).
- testutil/lumera.go: gofmt import grouping (grpc was mid-group).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
akobrin1 and others added 4 commits June 9, 2026 22:53
TestGetKeysForReplication_CanceledContext intermittently failed in CI
with "TempDir RemoveAll cleanup: directory not empty" because the sqlite
store's background goroutines could still write WAL/-shm files after
Close() returned.

Root cause: Worker.Stop() sent a single value on an unbuffered quit
channel that BOTH the DB worker and the checkpoint worker selected on,
so only one was released and the checkpoint worker leaked. It also slept
up to checkpointInterval inside backoff.RetryNotify before re-checking
quit, and Close() never waited for the goroutines before closing the DB.

Fix:
- Worker.Stop() now closes the quit channel (broadcast to all listeners)
  via sync.Once instead of sending one value.
- The checkpoint worker bails out of its retry loop on shutdown
  (backoff.Permanent(errStoreClosing)) and uses an interruptible
  inter-checkpoint wait so it stops promptly.
- Store tracks its three long-lived goroutines (DB worker, checkpoint
  worker, replication writer) with a WaitGroup; Close() signals stop,
  waits for them to exit, then closes the DB. Close() is now idempotent
  (sync.Once), and startRepWriter no longer spawns a nested goroutine.
- Converted RetrieveBatchValues / retrieveBatchValues from value to
  pointer receiver (a Store value can no longer be copied now that it
  holds sync primitives).

Added TestStore_CloseStopsWorkersAndIsIdempotent. Verified with
go test -race -count=10 (sqlite + kademlia packages).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
TestLEP6ConcurrentCascadesContendedReporter intermittently failed when a
single supernode finalized several cascade actions concurrently: the
finalize *simulation* pre-check (cascade/register.go step 11) fatally
failed with "account sequence mismatch, expected N, got M".

Simulation reads the signer's committed sequence into accountInfo, but
under concurrent in-flight txs from the same signer that value goes stale
between the read and the simulate ante check (TOCTOU). The actual
FinalizeAction broadcast already retries sequence mismatches, so a
simulate-only mismatch is a false negative that needlessly fails the
upload.

TxHelper.Simulate now retries on sequence mismatch — re-fetching fresh
account info each attempt, bounded by the same SequenceMismatchMaxAttempts
cap as the broadcast path. Added unit tests for the retry-then-succeed and
exhaust-cap cases.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The other half of TestLEP6ConcurrentCascadesContendedReporter: under
concurrent uploads, StoreArtefacts could fail with transient P2P
conditions — "no eligible store peers" (routing not yet converged on
store-eligible peers), "0.00% successful" (store RPCs to peers failing
under load), or a momentary "zero peers" — and the cascade register path
made a single attempt, failing the whole upload.

storeArtefacts now retries these transient failures with bounded backoff
(4 attempts, base 2s * attempt). Retries force IdempotentDirectoryRecord
so the symbol-directory row (a plain INSERT on first pass) is upserted
rather than colliding; the symbol/data key stores are already idempotent
by key. Deterministic errors (e.g. layout/encoding) are not retried.

Added tests: transient-then-succeed (asserting retries flip to the
idempotent path), non-transient-no-retry, and exhaust-cap.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…g (LEP-6/EVM)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@mateeullahmalik mateeullahmalik left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reviewed current head 14bddf0e0fae6b0f413856cfe35698d0885a0b11.

The blockers from my previous change request are resolved:

  • PR now merges cleanly with master; git merge-tree origin/master origin/pr-299 returns a clean tree hash, no conflict hunks.
  • The committed local Lumera replaces are gone from the nested modules; clean-checkout go test ./... passes in both cmd/sncli and sn-manager.

I also checked the follow-up reliability commits after the green full-test run. The latest code delta adds bounded retries around transient P2P artefact-store failures and a focused test suite. It is context-aware, forces idempotent directory upsert on retry, and does not retry deterministic errors.

Local validation at PR head:

  • go test ./... from repo root: PASS
  • go test ./supernode/cascade: PASS
  • go test ./supernode/cmd ./pkg/keyring ./pkg/lumera/codec ./sdk/task: PASS
  • go test ./... in cmd/sncli: PASS
  • go test ./... in sn-manager: PASS
  • git diff --check origin/master...HEAD: PASS

CI evidence checked:

  • Build green on current head 14bddf0.
  • Full tests workflow green on fc063ff for unit, integration, cascade-e2e, and lep6-e2e; later commits are the artefact-store retry + tests and changelog.

@akobrin1 akobrin1 marked this pull request as ready for review June 10, 2026 16:27

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

select {
case <-ctx.Done():
return ctx.Err()
case err := <-done:
return err

P2 Badge Avoid waiting forever on abandoned batch jobs

When Close races with an original/local StoreBatch after the job is queued, the DB worker can take the closed quit case and return without processing the queued job or sending on job.Done. In that case this wait has no escape unless the caller's context is canceled, so callers using context.Background() or a request context not tied to shutdown can hang indefinitely during store teardown; return on store shutdown or ensure pending Done channels are completed.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@akobrin1 akobrin1 merged commit 82cebf7 into master Jun 10, 2026
2 checks passed
@akobrin1 akobrin1 deleted the evm-support branch June 10, 2026 16:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants