backup: Phase 0b M6 implementation - cmd/elastickv-snapshot-encode + library#904
Merged
Conversation
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Phase 0b M6 implementation per the merged design doc
docs/design/2026_06_01_proposed_snapshot_encode_cli.md(#896, merged at fe9e941).Wires the merged M1–M5 encoder slices into a user-facing CLI plus a library entrypoint mirroring the decoder's
DecodeSnapshot. Implements the round-trip self-test the parent doc mandates, with write-then-rename atomic publish so a self-test failure never reaches the restore path.What lands
Library (
internal/backup/):encode_snapshot.go:EncodeSnapshot(opts, out io.Writer) (EncodeResult, error)— high-level wrapper that dispatches per-adapter encoders in canonical fan-out order (redis → dynamodb → s3 → sqs), implements two-mode buffering (stream whenSelfTest=false, buffer whenSelfTest=true), runs the structural self-test against the in-memory buffer, and copies tooutonly on match. UnexportedcorruptBufferForTesthook lets same-package tests inject buffer corruption that reaches the self-test decode but neverout(codex P2 v6 docs(backup): propose Phase 0b M6 - cmd/elastickv-snapshot-encode CLI #896 — write-then-rename atomicity).encode_info.go:EncodeInfoschema +WriteEncodeInfo/ReadEncodeInfohelpers +EncodeInfoSidecarPath(path-derived<output>.encode_info.json, no static-name collisions per gemini medium v2 docs(backup): propose Phase 0b M6 - cmd/elastickv-snapshot-encode CLI #896). Format-version gate so a future bump surfaces asErrUnsupportedEncodeInfoFormatVersion.manifest.go:Exclusions.RenameS3Collisions booladded with JSON tagrename_s3_collisions. Intentionally NOT inexclusionsRequiredFieldsso legacy manifests decode safely with the zero valuefalse(claude v5 medium docs(backup): propose Phase 0b M6 - cmd/elastickv-snapshot-encode CLI #896).Decoder CLI update (
cmd/elastickv-snapshot-decode/main.go):emitManifestnow populates the newRenameS3Collisionsfield fromcfg.renameCollisions, completing the decoder→encoder round-trip for--rename-collisionsdumps.Encoder CLI (
cmd/elastickv-snapshot-encode/main.go):--input,--output,--adapter(decoder-parity CSV parser),--last-commit-ts,--self-test,--scratch-root.<output>.tmp-<random>, fsync+close, then rename. Self-test failure →mismatch.txtnext to where.fsmwould have been, temp file removed, exit 2.--last-commit-ts T < manifest→ exit 2 with typedErrSelfTestLowerLastCommitTS.encoder_versionstamped at build time via-ldflags "-X main.version=..."(mirrors decoder pattern atmain.go:45).Test plan
go test -race -count=1— all green:internal/backup/: 3 new tests (encode_info schema, format gate, legacy manifest forward-compat)internal/backup/encode_snapshot_test.go: 5 tests (library round-trip, self-test match against canonicalized input, corruption never reachesout, missing-input guard, sidecar path derivation)cmd/elastickv-snapshot-encode/main_test.go: 9 tests (missing manifest, unknown adapter, lower-TS fail-closed, equal+higher TS accept, path-derived sidecar, two-files-no-collision, full round-trip with--self-test, atomic-publish never leaves bad.fsm,--last-commit-tsparser)make lint: clean.Caller audit per CLAUDE.md semantic-change rule
Exclusionsstruct gained a field. Existing callers either build via field-tagged literals (decoder CLI'semitManifest— updated to populate the new field) or read it (encoder'sbuildSelfTestDecodeOptions— new code). No silent semantic change.DecodeOptions.RenameS3Collisionswas already a public field used by the decoder; the encoder now also reads it via the manifest. No caller-side change needed.Self-review (5 passes)
.fsm. Corrupted buffer never reaches theio.Writer(pinned byTestEncodeSnapshotSelfTestDetectsCorruptionassertingout.Len()==0). Allb.Adderrors propagate throughEncodeSnapshot.io.Writer. Temp-file suffix usescrypto/randso concurrent encodes against the same--outputcannot collide.SelfTest=falsestreams with onesha256.Writertee, no extra allocations.SelfTest=trueallocates one FSM-sized*bytes.Bufferplus the scratch decode tree (documented memory cost).--last-commit-ts T < manifestfail-closed with typed error; self-test threads MANIFEST DecodeOptions (Exclusions.*+DynamoDBLayout → DynamoDBBundleJSONL) so trees produced with non-default decoder flags round-trip cleanly.Risk
Low. The encoder is offline; restoration is non-destructive (new keyspace on a fresh cluster via stop-replace-restart). The only public-API change is the
Exclusions.RenameS3Collisionsfield, which is forward-compat for older manifests. All existing M1–M5 tests continue to pass.Summary by CodeRabbit
New Features
Bug Fixes
Tests