Skip to content

chore(dev): tag dev images :dev-<sha> instead of :latest (no cross-worktree poisoning)#587

Merged
OisinKyne merged 1 commit into
mainfrom
chore/dev-image-sha-tags
Jun 3, 2026
Merged

chore(dev): tag dev images :dev-<sha> instead of :latest (no cross-worktree poisoning)#587
OisinKyne merged 1 commit into
mainfrom
chore/dev-image-sha-tags

Conversation

@bussyjd
Copy link
Copy Markdown
Collaborator

@bussyjd bussyjd commented Jun 3, 2026

Summary

What changed: Under OBOL_DEVELOPMENT, locally-built images are now tagged
<image>:dev-<short-git-sha> instead of the shared, mutable <image>:latest. The
manifest rewrite (internal/defaults) and the image build (internal/stack) agree on
the tag via a persisted $CONFIG_DIR/.dev-image-tag.

Why it matters: :latest is shared across every obol-stack worktree on one Docker
daemon. One worktree's build silently overwrites :latest, and another worktree's
obol stack up then deploys the wrong binary. This bit rc9 QA: a sibling branch's
serviceoffer-controller (which reads a per-purchase Secret this branch's RBAC doesn't
grant) poisoned an unrelated stack and presented as a buy hang, not an obvious image
mismatch. A per-commit dev-<sha> tag isolates builds by source so this can't recur.
Bonus: committing changes the SHA and auto-triggers a fresh build.

Risk level: low — dev-only path (production digest pins untouched); :latest fallback
preserves behaviour for non-git/tarball builds; the build loop is otherwise unchanged
(only the tag string differs).

Commit under test: 7b9c479d

Base branch: main

Scope

  • Code
  • Charts / manifests
  • Flows / QA scripts
  • Docs / skills
  • Images / dependencies
  • Other:

Validation

Unit tests:

go test ./...   → ok (33 packages), gofmt + go vet clean   (commit 7b9c479d)
New/updated:
  internal/defaults  TestDevImageTag_Format, TestReadDevImageTag_FallbackWhenAbsent
  internal/defaults  TestCopyInfrastructure_DevModeRewritesDigestPins (now asserts dev-<sha>
                     + that the persisted tag matches the stamped manifest tag)

Manual / runtime:

OBOL_DEVELOPMENT=true obol stack init  →
  .dev-image-tag = dev-d9527983fc74
  x402.yaml  serviceoffer-controller pin → :dev-d9527983fc74   (no @sha256: residue)
  llm.yaml   x402-buyer pin              → :dev-d9527983fc74
docker build -f Dockerfile.serviceoffer-controller -t …:dev-d9527983fc74 .  → ok
k3d image import …:dev-d9527983fc74 -c <cluster>                            → ok
(the exact build+import path buildAndImportLocalImages runs)

Release smoke: not re-run for this change (dev-build tagging only). The build/import
path it touches is the same one release-smoke exercises every run; the only behavioural
delta is the tag string, validated above. A full dev obol stack up is the recommended
final confirmation.

Review Notes

Known gaps:

  • internal/defaults.devLocallyBuiltImageBases and internal/stack.baseLocalImages remain
    duplicated (intentional, to avoid a defaults→stack import cycle) — unchanged by this PR.
  • Uncommitted working-tree changes reuse the committed dev-<sha> (same caveat as the old
    :latest); OBOL_FORCE_REBUILD_LOCAL_DEV_IMAGES is the escape hatch.

Reviewer focus:

  • internal/defaults/defaults.go: DevImageTag / ReadDevImageTag, the persist in
    CopyInfrastructure, and rewriteDevDigestPins(defaultsDir, devTag).
  • internal/stack/stack.go: buildAndImportLocalImages reads the persisted tag and builds/
    imports <base>:<devTag> (force-rebuild short-name matching still works via localImageShortName).

Comment thread internal/defaults/defaults.go
Copy link
Copy Markdown
Contributor

@OisinKyne OisinKyne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One worry that i'm not so sure the write will overwrite an existing file, i guess it probably does?

@OisinKyne OisinKyne force-pushed the chore/dev-image-sha-tags branch from 7b9c479 to 36146d0 Compare June 3, 2026 18:48
Under OBOL_DEVELOPMENT, both the manifest rewrite (internal/defaults) and the
image build (internal/stack) used a shared, mutable `<image>:latest` tag. On a
host running more than one obol-stack worktree against the same Docker daemon,
one worktree's build silently overwrites :latest and a different worktree's
`obol stack up` then deploys the wrong binary — observed during rc9 QA, where a
sibling branch's serviceoffer-controller (reading a per-purchase Secret this
branch's RBAC doesn't grant) poisoned an unrelated stack and presented as a
hang, not an obvious image mismatch.

Tag dev images with `dev-<short-git-sha>` instead:
- defaults.DevImageTag() = dev-<sha> of the working tree (`latest` fallback when
  not a git checkout, preserving prior behaviour for tarball builds).
- CopyInfrastructure stamps the rendered manifests with the dev tag AND persists
  it to $CONFIG_DIR/.dev-image-tag, so internal/stack builds/imports the exact
  tag the cluster pins — even if HEAD moves between `stack init` and `stack up`.
- Different branches/worktrees get distinct tags (no cross-worktree poisoning);
  committing changes the SHA and triggers a fresh build; uncommitted changes
  reuse the committed tag unless OBOL_FORCE_REBUILD_LOCAL_DEV_IMAGES is set.

Validated: go test ./... (33 pkgs); `obol stack init` (dev) rewrites both the
controller and buyer pins to :dev-<sha> with no leftover @sha256 and persists the
tag; `docker build -t …:dev-<sha>` + `k3d image import` succeed (the exact
build/import path buildAndImportLocalImages runs).
@OisinKyne OisinKyne force-pushed the chore/dev-image-sha-tags branch from 36146d0 to 18617a6 Compare June 3, 2026 18:49
@OisinKyne OisinKyne merged commit bc9edcb into main Jun 3, 2026
7 checks passed
@OisinKyne OisinKyne deleted the chore/dev-image-sha-tags branch June 3, 2026 18:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants