fix(restore): clear chattr +i mount-guard fallback in --cleanup-guards#239
Conversation
During restore, when a storage mountpoint is offline, ProxSave applies a read-only bind-mount guard; if the bind mount fails it falls back to `chattr +i` on the mountpoint directory so PVE/PBS cannot silently write into the root filesystem. `--cleanup-guards` previously only undid the bind-mount guards: the immutable flag was left set, persists across reboots, and had to be cleared by hand. Record each chattr-guarded mountpoint in a host-side index (/var/lib/proxsave/guards/chattr-targets) and have --cleanup-guards reverse it: - record on a successful `chattr +i` at both apply sites (PBS + PVE), written atomically on the real filesystem, deduped, with strict target validation (datastore mount root only; reject newline/control chars). - clearImmutableGuards reads the index and runs `chattr -i`, but only on targets that are NOT currently mounted (a live mount shadows the flagged dir; clearing it would touch the wrong inode), mirroring how hidden bind-mount guards are left in place. Symlinks are resolved and the allowlist re-checked before clearing so a parent-component symlink cannot escape /mnt|/media|/run/media. - keep the guard directory + index while any immutable target is still pending (mounted/unresolved/failed) so a later run can finish the job; remove them only when nothing is pending and no bind-mount guards remain. - surface the skip-while-mounted case at Info level and point users at the manual recovery (lsattr -d / chattr -i) when the guard dir was deleted. PVE guard apply now uses the existing mountGuardMkdirAll / mountGuardIsPathOnRootFilesystem seams (default behavior unchanged) so the record wiring can be tested without a writable /mnt/pve. Adversarially reviewed (correctness/security/test-rigor), mutation-proven.
Correct the mount-guard docs to match the new cleanup behavior: - distinguish the two guard types: a bind-mount guard is shadowed by a later real mount and discarded on reboot / by --cleanup-guards, whereas the `chattr +i` fallback sets an immutable flag on the directory inode that persists across reboots and is now recorded under /var/lib/proxsave/guards/chattr-targets for cleanup to reverse. - fix the "real mount overlays the guard" wording, which was only true for the bind-mount guard. - explain that --cleanup-guards clears recorded immutable flags only on not-currently-mounted targets, the unmount→cleanup→remount procedure for mounted ones, and the manual lsattr/chattr -i recovery when the guard directory was deleted by hand. - add a TROUBLESHOOTING entry for "mountpoint read-only after restore". Touches RESTORE_GUIDE.md, RESTORE_TECHNICAL.md, CLI_REFERENCE.md, TROUBLESHOOTING.md.
Qodo reviews are paused for this user.Troubleshooting steps vary by plan Learn more → On a Teams plan? Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center? |
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
Address the two github-advanced-security (gosec) findings on mount_guard_chattr_index.go structurally, without #nosec: - G301: tighten the guard base dir from 0o755 to 0o750. It is root-owned host state (the 0o600 index + bind-mount guard subdirs); nothing non-root needs to traverse it. - G304: read the index through an *os.Root on its directory (root.ReadFile(base)), mirroring readLockFileContent / normalizeTextFile. This confines the read to the guard directory at the syscall level so a symlink or `..` in the basename cannot redirect it, and removes the raw variable sink. Behavior is unchanged (missing/unreadable index -> nil).
#239) * fix(restore): clear chattr +i mount-guard fallback in --cleanup-guards During restore, when a storage mountpoint is offline, ProxSave applies a read-only bind-mount guard; if the bind mount fails it falls back to `chattr +i` on the mountpoint directory so PVE/PBS cannot silently write into the root filesystem. `--cleanup-guards` previously only undid the bind-mount guards: the immutable flag was left set, persists across reboots, and had to be cleared by hand. Record each chattr-guarded mountpoint in a host-side index (/var/lib/proxsave/guards/chattr-targets) and have --cleanup-guards reverse it: - record on a successful `chattr +i` at both apply sites (PBS + PVE), written atomically on the real filesystem, deduped, with strict target validation (datastore mount root only; reject newline/control chars). - clearImmutableGuards reads the index and runs `chattr -i`, but only on targets that are NOT currently mounted (a live mount shadows the flagged dir; clearing it would touch the wrong inode), mirroring how hidden bind-mount guards are left in place. Symlinks are resolved and the allowlist re-checked before clearing so a parent-component symlink cannot escape /mnt|/media|/run/media. - keep the guard directory + index while any immutable target is still pending (mounted/unresolved/failed) so a later run can finish the job; remove them only when nothing is pending and no bind-mount guards remain. - surface the skip-while-mounted case at Info level and point users at the manual recovery (lsattr -d / chattr -i) when the guard dir was deleted. PVE guard apply now uses the existing mountGuardMkdirAll / mountGuardIsPathOnRootFilesystem seams (default behavior unchanged) so the record wiring can be tested without a writable /mnt/pve. Adversarially reviewed (correctness/security/test-rigor), mutation-proven. * docs(restore): document --cleanup-guards clearing the chattr +i fallback Correct the mount-guard docs to match the new cleanup behavior: - distinguish the two guard types: a bind-mount guard is shadowed by a later real mount and discarded on reboot / by --cleanup-guards, whereas the `chattr +i` fallback sets an immutable flag on the directory inode that persists across reboots and is now recorded under /var/lib/proxsave/guards/chattr-targets for cleanup to reverse. - fix the "real mount overlays the guard" wording, which was only true for the bind-mount guard. - explain that --cleanup-guards clears recorded immutable flags only on not-currently-mounted targets, the unmount→cleanup→remount procedure for mounted ones, and the manual lsattr/chattr -i recovery when the guard directory was deleted by hand. - add a TROUBLESHOOTING entry for "mountpoint read-only after restore". Touches RESTORE_GUIDE.md, RESTORE_TECHNICAL.md, CLI_REFERENCE.md, TROUBLESHOOTING.md. * fix(orchestrator): resolve gosec G301/G304 on the chattr guard index Address the two github-advanced-security (gosec) findings on mount_guard_chattr_index.go structurally, without #nosec: - G301: tighten the guard base dir from 0o755 to 0o750. It is root-owned host state (the 0o600 index + bind-mount guard subdirs); nothing non-root needs to traverse it. - G304: read the index through an *os.Root on its directory (root.ReadFile(base)), mirroring readLockFileContent / normalizeTextFile. This confines the read to the guard directory at the syscall level so a symlink or `..` in the basename cannot redirect it, and removes the raw variable sink. Behavior is unchanged (missing/unreadable index -> nil).
Problem
During restore, when a storage mountpoint is offline, ProxSave applies a read-only bind-mount guard; if the bind mount can't be created it falls back to
chattr +ion the mountpoint directory, so PVE/PBS can't silently write into the root filesystem (/).--cleanup-guardspreviously only undid the bind-mount guards. Thechattr +ifallback was left in place — and unlike a bind mount, the immutable flag persists across reboots and is never cleared automatically, so a user had to runchattr -iby hand (and often didn't know to).Change
Record each chattr-guarded mountpoint in a host-side index (
/var/lib/proxsave/guards/chattr-targets) and make--cleanup-guardsreverse it:chattr +iat both apply sites (PBS + PVE), written atomically on the real filesystem (the index is host state, not part of the staged restore tree), deduped, with strict validation (datastore mount root only; reject newline/control chars).clearImmutableGuardsreads the index and runschattr -i, but only on targets that are not currently mounted — a live mount shadows the flagged directory, so clearing it would touch the wrong inode (mirrors how hidden bind-mount guards are left in place). Symlinks are resolved and the allowlist re-checked before clearing, so a parent-component symlink can't escape/mnt|/media|/run/media.lsattr -d/chattr -i) when the guard directory was deleted by hand.PVE guard apply now uses the existing
mountGuardMkdirAll/mountGuardIsPathOnRootFilesystemseams (default behavior unchanged) so the record wiring can be tested without a writable/mnt/pve.Behavior notes
--cleanup-guardsjust removes the leftover guard.--cleanup-guards(orchattr -i) → remount.lsattr -d/chattr -i.Tests & validation
internal/orchestratorsuite green with-race -shuffle=on -count=2;cmd/...+internal/cli/...green.Docs
Updated
RESTORE_GUIDE.md,RESTORE_TECHNICAL.md,CLI_REFERENCE.md,TROUBLESHOOTING.md: distinguish bind-mount vschattr +iguard lifecycle (the latter survives reboot), fix the "real mount overlays the guard" wording, and document the cleanup / mounted-target / deleted-directory recovery procedures.Greptile Summary
This PR closes the gap where
--cleanup-guardsonly reversed bind-mount guards but leftchattr +iimmutable flags in place — a regression that persists across reboots and requires manual recovery. The fix records each successfully-appliedchattr +ifallback target in a host-side index (/var/lib/proxsave/guards/chattr-targets) and makes--cleanup-guardsreverse them with appropriate safety checks.recordImmutableGuardTargetafter a successfulchattr +i, writing atomically (temp-file + rename) with dedup, strict allowlist validation, and control-character rejection.clearImmutableGuardsre-validates each recorded target against the datastore-root allowlist, skips currently-mounted targets (the immutable flag is on the shadowed inode, not the live mount), resolves symlinks and re-checks the allowlist, and runschattr -ion the resolved path.Confidence Score: 5/5
Safe to merge — the chattr guard record/clear logic is well-contained, comprehensively tested, and fails safely at every step.
The change correctly closes a real lifecycle gap (immutable flags persisting after reboots). Guard application and cleanup are both best-effort and non-fatal, so recording or clearing failures degrade gracefully to the pre-PR manual-recovery path rather than aborting the restore. The allowlist double-check (once on the raw target, again on the resolved path) prevents symlink-escape. Test coverage is extensive and mutation-aware across all code paths. Two previously-noted edge cases (no fsync before rename; isMounted checked on the unresolved path before symlink resolution) are low-impact theoretical concerns that the PR explicitly acknowledges as acceptable trade-offs.
No files require special attention. The core logic in guards_cleanup.go and mount_guard_chattr_index.go is well-covered by the new test suite.
Important Files Changed
Sequence Diagram
%%{init: {'theme': 'neutral'}}%% sequenceDiagram participant R as Restore participant GA as Guard Apply (PBS/PVE) participant IDX as chattr-targets index participant CG as --cleanup-guards R->>GA: process offline storage mountpoint GA->>GA: attempt bind-mount guard alt bind-mount succeeds GA-->>R: bind-mount guard active else bind-mount fails → chattr fallback GA->>GA: chattr +i guardTarget GA->>IDX: recordImmutableGuardTarget(target) IDX->>IDX: atomic write (temp+rename, 0o600) end Note over R,IDX: Later, after storage is back online CG->>IDX: clearImmutableGuards: read index loop each recorded target CG->>CG: isConfirmableDatastoreMountRoot(target)? CG->>CG: isMounted(target)? alt currently mounted CG-->>CG: skip (pending++), keep index else not mounted CG->>CG: EvalSymlinks(target) → resolved CG->>CG: isConfirmableDatastoreMountRoot(resolved)? CG->>CG: chattr -i resolved end end alt "pending == 0 and no bind-mount guards" CG->>IDX: RemoveAll(mountGuardBaseDir) else "pending > 0" CG-->>CG: keep index for next run end%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%% sequenceDiagram participant R as Restore participant GA as Guard Apply (PBS/PVE) participant IDX as chattr-targets index participant CG as --cleanup-guards R->>GA: process offline storage mountpoint GA->>GA: attempt bind-mount guard alt bind-mount succeeds GA-->>R: bind-mount guard active else bind-mount fails → chattr fallback GA->>GA: chattr +i guardTarget GA->>IDX: recordImmutableGuardTarget(target) IDX->>IDX: atomic write (temp+rename, 0o600) end Note over R,IDX: Later, after storage is back online CG->>IDX: clearImmutableGuards: read index loop each recorded target CG->>CG: isConfirmableDatastoreMountRoot(target)? CG->>CG: isMounted(target)? alt currently mounted CG-->>CG: skip (pending++), keep index else not mounted CG->>CG: EvalSymlinks(target) → resolved CG->>CG: isConfirmableDatastoreMountRoot(resolved)? CG->>CG: chattr -i resolved end end alt "pending == 0 and no bind-mount guards" CG->>IDX: RemoveAll(mountGuardBaseDir) else "pending > 0" CG-->>CG: keep index for next run endComments Outside Diff (2)
internal/orchestrator/guards_cleanup.go, line 287-320 (link)The
isMounted(target)guard at line 261 checks the original (possibly symlink) path. AftercleanupResolveTargetreturns a differentresolvedpath, the code runschattr -i resolvedwithout verifying thatresolvedis also not currently mounted. On Linux,mountfollows symlinks: if/mnt/pve/storeis a symlink to/mnt/actual-store, mounting storage there records/mnt/actual-storein mountinfo, soisMounted("/mnt/pve/store")returnsfalseeven though the storage is live. The subsequentchattr -i /mnt/actual-storewould then act on the live mount's top-level inode rather than the shadowed underlying directory — exactly the scenarioisMountedwas added to prevent.Proxmox itself does not create symlinks for storage mountpoints, so this is not triggered in normal usage, but the defensive-programming rationale used elsewhere in this function applies equally here. Adding an
isMounted(resolved)check (with the same skip-and-pending logic) would close the gap.internal/orchestrator/mount_guard_chattr_index.go, line 497-517 (link)fsyncbefore renamewriteImmutableGuardIndexwrites to a temp file and atomically renames it, but never callstmp.Sync(). If the kernel crashes or loses power between the rename completing and the OS flushing dirty pages, the renamed file can be empty or contain zeros — the guard directory would exist but the index would be unreadable, silently returning 0 targets on the next--cleanup-guardsrun. The PR documentation correctly calls this "best-effort", and the failure mode (manualchattr -irequired) is the same as the pre-PR situation, so the trade-off is reasonable. Callingtmp.Sync()after the lastWriteand beforeClosewould eliminate the gap at the cost of an extra fdatasync.Reviews (2): Last reviewed commit: "fix(orchestrator): resolve gosec G301/G3..." | Re-trigger Greptile