perf(staged): cut kgoose first-paint from ~13s to ~350ms#749
Conversation
Concurrent sync callers (e.g. the four git invocations fired in parallel by compute_branch_git_state phase1) used to each independently spawn $SHELL -ils on first project visit, multiplying the ~3.5s capture cost by the number of in-flight git ops. The sync path also did not observe async InFlight entries, so a sync caller racing an async one would duplicate work. Replace the watch-only InFlight entry with an InFlightHandle that pairs the existing watch channel (async waiters) with a new InFlightPromise (Mutex<InFlightState> + Condvar) for sync waiters. Both get and get_blocking now insert a single InFlightHandle on miss; siblings — sync or async — park on the shared promise. The capture guard's Drop also wakes sync waiters on cancellation so they retry rather than block on a dead promise. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Matt Toohey <contact@matttoohey.com>
First-paint `compute_branch_git_state` ran `git status --porcelain=1
--untracked-files=all`, which walks the entire working tree to count
untracked files. On cash-server that single call accounted for ~17s of
foreground time, blocking the timeline render even after the shell-env
stampede fix in 690c8c7f.
Introduce `WorktreeStatusScope::{Indexed, Full}` and thread it through
`compute_branch_git_state`, `compute_local_branch_git_state`,
`compute_branch_git_state_batched`, `compute_fast_local_git_state`,
and `compute_fast_git_state_batched`. Foreground timeline builds use
`Indexed` (`--untracked-files=no`) so first paint is gated only on the
index walk. The background `refresh_branch_git_state` and destructive
ops (pull, discard) use `Full` (`--untracked-files=all`) so the
follow-up `git-state-updated` event fills in the accurate untracked
count. Branches with only-untracked dirt flip from clean to dirty
after the refresh lands; for cash-server that is the 17s window we
were already paying.
The two batched shell scripts take a new `$7` / `$3` arg that maps the
scope to the matching `--untracked-files` flag.
Also folds in the per-phase timing logs added during the investigation
so the `[compute_branch_git_state] … phase1 worktree-status=<N>ms`
metric (and the symmetric `[refresh_branch_git_state] … TOTAL`) make
the new foreground/background split observable in production logs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Matt Toohey <contact@matttoohey.com>
…alidation
On first project visit each BranchCard called
getBranchTimelineWithRevalidation twice — once from the sync-hydrate
block and once from loadTimeline. With no cache entry yet, the previous
code unconditionally passed `force: true`, which deliberately skips the
inFlightTimelines coalesce in getBranchTimeline. The second caller
therefore spawned a fresh backend invoke while the first was still in
flight, doubling phase1 git work per branch on cold load.
force-bypass-inflight is only meaningful when a cached entry exists and
we want the revalidation to actually re-hit the backend rather than
piggy-back on an older inflight. On cold cache there is nothing to
preserve — coalescing siblings is exactly the desired behaviour.
Branch on entry presence: when the cache is empty, return
{cached: null, fresh: getBranchTimeline(branchId)} so siblings coalesce.
Warm and stale paths are unchanged.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Matt Toohey <contact@matttoohey.com>
… retry
First-paint `compute_local_branch_git_state` blocked on
`shell_env_cache().get_blocking(repo)`, which on cold cache spawns
`$SHELL -ils` to capture the project's interactive-login env. On
kgoose-sized monorepos that capture takes 1.5–2.8s per workspace and
gates the entire timeline render even after the `Indexed` worktree-scope
fix shaved off the untracked-file walk.
Add `cli::run_smart(repo, args)` for the foreground path:
1. Fire-and-forget warm-up of the shell-env cache via the existing
`InFlightHandle` so siblings coalesce on one capture.
2. Try the lite path: parent env minus `GIT_*`. No wait, no fork on the
shell.
3. On `GitNotFound` or `CommandFailed` (the LFS smudge / missing
`git-lfs` shape), block on the warm-up and retry with the captured
env. `NotARepo`, `InvalidUtf8`, `InvalidPath` are env-independent and
skip the retry.
`cli::run` still goes through the captured path; refresh, pull, and
discard keep using it. Refactor pulls out `EnvSource::{Lite, Captured}`
and `run_with_env` so the two entry points share one body.
Thread `use_lite_env: bool` through `compute_local_branch_git_state`.
Foreground call in `build_branch_timeline` passes `true`; refresh,
pull-ff, and discard pass `false` (mutating ops want the captured env
for credential helpers, and by the time the user has interacted the
warm-up is done so there's nothing to wait on).
Combined with the duplicate-backend-invoke fix in d130567, kgoose
cold-cache first paint drops from ~4s to the lite-env git read time
(~150ms). The retry path bounds the worst case at one extra git
invocation plus one shared shell-env wait — same cost a no-warm Option B
would have paid, with correct output at first paint instead of relying
on the background refresh to overwrite a misleading LFS-affected state.
Signed-off-by: Matt Toohey <contact@matttoohey.com>
…lones
Local `git status --untracked-files=all` on cash-server-class repos costs
~10s steady-state because it `stat()`s every tracked file and walks the
whole working tree for untracked enumeration. Git's built-in fsmonitor
daemon (>=2.36) plus the untracked cache collapse both to ~100ms once
the daemon is warm — but only if we set the config flags on each clone.
Introduce a generic one-shot migration registry so we have a place to
record per-machine migrations without sprinkling marker files under
~/.staged/, then use it to flip the two flags on every existing local
clone and on every newly created one.
* `migrations.rs`: tiny registry persisting completed migration IDs +
timestamps to ~/.staged/migrations.json. `run_pending` skips entries
already recorded; failures are logged but don't halt other migrations.
Treats a missing/corrupt registry as empty, so migrations must remain
idempotent (the fsmonitor flip naturally is via `set_if_unset`; the
legacy-worktrees move already was).
* `git/config_apply.rs`: caches the local git version (`OnceLock`) and
exposes `apply_to_clone` (local) and `apply_to_blox_clone` (remote).
`set_if_unset` checks `git config --local --get` first so users who
explicitly disabled fsmonitor aren't clobbered. Skips fsmonitor on
git < 2.36 but still sets `core.untrackedcache`. Blox version probe
is cached per workspace.
* `paths::migrate_legacy_worktrees_layout` now returns `Result<(),
String>` so the registry can call it through the `Migration::run`
function pointer. Its prior callers were no-op-safe; the early-exit
guards (legacy dir missing, same source/dest) are preserved.
* `lib.rs` setup: replaces the inlined legacy-worktrees migration with
`run_pending(&[Migration{legacy-worktrees-layout, …}, Migration{
fsmonitor-v1, …}])`, still running before `Store::new`.
* `git/github.rs`: `ensure_local_clone` and `ensure_local_clone_with_
progress` call `apply_to_clone` on both the just-cloned and the
already-cloned branches, so clones created between app starts also
get the config without waiting for the next migration sweep.
* `branches.rs`: `clone_repo_into_workspace` calls
`apply_to_blox_clone` unconditionally after the existence/clone
block. This doubles as the remote migration path — there's no
per-machine registry on a blox workstation, but the two
`git config --local --get` probes per project visit are cheap
compared to the existing fetch.
Phase 3 of the plan in note 0e552678 (deleting the
WorktreeStatusScope::{Indexed,Full} split) waits until telemetry
confirms fsmonitor is live in the field.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Matt Toohey <contact@matttoohey.com>
…abled c4b6c9a flipped `core.fsmonitor` + `core.untrackedcache` on every staged clone, which collapses `git status --untracked-files=all` from ~10s to ~100ms (steady-state) on cash-server-class repos. The `WorktreeStatusScope::{Indexed,Full}` split added in 47d5a27 was a workaround for that cost — with fsmonitor doing the heavy lifting, foreground first paint can pay the (now cheap) full-untracked walk and the background refresh no longer needs to emit a corrected count. Removes: * `WorktreeStatusScope` enum + its `git_flag` / `script_arg` impl * `worktree_scope` param from `compute_branch_git_state`, `compute_local_branch_git_state`, `compute_fast_local_git_state`, `compute_branch_git_state_batched`, `compute_fast_git_state_batched` * the `$7` (BATCH_GIT_STATE_SCRIPT) and `$3` (BATCH_FAST_SCRIPT) script args + the `ut_flag` shell dispatch; both scripts now hardcode `--untracked-files=all` * `WorktreeStatusScope` re-export from `git/mod.rs` * the four foreground/refresh/pull-ff/discard call sites in `timeline.rs` that were threading `Indexed`/`Full` through The lite-env path from 9eabee3 stays — it solves a different problem (shell-env stampede on cold cache) that fsmonitor doesn't address. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Matt Toohey <contact@matttoohey.com>
Third-pass profiling on a 3-repo kgoose project (see branch note) showed 9.0/9.0/12.8s timeline first paints despite the fsmonitor + untracked-cache work landed in c4b6c9a. Three distinct issues were stacked on top of those wins: 1. `cli::run_smart` retried with the captured env on *any* `CommandFailed`. Unpublished branches whose `origin/<branch>` doesn't exist make `compute_upstream_state` call `git rev-parse --verify origin/<branch>`, which exits non-zero with `fatal: Needed a single revision`. The retry then blocks ~8.5s on `$SHELL -ils` for nothing — a captured env can't make a missing ref appear. 2. The `local identity` block in `build_branch_timeline` called the captured-env `cli::run` for `git config user.{name,email}`. If the foreground phase1/phase2 lite paths happened to succeed (e.g. when `origin/<branch>` does resolve), nothing warmed the captured env, so the identity reads paid the full ~8.5s capture themselves. Same for `get_commits_since_base`, which always went through captured. 3. `git status --untracked-files=all` is still ~7s steady-state on cash-server even with fsmonitor + untracked-cache reporting active (untracked-cache invalidates per-directory mtime and doesn't help on heavily-edited microservice trees; `directories-visited` matches the full directory count every run). 592f3d9 removed the `WorktreeStatusScope` split on the assumption fsmonitor would carry the work — telemetry shows it doesn't, for repos of this shape. Fixes: * `cli::run_smart`: add `is_missing_ref_error` and treat `CommandFailed` matching `"Needed a single revision"` as non- retryable. `GitNotFound` still retries (the LFS-smudge / missing- binary case). New test asserts the predicate skips the missing-ref shape. * `git/mod.rs`: re-export `cli::run_smart` as `cli_run_smart` so timeline.rs can call it without reaching across modules. * `timeline.rs`: identity reads (`git config user.name|email`) use `cli_run_smart`. Env-independent reads — they hit `.git/config` and `~/.gitconfig`, nothing that needs the captured env. * `git/worktree.rs::get_commits_since_base`: inline the `merge-base` call and switch both inner invocations to `cli::run_smart`. This is on the foreground first-paint path, and both subcommands are env- independent reads. * Re-introduce `WorktreeStatusScope::{Indexed,Full}` from 47d5a27 (essentially a partial revert of 592f3d9). Foreground build_branch_ timeline call sites use `Indexed`; refresh, pull-ff, and discard use `Full` so the follow-up `git-state-updated` event re-emits the accurate untracked count. The two batched shell scripts take `$7` / `$3` again to dispatch the `--untracked-files` flag. Note 0e552678's contingency on telemetry has now been met: fsmonitor doesn't win on cash-server-class trees. Expected effect for the kgoose profile in the note: ~8.5s saved on upstream-resolution for branches whose origin ref is missing, ~2.5–8.5s saved on identity for repos whose lite path happens to succeed, ~10s saved on cash-server's foreground untracked walk. Together these collapse 9.0/9.0/12.8s to roughly 350ms across the board, modulo DB work. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Matt Toohey <contact@matttoohey.com>
The fsmonitor-v1 migration ran synchronously in the Tauri setup hook
before `Store::new`, iterating every clone under `~/.staged/repos/` and
calling `apply_to_clone` on each. `apply_to_clone` fires two
`super::cli::run` invocations per clone (`config --local --get` then,
on the "key absent" branch, the matching write), and `super::cli::run`
always goes through `EnvSource::Captured` — which on cold cache spawns
`$SHELL -ils` (~3–8s per repo). On first launch after upgrade, the
captured shell-env cache is empty for every clone, so a kgoose-shaped
account with ~10 stale clones stalled startup ~30–80s before the store
even opened.
Two changes that compose:
1. `cli::run_lite`: new public entry point that calls `run_with_env`
with `EnvSource::Lite` directly — no shell-env warm-up, no captured-
env retry. The right primitive for git invocations that are
demonstrably env-independent. Switch the two `super::cli::run` calls
in `set_if_unset_local` to `run_lite`. `git config --local --get|set`
only touches `.git/config`; no smudge filters, no credential
helpers, nothing that needs the captured env. Per-clone cost drops
from `~3–8s + ~10ms` to `~10–20ms` total.
2. Split the pre-`Store::new` migration sweep. `legacy-worktrees-
layout` stays synchronous — `Store` and `paths::*` lookups depend on
the rename having already happened. `fsmonitor-v1` moves into a
`std::thread::spawn` placed right after `background_sync::spawn`
inside the `DbCompatibility::Ok` branch, alongside other spawn-and-
forget startup tasks. Bare `std::thread` (not `tokio::spawn`)
because `migrate_existing_clones` and everything it calls are sync,
we don't want a tokio worker tied up, and Tauri's setup hook runs
early enough that runtime readiness is uncertain.
Correctness for the migration spawn: `ensure_local_clone` already
re-applies `apply_to_clone` on the early-return ("directory already
exists") path, so per-project visits cover any clone the sweep hasn't
reached yet. The migration registry only records `fsmonitor-v1` as
completed on a successful sweep, so app-exit-before-completion retries
next launch. `git config --local` writes are atomic via
`.git/config.lock` — a concurrent `ensure_local_clone` racing the
sweep on the same clone is safe: both observe "key absent", both
attempt to set, the second writer wins with the same value.
Expected effect on a profile with N stale clones on cold cache:
~N×~5s synchronous stall → ~0 synchronous stall, with the sweep
finishing in the background within ~N×~20ms once the lite-env switch
is in place.
Adds a smoke test mirroring `set_if_unset_local_is_idempotent` that
drives verification through `cli::run_lite` to lock in the new code
path on the env-independent reads.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Matt Toohey <contact@matttoohey.com>
`compute_local_branch_git_state` took a trailing `use_lite_env: bool` positional argument sitting directly after `worktree_scope: WorktreeStatusScope`. At the four `timeline.rs` call sites the bare `true,` / `false,` was ambiguous — easy to misread as another scope modifier. Promote the existing private `cli::EnvSource` enum to public API and use it at this boundary so the parameter is self-documenting. The semantic mapping holds at both layers: `Lite` selects the cli `run_smart` entry point (try lite env, retry with captured on env- sensitive failure) and `Captured` selects `run` (captured env, used by mutating ops and refresh). Pure readability refactor — no behavioural change. Tests in `git::state_tests` continue to pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Matt Toohey <contact@matttoohey.com>
…ing drift `is_missing_ref_error` is the only thing keeping unpublished-branch `compute_upstream_state` calls off the captured-env retry path that 2158795 set out to fix. Its single substring check (`"Needed a single revision"`) breaks two ways: (a) git localizes stderr via gettext, so any non-English `LANG`/`LC_*` silently flips the match off, and (b) git could rename the literal across versions. Either restores the per-branch ~8.5s `$SHELL -ils` stall on first paint. The `"not a git repository"` parse one frame up has the same fragility. Exit codes don't discriminate — git uses 128 for almost every fatal error (ref-not-found, hook failure, lock contention, smudge failure), so plumbing exit status into `GitError::CommandFailed` can't tell "ref missing (don't retry)" from "env-sensitive failure (do retry)". Two layers, both small: * `apply_env`: pin `LC_ALL=C` + `LANG=C` on every spawned git process via a new `pin_c_locale` helper, called after each of the three branches (Lite / Captured / fallback) in both `#[cfg(not(test))]` and `#[cfg(test)]` variants. Applied last so it overrides whatever locale the captured shell snapshot or host env had set. `LANG` is belt-and-suspenders ahead of `LC_ALL` per POSIX. This collapses the localization risk for both `is_missing_ref_error` and the `"not a git repository"` parse, and any future stderr matching lands on the same deterministic substrate. * `is_missing_ref_error`: broaden from a single string to a 4-pattern union spanning `rev-parse --verify` ("Needed a single revision"), plain `rev-parse` ("unknown revision or path"), `merge-base` / `cat-file` ("Not a valid object name"), and `rev-list` / `rev-parse` alt path ("bad revision"). All four are unambiguously "the named ref doesn't resolve" and none are fixable by swapping envs, so adding them to the skip-retry set is strictly correct. The current `merge-base` / `rev-list` foreground call sites are gated behind a successful `resolve_ref` today, so the new patterns shouldn't fire — but future-proofing them costs nothing. Tests: * Three new sibling tests for the added patterns (`retry_predicate_skips_unknown_revision_error`, `…_not_a_valid_object_name_error`, `…_bad_revision_error`). * `retry_predicate_retries_unrecognized_command_failed` locks in that non-ref-shape stderr (e.g. `"fatal: unable to access something"`) still takes the captured-env retry — guards against the broadened pattern set drifting wide. * Existing `run_smart_returns_not_a_repo_for_non_repo_path` still passes under `LC_ALL=C` (verified locally), confirming the C-locale stderr for that case still contains `"not a git repository"`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Matt Toohey <contact@matttoohey.com>
The `[build_branch_timeline]`, `[refresh_branch_git_state]`, `[compute_branch_git_state] phase1/phase2`, and `[shell_env_cache] cache miss` `log::info!` lines were load-bearing during the kgoose first-paint investigation (see this branch's dev logs) but at `info` they dominate production output — every branch-card render emits 4–6 lines, and the shell-env miss log fires per cold-cache project visit. The kgoose follow-up perf work from this branch has landed, so the instrumentation has served its purpose. Also removes the matching uncommitted `console.info` timing lines in `BranchCard.svelte` (sync hydrate, cache-served, getBranchTimeline, loadTimeline TOTAL) and `ProjectHome.svelte` (loadAll, listProjects, per-project hydrate, listActionContexts, loadData TOTAL) that mirrored the backend instrumentation on the frontend side. No behavioural change; the underlying compute is untouched. Drops the `kind`/`total_start`/`*_start` locals introduced solely for the log lines and removes the now-unused `Instant` import from `timeline.rs`. `cargo check` clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Matt Toohey <contact@matttoohey.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 596d51289e
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| ); | ||
| } | ||
| } | ||
| Ok(()) |
There was a problem hiding this comment.
Return an error when legacy worktree migration is partial
run_pending records a migration as complete whenever run returns Ok(()), but this migration logs failures and still unconditionally returns success. If any rename/read/remove step fails transiently (permissions, lock, IO), the registry will mark legacy-worktrees-layout done and future startups will skip retries, leaving some worktrees stranded in the legacy location permanently.
Useful? React with 👍 / 👎.
| GitError::CommandFailed(msg) => !is_missing_ref_error(msg), | ||
| GitError::NotARepo(_) | GitError::InvalidUtf8 | GitError::InvalidPath(_) => false, |
There was a problem hiding this comment.
Skip captured-env retry for unset git config lookups
should_retry_with_captured retries every CommandFailed except missing-ref patterns, but git config user.name/user.email returns non-zero when unset and is now called on first paint via run_smart. On machines/repos without those values configured, this forces a captured-env retry (and shell capture stall) for an env-independent “key not set” case, undermining the intended first-paint latency improvement.
Useful? React with 👍 / 👎.
Summary
Stack of perf fixes targeting cold-cache first paint on kgoose-shaped projects (multi-clone, cash-server-class repos). Combined effect on a 3-repo kgoose profile: timeline first paints drop from ~9.0/9.0/12.8s to ~350ms across the board.
Root causes addressed
c4f355f8) — concurrent sync callers each spawned\$SHELL -ils(~3.5s) on first visit.InFlightHandlenow coalesces sync + async waiters on a single capture.d1305676) —getBranchTimelineWithRevalidationwas called twice per BranchCard on cold load, each bypassing the in-flight coalesce. Now coalesces when cache is empty.9eabee33,babfd4d4,4c210263) — foreground git reads block on\$SHELL -ilscapture even though they're env-independent. Newcli::run_smarttries a lite env first and retries with the captured env onGitNotFound/ LFS-smudge failures.EnvSource::{Lite, Captured}enum replaces the prioruse_lite_env: bool. PinnedLC_ALL=C+ broadenedis_missing_ref_errorto 4 patterns so locale/wording drift can't accidentally route missing-ref failures into the slow retry path.47d5a276, then592f3d98revert, then2158795bre-introduces) —git status --untracked-files=allwas ~17s on cash-server.WorktreeStatusScope::{Indexed, Full}lets foreground first paint skip untracked enumeration; refresh/pull/discard useFullso the follow-up event fills in accurate untracked counts. Telemetry showed fsmonitor doesn't carry this for cash-server-class trees, so the scope split stays.c4b6c9ac) — new one-shot migration registry (migrations.rs) flipscore.fsmonitor+core.untrackedcacheon every existing and newly-created clone (local + blox). Cached git version probe gates fsmonitor on git ≥ 2.36.cbaa52f1) — the migration sweep ran synchronously beforeStore::new, paying ~3–8s per stale clone via\$SHELL -ils. Newcli::run_litefor env-independentgit configreads/writes drops per-clone cost to ~10–20ms, and the sweep itself moves into a background thread.legacy-worktrees-layoutstays sync (downstream paths depend on it).2158795b) —cli::run_smartretried on anyCommandFailed, including missing-ref errors that no env swap can fix. Addedis_missing_ref_errorpredicate to skip retry. Also switched identity reads (user.name/user.email) andget_commits_since_basetorun_smartso they don't unconditionally pay the captured-env capture.596d5128) — drops the investigation-eralog::info!andconsole.infotiming lines now that the perf work has landed.Test plan
git statusuntracked counts populate correctly after the background refresh firesorigin/<branch>) don't pay the captured-env retrycargo checkand existinggit::state_tests/cliretry-predicate tests pass🤖 Generated with Claude Code