Skip to content

vscode: stop dual-watch dupes from chatSessions + transcripts#85

Merged
KaluJo merged 4 commits into
mainfrom
fix/vscode-dual-watch-dedup
May 18, 2026
Merged

vscode: stop dual-watch dupes from chatSessions + transcripts#85
KaluJo merged 4 commits into
mainfrom
fix/vscode-dual-watch-dedup

Conversation

@KaluJo
Copy link
Copy Markdown
Collaborator

@KaluJo KaluJo commented May 18, 2026

Summary

VS Code Copilot Chat 0.45+ writes the same conversation to both
.../GitHub.copilot-chat/transcripts/<sid>.jsonl (legacy) and
.../chatSessions/<sid>.jsonl (modern) during the upgrade window. The
watcher registered notify watches on both directories, dispatched each
prompt through two parsers, and emitted two rows whose captured_at
strings differed only in formatting — so the V2 content hash diverged
and the unique constraint on prompts.content_hash couldn't dedup
them. Result: bundle review renders the same prompt twice (e.g.
prompts 8 + 9 both showing "the water missed the seed / 3 tools" on
the same session).

Two layers of fix:

  • Prefer chatSessions/ when both exist. WorkspaceMatch.transcript_dir
    is now Option<PathBuf> and is None when chat_sessions_dir
    already exists on disk at scan time. The three watch sites in
    watcher.rs (run, rescan_workspaces, handle_new_dir) skip the
    legacy watch when the option is None, and process_file adds a
    runtime guard for the upgrade-window edge case where the legacy
    notify watch was registered before chatSessions/ appeared.
    find_workspace resolves the hash dir via
    chat_sessions_dir.parent() so it no longer depends on
    transcript_dir being populated.
  • Normalise captured_at in the V2 hash. prompt_content_hash_v2
    (and therefore prompt_id_v2) now parses the timestamp via
    chrono::DateTime::parse_from_rfc3339 and reformats to canonical
    UTC ISO with SecondsFormat::Millis before hashing. Unparseable
    strings fall through unchanged so any pre-existing v2 hashes
    computed over non-RFC3339 inputs stay stable.

DB cleanup for already-double-written rows ships separately in the
functions repo (20260518000000_dedupe_vscode_prompts.sql).

Test plan

  • cargo test --workspace — 106 + 4 + 1 + 1 tests pass, including
    the new v2_hash_normalizes_captured_at_format covering
    millis-vs-micros, Z-vs-offset parity, and parse-failure fallback.
  • cargo clippy --all-targets -- -D warnings clean.
  • cargo fmt --check clean.
  • Watcher unit test for the dual-dir scan decision skipped:
    scan_workspaces depends on dirs::home_dir() and projects::load(),
    and stubbing both is more plumbing than the fix is worth. The
    runtime guard in process_file plus the V2 hash test give enough
    coverage in practice.
  • After this lands, run the cleanup migration in the functions
    repo against the live DB.

Made with Cursor

VS Code Copilot Chat 0.45+ writes the same conversation to both the
legacy `GitHub.copilot-chat/transcripts/<sid>.jsonl` and the modern
`chatSessions/<sid>.jsonl` during the upgrade window. The watcher used
to register notify watches on both directories, dispatch each prompt
through two parsers, and emit two prompt rows whose `captured_at`
strings differed only in formatting — different formatting → different
content_hash → no dedup against the unique constraint → bundle review
renders the same prompt twice (e.g. prompts 8 and 9 both showing the
same text + tool count on the same session).

Two layers of fix:

1. Prefer chatSessions/ when both exist. `WorkspaceMatch.transcript_dir`
   becomes `Option<PathBuf>` and is `None` whenever `chat_sessions_dir`
   already exists on disk at scan time. The watcher's three watch sites
   (`run`, `rescan_workspaces`, `handle_new_dir`) skip the legacy watch
   when the option is `None`, and `process_file` adds a runtime guard
   for the upgrade-window edge case where the legacy notify watch was
   registered before `chatSessions/` appeared. `find_workspace` now
   resolves the hash directory via `chat_sessions_dir.parent()` so it
   no longer relies on `transcript_dir` being populated.

2. Normalise `captured_at` in the V2 hash. `prompt_content_hash_v2`
   (and therefore `prompt_id_v2`) now parses the timestamp via
   `chrono::DateTime::parse_from_rfc3339` and reformats to canonical
   UTC ISO with `SecondsFormat::Millis` before hashing. Unparseable
   strings fall through unchanged so any pre-existing v2 hashes
   computed over non-RFC3339 inputs stay stable. New unit test covers
   the millis-vs-micros / Z-vs-offset parity plus the parse-failure
   fallback.

The DB cleanup for the rows already double-written before this fix
ships in `functions/supabase/migrations/20260518000000_dedupe_vscode_prompts.sql`.

No skip on the watch-site unit test: scan_workspaces depends on
`dirs::home_dir()` and `projects::load()`, and stubbing both would
take more plumbing than the fix is worth. The runtime guard in
`process_file` plus the V2 hash test give enough coverage.

Co-authored-by: Cursor <cursoragent@cursor.com>
KaluJo added a commit to pcr-developers/functions that referenced this pull request May 18, 2026
VS Code Copilot Chat 0.45+ wrote the same conversation to both the
legacy `transcripts/<sid>.jsonl` and the modern `chatSessions/<sid>.jsonl`
during the upgrade window, and the CLI watcher dispatched each prompt
through two parsers whose `captured_at` formatting differed (raw VS
Code string vs `SecondsFormat::Millis`). Different formatting → different
content_hash → no dedup against `prompts.content_hash` UNIQUE → bundle
review renders the same prompt twice.

The CLI fix landed in `pcr-developers/cli#85` (prefer chatSessions/
when both exist + normalise `captured_at` in `prompt_content_hash_v2`).
This migration cleans up the rows already written before that ships.

It partitions vscode-source rows by (session_id, prompt_text), keeps
the OLDEST in each group (almost always the one with the complete
streaming-emitted tool_calls / response_text), and deletes the rest.
A pre-DELETE DO block raises a NOTICE with the duplicate count for
operator visibility. Wrapped in BEGIN/COMMIT and idempotent — re-running
on a deduped table is a no-op.

Run after the CLI fix is deployed to user machines:
  supabase db push   # from /functions

Co-authored-by: Cursor <cursoragent@cursor.com>
KaluJo and others added 3 commits May 18, 2026 02:43
`pcr show <draft>` was printing file_context.changed_files and
file_context.relevant_files with surrounding quotes (`"src/foo.rs"`)
because the printer used `format!("{}", value)` on `serde_json::Value`,
which serializes string nodes with their JSON quoting. Extract the
inner `as_str()` first so paths render bare.

Co-authored-by: Cursor <cursoragent@cursor.com>
`truncate_diff` (push) and `get_git_diff` (shared) sliced raw bytes at
the 50 KB cap. When the cap landed inside a multi-byte UTF-8
codepoint — non-ASCII filenames, accented identifiers in a hunk —
`&str[..MAX]` panicked with "byte index N is not a char boundary".
Walk back to the nearest boundary with a small `floor_char_boundary`
helper (re-implementation of the unstable `str` method) and reuse it
from both call sites. Covered by three unit tests.

Co-authored-by: Cursor <cursoragent@cursor.com>
`_unused_project` in `commands/project_context.rs` was an orphaned
`pub fn` (and the only reason `Project` was imported there). Its
companion `let _ = sql;` discard in `store::diff_events::
get_diff_events_in_window` was likewise leftover from an earlier
refactor — the SQL strings can stay inlined in their branches.
No behavior change.

Co-authored-by: Cursor <cursoragent@cursor.com>
@KaluJo KaluJo merged commit 7d94e90 into main May 18, 2026
2 checks passed
@KaluJo KaluJo deleted the fix/vscode-dual-watch-dedup branch May 18, 2026 14:49
KaluJo added a commit that referenced this pull request May 18, 2026
Adds a best-effort background update check that prints a soft
"X is available — run: brew upgrade pcr" notice at the end of every
interactive `pcr` command when a newer `pcr-dev` ships on npm.

Modeled on the `update-notifier` npm package and `cargo`'s own
behaviour:

- Background thread fires off a 3-second-timeout GET against
  https://registry.npmjs.org/pcr-dev/latest at the *start* of the
  command, runs concurrently with the command itself, and writes the
  result to `~/.pcr-dev/update-check.json` regardless of whether the
  foreground command has exited. The thread is intentionally not
  joined — network failures, captive portals, slow DNS never delay
  the user's primary signal.
- At the *end* of the command, the cached file is read and a one-line
  notice is printed to stderr if (a) the cached version is greater
  than `CARGO_PKG_VERSION` under naive `major.minor.patch` semver,
  and (b) we haven't shown the notice in the last hour (so back-to-
  back `pcr log; pcr show` doesn't double-print).
- Suggested upgrade command is install-method-aware: inspects
  `current_exe()` for `/Cellar/`, `/opt/homebrew/`, or `/node_modules/`
  and prints `brew upgrade pcr`, `npm i -g pcr-dev@latest`, or a
  generic `https://pcr.dev/install` link respectively.
- Hard skips: `--json` output, the hidden `hook` + `mcp` subcommands
  (they're stdio JSON-RPC / Stop-hook channels), `CI=*` env, and
  `PCR_NO_UPDATE_CHECK=1` for users who want to opt out.

6 new unit tests cover semver comparison (including prerelease
suffixes), forward-compatible cache deserialisation (older payloads
without `last_notice_unix` decode cleanly), and the quiet-subcommand
skip list. Workspace tests: 134 passed (was 128 on this main baseline).

Also adds CHANGELOG.md (none previously existed in this repo) seeded
with an Unreleased section that catalogues this feature plus the
in-flight fixes from PRs #85 (vscode dual-watch dedup) and #86
(watcher correctness + perf) so they have a single grep-able home
when those PRs merge.

Independent of PR #85 / #86 — touches `lib.rs`, `entry.rs`, and a
new file. Safe to merge in any order.

Made with [Cursor](https://cursor.com)

Co-authored-by: Cursor <cursoragent@cursor.com>
KaluJo added a commit that referenced this pull request May 18, 2026
…md (#87)

* feat: "newer version available" notice on pcr runs + start CHANGELOG.md

Adds a best-effort background update check that prints a soft
"X is available — run: brew upgrade pcr" notice at the end of every
interactive `pcr` command when a newer `pcr-dev` ships on npm.

Modeled on the `update-notifier` npm package and `cargo`'s own
behaviour:

- Background thread fires off a 3-second-timeout GET against
  https://registry.npmjs.org/pcr-dev/latest at the *start* of the
  command, runs concurrently with the command itself, and writes the
  result to `~/.pcr-dev/update-check.json` regardless of whether the
  foreground command has exited. The thread is intentionally not
  joined — network failures, captive portals, slow DNS never delay
  the user's primary signal.
- At the *end* of the command, the cached file is read and a one-line
  notice is printed to stderr if (a) the cached version is greater
  than `CARGO_PKG_VERSION` under naive `major.minor.patch` semver,
  and (b) we haven't shown the notice in the last hour (so back-to-
  back `pcr log; pcr show` doesn't double-print).
- Suggested upgrade command is install-method-aware: inspects
  `current_exe()` for `/Cellar/`, `/opt/homebrew/`, or `/node_modules/`
  and prints `brew upgrade pcr`, `npm i -g pcr-dev@latest`, or a
  generic `https://pcr.dev/install` link respectively.
- Hard skips: `--json` output, the hidden `hook` + `mcp` subcommands
  (they're stdio JSON-RPC / Stop-hook channels), `CI=*` env, and
  `PCR_NO_UPDATE_CHECK=1` for users who want to opt out.

6 new unit tests cover semver comparison (including prerelease
suffixes), forward-compatible cache deserialisation (older payloads
without `last_notice_unix` decode cleanly), and the quiet-subcommand
skip list. Workspace tests: 134 passed (was 128 on this main baseline).

Also adds CHANGELOG.md (none previously existed in this repo) seeded
with an Unreleased section that catalogues this feature plus the
in-flight fixes from PRs #85 (vscode dual-watch dedup) and #86
(watcher correctness + perf) so they have a single grep-able home
when those PRs merge.

Independent of PR #85 / #86 — touches `lib.rs`, `entry.rs`, and a
new file. Safe to merge in any order.

Made with [Cursor](https://cursor.com)

Co-authored-by: Cursor <cursoragent@cursor.com>

* update_check: handle pcr_dir() -> Result<PathBuf> from #86

PR #86 changed `config::pcr_dir()` to return `Result<PathBuf>` so
that auth + SQLite + watcher state never silently fall back to
`/tmp` when neither `$HOME` nor `%USERPROFILE%` resolves. The
update-notifier is best-effort, so it absorbs the Err the same way
it absorbs every other failure: collapse to `None` and silently
skip the cache operation. The foreground command never sees the
error.

Cache helper signatures:
  cache_path() -> PathBuf            // before
  cache_path() -> Option<PathBuf>    // after

load_cache + save_cache add a `let-else` guard at the top. Tests
unchanged — all 6 update_check::tests still pass, workspace 153
passing / 0 failing.

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
@KaluJo KaluJo mentioned this pull request May 18, 2026
KaluJo added a commit that referenced this pull request May 18, 2026
Bumps the workspace to 0.3.0 — the 0.x minor (rather than 0.2.10
patch) is motivated by the breaking signature change to
`pcr_core::config::pcr_dir()` from #86 (returns `Result<PathBuf>`
instead of `PathBuf`). The CLI surface (`pcr <cmd>` flags / exit
codes / output format) is unchanged.

Version touchpoints:

  * `Cargo.toml` workspace.package.version → 0.3.0
  * `crates/pcr-napi/package.json` version + all 4 optionalDependencies
  * `crates/pcr-napi/npm/{darwin-arm64,darwin-x64,linux-x64-gnu,
    win32-x64-msvc}/package.json` versions
  * `README.md` TUI mock version stamp
  * `CHANGELOG.md` `[Unreleased]` promoted to `[0.3.0] — 2026-05-18`
    with full release notes catalogued by PR (#85, #86, #87, #88)
    and grouped Added / Changed / Fixed / Tests.

Workspace verification:

  * `cargo fmt --all --check` clean
  * `cargo clippy --workspace --all-targets -- -D warnings` clean
  * `cargo test --workspace` — 153 passing, 0 failing (was 128 on
    the v0.2.9 baseline; +25 across the 4 merged PRs)
  * `cargo build -p pcr-cli --release` → `pcr 0.3.0 (rust)`

After this lands on `main`, the release commit is tagged `v0.3.0`
locally and pushed; that triggers the release workflow which
publishes npm + builds binaries + dispatches the homebrew formula
update.

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant