From 343ba6228d184a8999a42e485782245ade05e79c Mon Sep 17 00:00:00 2001 From: atheate Date: Thu, 28 May 2026 13:27:16 +0200 Subject: [PATCH 1/3] [Update] /implement-extensions-batch to spawn one team for the whole batch Replaces the per-file 4N-agent spawn pattern with a single named team (researcher, implementer, tester, reviewer) addressable via SendMessage. Each agent handles all N files in the batch in its own context; role boundaries and ACLs are preserved per file kind. Cuts live-agent count from 4N to 4 regardless of batch size and eliminates the per-file prompt-overhead duplication that scaled linearly with N. Co-Authored-By: Claude Opus 4 (1M context) --- .../commands/implement-extensions-batch.md | 332 ++++++++++++------ .claude/team-templates/extension-impl.md | 207 +++++++++++ 2 files changed, 432 insertions(+), 107 deletions(-) diff --git a/.claude/commands/implement-extensions-batch.md b/.claude/commands/implement-extensions-batch.md index 18a47a58..f7057d88 100644 --- a/.claude/commands/implement-extensions-batch.md +++ b/.claude/commands/implement-extensions-batch.md @@ -5,10 +5,15 @@ argument-hint: [ ...] (2–6 Extension file na # /implement-extensions-batch -Apply the **existing `/implement-extensions` 4-role team workflow** across N -files (`$ARGUMENTS`) in one run. The team itself is unchanged — researcher, -implementer, tester, reviewer per file. What this command adds on top of the -single-file flow: +Run **ONE** 4-role team (researcher, implementer, tester, reviewer) over N +files (`$ARGUMENTS`) in a single batch. Each agent handles **all N files** in +its own context — not one team per file. Role boundaries are unchanged: the +implementer still cannot touch test fixtures, the tester still cannot touch +production files, the reviewer is still read-only. What changes is the ACL +size: each role's allowed-write set is now the N files of its kind in the +batch instead of one. + +What this command does on top of the single-file flow: 1. **Pre-flight validation** of every file + its GitHub issue, before any state change. @@ -17,14 +22,18 @@ single-file flow: upstream tracking set so the user can immediately open a pull request after committing. 3. **Assigns every related GitHub issue to the invoking user** (`@me`). -4. **Parallelises agent spawns across files** wherever their target files are - disjoint. -5. **Single consolidated regression sweep** instead of one per file. +4. **Spawns one team via `TeamCreate` + 4 named agents.** Researcher first, + then implementer + tester in parallel, then reviewer. Iterative fixes route + via `SendMessage` to the same named agent — no fresh spawns. +5. **Single consolidated regression sweep** dispatched to the still-running + tester. 6. **Loops the issue-checklist sync** per file at the end. The team template (role prompts) at `.claude/team-templates/extension-impl.md` -(v2, repo-tracked) is the source of truth for the per-file behaviour. This -command body is the batch orchestration glue. +is the source of truth for both the single-file role prompts AND the +batch-mode addenda. This command body is the batch orchestration glue. + +**Total live-agent count: 4** (vs. 4N in the previous per-file design). ## Path conventions @@ -40,7 +49,7 @@ Repo-relative with forward slashes throughout. Tools that require absolute paths - The researcher notes file per batch member: `.team-notes/-extensions-spec.md`. - Sibling test fixtures whose `Throws.TypeOf()` assertions now fail because one of the batch's implementations unblocked them (consolidated - regression sweep, see step 10). + regression sweep, see step 12). **MUST NOT modify** the same things the single-file command refuses to touch: other production files in `SysML2.NET/Extend/` or `SysML2.NET/Core/`, @@ -54,30 +63,32 @@ batch. ## Pre-flight: detect orchestrator plan mode If the orchestrator session is itself in **plan mode** at the moment -`/implement-extensions-batch` runs, spawned sub-agents (researchers, implementers, -testers, reviewers) inherit that state. The Agent tool's `mode: "acceptEdits"` -parameter does NOT override the inherited plan-mode state on the current Claude -Code build — sub-agents will respect the `` that declares plan -mode and refuse to apply any edits, even though their prompts tell them to. - -Symptom: every sub-agent reports "ready to execute on exit from plan mode" and -writes its work to its own per-agent plan file at +`/implement-extensions-batch` runs, the four named teammates (researcher, +implementer, tester, reviewer) inherit that state. The Agent tool's +`mode: "acceptEdits"` parameter does NOT override the inherited plan-mode +state on the current Claude Code build — sub-agents will respect the +`` that declares plan mode and refuse to apply any edits, +even though their prompts tell them to. + +Symptom: each named teammate reports "ready to execute on exit from plan +mode" and writes its work to its own per-agent plan file at `C:\Users\\.claude\plans\-agent-.md` instead of to the -target file. +target files. -Before spawning any sub-agent, check whether plan mode is active in the +Before spawning the team, check whether plan mode is active in the orchestrator session. If it is: 1. **Stop** and surface the situation to the user with `AskUserQuestion`. Two options: - **Exit plan mode first** (user toggles their harness off plan mode, then re-invokes the command). Cleanest. - - **Proceed in degraded mode**: spawn researchers as normal (they only need - to write to `.team-notes/`, which the orchestrator can copy into place - from their per-agent plan files if blocked). For Phase IT (implementers + - testers) and Phase RV (reviewers), the orchestrator applies the - production / test edits itself, reading each implementer's plan file to - extract the verbatim code. Reviewers can still run read-only. + - **Proceed in degraded mode**: spawn the researcher as normal (it only + needs to write to `.team-notes/`, which the orchestrator can split out + of its per-agent plan file if blocked). For Phase IT (implementer + + tester) and Phase RV (reviewer), the orchestrator applies the + production / test edits itself, reading each agent's plan file to + extract the verbatim per-file code blocks. Reviewer can still run + read-only. 2. If the user picks degraded mode, set an internal `PLAN_MODE_DEGRADED=true` flag for the run and follow the per-phase divergences in the **Notes for @@ -134,10 +145,30 @@ For each file in the (possibly reduced) batch: enclosing methods. If 0, drop the file from the batch and inform the user (already implemented). - Apply the same complexity-grading rubric from `/implement-extensions` step 3.5 - to that file's method list. Record `(complexity, per_role_model_picks)`. + to that file's method list. Record `(complexity, per_file_per_role_model_picks)`. If the batch becomes empty after pruning, abort cleanly. +### 3.5. Batch-wide model rollup (one model per role for the whole team) + +Each role is spawned ONCE for the whole batch, so the orchestrator picks ONE +model per role using the **worst-case complexity across the batch**: + +- If any file in the batch is graded `complex` → that role's complex-tier model. +- Else if any file is graded `standard` → that role's standard-tier model. +- Else (all `trivial`) → that role's trivial-tier model. + +Rationale: the per-role model is fixed at spawn time and applies uniformly to +every file the agent edits in that run. Picking a tier weaker than the worst +file would under-equip the agent on that file. The user can still override +"all Sonnet" / "all Opus" / "Custom" at the step-5 sanity check. + +Per-role asymmetry across roles is still encouraged (e.g. trivial impl across +all files but a heavy regression-sweep load → Opus tester). + +Record the rolled-up picks as +`(batch_researcher_model, batch_implementer_model, batch_tester_model, batch_reviewer_model)`. + ### 4. Pre-flight git checks - `git status --porcelain` must be empty. Refuse to proceed otherwise — the user @@ -148,13 +179,16 @@ If the batch becomes empty after pruning, abort cleanly. Use `AskUserQuestion` to present: -- The final batch composition (files + issues + complexity + per-role model - picks per file). +- The final batch composition (files + issues + per-file complexity grade for + transparency). +- The **batch-wide rolled-up picks** from step 3.5: + `researcher=, implementer=, tester=, reviewer=`. - The proposed branch name (see step 6). - Questions: 1. **Proceed with this batch composition?** (Yes / No / drop specific files) - 2. **Use the per-file dynamic model selection?** (Yes / override "all Sonnet" - / "all Opus" / custom) + 2. **Use the rolled-up batch-wide model selection?** (Yes / override "all + Sonnet" / "all Opus" / custom — Custom lets the user override individual + roles) If user picks "drop specific files" or overrides models, apply and re-confirm. @@ -207,41 +241,82 @@ Idempotent — re-assigning is a no-op on `gh`. Report success/failure per issue on failure, log and continue (an unassignable issue is not a blocker for the implementation itself). -### 8. Phase R — Spawn researchers in parallel +### 8. Create the named team (one-shot, before any Phase) + +``` +TeamCreate({team_name: "batch-extensions-impl-"}) +``` + +where `` is the same dashed-issue-number string used in the +branch name from step 6 (deduplicates across concurrent batches). The team +hosts the four named teammates spawned in the following phases. + +Set `{{ORCHESTRATOR_NAME}}` = your own orchestrator identifier (the value +named teammates SendMessage back to). The named teammates remain addressable +for the rest of the run; iterative fixes route via `SendMessage to: ""`, +not via fresh `Agent` spawns. -**One orchestrator message containing N `Agent(...)` calls** (one per file). Each: +### 9. Phase R — Spawn the single researcher +ONE `Agent(...)` call: + +- `name: "researcher"` - `subagent_type: "general-purpose"` -- `model: ` per the per-file step-3 grade (Haiku trivial / - Sonnet standard / Opus complex, or user override from step 5). +- `team_name`: the team from step 8. +- `model`: `` from step 3.5 (or user override from step 5). - Foreground (no `run_in_background`). -- Prompt: the v2 researcher prompt from `.claude/team-templates/extension-impl.md` - with that file's `{{PLACEHOLDERS}}` substituted + the file's method list. +- Prompt: the **batch-mode researcher prompt** from + `.claude/team-templates/extension-impl.md` — i.e. the single-file researcher + body PLUS the "Batch-mode operation" addendum, with `{{BATCH_FILES}}` + expanded to the numbered list of (interface, paths, method-list) tuples, + one per file in the batch. + +Wait for the researcher's `spec ready` SendMessage. Then **read each +`.team-notes/-extensions-spec.md`** yourself to verify coverage, +spec-text-only flags, stub-blocker flags. Surface ambiguities to the user +before continuing. + +The researcher agent **stays addressable** for the rest of the run — the +implementer / tester / reviewer may need a clarification on one file's OCL +later, which the orchestrator can route via `SendMessage to: "researcher"`. + +### 10. Phase IT — Spawn the single implementer + single tester in parallel -After all N return, **read each notes file** yourself to verify coverage + -spec-text-only flags + stub-blocker flags. +**One orchestrator message containing TWO `Agent(...)` calls**, both +foreground: -### 9. Phase IT — Spawn implementers + testers in parallel +1. Implementer: + - `name: "implementer"`, `team_name`: same team. + - `model`: ``. + - Prompt: batch-mode implementer prompt with `{{BATCH_FILES}}` expanded. +2. Tester: + - `name: "tester"`, `team_name`: same team. + - `model`: ``. + - Prompt: batch-mode tester prompt with `{{BATCH_FILES}}` expanded AND + the **parallel-mode caveat** clearly stated (`dotnet build` only; + MUST NOT run `dotnet test`). -**One orchestrator message containing 2N `Agent(...)` calls** — one implementer -+ one tester per file. All foreground. +The two agents edit disjoint files (N production files vs. N test fixtures), +so concurrent edits are safe. -Each implementer prompt is the v2 implementer prompt with the file's placeholders -+ the **parallel-mode caveat** clearly stated (see template). Each tester prompt -is the v2 tester prompt with the same caveat — they MUST run `dotnet build` only -and MUST NOT run `dotnet test` (production lacks parallel-turn edits in their -disk view). +Wait for both `dev complete` and `tests complete` SendMessages. -### 10. Phase V — Orchestrator verification (sequential) +### 11. Phase V — Orchestrator verification (sequential) -After all 2N agents return, run sequentially in the orchestrator turn: +Run sequentially in the orchestrator turn: 1. **One build of production**: ```bash dotnet build SysML2.NET/SysML2.NET.csproj --nologo --verbosity quiet ``` - On failure, identify which file's production diff caused it, re-dispatch - that file's implementer. Iterate. + On failure, attribute the error to its source file(s) by reading the + compiler output, then: + ``` + SendMessage({to: "implementer", message: "Build failed; fix in place. + Files + errors: …"}) + ``` + The same implementer agent retains its context and patches the diffs. + Iterate until the build is green. 2. **One consolidated targeted test run**, OR-joining every fixture in the batch: ```bash @@ -249,12 +324,17 @@ After all 2N agents return, run sequentially in the orchestrator turn: --filter "FullyQualifiedName~ExtensionsTestFixture|FullyQualifiedName~ExtensionsTestFixture|..." \ --nologo --verbosity quiet ``` - For each failure, attribute it to the correct file: - - OCL mistranslation in production → re-dispatch THAT file's implementer. - - Wrong test assertion → re-dispatch THAT file's tester. + For each failure, attribute it and route via `SendMessage`: + - **OCL mistranslation in production** → + `SendMessage({to: "implementer", message: "(file, method, + observed-vs-expected). Fix in place."})` + - **Wrong test assertion** → + `SendMessage({to: "tester", message: "(file, method, + observed-vs-expected). Fix in place."})` + Both agents keep their context; do NOT spawn fresh per-fix Agent calls. Iterate until 0 failures across the batch. -### 11. Phase S — Consolidated regression sweep +### 12. Phase S — Consolidated regression sweep ```bash dotnet test SysML2.NET.sln --no-build --nologo --verbosity quiet @@ -264,24 +344,44 @@ For each `Expected: But was: no exception` failure, identify which file in the batch unblocked it (grep the failing test for `For Later: depends on …` references; or trace by the targeted stub's signature). -Dispatch regression-sweep testers (Sonnet by default) per touched sibling -fixture, in parallel when the sibling fixtures are disjoint. Use the -**expand-don't-replace** brief from `/implement-extensions` step 8 (filter -discrimination + predicate completeness + owned vs inherited + null-projection -guard). +Send the consolidated brief to the still-running tester: + +``` +SendMessage({to: "tester", message: "Regression sweep brief. For each +sibling fixture below, expand-don't-replace per the four-axis checklist +(filter discrimination + predicate completeness + owned vs inherited + +null-projection guard). Sibling fixtures and their failing tests: +- SysML2.NET.Tests/Extend/ExtensionsTestFixture.cs: + failing tests: [...], exercising production OCL: ..."}) +``` + +The tester's ACL extends to those sibling fixtures for this dispatch only. Iterate until the full solution test run is 0 failures. -### 12. Phase RV — Reviewers in parallel +### 13. Phase RV — Spawn the single reviewer + +ONE `Agent(...)` call: + +- `name: "reviewer"`, `team_name`: same team, `model: `. +- Prompt: batch-mode reviewer prompt with `{{BATCH_FILES}}` expanded. + +The reviewer walks each file's `(notes, production, tests + any +regression-swept sibling fixtures touched by this batch)` triple and returns +one `OK` or `NEEDS FIX` verdict with per-file findings grouped. -**One orchestrator message containing N `Agent(...)` calls** — one reviewer per -file. Each scoped to ONE file's `(notes file, production file, test fixture, -regression-swept sibling tests that this file's implementation touched)`. +For each NEEDS-FIX finding: +- Production-code finding → + `SendMessage({to: "implementer", message: "(file, line, concern). Fix."})` +- Test finding → + `SendMessage({to: "tester", message: "(file, line, concern). Fix."})` -For each "NEEDS FIX" verdict, dispatch the implementer or tester for that file -back. Other files' results are still reported in the final summary. +Re-run Phase V's build + consolidated targeted test after fixes, then +re-dispatch the reviewer if its prior verdict was NEEDS FIX: +`SendMessage({to: "reviewer", message: "Findings actioned; please +re-verify."})`. The reviewer keeps its context. -### 13. Phase IS — Issue checklist sync (sequential, looped) +### 14. Phase IS — Issue checklist sync (sequential, looped) For each `(file, issue_number)` pair in the batch, run the **identical** step-11 logic from `/implement-extensions`: @@ -290,14 +390,14 @@ step-11 logic from `/implement-extensions`: 2. Locate `### Checklist` section. 3. Enumerate `Compute*` methods in the production file. Tick the ones whose bodies no longer throw `NotSupportedException` AND whose `Verify{Method}` - passed in step 10's last `dotnet test` run. + passed in step 11's last `dotnet test` run. 4. Append any new methods (signature not present in existing checklist) in declaration order. 5. Stitch new body (touch ONLY the Checklist section). 6. Push via `gh issue edit --body-file `. 7. Re-fetch + diff to verify only the Checklist section changed. -### 14. Final summary +### 15. Final summary Print to the user: @@ -335,61 +435,79 @@ Print to the user: | Ambiguous issue | Step 2 | `AskUserQuestion` for an explicit issue number per file. | | Dirty working tree | Step 4 | Abort, ask user to commit/stash. | | Branch already exists | Step 6 | Abort; ask user to pick a different batch or delete the stale branch. | -| `git push -u origin ` fails (network, auth, branch protection) | Step 6 | Log + continue (non-blocking; surface clearly in step 14 final summary with a manual re-push command). | +| `git push -u origin ` fails (network, auth, branch protection) | Step 6 | Log + continue (non-blocking; surface clearly in step 15 final summary with a manual re-push command). | | `gh issue edit --add-assignee` fails for one issue | Step 7 | Log + continue (non-blocking; implementation still proceeds). | -| Production build fails after implementer | Step 10.1 | Attribute to the file, re-dispatch that implementer. | -| Targeted test fails | Step 10.2 | Attribute (OCL vs test bug), re-dispatch correct role. | -| Sibling test failure in regression sweep | Step 11 | Dispatch a regression-sweep tester per touched fixture; in scope. | -| Reviewer NEEDS FIX | Step 12 | Re-dispatch implementer or tester for that file only; other files' results still reported. | +| `TeamCreate` fails | Step 8 | Abort with a clear error; the named-teammate flow requires a live team. Do NOT silently fall back to per-file `Agent` calls — that defeats the cost-reduction goal of this command. | +| Production build fails after implementer | Step 11.1 | Attribute to the source file(s); `SendMessage to: "implementer"` with the failing file + error. Same agent, no re-spawn. | +| Targeted test fails | Step 11.2 | Attribute (OCL vs test bug); `SendMessage to: "implementer"` or `to: "tester"` with the `(file, method, observed-vs-expected)` triple. Same agent, no re-spawn. | +| Sibling test failure in regression sweep | Step 12 | `SendMessage to: "tester"` with the consolidated regression brief (all touched sibling fixtures in one dispatch). Tester's ACL extends to those fixtures for this dispatch only. | +| Reviewer NEEDS FIX | Step 13 | `SendMessage to: "implementer"` or `to: "tester"` with the per-file finding; then `SendMessage to: "reviewer"` to re-verify. Same agents throughout. | +| Named teammate becomes unresponsive (timed-out SendMessage) | Phases IT / V / S / RV | Fall back to a fresh `Agent(...)` spawn for that role only, replaying the spawn prompt PLUS the most recent context the orchestrator has (notes file paths, prior deviation reports). The team-name stays alive for the other roles. | | One file's implementation fails after branch + assignment | Any step ≥ 6 | Keep the branch; surface in final summary; user decides whether to retry via `/implement-extensions` for that single file or revert. | -| Sub-agent inherits orchestrator plan mode and refuses to edit | Phases R / IT / RV | Surface to user via `AskUserQuestion`. Either exit plan mode and retry, or proceed in degraded mode (orchestrator copies researcher specs from agent plan files into `.team-notes/`, applies implementer + tester code from agent plan files, runs reviewers as read-only). | +| Sub-agent inherits orchestrator plan mode and refuses to edit | Phases R / IT / RV | Surface to user via `AskUserQuestion`. Either exit plan mode and retry, or proceed in degraded mode (orchestrator splits the researcher's per-agent plan file into `.team-notes/-extensions-spec.md` per file, applies the implementer + tester per-file code blocks from their plan files, runs reviewer as read-only). | | Agent's `mode: "acceptEdits"` parameter does not override inherited plan mode | Phases IT / RV | Known limitation of this Claude Code build. The orchestrator must apply the edits itself in degraded mode (see "Pre-flight: detect orchestrator plan mode"). | ## Parallelism caps (orchestrator self-enforced) -- N ≤ 6 files per batch. -- Phase R: N parallel agents. -- Phase IT: 2N parallel agents (max 12 at N=6). -- Phase RV: N parallel agents. -- Regression sweep dispatch (step 11): batch parallel by touched fixture - filename; if more than 6 fixtures need expansion, serialise above 6. +- N ≤ 6 files per batch (unchanged — bounds the per-agent context size, not + the live-agent count). +- Phase R: **1** agent (the researcher). +- Phase IT: **2** agents in parallel (implementer ∥ tester), via one + orchestrator message containing two `Agent(...)` calls. +- Phase RV: **1** agent (the reviewer). +- Regression sweep dispatch (step 12): handled inside the still-running + tester via `SendMessage`. No extra spawn. +- **Total live-agent count: 4**, regardless of batch size. ## Notes for the orchestrator (you, the main agent) - The team-template role prompts at `.claude/team-templates/extension-impl.md` - are the **source of truth** for per-file behaviour. Substitute the - file-specific placeholders fresh for each agent spawn; do not let prompts - leak across files. -- All paths in agent prompts must be repo-relative with forward slashes (per - the convention used throughout the existing single-file command). -- Researcher is **mandatory** per file, even when the file has been seen before - via `/implement-extensions`. Researchers are cheap and produce the contract - the implementer/tester/reviewer read. -- Reviewers are **mandatory** per file — cheap insurance against subtle OCL - mistranslation. Even when the file is trivial spec-text-only. + are the **source of truth**. The batch-mode addendum inside each role's + prompt block is what activates one-team-for-all-files behaviour — it is + the single-file prompt PLUS the "Batch-mode operation" section, with + `{{BATCH_FILES}}` expanded. +- All paths in agent prompts must be repo-relative with forward slashes. +- Researcher is **mandatory** for the batch, even when individual files have + been seen before via `/implement-extensions`. One researcher produces all + N notes files; the implementer/tester/reviewer read them. +- Reviewer is **mandatory** for the batch — cheap insurance against subtle + OCL mistranslation across all N files. - The branch and the assignments persist even on partial failure. Be explicit in the final summary about which files succeeded vs which need follow-up. - Do NOT auto-commit. The user reviews `git diff` and commits manually. - If the user supplies a single file, route them to `/implement-extensions` with the same filename rather than creating a degenerate 1-file "batch" - branch. + team. +- **Prefer SendMessage over fresh Agent spawns inside the iteration loop.** + The named teammates retain their context (notes already read, prior + decisions, OCL chains they've already traced). A fresh `Agent(...)` call + for a single-file fix throws that context away and re-pays the prompt + cost. The only reason to spawn fresh is the "named teammate becomes + unresponsive" row in the failure-handling table. - **Plan-mode degraded mode** (`PLAN_MODE_DEGRADED=true`): - - **Phase R**: researchers will write to per-agent plan files instead of - `.team-notes/`. After each returns, copy its plan file from - `C:\Users\\.claude\plans\-agent-.md` into - `.team-notes/-extensions-spec.md` and verify content matches the - schema the template expects. - - **Phase IT**: implementers + testers will write to per-agent plan files, - NOT to the production / test fixtures. The orchestrator reads each - agent's plan file, extracts the verbatim code, and applies the edits - itself via `Edit` / `Write`. Build + targeted tests still run in - Phase V. Do NOT mark Phase IT complete on the sub-agent's "ready to - execute" message alone — only after the orchestrator has applied each - file's diffs and the build is green. - - **Phase RV**: reviewers operate read-only, so plan mode does not block - them. No degradation needed. + - **Phase R**: the single researcher writes its multi-file spec to ONE + per-agent plan file under + `C:\Users\\.claude\plans\-agent-.md` instead of to + the N `.team-notes/` files. The orchestrator reads that plan file, + splits it per `## ` section, and writes each + `.team-notes/-extensions-spec.md`. Verify each split section matches + the schema in `.claude/team-templates/extension-impl.md`. + - **Phase IT**: the single implementer writes verbatim production code for + all N files to its per-agent plan file (with `## ` section markers + + code fences). Likewise the single tester for all N test fixtures. The + orchestrator reads each plan file, extracts the per-file code blocks, + and applies the edits itself via `Edit` / `Write`. Build + targeted + tests still run in Phase V. Do NOT mark Phase IT complete on a "ready + to execute" message alone — only after the orchestrator has applied + each file's diffs and the build is green. + - **Phase S (regression sweep)**: same as Phase IT — the still-running + tester writes its per-fixture expanded tests to per-agent-plan + appendices; orchestrator splits and applies. + - **Phase RV**: reviewer is read-only, so plan mode does not block it. + No degradation needed. - **Sanity check**: in degraded mode, the orchestrator does roughly 2× the - work it would in normal mode. Budget for it — do not silently fall + work it would in normal mode (it now applies the edits the sub-agents + would otherwise apply themselves). Budget for it — do not silently fall behind on Phase V verification just because Phase IT cost more turns. - The Agent tool's `mode` parameter cannot reliably escape inherited plan mode on this Claude Code build. The orchestrator MUST detect plan mode at diff --git a/.claude/team-templates/extension-impl.md b/.claude/team-templates/extension-impl.md index d8fa1a2d..b2f4dc88 100644 --- a/.claude/team-templates/extension-impl.md +++ b/.claude/team-templates/extension-impl.md @@ -53,6 +53,22 @@ Don't use this template when: | `{{METHOD_LIST}}` | Bullet list of methods to implement, in dependency order | (per-task) | | `{{ORCHESTRATOR_NAME}}` | Team-lead name for SendMessage | `team-lead` | +### Batch-mode placeholders (used by `/implement-extensions-batch`) + +When this template is instantiated by `/implement-extensions-batch`, each role +agent is spawned **once for the whole batch** instead of once per file. The +per-file slot values (`{{TARGET_INTERFACE}}`, `{{PRODUCTION_FILE}}`, +`{{TEST_FILE}}`, `{{NOTES_FILE}}`, `{{METHOD_LIST}}`, …) are replaced by a single +batch-wide placeholder: + +| Placeholder | Description | Example value | +|---|---|---| +| `{{BATCH_FILES}}` | Numbered list of files in the batch, each carrying the same per-file slot values listed above plus its individual `{{METHOD_LIST}}`. The agent walks the list sequentially in its single context. | `1. Foo — interface IFoo, prod SysML2.NET/Extend/FooExtensions.cs, tests …, notes .team-notes/foo-extensions-spec.md, methods […]\n2. Bar — …` | + +Shared placeholders (`{{REFERENCE_PRODUCTION_FILE}}`, `{{REFERENCE_TEST_FILE}}`, +`{{TEAM_NAME}}`, `{{ORCHESTRATOR_NAME}}`) are unchanged — they are the same +across every file in the batch. + ## Workflow ``` @@ -80,6 +96,48 @@ Don't use this template when: fixtures that now fail get updated as part of the same PR. ``` +### Batch-mode workflow (one team for all N files) + +When invoked through `/implement-extensions-batch`, the workflow above runs +**once for the whole batch** instead of once per file: + +``` +1. Researcher → walks {{BATCH_FILES}} and writes ONE + .team-notes/-extensions-spec.md per file IN ITS SINGLE + AGENT RUN. Reads {{REFERENCE_PRODUCTION_FILE}} and + {{REFERENCE_TEST_FILE}} ONCE up front, not per file. + +2. Implementer → walks {{BATCH_FILES}} and implements each + SysML2.NET/Extend/Extensions.cs in its single agent run. + Reads each file's notes file before editing it. ACL = + the set of production files listed in {{BATCH_FILES}}; + refuses every other path. + +3. Tester → walks {{BATCH_FILES}} and rewrites each + SysML2.NET.Tests/Extend/ExtensionsTestFixture.cs. ACL = + the set of test fixtures listed in {{BATCH_FILES}}, PLUS + (after the orchestrator's regression-sweep SendMessage) + any sibling *ExtensionsTestFixture.cs touched by the new + implementations. + +4. Reviewer → walks {{BATCH_FILES}} and applies the OCL-translation + + test-fixture checklists per file; READ-ONLY. + +5. Orchestrator → after Phases R / IT, runs ONE consolidated build, ONE + consolidated targeted test (filter joining every fixture + in the batch), then the regression sweep. On any failure, + it sends a SendMessage to the relevant named teammate + (`researcher`, `implementer`, or `tester`) with the + failing-file list — the same agent keeps its context and + fixes in place. Fresh Agent spawns are not used inside + the iteration loop. +``` + +The role boundaries are **identical** to single-file mode: the implementer +never edits test files, the tester never edits production files, the reviewer +is read-only. The only thing that changes is the size of the per-role allowed +file set (one file → N files in the batch). + ## Plan-mode-aware prompting (added 2026-05-28) If the orchestrator session is in plan mode when this template is used, sub-agents @@ -315,6 +373,33 @@ Send a SendMessage to `{{ORCHESTRATOR_NAME}}` with summary `spec ready` and a li of methods where the OCL is ambiguous, missing, or transitively depends on other stubbed extensions. +## Batch-mode operation + +When the orchestrator passes `{{BATCH_FILES}}` instead of a single +`{{NOTES_FILE}}` / `{{PRODUCTION_FILE}}` set, you are the SOLE researcher for +the whole batch. Operate as follows: + +1. **Read shared context ONCE** at the start of your run, not per file: + - `{{REFERENCE_PRODUCTION_FILE}}` (the canonical NamespaceExtensions.cs). + - The OCL operator translation idioms in the team template (already in your + prompt). + The XMI files (`Resources/KerML_only_xmi.uml`, + `Resources/SysML_only_xmi.uml`) and the spec text files are also shared — + keep a single open mental map across files; do not re-summarize each. +2. **Iterate `{{BATCH_FILES}}` sequentially.** For each entry — which carries + its own `{{TARGET_INTERFACE}}`, `{{TARGET_METACLASS_NAME}}`, + `{{SUBJECT_PARAM}}`, `{{PRODUCTION_FILE}}`, `{{NOTES_FILE}}`, and + `{{METHOD_LIST}}` — perform your single-file research workflow above and + write that entry's `{{NOTES_FILE}}`. +3. **Edit ACL = the set of `{{NOTES_FILE}}` paths listed in `{{BATCH_FILES}}`.** + Every other path is read-only. If a tool call would violate this, abort and + message `{{ORCHESTRATOR_NAME}}`. +4. **Final SendMessage** to `{{ORCHESTRATOR_NAME}}`: + `spec ready, files: N, ambiguous-OCL: {Foo: [methods], …}, + spec-text-only: {Foo: [methods], …}, stub-blocker-flagged-in: {Foo: [methods], …}`. +5. **Remain addressable** for the rest of the run. The implementer, tester, or + reviewer may SendMessage you a clarification request on one file's OCL. + Begin. ``` @@ -401,6 +486,45 @@ Send a SendMessage to `{{ORCHESTRATOR_NAME}}` with: Begin by reading `{{NOTES_FILE}}`, then the method `` blocks in {{PRODUCTION_FILE}}, then the reference template ({{REFERENCE_PRODUCTION_FILE}}), then making the edits. + +## Batch-mode operation + +When the orchestrator passes `{{BATCH_FILES}}` instead of a single +`{{PRODUCTION_FILE}}` / `{{NOTES_FILE}}` set, you are the SOLE implementer for +the whole batch. Operate as follows: + +1. **Read shared context ONCE** at the start of your run: + - `{{REFERENCE_PRODUCTION_FILE}}` (the canonical NamespaceExtensions.cs). + - The OCL operator translation table is already in your prompt. +2. **Edit ACL = the set of `{{PRODUCTION_FILE}}` paths listed in + `{{BATCH_FILES}}`.** You may Edit/Write ONLY those production files. + - Refuse every test fixture path (tester's territory). + - Refuse every other production file in `SysML2.NET/Extend/` or + `SysML2.NET/Core/` (scope-discipline rule). + - Refuse auto-generated POCOs and code-gen templates. + If a tool call would violate this, abort and message `{{ORCHESTRATOR_NAME}}`. +3. **Iterate `{{BATCH_FILES}}` in a sensible order.** Independent metaclasses + first (no cross-file references), files whose OCL transitively reads a + sibling-batch metaclass last. For each entry — which carries its own + `{{TARGET_INTERFACE}}`, `{{SUBJECT_PARAM}}`, `{{PRODUCTION_FILE}}`, + `{{NOTES_FILE}}`, and `{{METHOD_LIST}}` — read that entry's `{{NOTES_FILE}}` + first, then implement each method in its `{{PRODUCTION_FILE}}` per the + single-file workflow above. +4. **Build once at the end**: + ```bash + dotnet build SysML2.NET/SysML2.NET.csproj + ``` + Not per file — the build is shared. On failure, identify which file's + diff broke it and iterate. +5. **Final SendMessage** to `{{ORCHESTRATOR_NAME}}`: + `dev complete, files: N, order: [Foo, Bar, …], + deviations-per-file: {Foo: [...], Bar: [...]}, + upstream-stubs-touched: [methods you couldn't fully implement and how you + handled them]`. +6. **Remain addressable.** When the orchestrator's targeted-test run surfaces + an OCL mistranslation in any file's production, it will SendMessage you + the `(file, method, observed-vs-expected)` triple. Fix in place — your + context (notes already read, prior decisions) is preserved. ``` --- @@ -512,6 +636,50 @@ Send a SendMessage to `{{ORCHESTRATOR_NAME}}` with: Begin by reading `{{NOTES_FILE}}` (each method has a "Test plan" section), the production methods you need to test, the reference fixture (`{{REFERENCE_TEST_FILE}}`), the current `{{TEST_FILE}}`, then making the edits. + +## Batch-mode operation + +When the orchestrator passes `{{BATCH_FILES}}` instead of a single +`{{TEST_FILE}}` / `{{NOTES_FILE}}` set, you are the SOLE tester for the whole +batch. Operate as follows: + +1. **Read shared context ONCE** at the start of your run: + - `{{REFERENCE_TEST_FILE}}` (the canonical NamespaceExtensionsTestFixture.cs). + - NUnit conventions in `TESTING.md` at the repo root. +2. **Edit ACL = the set of `{{TEST_FILE}}` paths listed in `{{BATCH_FILES}}`.** + - Refuse every production file under `SysML2.NET/Extend/` or + `SysML2.NET/Core/` (implementer's territory and the scope-discipline rule). + - Refuse auto-generated POCOs and code-gen templates. + - **Regression-sweep extension**: once the orchestrator sends you the + regression-sweep brief (after the full-solution `dotnet test`), the ACL + extends to the listed sibling `*ExtensionsTestFixture.cs` files for that + dispatch only. + If a tool call would violate the current ACL, abort and message + `{{ORCHESTRATOR_NAME}}`. +3. **Iterate `{{BATCH_FILES}}` sequentially.** For each entry — which carries + its own `{{TARGET_INTERFACE}}`, `{{TARGET_METACLASS_NAME}}`, + `{{TEST_FILE}}`, `{{NOTES_FILE}}`, and `{{METHOD_LIST}}` — read that + entry's `{{NOTES_FILE}}` (the "Test plan" sections) and rewrite its + `{{TEST_FILE}}` per the single-file workflow above. +4. **Parallel-mode caveat is unchanged**: while you are spawned in parallel + with the implementer, you run `dotnet build` on the test project ONLY and + MUST NOT run `dotnet test`. The orchestrator runs the consolidated targeted + test after both of you return. +5. **Final SendMessage** to `{{ORCHESTRATOR_NAME}}`: + `tests complete, files: N, + stub-blocker-pattern-applied-in: {Foo: [methods], …}, + weak-populated-case-in: {Foo: [methods, reason], …}`. +6. **Remain addressable** for two follow-up dispatches: + - **Wrong assertion fix**: if the orchestrator's targeted-test run blames + a failing assertion on the test code rather than the production OCL, + it will SendMessage you the `(file, method, observed-vs-expected)` + triple. Fix in place. + - **Regression sweep**: after Phase V the orchestrator sends you the list + of sibling fixtures whose `Throws.TypeOf()` + assertions now fail. Apply the **expand-don't-replace** edit pattern + (filter discrimination + predicate completeness + owned vs inherited + + null-projection guard) per sibling fixture. The ACL extension above + applies for this dispatch. ``` --- @@ -628,6 +796,27 @@ Produce a concise report to `{{ORCHESTRATOR_NAME}}` (via SendMessage) with: Do NOT modify any code yourself. The orchestrator (or the implementer/tester re-dispatched by the orchestrator) will action your findings. + +## Batch-mode operation + +When the orchestrator passes `{{BATCH_FILES}}` instead of a single +`(notes, production, tests)` triple, you are the SOLE reviewer for the whole +batch. Operate as follows: + +1. **Edit ACL = none.** You are READ-ONLY across the entire repo — same as + single-file mode. The size of `{{BATCH_FILES}}` does not loosen this. +2. **Iterate `{{BATCH_FILES}}` sequentially.** For each entry, walk its + `(notes file, production file, test file)` triple and apply the full + single-file OCL-translation checklist + test-fixture checklist above. +3. **Final SendMessage** to `{{ORCHESTRATOR_NAME}}`: + - Top line: `OK` or `NEEDS FIX` (batch-wide verdict; NEEDS FIX wins if any + file fails). + - Body: per-file findings grouped as + `{Foo: [{file, line, concern, suggested fix}, …], + Bar: [...], …}`. + For files with no findings, list `Foo: OK`. +4. The orchestrator routes individual findings to `implementer` or `tester` + via SendMessage; you do not edit and you do not re-run. ``` --- @@ -657,6 +846,24 @@ In a fresh conversation, when the user asks to implement methods in another For low-friction invocation, use the slash command at `.claude/commands/implement-extensions.md` which automates steps 1–7. +### Instantiation for a batch (one team for N files) + +When `/implement-extensions-batch` invokes this template, the per-role prompts +are augmented with the "Batch-mode operation" section already documented inside +each role prompt above: + +1. `TeamCreate({team_name: "batch-extensions-impl-"})` once. +2. Spawn FOUR named teammates (`researcher`, `implementer`, `tester`, + `reviewer`) via `Agent` calls, each with the role's full prompt (single-file + body + Batch-mode operation addendum) and `{{BATCH_FILES}}` expanded. +3. Sequence: researcher → (implementer ∥ tester) → orchestrator verification + → regression-sweep brief via `SendMessage to: "tester"` → reviewer. +4. Iterate fixes via `SendMessage` to the named teammate — do not spawn fresh + `Agent` calls inside the iteration loop. + +See `.claude/commands/implement-extensions-batch.md` for the end-to-end +orchestration. + ## Provenance v2 distilled from the `type-extensions-impl` task (33 methods in From 5591289e1bc9dafb1c21f87d65a32996339963f8 Mon Sep 17 00:00:00 2001 From: atheate Date: Thu, 28 May 2026 14:44:31 +0200 Subject: [PATCH 2/3] Fix #111 #116 #118 #119 #130 #132 --- .../ConjugationExtensionsTestFixture.cs | 43 ++++++++++++++----- .../CrossSubsettingExtensionsTestFixture.cs | 39 ++++++++++++----- .../DifferencingExtensionsTestFixture.cs | 43 ++++++++++++++----- .../Extend/DisjoiningExtensionsTestFixture.cs | 43 ++++++++++++++----- .../FeatureChainingExtensionsTestFixture.cs | 39 ++++++++++++----- .../FeatureInvertingExtensionsTestFixture.cs | 39 ++++++++++++----- SysML2.NET/Extend/ConjugationExtensions.cs | 5 ++- .../Extend/CrossSubsettingExtensions.cs | 5 ++- SysML2.NET/Extend/DifferencingExtensions.cs | 5 ++- SysML2.NET/Extend/DisjoiningExtensions.cs | 5 ++- .../Extend/FeatureChainingExtensions.cs | 5 ++- .../Extend/FeatureInvertingExtensions.cs | 5 ++- 12 files changed, 204 insertions(+), 72 deletions(-) diff --git a/SysML2.NET.Tests/Extend/ConjugationExtensionsTestFixture.cs b/SysML2.NET.Tests/Extend/ConjugationExtensionsTestFixture.cs index df1874de..70a61182 100644 --- a/SysML2.NET.Tests/Extend/ConjugationExtensionsTestFixture.cs +++ b/SysML2.NET.Tests/Extend/ConjugationExtensionsTestFixture.cs @@ -1,38 +1,61 @@ -// ------------------------------------------------------------------------------------------------- +// ------------------------------------------------------------------------------------------------- // -// +// // Copyright 2022-2026 Starion Group S.A. -// +// // Licensed under the Apache License, Version 2.0 (the "License"); // you may not use this file except in compliance with the License. // You may obtain a copy of the License at -// +// // http://www.apache.org/licenses/LICENSE-2.0 -// +// // Unless required by applicable law or agreed to in writing, software // distributed under the License is distributed on an "AS IS" BASIS, // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // See the License for the specific language governing permissions and // limitations under the License. -// +// // // ------------------------------------------------------------------------------------------------ namespace SysML2.NET.Tests.Extend { using System; - + using NUnit.Framework; - + using SysML2.NET.Core.POCO.Core.Types; + using SysML2.NET.Core.POCO.Root.Elements; + using SysML2.NET.Core.POCO.Root.Namespaces; + using SysML2.NET.Extensions; + + using Type = SysML2.NET.Core.POCO.Core.Types.Type; [TestFixture] public class ConjugationExtensionsTestFixture { [Test] - public void ComputeOwningType_ThrowsNotSupportedException() + public void VerifyComputeOwningType() { - Assert.That(() => ((IConjugation)null).ComputeOwningType(), Throws.TypeOf()); + Assert.That(() => ((IConjugation)null).ComputeOwningType(), Throws.TypeOf()); + + var emptyConjugation = new Conjugation(); + + Assert.That(emptyConjugation.ComputeOwningType(), Is.Null); + + var owningType = new Type(); + var conjugation = new Conjugation(); + + owningType.AssignOwnership(conjugation); + + Assert.That(conjugation.ComputeOwningType(), Is.SameAs(owningType)); + + var nonTypeConjugation = new Conjugation(); + var nonTypeOwner = new Namespace(); + + ((IContainedRelationship)nonTypeConjugation).OwningRelatedElement = nonTypeOwner; + + Assert.That(nonTypeConjugation.ComputeOwningType(), Is.Null); } } } diff --git a/SysML2.NET.Tests/Extend/CrossSubsettingExtensionsTestFixture.cs b/SysML2.NET.Tests/Extend/CrossSubsettingExtensionsTestFixture.cs index 8c22229f..cfe15e65 100644 --- a/SysML2.NET.Tests/Extend/CrossSubsettingExtensionsTestFixture.cs +++ b/SysML2.NET.Tests/Extend/CrossSubsettingExtensionsTestFixture.cs @@ -1,38 +1,57 @@ -// ------------------------------------------------------------------------------------------------- +// ------------------------------------------------------------------------------------------------- // -// +// // Copyright 2022-2026 Starion Group S.A. -// +// // Licensed under the Apache License, Version 2.0 (the "License"); // you may not use this file except in compliance with the License. // You may obtain a copy of the License at -// +// // http://www.apache.org/licenses/LICENSE-2.0 -// +// // Unless required by applicable law or agreed to in writing, software // distributed under the License is distributed on an "AS IS" BASIS, // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // See the License for the specific language governing permissions and // limitations under the License. -// +// // // ------------------------------------------------------------------------------------------------ namespace SysML2.NET.Tests.Extend { using System; - + using NUnit.Framework; - + using SysML2.NET.Core.POCO.Core.Features; + using SysML2.NET.Core.POCO.Root.Elements; + using SysML2.NET.Core.POCO.Root.Namespaces; [TestFixture] public class CrossSubsettingExtensionsTestFixture { [Test] - public void ComputeCrossingFeature_ThrowsNotSupportedException() + public void VerifyComputeCrossingFeature() { - Assert.That(() => ((ICrossSubsetting)null).ComputeCrossingFeature(), Throws.TypeOf()); + Assert.That(() => ((ICrossSubsetting)null).ComputeCrossingFeature(), Throws.TypeOf()); + + var crossSubsetting = new CrossSubsetting(); + + Assert.That(crossSubsetting.ComputeCrossingFeature(), Is.Null); + + var feature = new Feature(); + + ((IContainedRelationship)crossSubsetting).OwningRelatedElement = feature; + + Assert.That(crossSubsetting.ComputeCrossingFeature(), Is.SameAs(feature)); + + var nonFeatureOwner = new Namespace(); + var nonFeatureCrossSubsetting = new CrossSubsetting(); + + ((IContainedRelationship)nonFeatureCrossSubsetting).OwningRelatedElement = nonFeatureOwner; + + Assert.That(nonFeatureCrossSubsetting.ComputeCrossingFeature(), Is.Null); } } } diff --git a/SysML2.NET.Tests/Extend/DifferencingExtensionsTestFixture.cs b/SysML2.NET.Tests/Extend/DifferencingExtensionsTestFixture.cs index 7c5f31ef..89c8a331 100644 --- a/SysML2.NET.Tests/Extend/DifferencingExtensionsTestFixture.cs +++ b/SysML2.NET.Tests/Extend/DifferencingExtensionsTestFixture.cs @@ -1,38 +1,61 @@ -// ------------------------------------------------------------------------------------------------- +// ------------------------------------------------------------------------------------------------- // -// +// // Copyright 2022-2026 Starion Group S.A. -// +// // Licensed under the Apache License, Version 2.0 (the "License"); // you may not use this file except in compliance with the License. // You may obtain a copy of the License at -// +// // http://www.apache.org/licenses/LICENSE-2.0 -// +// // Unless required by applicable law or agreed to in writing, software // distributed under the License is distributed on an "AS IS" BASIS, // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // See the License for the specific language governing permissions and // limitations under the License. -// +// // // ------------------------------------------------------------------------------------------------ namespace SysML2.NET.Tests.Extend { using System; - + using NUnit.Framework; - + using SysML2.NET.Core.POCO.Core.Types; + using SysML2.NET.Core.POCO.Root.Elements; + using SysML2.NET.Core.POCO.Root.Namespaces; + using SysML2.NET.Extensions; + + using Type = SysML2.NET.Core.POCO.Core.Types.Type; [TestFixture] public class DifferencingExtensionsTestFixture { [Test] - public void ComputeTypeDifferenced_ThrowsNotSupportedException() + public void VerifyComputeTypeDifferenced() { - Assert.That(() => ((IDifferencing)null).ComputeTypeDifferenced(), Throws.TypeOf()); + Assert.That(() => ((IDifferencing)null).ComputeTypeDifferenced(), Throws.TypeOf()); + + var emptyDifferencing = new Differencing(); + + Assert.That(emptyDifferencing.ComputeTypeDifferenced(), Is.Null); + + var owningType = new Type(); + var differencing = new Differencing(); + + owningType.AssignOwnership(differencing); + + Assert.That(differencing.ComputeTypeDifferenced(), Is.SameAs(owningType)); + + var nonTypeDifferencing = new Differencing(); + var nonTypeOwner = new Namespace(); + + ((IContainedRelationship)nonTypeDifferencing).OwningRelatedElement = nonTypeOwner; + + Assert.That(nonTypeDifferencing.ComputeTypeDifferenced(), Is.Null); } } } diff --git a/SysML2.NET.Tests/Extend/DisjoiningExtensionsTestFixture.cs b/SysML2.NET.Tests/Extend/DisjoiningExtensionsTestFixture.cs index 15969bde..53b971be 100644 --- a/SysML2.NET.Tests/Extend/DisjoiningExtensionsTestFixture.cs +++ b/SysML2.NET.Tests/Extend/DisjoiningExtensionsTestFixture.cs @@ -1,38 +1,61 @@ -// ------------------------------------------------------------------------------------------------- +// ------------------------------------------------------------------------------------------------- // -// +// // Copyright 2022-2026 Starion Group S.A. -// +// // Licensed under the Apache License, Version 2.0 (the "License"); // you may not use this file except in compliance with the License. // You may obtain a copy of the License at -// +// // http://www.apache.org/licenses/LICENSE-2.0 -// +// // Unless required by applicable law or agreed to in writing, software // distributed under the License is distributed on an "AS IS" BASIS, // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // See the License for the specific language governing permissions and // limitations under the License. -// +// // // ------------------------------------------------------------------------------------------------ namespace SysML2.NET.Tests.Extend { using System; - + using NUnit.Framework; - + using SysML2.NET.Core.POCO.Core.Types; + using SysML2.NET.Core.POCO.Root.Elements; + using SysML2.NET.Core.POCO.Root.Namespaces; + using SysML2.NET.Extensions; + + using Type = SysML2.NET.Core.POCO.Core.Types.Type; [TestFixture] public class DisjoiningExtensionsTestFixture { [Test] - public void ComputeOwningType_ThrowsNotSupportedException() + public void VerifyComputeOwningType() { - Assert.That(() => ((IDisjoining)null).ComputeOwningType(), Throws.TypeOf()); + Assert.That(() => ((IDisjoining)null).ComputeOwningType(), Throws.TypeOf()); + + var emptyDisjoining = new Disjoining(); + + Assert.That(emptyDisjoining.ComputeOwningType(), Is.Null); + + var owningType = new Type(); + var disjoining = new Disjoining(); + + owningType.AssignOwnership(disjoining); + + Assert.That(disjoining.ComputeOwningType(), Is.SameAs(owningType)); + + var nonTypeDisjoining = new Disjoining(); + var nonTypeOwner = new Namespace(); + + ((IContainedRelationship)nonTypeDisjoining).OwningRelatedElement = nonTypeOwner; + + Assert.That(nonTypeDisjoining.ComputeOwningType(), Is.Null); } } } diff --git a/SysML2.NET.Tests/Extend/FeatureChainingExtensionsTestFixture.cs b/SysML2.NET.Tests/Extend/FeatureChainingExtensionsTestFixture.cs index ac46c8a0..da462958 100644 --- a/SysML2.NET.Tests/Extend/FeatureChainingExtensionsTestFixture.cs +++ b/SysML2.NET.Tests/Extend/FeatureChainingExtensionsTestFixture.cs @@ -1,38 +1,57 @@ -// ------------------------------------------------------------------------------------------------- +// ------------------------------------------------------------------------------------------------- // -// +// // Copyright 2022-2026 Starion Group S.A. -// +// // Licensed under the Apache License, Version 2.0 (the "License"); // you may not use this file except in compliance with the License. // You may obtain a copy of the License at -// +// // http://www.apache.org/licenses/LICENSE-2.0 -// +// // Unless required by applicable law or agreed to in writing, software // distributed under the License is distributed on an "AS IS" BASIS, // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // See the License for the specific language governing permissions and // limitations under the License. -// +// // // ------------------------------------------------------------------------------------------------ namespace SysML2.NET.Tests.Extend { using System; - + using NUnit.Framework; - + using SysML2.NET.Core.POCO.Core.Features; + using SysML2.NET.Core.POCO.Root.Elements; + using SysML2.NET.Core.POCO.Root.Namespaces; [TestFixture] public class FeatureChainingExtensionsTestFixture { [Test] - public void ComputeFeatureChained_ThrowsNotSupportedException() + public void VerifyComputeFeatureChained() { - Assert.That(() => ((IFeatureChaining)null).ComputeFeatureChained(), Throws.TypeOf()); + Assert.That(() => ((IFeatureChaining)null).ComputeFeatureChained(), Throws.TypeOf()); + + var featureChaining = new FeatureChaining(); + + Assert.That(featureChaining.ComputeFeatureChained(), Is.Null); + + var feature = new Feature(); + + ((IContainedRelationship)featureChaining).OwningRelatedElement = feature; + + Assert.That(featureChaining.ComputeFeatureChained(), Is.SameAs(feature)); + + var nonFeatureOwner = new Namespace(); + var nonFeatureChaining = new FeatureChaining(); + + ((IContainedRelationship)nonFeatureChaining).OwningRelatedElement = nonFeatureOwner; + + Assert.That(nonFeatureChaining.ComputeFeatureChained(), Is.Null); } } } diff --git a/SysML2.NET.Tests/Extend/FeatureInvertingExtensionsTestFixture.cs b/SysML2.NET.Tests/Extend/FeatureInvertingExtensionsTestFixture.cs index 655ce32d..cdedbea6 100644 --- a/SysML2.NET.Tests/Extend/FeatureInvertingExtensionsTestFixture.cs +++ b/SysML2.NET.Tests/Extend/FeatureInvertingExtensionsTestFixture.cs @@ -1,38 +1,57 @@ -// ------------------------------------------------------------------------------------------------- +// ------------------------------------------------------------------------------------------------- // -// +// // Copyright 2022-2026 Starion Group S.A. -// +// // Licensed under the Apache License, Version 2.0 (the "License"); // you may not use this file except in compliance with the License. // You may obtain a copy of the License at -// +// // http://www.apache.org/licenses/LICENSE-2.0 -// +// // Unless required by applicable law or agreed to in writing, software // distributed under the License is distributed on an "AS IS" BASIS, // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // See the License for the specific language governing permissions and // limitations under the License. -// +// // // ------------------------------------------------------------------------------------------------ namespace SysML2.NET.Tests.Extend { using System; - + using NUnit.Framework; - + using SysML2.NET.Core.POCO.Core.Features; + using SysML2.NET.Core.POCO.Root.Elements; + using SysML2.NET.Core.POCO.Root.Namespaces; [TestFixture] public class FeatureInvertingExtensionsTestFixture { [Test] - public void ComputeOwningFeature_ThrowsNotSupportedException() + public void VerifyComputeOwningFeature() { - Assert.That(() => ((IFeatureInverting)null).ComputeOwningFeature(), Throws.TypeOf()); + Assert.That(() => ((IFeatureInverting)null).ComputeOwningFeature(), Throws.TypeOf()); + + var featureInverting = new FeatureInverting(); + + Assert.That(featureInverting.ComputeOwningFeature(), Is.Null); + + var feature = new Feature(); + + ((IContainedRelationship)featureInverting).OwningRelatedElement = feature; + + Assert.That(featureInverting.ComputeOwningFeature(), Is.SameAs(feature)); + + var nonFeatureOwner = new Namespace(); + var nonFeatureInverting = new FeatureInverting(); + + ((IContainedRelationship)nonFeatureInverting).OwningRelatedElement = nonFeatureOwner; + + Assert.That(nonFeatureInverting.ComputeOwningFeature(), Is.Null); } } } diff --git a/SysML2.NET/Extend/ConjugationExtensions.cs b/SysML2.NET/Extend/ConjugationExtensions.cs index c4d9e121..cdae72ce 100644 --- a/SysML2.NET/Extend/ConjugationExtensions.cs +++ b/SysML2.NET/Extend/ConjugationExtensions.cs @@ -42,10 +42,11 @@ internal static class ConjugationExtensions /// /// the computed result /// - [System.Diagnostics.CodeAnalysis.ExcludeFromCodeCoverage] internal static IType ComputeOwningType(this IConjugation conjugationSubject) { - throw new NotSupportedException("Create a GitHub issue when this method is required"); + return conjugationSubject == null + ? throw new ArgumentNullException(nameof(conjugationSubject)) + : conjugationSubject.OwningRelatedElement as IType; } } diff --git a/SysML2.NET/Extend/CrossSubsettingExtensions.cs b/SysML2.NET/Extend/CrossSubsettingExtensions.cs index 0f0b1c12..5c101ba3 100644 --- a/SysML2.NET/Extend/CrossSubsettingExtensions.cs +++ b/SysML2.NET/Extend/CrossSubsettingExtensions.cs @@ -43,10 +43,11 @@ internal static class CrossSubsettingExtensions /// /// the computed result /// - [System.Diagnostics.CodeAnalysis.ExcludeFromCodeCoverage] internal static IFeature ComputeCrossingFeature(this ICrossSubsetting crossSubsettingSubject) { - throw new NotSupportedException("Create a GitHub issue when this method is required"); + return crossSubsettingSubject == null + ? throw new ArgumentNullException(nameof(crossSubsettingSubject)) + : crossSubsettingSubject.OwningRelatedElement as IFeature; } } diff --git a/SysML2.NET/Extend/DifferencingExtensions.cs b/SysML2.NET/Extend/DifferencingExtensions.cs index ce196ae0..c27820c4 100644 --- a/SysML2.NET/Extend/DifferencingExtensions.cs +++ b/SysML2.NET/Extend/DifferencingExtensions.cs @@ -42,10 +42,11 @@ internal static class DifferencingExtensions /// /// the computed result /// - [System.Diagnostics.CodeAnalysis.ExcludeFromCodeCoverage] internal static IType ComputeTypeDifferenced(this IDifferencing differencingSubject) { - throw new NotSupportedException("Create a GitHub issue when this method is required"); + return differencingSubject == null + ? throw new ArgumentNullException(nameof(differencingSubject)) + : differencingSubject.OwningRelatedElement as IType; } } diff --git a/SysML2.NET/Extend/DisjoiningExtensions.cs b/SysML2.NET/Extend/DisjoiningExtensions.cs index 561d40d6..bbd9b2de 100644 --- a/SysML2.NET/Extend/DisjoiningExtensions.cs +++ b/SysML2.NET/Extend/DisjoiningExtensions.cs @@ -42,10 +42,11 @@ internal static class DisjoiningExtensions /// /// the computed result /// - [System.Diagnostics.CodeAnalysis.ExcludeFromCodeCoverage] internal static IType ComputeOwningType(this IDisjoining disjoiningSubject) { - throw new NotSupportedException("Create a GitHub issue when this method is required"); + return disjoiningSubject == null + ? throw new ArgumentNullException(nameof(disjoiningSubject)) + : disjoiningSubject.OwningRelatedElement as IType; } } diff --git a/SysML2.NET/Extend/FeatureChainingExtensions.cs b/SysML2.NET/Extend/FeatureChainingExtensions.cs index c5ea648e..6ca2adf2 100644 --- a/SysML2.NET/Extend/FeatureChainingExtensions.cs +++ b/SysML2.NET/Extend/FeatureChainingExtensions.cs @@ -42,10 +42,11 @@ internal static class FeatureChainingExtensions /// /// the computed result /// - [System.Diagnostics.CodeAnalysis.ExcludeFromCodeCoverage] internal static IFeature ComputeFeatureChained(this IFeatureChaining featureChainingSubject) { - throw new NotSupportedException("Create a GitHub issue when this method is required"); + return featureChainingSubject == null + ? throw new ArgumentNullException(nameof(featureChainingSubject)) + : featureChainingSubject.OwningRelatedElement as IFeature; } } diff --git a/SysML2.NET/Extend/FeatureInvertingExtensions.cs b/SysML2.NET/Extend/FeatureInvertingExtensions.cs index 6d40ccdb..4fba099a 100644 --- a/SysML2.NET/Extend/FeatureInvertingExtensions.cs +++ b/SysML2.NET/Extend/FeatureInvertingExtensions.cs @@ -42,10 +42,11 @@ internal static class FeatureInvertingExtensions /// /// the computed result /// - [System.Diagnostics.CodeAnalysis.ExcludeFromCodeCoverage] internal static IFeature ComputeOwningFeature(this IFeatureInverting featureInvertingSubject) { - throw new NotSupportedException("Create a GitHub issue when this method is required"); + return featureInvertingSubject == null + ? throw new ArgumentNullException(nameof(featureInvertingSubject)) + : featureInvertingSubject.OwningRelatedElement as IFeature; } } From ad878ae8e58cc0f63847746d04f928b3f5af8841 Mon Sep 17 00:00:00 2001 From: atheate Date: Thu, 28 May 2026 15:22:24 +0200 Subject: [PATCH 3/3] update flow on command --- .../commands/implement-extensions-batch.md | 137 +++++++++--------- .claude/commands/implement-extensions.md | 118 +++++++-------- .claude/team-templates/extension-impl.md | 70 +++++---- 3 files changed, 160 insertions(+), 165 deletions(-) diff --git a/.claude/commands/implement-extensions-batch.md b/.claude/commands/implement-extensions-batch.md index f7057d88..bfb0b851 100644 --- a/.claude/commands/implement-extensions-batch.md +++ b/.claude/commands/implement-extensions-batch.md @@ -60,39 +60,35 @@ stub-blocker test pattern (see template) when an in-scope test would otherwise need to traverse a still-stubbed upstream method that is NOT part of the current batch. -## Pre-flight: detect orchestrator plan mode - -If the orchestrator session is itself in **plan mode** at the moment -`/implement-extensions-batch` runs, the four named teammates (researcher, -implementer, tester, reviewer) inherit that state. The Agent tool's -`mode: "acceptEdits"` parameter does NOT override the inherited plan-mode -state on the current Claude Code build — sub-agents will respect the -`` that declares plan mode and refuse to apply any edits, -even though their prompts tell them to. - -Symptom: each named teammate reports "ready to execute on exit from plan -mode" and writes its work to its own per-agent plan file at -`C:\Users\\.claude\plans\-agent-.md` instead of to the -target files. - -Before spawning the team, check whether plan mode is active in the -orchestrator session. If it is: - -1. **Stop** and surface the situation to the user with `AskUserQuestion`. Two - options: - - **Exit plan mode first** (user toggles their harness off plan mode, then - re-invokes the command). Cleanest. - - **Proceed in degraded mode**: spawn the researcher as normal (it only - needs to write to `.team-notes/`, which the orchestrator can split out - of its per-agent plan file if blocked). For Phase IT (implementer + - tester) and Phase RV (reviewer), the orchestrator applies the - production / test edits itself, reading each agent's plan file to - extract the verbatim per-file code blocks. Reviewer can still run - read-only. - -2. If the user picks degraded mode, set an internal `PLAN_MODE_DEGRADED=true` - flag for the run and follow the per-phase divergences in the **Notes for - the orchestrator** block below. +## Pre-flight: plan mode IS the pre-execution approval gate (Gate 0) + +If the orchestrator session is in **plan mode** when `/implement-extensions-batch` is invoked, use plan mode as the natural pre-execution approval gate. Do NOT spawn any sub-agent, do NOT create any branch, do NOT assign any issue. Instead: + +1. **Stay in plan mode.** Do all the READ-ONLY pre-flight work that steps 1 → 4 of this command require: + - Validate every input file exists; refuse missing files. + - For each file, look up the GitHub issue via `gh issue list … --search " in:body"`. Surface ambiguities (0 or >1 match) via `AskUserQuestion`. + - Enumerate stub `Compute*` methods per file (`Grep "throw new NotSupportedException"`); drop files with zero stubs. + - Grade complexity per file (step 3.5 rubric); roll up per-role models (step 3.5 rollup logic). + - Read `git status --porcelain` (refuse on dirty tree). + - `git fetch origin development` (read-only). + - Run `git ls-remote --exit-code origin ` and `git branch --list ` to check the branch isn't already taken. If it is, surface via `AskUserQuestion` (different branch / abort). + +2. **Write the plan file** (at the path Claude Code provides in the plan-mode system reminder) with the standard structure (Context, Recommended approach, Critical files, Verification). The "Recommended approach" section is a concise description of the proposed execution: + - Batch composition (files + GitHub issue numbers + per-file complexity grade + stub counts). + - Proposed branch name + base ref + remote-tracking intent. + - Team name (`TeamCreate({team_name: "..."})`). + - Per-role model picks (rolled up from step 3.5). Note that `ExitPlanMode` is binary (approve / reject), so the user cannot override models from inside the plan-approval UI. Two override paths are supported: (a) reject the plan and re-invoke with an explicit model preference in the prompt, OR (b) accept the plan and the orchestrator fires a follow-up `AskUserQuestion` (Yes / "all Sonnet" / "all Opus" / Custom) BEFORE step 6's branch creation. + - Workflow shape: Phase R → Phase R-A (Gate R-A approval) → Phase IT → V → S → RV → IS. + - Explicit mention that Phase R-A will pause for `AskUserQuestion` after the researcher returns its `spec ready` (so the user knows there is exactly ONE more structural checkpoint after this one). + +3. **Call `ExitPlanMode`.** The user reviews the plan in the standard plan-approval UI: + - **Approve (unchanged)** → plan mode exits, orchestrator proceeds (model-override `AskUserQuestion` per step 2 above, then step 6 → 7 → 8 → Phase R → Phase R-A → Phase IT → V → S → RV → IS). + - **Approve (edited)** → the tool result includes a `## Approved Plan (edited by user):` block with the user's modifications. The orchestrator MUST treat the EDITED plan content as the source of truth — re-parse composition / branch / models from it, NOT from the originally-proposed plan. If the user dropped a file by removing it from the composition list, the orchestrator continues with the reduced set. If the user changed the branch name or model picks inline, the orchestrator honors that. (Standard Claude Code plan-edit-on-approval pattern; see the `` after every `ExitPlanMode` approval.) + - **Reject** → nothing executes. No branch, no team, no issue assignments. The plan file remains for the user's reference. + +If the orchestrator session is **NOT in plan mode** when the command is invoked, continue with the existing step 5 sanity-check (`AskUserQuestion` for composition + models + branch name). Gate 0 only fires in plan mode; otherwise the existing step 5 plays the same role. + +**Important — tool-level prompts are NOT auto-allowed.** Gate 0 (and Gate R-A in step 9.5) govern STRUCTURAL approval only. Individual tool calls — `Bash(dotnet build *)`, `Bash(git push *)`, `TeamCreate`, `Agent(...)`, etc. — continue to surface per the user's `settings.json` and harness defaults. The user has chosen to keep handling those prompts manually; the gates do not bypass them. ## Workflow @@ -271,14 +267,34 @@ ONE `Agent(...)` call: expanded to the numbered list of (interface, paths, method-list) tuples, one per file in the batch. -Wait for the researcher's `spec ready` SendMessage. Then **read each -`.team-notes/-extensions-spec.md`** yourself to verify coverage, -spec-text-only flags, stub-blocker flags. Surface ambiguities to the user -before continuing. +Wait for the researcher's `spec ready` SendMessage. Then **read each `.team-notes/-extensions-spec.md`** yourself to verify coverage, spec-text-only flags, and stub-blocker flags. Proceed to step 9.5 (Phase R-A approval gate) — DO NOT spawn implementer + tester directly. + +The researcher agent **stays addressable** for the rest of the run — the implementer / tester / reviewer may need a clarification on one file's OCL later, which the orchestrator can route via `SendMessage to: "researcher"`. The researcher is also the recipient of step 9.5's "Abort — research again with feedback" branch. + +### 9.5. Phase R-A — Researcher-plan approval gate (MANDATORY, every run) + +Before spawning implementer + tester, the orchestrator MUST render an inline spec preview and ask for explicit approval. This is non-skippable, even when the researcher reports zero ambiguities / zero spec-text-only flags / zero stub-blockers. It is the only chance the user gets to inspect the per-method derivation plan before any code is written to disk. + +1. **Render** an inline preview in the chat response, one block per file. For each file, pull from its `.team-notes/-extensions-spec.md`: + - File path + GitHub issue number (link to issue if practical). + - Per method: + - Signature line (`internal static Compute(this I )`). + - Derivation source tag: `OCL in XMI`, `OCL in `, or `spec-text only`. + - The suggested C# code block from the notes — single fenced block, cap ≤ 8 lines. + - Dependencies summary (sibling derived properties used; upstream stubs hit). + - Stub-blocker flag (if any) — signals that the populated case will need the stub-blocker test pattern. + - Use a compact Markdown format. Cap total preview at ~80 lines per file to stay scannable. If a notes file's suggested code is unusually long, link the notes-file path and quote only the first ~5 lines. + +2. **Ask** via `AskUserQuestion` (single question, 3 options): + - **Approve — spawn implementer + tester now** *(Recommended)*. Proceeds to step 10. + - **Drop specific files from the batch**. User picks which file(s) to exclude; orchestrator regenerates the preview against the remaining files and re-asks. Drop-and-continue does NOT recreate the branch (already pushed) and does NOT unassign the dropped issues (idempotent enough; surface this in the final summary). + - **Abort — research again with feedback**. Orchestrator forwards the user's free-form `Other` text to the still-addressable `researcher` via `SendMessage`, waits for a fresh `spec ready`, then re-runs step 9.5. + +3. **On Approve**, proceed to step 10. Do NOT re-ask until the run completes. + +4. **On Abort (user rejects without requesting re-research)**, stop the orchestration. Branch + issue assignments remain. Notes files remain. The user can revert manually via git if desired. -The researcher agent **stays addressable** for the rest of the run — the -implementer / tester / reviewer may need a clarification on one file's OCL -later, which the orchestrator can route via `SendMessage to: "researcher"`. +Rationale: the previous "surface ambiguities" instruction was conditional and got skipped on clean researcher returns. This gate runs *every time*, so the user always sees the researcher's contract before any code is committed to disk. The gate governs STRUCTURAL approval only — individual tool-permission prompts that the user's `settings.json` requires (e.g. `Bash(dotnet build *)`) continue to surface during Phase IT / V / S / RV / IS. ### 10. Phase IT — Spawn the single implementer + single tester in parallel @@ -443,15 +459,19 @@ Print to the user: | Sibling test failure in regression sweep | Step 12 | `SendMessage to: "tester"` with the consolidated regression brief (all touched sibling fixtures in one dispatch). Tester's ACL extends to those fixtures for this dispatch only. | | Reviewer NEEDS FIX | Step 13 | `SendMessage to: "implementer"` or `to: "tester"` with the per-file finding; then `SendMessage to: "reviewer"` to re-verify. Same agents throughout. | | Named teammate becomes unresponsive (timed-out SendMessage) | Phases IT / V / S / RV | Fall back to a fresh `Agent(...)` spawn for that role only, replaying the spawn prompt PLUS the most recent context the orchestrator has (notes file paths, prior deviation reports). The team-name stays alive for the other roles. | +| Sub-agent deadlocked on harness UI permission prompt the user cannot action | Phases IT / V / S / RV | Orchestrator can take over the deadlocked sub-agent's remaining work directly (build / test / gh issue edit run from the orchestrator's own permission scope). Send `shutdown_request` to the parked sub-agent. Surface to the user clearly in the final summary. (Precedent: 2026-05-28 batch run for issues #111/#116/#118/#119/#130/#132 — implementer + tester parked on `dotnet build` prompts; orchestrator completed Phases V → S → RV → IS itself.) | | One file's implementation fails after branch + assignment | Any step ≥ 6 | Keep the branch; surface in final summary; user decides whether to retry via `/implement-extensions` for that single file or revert. | -| Sub-agent inherits orchestrator plan mode and refuses to edit | Phases R / IT / RV | Surface to user via `AskUserQuestion`. Either exit plan mode and retry, or proceed in degraded mode (orchestrator splits the researcher's per-agent plan file into `.team-notes/-extensions-spec.md` per file, applies the implementer + tester per-file code blocks from their plan files, runs reviewer as read-only). | -| Agent's `mode: "acceptEdits"` parameter does not override inherited plan mode | Phases IT / RV | Known limitation of this Claude Code build. The orchestrator must apply the edits itself in degraded mode (see "Pre-flight: detect orchestrator plan mode"). | +| User rejects plan at Gate 0 (Pre-flight, plan-mode invocation) | Pre-flight | Stop. No branch, no team, no issue assignments. Plan file remains for reference. | +| User picks "Abort — research again with feedback" at Gate R-A (Phase R-A) | Step 9.5 | Forward the user's free-form text to the still-addressable `researcher` via `SendMessage`. Wait for a fresh `spec ready`. Re-run step 9.5. | +| User picks "Abort" outright at Gate R-A | Step 9.5 | Stop orchestration. Branch + issue assignments remain (user reverts via git / `gh issue edit --remove-assignee` if desired). Researcher's `.team-notes/*.md` files remain for later reference. | +| User picks "Drop specific files" at Gate R-A | Step 9.5 | Recompute the batch composition against the reduced set. Regenerate the inline preview. Re-ask via `AskUserQuestion`. Branch name + issue assignments are NOT recomputed (already pushed / assigned); surface the mismatch in the final summary. | ## Parallelism caps (orchestrator self-enforced) - N ≤ 6 files per batch (unchanged — bounds the per-agent context size, not the live-agent count). - Phase R: **1** agent (the researcher). +- **Phase R-A (step 9.5): no agents in flight.** The user-facing AskUserQuestion gate blocks all sub-agent spawning until the user approves. The researcher is idle but addressable for re-research dispatches. - Phase IT: **2** agents in parallel (implementer ∥ tester), via one orchestrator message containing two `Agent(...)` calls. - Phase RV: **1** agent (the reviewer). @@ -484,32 +504,7 @@ Print to the user: for a single-file fix throws that context away and re-pays the prompt cost. The only reason to spawn fresh is the "named teammate becomes unresponsive" row in the failure-handling table. -- **Plan-mode degraded mode** (`PLAN_MODE_DEGRADED=true`): - - **Phase R**: the single researcher writes its multi-file spec to ONE - per-agent plan file under - `C:\Users\\.claude\plans\-agent-.md` instead of to - the N `.team-notes/` files. The orchestrator reads that plan file, - splits it per `## ` section, and writes each - `.team-notes/-extensions-spec.md`. Verify each split section matches - the schema in `.claude/team-templates/extension-impl.md`. - - **Phase IT**: the single implementer writes verbatim production code for - all N files to its per-agent plan file (with `## ` section markers - + code fences). Likewise the single tester for all N test fixtures. The - orchestrator reads each plan file, extracts the per-file code blocks, - and applies the edits itself via `Edit` / `Write`. Build + targeted - tests still run in Phase V. Do NOT mark Phase IT complete on a "ready - to execute" message alone — only after the orchestrator has applied - each file's diffs and the build is green. - - **Phase S (regression sweep)**: same as Phase IT — the still-running - tester writes its per-fixture expanded tests to per-agent-plan - appendices; orchestrator splits and applies. - - **Phase RV**: reviewer is read-only, so plan mode does not block it. - No degradation needed. - - **Sanity check**: in degraded mode, the orchestrator does roughly 2× the - work it would in normal mode (it now applies the edits the sub-agents - would otherwise apply themselves). Budget for it — do not silently fall - behind on Phase V verification just because Phase IT cost more turns. -- The Agent tool's `mode` parameter cannot reliably escape inherited plan mode - on this Claude Code build. The orchestrator MUST detect plan mode at - pre-flight and pick the degraded-mode branch deliberately rather than - assuming `mode: "acceptEdits"` will work. +- **Plan mode is handled by Gate 0 at the top of this file** — the orchestrator writes the proposed-execution plan to the plan file, calls `ExitPlanMode`, and proceeds on approval. The orchestrator never spawns sub-agents while plan mode is active, so the previous "degraded mode" workaround is no longer needed and has been removed. If for some reason plan mode re-engages mid-run (e.g. the user toggles it back on after Gate 0 approval), the orchestrator pauses any pending sub-agent spawns and surfaces the state to the user. +- **Gate R-A (step 9.5) is mandatory every run.** It is the only structural checkpoint between the researcher returning `spec ready` and the implementer + tester being spawned. Do not skip it even when the researcher reports zero ambiguities — the user explicitly asked for an unconditional gate so they can review per-method derivations before code is written. +- **Tool-level prompts (`Bash(...)`, `Edit(...)`, `Write(...)`, `TeamCreate`, `Agent(...)`) still surface per the user's `settings.json`.** Gate 0 and Gate R-A govern STRUCTURAL approval only. The user has explicitly chosen to keep handling tool-level prompts manually rather than auto-allowing them via permission rules or hooks. +- **Sub-agent-side deadlocks**: if an implementer / tester gets parked on a harness UI permission prompt the user cannot action (precedent: 2026-05-28 batch run for issues #111/#116/#118/#119/#130/#132), the orchestrator should take over the deadlocked work directly from its own permission scope (running `dotnet build` / `dotnet test` / `gh issue edit` itself). Send `shutdown_request` to the parked sub-agent. Surface clearly in the final summary. diff --git a/.claude/commands/implement-extensions.md b/.claude/commands/implement-extensions.md index 78cad4a5..39deb76c 100644 --- a/.claude/commands/implement-extensions.md +++ b/.claude/commands/implement-extensions.md @@ -45,38 +45,33 @@ This is the user-memory `feedback_scope_discipline.md` rule. Even when an adjace stub blocks dependent test coverage, surface the blocker; do not silently expand scope. Use the stub-blocker test pattern (see template). -## Pre-flight: detect orchestrator plan mode - -If the orchestrator session is itself in **plan mode** at the moment -`/implement-extensions` runs, spawned sub-agents (researcher, implementer, tester, -reviewer) inherit that state. The Agent tool's `mode: "acceptEdits"` parameter -does NOT override the inherited plan-mode state on the current Claude Code build -— sub-agents will respect the `` that declares plan mode and -refuse to apply any edits, even though their prompts tell them to. - -Symptom: every sub-agent reports "ready to execute on exit from plan mode" and -writes its work to its own per-agent plan file at -`C:\Users\\.claude\plans\-agent-.md` instead of to the -target file. - -Before spawning any sub-agent, check whether plan mode is active in the -orchestrator session. If it is: - -1. **Stop** and surface the situation to the user with `AskUserQuestion`. Two - options: - - **Exit plan mode first** (user toggles their harness off plan mode, then - re-invokes the command). Cleanest. - - **Proceed in degraded mode**: spawn the researcher as normal (it only - needs to write to `.team-notes/`, which the orchestrator can copy into - place from the agent's plan file if blocked). For the step-6 implementer - + tester pair and the step-9 reviewer, the orchestrator applies the - production / test edits itself, reading each sub-agent's plan file to - extract the verbatim code. The reviewer is already read-only and runs - unaffected. - -2. If the user picks degraded mode, set an internal `PLAN_MODE_DEGRADED=true` - flag for the run and follow the per-step divergences in the **Notes for - the orchestrator** block at the bottom of this file. +## Pre-flight: plan mode IS the pre-execution approval gate (Gate 0) + +If the orchestrator session is in **plan mode** when `/implement-extensions` is invoked, use plan mode as the natural pre-execution approval gate. Do NOT spawn any sub-agent. Instead: + +1. **Stay in plan mode.** Do all the READ-ONLY pre-flight work that steps 1 → 3.5 of this command require: + - Validate the input file exists (per step 1's normalization rule); refuse if missing. + - Auto-derive the per-file paths (per step 2): production, test fixture, target interface, subject param, notes file, team name. + - Look up the GitHub issue via `gh issue list --search " in:body"`. Surface ambiguity (0 or >1 match) via `AskUserQuestion`. + - Enumerate stub `Compute*` methods (per step 3); abort if 0 stubs. + - Grade complexity, pick per-role models (per step 3.5). + - Read `git status --porcelain` (read-only; refuse on dirty tree if you intend the run to be clean). + +2. **Write the plan file** (at the path Claude Code provides in the plan-mode system reminder) with the standard structure (Context, Recommended approach, Critical files, Verification). The "Recommended approach" section is a concise description of the proposed execution: + - The one production file + its derived test fixture + notes file + GitHub issue number. + - Per-role model picks (from step 3.5). Note that `ExitPlanMode` is binary (approve / reject), so the user cannot override models from inside the plan-approval UI. Two override paths are supported: (a) reject the plan and re-invoke with an explicit model preference in the prompt, OR (b) accept the plan and the orchestrator fires a follow-up `AskUserQuestion` BEFORE step 5's researcher spawn. + - Stub method count + complexity grade. + - Workflow shape: step 5 (researcher) → step 5.5 (Gate R-A approval) → step 6 (implementer + tester) → step 7 (verification) → step 8 (regression sweep) → step 9 (reviewer) → step 10 (final summary) → step 11 (issue checklist sync). + - Explicit mention that step 5.5 will pause for `AskUserQuestion` after the researcher returns its `spec ready` (so the user knows there is exactly ONE more structural checkpoint after this one). + +3. **Call `ExitPlanMode`.** The user reviews the plan in the standard plan-approval UI: + - **Approve (unchanged)** → plan mode exits, orchestrator proceeds (model-override `AskUserQuestion` per step 2 above, then step 5 → 5.5 → 6 → 7 → 8 → 9 → 10 → 11). + - **Approve (edited)** → the tool result includes a `## Approved Plan (edited by user):` block with the user's modifications. The orchestrator MUST treat the EDITED plan content as the source of truth — re-parse the (single) file path + model picks from it, NOT from the originally-proposed plan. (Standard Claude Code plan-edit-on-approval pattern; see the `` after every `ExitPlanMode` approval.) + - **Reject** → nothing executes. The plan file remains for the user's reference. + +If the orchestrator session is **NOT in plan mode** when the command is invoked, continue with the existing step 4 sanity-check (`AskUserQuestion` for scope + models). Gate 0 only fires in plan mode; otherwise the existing step 4 plays the same role. + +**Important — tool-level prompts are NOT auto-allowed.** Gate 0 (and Gate R-A in step 5.5) govern STRUCTURAL approval only. Individual tool calls — `Bash(dotnet build *)`, `Bash(git push *)`, `Agent(...)`, etc. — continue to surface per the user's `settings.json` and harness defaults. The user has chosen to keep handling those prompts manually; the gates do not bypass them. ## Workflow @@ -220,8 +215,31 @@ The researcher MUST: - Flag any method whose OCL transitively reads a still-stubbed sibling `Compute*` so the tester knows to use the stub-blocker pattern. -After the researcher finishes, read `{{NOTES_FILE}}` yourself to verify it -covers all methods + flags spec-text-only and stub-blocker cases. +After the researcher finishes, read `{{NOTES_FILE}}` yourself to verify it covers all methods + flags spec-text-only and stub-blocker cases. Proceed to step 5.5 (Phase R-A approval gate) — DO NOT spawn implementer + tester directly. + +### 5.5. Phase R-A — Researcher-plan approval gate (MANDATORY, every run) + +Before spawning implementer + tester, the orchestrator MUST render an inline spec preview and ask for explicit approval. This is non-skippable, even when the researcher reports zero ambiguities. It is the only chance the user gets to inspect the per-method derivation plan before any code is written to disk. + +1. **Render** an inline preview in the chat response. Pull from `{{NOTES_FILE}}`: + - File path + GitHub issue number (link to issue if practical). + - Per method: + - Signature line (`internal static Compute(this {{TARGET_INTERFACE}} {{SUBJECT_PARAM}})`). + - Derivation source tag: `OCL in XMI`, `OCL in `, or `spec-text only`. + - The suggested C# code block from the notes — single fenced block, cap ≤ 8 lines. + - Dependencies summary (sibling derived properties used; upstream stubs hit). + - Stub-blocker flag (if any) — signals that the populated case will need the stub-blocker test pattern. + - Use a compact Markdown format. Cap total preview at ~80 lines. + +2. **Ask** via `AskUserQuestion` (single question, 2 options — "drop files" collapses out in single-file mode): + - **Approve — spawn implementer + tester now** *(Recommended)*. Proceeds to step 6. + - **Abort — research again with feedback**. Orchestrator forwards the user's free-form `Other` text to the still-addressable `researcher` via `SendMessage`, waits for a fresh `spec ready`, then re-runs step 5.5. + +3. **On Approve**, proceed to step 6. Do NOT re-ask until the run completes. + +4. **On Abort (user rejects without requesting re-research)**, stop the orchestration. The notes file remains for the user's reference. + +Rationale: the previous "After the researcher finishes, read `{{NOTES_FILE}}` yourself to verify" instruction was an internal check by the orchestrator, with no user-visible gate. This gate runs *every time*, so the user always sees the researcher's contract before any code is committed to disk. The gate governs STRUCTURAL approval only — individual tool-permission prompts that the user's `settings.json` requires (e.g. `Bash(dotnet build *)`) continue to surface during steps 6 / 7 / 8 / 9 / 11. ### 6. Spawn the implementer and tester in parallel @@ -454,31 +472,7 @@ unresolved findings are separately surfaced in the final-summary report. The implementation state of the file; unresolved findings are separately surfaced in the final-summary report. The `gh issue edit` push must touch ONLY the `### Checklist` section — verify with a re-fetch + diff before reporting "done". -- **Plan-mode degraded mode** (`PLAN_MODE_DEGRADED=true` — set in the pre-flight - step at the top of this file): - - **Step 5 (researcher)**: the researcher will write its spec to a per-agent - plan file under `C:\Users\\.claude\plans\-agent-.md` - instead of `.team-notes/-extensions-spec.md`. After it returns, copy - the agent plan file into `.team-notes/-extensions-spec.md` and verify - the content matches the schema in `.claude/team-templates/extension-impl.md`. - - **Step 6 (implementer + tester)**: both will write their final production - / test code to per-agent plan files, NOT to `{{PRODUCTION_FILE}}` / - `{{TEST_FILE}}`. The orchestrator reads each agent's plan file, extracts - the verbatim code, and applies the edits itself via `Edit` / `Write`. - Step 7's `dotnet build` + targeted `dotnet test` then runs against the - orchestrator-applied diffs as normal. Do NOT mark step 6 complete on the - sub-agent's "ready to execute" message alone — only after the orchestrator - has applied each file's diffs and the build is green. - - **Step 8 (regression sweep)**: same as step 6 — the regression-sweep - tester writes to a per-agent plan file; orchestrator applies the diffs to - each touched sibling fixture itself. - - **Step 9 (reviewer)**: read-only, so plan mode does not block it. No - degradation needed. - - **Sanity check**: in degraded mode, the orchestrator does roughly 2× the - work it would in normal mode (it now applies the edits the sub-agents - would otherwise apply themselves). Budget for it — do not silently fall - behind on step-7 verification just because step 6 cost more turns. -- The Agent tool's `mode` parameter cannot reliably escape inherited plan mode - on this Claude Code build. The orchestrator MUST detect plan mode at - pre-flight and pick the degraded-mode branch deliberately rather than - assuming `mode: "acceptEdits"` will work. +- **Plan mode is handled by Gate 0 at the top of this file** — the orchestrator writes the proposed-execution plan to the plan file, calls `ExitPlanMode`, and proceeds on approval. The orchestrator never spawns sub-agents while plan mode is active, so the previous "degraded mode" workaround is no longer needed and has been removed. +- **Gate R-A (step 5.5) is mandatory every run.** It is the only structural checkpoint between the researcher returning `spec ready` and the implementer + tester being spawned. Do not skip it even when the researcher reports zero ambiguities — the user explicitly asked for an unconditional gate so they can review per-method derivations before code is written. +- **Tool-level prompts (`Bash(...)`, `Edit(...)`, `Write(...)`, `Agent(...)`) still surface per the user's `settings.json`.** Gate 0 and Gate R-A govern STRUCTURAL approval only. The user has explicitly chosen to keep handling tool-level prompts manually rather than auto-allowing them via permission rules or hooks. +- **Sub-agent-side deadlocks**: if an implementer or tester gets parked on a harness UI permission prompt the user cannot action, the orchestrator should take over the deadlocked work directly from its own permission scope (running `dotnet build` / `dotnet test` / `gh issue edit` itself). Send `shutdown_request` to the parked sub-agent. Surface clearly in the final summary. diff --git a/.claude/team-templates/extension-impl.md b/.claude/team-templates/extension-impl.md index b2f4dc88..5b7ba789 100644 --- a/.claude/team-templates/extension-impl.md +++ b/.claude/team-templates/extension-impl.md @@ -79,6 +79,12 @@ across every file in the batch. in the recent TypeExtensions task) and surfaces transitive stub-blockers before the implementer hits them. +1.5 Orchestrator → renders inline per-file spec preview, asks user via + AskUserQuestion to approve before any code is written. + MANDATORY every run, even when researcher reports zero + ambiguities. See "Phase R-A" / step 5.5 in + .claude/commands/implement-extensions.md. + 2. Implementer → implements all methods in {{PRODUCTION_FILE}}, sliced by dependency tier (Tier 1: direct OfType filters; Tier 2: chains over Tier 1; Tier 3: depends on operations; Tier 4: @@ -107,6 +113,14 @@ When invoked through `/implement-extensions-batch`, the workflow above runs AGENT RUN. Reads {{REFERENCE_PRODUCTION_FILE}} and {{REFERENCE_TEST_FILE}} ONCE up front, not per file. +1.5 Orchestrator → renders inline per-file spec preview (one block per + batch file), asks user via AskUserQuestion to approve + before any code is written. Options: Approve / Drop + specific files / Abort + research again. MANDATORY every + run, even when researcher reports zero ambiguities. See + "Phase R-A" / step 9.5 in + .claude/commands/implement-extensions-batch.md. + 2. Implementer → walks {{BATCH_FILES}} and implements each SysML2.NET/Extend/Extensions.cs in its single agent run. Reads each file's notes file before editing it. ACL = @@ -123,14 +137,14 @@ When invoked through `/implement-extensions-batch`, the workflow above runs 4. Reviewer → walks {{BATCH_FILES}} and applies the OCL-translation + test-fixture checklists per file; READ-ONLY. -5. Orchestrator → after Phases R / IT, runs ONE consolidated build, ONE - consolidated targeted test (filter joining every fixture - in the batch), then the regression sweep. On any failure, - it sends a SendMessage to the relevant named teammate - (`researcher`, `implementer`, or `tester`) with the - failing-file list — the same agent keeps its context and - fixes in place. Fresh Agent spawns are not used inside - the iteration loop. +5. Orchestrator → after Phases R / R-A / IT, runs ONE consolidated build, + ONE consolidated targeted test (filter joining every + fixture in the batch), then the regression sweep. On any + failure, it sends a SendMessage to the relevant named + teammate (`researcher`, `implementer`, or `tester`) with + the failing-file list — the same agent keeps its context + and fixes in place. Fresh Agent spawns are not used + inside the iteration loop. ``` The role boundaries are **identical** to single-file mode: the implementer @@ -138,30 +152,22 @@ never edits test files, the tester never edits production files, the reviewer is read-only. The only thing that changes is the size of the per-role allowed file set (one file → N files in the batch). -## Plan-mode-aware prompting (added 2026-05-28) - -If the orchestrator session is in plan mode when this template is used, sub-agents -will inherit it and cannot apply edits. The Agent-tool `mode: "acceptEdits"` -parameter does NOT override the inherited state on the current Claude Code build. - -Role prompts in this template are written so that the orchestrator can fall back -to applying edits itself when this happens: - -- **Researcher** prompts always direct the agent to write the spec to its - declared `{{NOTES_FILE}}` location. In plan-mode-degraded runs the agent will - write to a per-agent plan file under - `C:\Users\\.claude\plans\-agent-.md` instead; the - orchestrator then copies it to `.team-notes/`. -- **Implementer / tester** prompts must emit the verbatim production / test code - in their text response (or, equivalently, in their per-agent plan file) so it - survives a plan-mode block. The orchestrator extracts and applies it via - `Edit` / `Write`. -- **Reviewer** prompts are already read-only and unaffected by plan mode. - -The `/implement-extensions-batch` command body documents the degraded-mode flow -end-to-end in its "Pre-flight: detect orchestrator plan mode" section. The -single-file `/implement-extensions` command should also adopt the same flow — -see that file's notes section. +## Plan-mode-aware prompting (reworked 2026-05-28) + +Plan mode is now the natural pre-execution approval gate for both `/implement-extensions` and `/implement-extensions-batch` — the orchestrator stays in plan mode, does read-only pre-flight, writes the proposed-execution plan to the plan file, calls `ExitPlanMode`, and proceeds on approval. The "degraded mode" workaround documented in the previous version of this section (orchestrator applies edits on behalf of sub-agents) is no longer needed because the orchestrator never spawns sub-agents while plan mode is active. + +See the **"Pre-flight: plan mode IS the pre-execution approval gate (Gate 0)"** section in each command file: +- `.claude/commands/implement-extensions.md` +- `.claude/commands/implement-extensions-batch.md` + +The structural approval workflow ships with **two gates** now: + +| Gate | Fires | Approves | UI | +|---|---|---|---| +| Gate 0 (pre-execution) | At invocation, if orchestrator is in plan mode | Composition + branch + team + model picks | `ExitPlanMode` plan-approval UI | +| Gate R-A (post-researcher) | After researcher returns `spec ready`, before implementer/tester spawn | The per-method derivation plan inline-rendered from each `.team-notes/-extensions-spec.md` | `AskUserQuestion` with 2–3 options | + +Tool-level permission prompts (`Bash(dotnet build *)`, `Bash(git push *)`, `TeamCreate`, `Agent(...)`, etc.) continue to surface per the user's `settings.json`. The gates govern STRUCTURAL approval only; they do not auto-allow individual tool calls. ## Hard scope-discipline rule (v2)