refactor(dream): remove learning pipeline + per-task skills with dynamic model loading#238
Conversation
Removes the Devflow self-learning pipeline (workflow/procedural detection, learning markers, eval-learning/eval-reinforce hooks, devflow learn CLI, learningCounts HUD component) while preserving decisions, memory, knowledge, and curation pipelines intact. Key changes: - Migrations: purge-learning-pipeline-v1 (per-project) + purge-learning-global-v1 (global) clean up existing installs on next devflow init - Hooks: delete eval-learning + eval-reinforce; remove Section 1.75 (LEARNED BEHAVIORS) from session-start-context; convert learning) case in dream-collect-tasks to unconditional delete-on-sight (R1: orphaned markers never reach spawn emitter) - TypeScript: remove DreamConfig.learning, devflow learn command, learningCounts HUD component and supporting types; update manifest schema; remove getLearning* path helpers - Tests: delete learning-specific test files; update all shared tests to remove learning field references; add migration tests (TDD) - Docs: update CLAUDE.md, README.md to reflect removal Existing installs self-heal: migrations remove .devflow/learning/, dream markers, config keys, and auto-generated self-learning artifacts. Stale learningCounts in user HUD configs degrades gracefully (unknown component IDs are silently skipped). Applies ADR-001 (clean-break philosophy with explicit migrations), ADR-002 (clean house), PF-004 (migration idempotency), PF-007 (source-first hooks).
Phase B: lift the four Dream task procedures into dedicated per-task skills
(dream-memory/decisions/knowledge/curation) and rewrite both the Dream agent
body and the SessionStart spawn directive to load skills dynamically with a
per-task model map (memory=haiku, knowledge=sonnet, decisions/curation=opus).
Key changes:
- Add shared/skills/dream-{memory,decisions,knowledge,curation}/SKILL.md: each
carries the verbatim task procedure lifted from the former dream.md body.
allowed-tools includes Bash/Write/Edit (deliberate exception — these skills
materialize artifacts). Includes bounded retry+backoff for .reinforce.lock
(decisions) and .decisions.lock (curation): 9 attempts, ~30s total, explicit
cap — no unbounded waits, no silent write loss (AC-C6/AC-P4).
- Rewrite shared/agents/dream.md: body is now plumbing-only (claim, heartbeat,
multi-marker merge, error discipline). Per-task procedure loaded via Skill
tool: "load devflow:dream-<task> and follow it". Supports "decisions then
curation" sequential combined spawn.
- Rewrite session-start-context Section 2 directive: per-task Agent() calls with
hardcoded model map; decisions+curation co-pending → exactly ONE opus spawn;
unknown task types silently skipped (AC-F4/AC-C4).
- Register all four new skills in plugin.json and plugins.ts (core-skills).
Add bare names to LEGACY_SKILL_NAMES for future cleanup migrations.
No learning remnants in any file. All 1464 tests green, zero warnings.
CO-AUTHORED-BY: Claude <noreply@anthropic.com>
…ts.toThrow() assertion (was auto-awaited at teardown, masking the check — fails under Vitest 3) - init.ts: add eval-learning/eval-reinforce to LEGACY_HOOK_FILES so upgrading users get the now-dead learning eval modules swept (copyDirectory is additive merge and never removes orphaned installed hooks; applies ADR-002 clean house) - includes Simplifier polish: drop redundant heartbeat note from skill preambles (explicit touch step retained), comment fixes
…kill restructure - hooks KB: per-task spawn model, agent=plumbing+skill-load, 4 dream skills, lock hardening, learning pipeline removed - cli-rules KB: devflow learn removed, dream skills registered in core-skills - index.json: bump lastUpdated + referencedFiles (clears staleness)
…vflow/
Rewrite .devflow/.gitignore to ignore-by-default: only curated team knowledge
is committed (decisions/{decisions.md,pitfalls.md}, features/{index.json,
<slug>/KNOWLEDGE.md}). Everything else (docs/, memory/, dream/, learning/,
locks, runtime state, manifest, scratch) is per-developer/transient → ignored.
Untrack the ~975 already-committed files this newly ignores (973 docs/ history,
2 stray .create-result.json scratch files, redundant nested features/.gitignore).
Files remain on disk; only removed from git tracking.
… policy Switch the generated .devflow/.gitignore template from an explicit blocklist (per-file ignore entries) to an ignore-by-default allowlist (*) that re-includes only curated team knowledge: decisions/decisions.md, decisions/pitfalls.md, features/index.json, and features/<slug>/KNOWLEDGE.md. Any new file under .devflow/ is automatically ignored without requiring a template update. Add sync-devflow-gitignore-v2 per-project migration to re-sync existing projects to the new template (v1 already ran on all machines, so a new ID is required). Migration is idempotent: no-op when content already matches the new policy. Applies PF-004 / PF-001: migration idempotency, edit source hooks not installed copies. Co-Authored-By: Claude <noreply@anthropic.com>
PR #238 Code Review: Inline Findings SummaryMerge Recommendation: APPROVED_WITH_CONDITIONS — Code quality is strong, but 3 BLOCKING documentation issues must be resolved before merge. Blocking Issues (Must Fix Before Merge)1. Stale skill count in CLAUDE.md (Line 75)File: This PR adds 4 new dream-* skills, making the actual total 45. The count must be updated to reflect the new skills being added in this PR. Fix: Change 2. Stale skill count in README.md (Line 52)File: The literal count now disagrees with CLAUDE.md after this PR. Recommend clarifying the intent — are these Fix: Either clarify "41 expert-backed skills" or reconcile to 45. 3. Four new dream-* skills missing mandatory Iron LawFiles:
Confidence: 90% CLAUDE.md explicitly requires: "Each skill has one non-negotiable Iron Law in its Fix: Add
Should-Fix Issues (High Priority, Non-Blocking)4. Stale orchestrator comment in dream-evaluate (Line 7)File: # Orchestrator: sources eval-helpers + 4 feature modules after shared setup.After removing Fix: Change 5. Temp file leak on failed rename in migrations.ts (Lines 795–797)File: const tmp = configPath + '.tmp.' + process.pid;
await fs.writeFile(tmp, JSON.stringify(config, null, 2) + '\n', { encoding: 'utf-8', mode: 0o600 });
await fs.rename(tmp, configPath);If Fix: Wrap in try/catch to clean up on failure: const tmp = configPath + '.tmp.' + process.pid;
await fs.writeFile(tmp, JSON.stringify(config, null, 2) + '\n', { encoding: 'utf-8', mode: 0o600 });
try {
await fs.rename(tmp, configPath);
} catch (err) {
await fs.rm(tmp, { force: true }).catch(() => {});
throw err;
}6. Orphaned documentation files after learning pipeline removalFiles: The learning pipeline ( Fix: 7. Stale CLI documentation in docs/cli-reference.mdFile: The entire Fix: Delete the Summary
Pre-existing issues flagged for context (not blocking, unchanged lines):
All changes outside blocking/should-fix categories are solid. The architecture refactor is clean, test coverage is excellent, and the learning pipeline removal is complete with proper migrations. Authored by Claude Code |
…unt comment, remove dead json-helper cases - ensure-devflow-init: remove learning from the fast-path existence check and from the mkdir -p list; the learning pipeline was deleted and its migration now removes .devflow/learning/, so keeping it here would re-create the directory on every session and permanently defeat the fast-path once the directory is absent. Fast-path now checks dream/ which the mkdir block always creates. applies ADR-002. - dream-evaluate: correct header comment "4 feature modules" to "3 feature modules" (eval-decisions, eval-knowledge, eval-curation — the learning eval modules were removed in this PR). - json-helper.cjs: delete dead learning-created and learning-new case blocks and their usage-comment lines; their sole callers were the json_learning_created and json_learning_new shell functions in json-parse, removed with the learning pipeline. Zero remaining callers confirmed by grep. applies ADR-002.
…, add step-6 test - Issue 1: purge-learning-pipeline-v1 step 2 outer catch swallowed non-ENOENT errors such as EACCES, ENOTDIR, and I/O failures; add rethrow to match every sibling step in the file; avoids PF-004 where a buggy run is recorded as success and never re-runs - Issue 2: _dropLearningKeyFromConfig orphaned the .tmp.<pid> file when fs.rename() threw; wrap rename in try/catch that unlinks the temp on failure before rethrowing - Issue 3: step 6 had no coverage after learn.test.ts was deleted; add one behavior test covering auto-generated skill removal and user-skill preservation
…eam-* per-task skills now has an ## Iron Law section matching the house style (blockquote, bold rule name, 1-3 lines) used by all other skills in shared/skills/: - dream-memory: synthesize from the captured turns queue only - dream-decisions: decisions-append owns all numbering, never hand-edit IDs - dream-knowledge: refresh from the live codebase, never from stale context - dream-curation: deprecate, never delete -- append-only invariant is absolute
…md model note - git rm docs/self-learning.md (orphaned; README already delinked it) - docs/working-memory.md: remove Self-Learning sibling-system section + dead link - docs/cli-reference.md: drop --learn/--no-learn init row and devflow learn command block - docs/reference/file-organization.md: remove eval-learning + eval-reinforce hook entries; fix dream-evaluate description from "5 feature modules" to "3 feature modules (eval-decisions, eval-knowledge, eval-curation)"; update skill count 42→45 - CLAUDE.md: update skill count 42→45 in Project Structure comment - README.md: update skill count 41→45; preserve expert-material wording for the 41 expert-backed skills, note 4 procedural Dream skills separately - shared/agents/dream.md: remove stale "userSignals" from input-cap line (only dialog-pairs remain); add model-note blockquote explaining sonnet default vs per-task opus override from session-start-context Remaining self-learning mentions in CLAUDE.md Migrations section are intentional: they describe the purge-legacy-knowledge-v3 discriminator string and the path that purge-learning-pipeline-v1 removes — accurate migration history, not live references. applies ADR-002
…ap exceeded
Restructure dream-collect-tasks into a two-pass pipeline:
Pass 1 scans ALL .json markers unconditionally (no mtime, no basename
subprocess — uses parameter expansion ${f##*/} and ${base%%.*}):
- Deletes learning.* markers always (orphan sweep, R1).
- Deletes disabled-feature markers (memory/decisions/knowledge) regardless
of count or cap position, closing the latent correctness gap where markers
past position 50 were never swept.
- curation and unknown types pass through unchanged (no flag gates them).
- Accumulates kept markers in a newline-separated string; tracks count.
Cap selection (between passes):
- count <= 50: use candidates directly — zero get_mtime calls (AC-11).
- count > 50: compute mtime per candidate, sort oldest-first, head -50 —
the only path that spawns stat subprocesses (AC-12).
Pass 2 is now deletion-free: only handles .retries cleanup (JUST_RECOVERED
guard preserved) and type accumulation into _DREAM_TASKS.
Signature (dream_collect_tasks DREAM_DIR MEM DEC KNOW) and _DREAM_TASKS
output-variable contract are unchanged (AC-8). Tests added: AC-1–AC-5,
AC-11–AC-12, plus the beyond-cap disabled-marker correctness regression.
The lock serializes merge-observation writes to decisions-log.jsonl.
"reinforce" was learning-pipeline vocabulary; "observations" accurately
describes what the lock protects.
Behavioral site:
- shared/skills/dream-decisions/SKILL.md: LOCK assignment, prose mention,
error message (3 occurrences). `npm run build` regenerates the gitignored
plugins/ copy.
Comment / doc accuracy (no behavior change):
- scripts/hooks/json-helper.cjs: D53 comment (lock name in description)
- src/cli/utils/migrations.ts: sidecar-rmdir safety comment
- .devflow/features/hooks/KNOWLEDGE.md: lines 148, 174, 276
Intentionally left as-is (semantic "reinforce", not the lock name):
tests/learning/merge-observation.test.ts describe/test names,
scripts/hooks/lib/staleness.cjs:21, json-helper.cjs "ID-keyed reinforce op"
comment.
AC-7: .reinforce.lock nowhere in shared/, scripts/, src/, or .devflow/features/.
AC-10: dream-decisions skill remains the sole external lock acquirer;
merge-observation in json-helper.cjs is lock-agnostic (D53).
…n break) The rename-sidecar-to-dream-v1 migration already moves sidecar/config.json → dream/config.json (overwrite:true) at devflow init time. The per-release transitional fallback is now fully removable per ADR-001. Shell hooks (6): remove the 3-line dream-fallback if-block from each hook that read sidecar/config.json when dream/config.json was absent. Each hook now reads $DREAM_DIR/config.json directly. The comment noting the fallback was also removed. `grep -rn sidecar scripts/hooks/` is empty (AC-6). TS layer (dream-config.ts): drop the legacy catch-fallback (the inner try/catch that read sidecar/config.json). Remove the now-unused `import * as path` (only that legacy line used path.join). readConfig now: read dream/config.json → coerce → return DEFAULT_CONFIG on any failure (AC-9). Tests: - tests/dream-config.test.ts: delete 2 M5 tests asserting the removed fallback behavior. Add AC-9 test: only sidecar/config.json present (no dream/config.json) → readConfig returns DEFAULT_CONFIG (proves fallback gone). - tests/shell-hooks.test.ts: delete 3 D37 fallback describe blocks (7 tests) that asserted the removed behavior. Delete createLegacySidecarConfig helper (now unused after test removal). Untouched: migrations (rename-sidecar-to-dream-v1, purge-learning sidecar-key drop) — migration code is sanctioned, harmless, and tested. applies ADR-001 (clean break philosophy)
… all post-removal stale references to the learning pipeline: - docs/reference/file-organization.md: drop deleted learn.ts from source tree, remove "learning" from dream-evaluate marker types, update HUD component count 14 -> 15 (verified: 15 files in components/) - docs/working-memory.md: remove the entire learning/ directory block - CLAUDE.md: bump shared agent count 15 -> 16, add dream to the roster, extend Model Strategy with Dream per-task override map (haiku=memory, sonnet=knowledge, opus=decisions/curation) - shared/agents/dream.md: fix CRITICAL model map -- memory spawn uses haiku not sonnet (matches session-start-context:199) - scripts/hooks/dream-collect-tasks: extract _derive_marker_type helper so the naming rule has one source of truth (no logic change)
- Issue 1 (FALSE_POSITIVE — already fixed in e96e48b: model note now matches code; memory=haiku is correct in both dream.md and session-start-context) - Issue 3: complete _derive_marker_type rollout — pass 2 in dream-collect-tasks was missed in the prior commit; apply helper there so both passes share one source of truth - Issue 4: delete dead filter-observations op (no production callers; only learning-pipeline tests called it) and orphaned artifactName helper from json-helper.cjs; remove the corresponding test describe block; also remove "learning," from gitignore comment in ensure-devflow-init (fixes pre-existing test failure: heredoc vs project-paths.cjs mismatch) - Issue 5a: fix lock-backoff comment in dream-decisions + dream-curation from "~30s total" to "~47s total" (9 attempts × 1+2+4+8+8+8+8+8 = 47s) - Issue 5b: remove obsolete `learning = 1800s` threshold from dream-recover header Verification: bash -n clean on all edited hooks; node -e require() on json-helper.cjs; all 155 shell-hooks tests pass.
- Issue 1 (security): replace hand-rolled temp+rename in _dropLearningKeyFromConfig with writeFileAtomicExclusive (D34 — O_EXCL/TOCTOU-safe; sibling migrations already used the helper) - Issue 2 (regression): document D37 clone-after-marker edge case in dream-config.ts readConfig — explains the bounded silent DEFAULT_CONFIG fallback, tradeoff, and recovery path per ADR-001 clean-break - Issue 3 (consistency): remove stale 'learning,' from .devflow/.gitignore prose comment in both project-paths.ts and its project-paths.cjs mirror - Issue 4 (testing): confirm commands branch in cleanSelfLearningArtifacts is dead (migration step 5 deletes the dir before step 6 calls helper); remove the dead branch; add 8 focused unit tests covering skills-scan, devflow: prefix skip, missing-dir ENOENT path, and return contract Co-Authored-By: Claude <noreply@anthropic.com>
Resolve the deferred complexity finding on session-start-context Section 2 (~125-line inline block doing 4 jobs). Extract the per-task spawn-directive builder into dream_build_spawn_directive (sourced from dream-collect-tasks), which sets the _DREAM_DIRECTIVE global — mirroring the existing dream_collect_tasks -> _DREAM_TASKS contract. Communicating via a global (not stdout) preserves exact directive bytes; command substitution would strip trailing newlines. Decouples the builder from the $CONTEXT accumulator, the reviewer's actual concern. Behavior byte-identical (verified: 155 shell-hooks tests + direct render spot-check). applies ADR-008.
Summary
This PR removes the Devflow self-learning pipeline end-to-end and restructures the Dream subsystem from a single sequential agent into N per-task agents with clean context and a hardcoded task→model map.
Delivered in two commits to allow atomic review of each concern:
cf853e5) — Learning pipeline removal end-to-end98ffa97) — Per-task skills + dynamic model loadingChanges
Phase A: Learning Pipeline Removal
What was removed:
scripts/hooks/eval-learningandeval-reinforce— SessionEnd learning batch accumulator and artifact reinforcementsrc/cli/commands/learn.ts— fulldevflow learnCLI command (5 subcommands)src/cli/hud/learning-counts.ts+components/learning-counts.ts— HUD learning counts displaytests/learn.test.ts,tests/learning/hud-counts.test.ts,tests/learning/capacity-thresholds.test.ts,tests/learning/migration.test.ts— all learning-specific test filesWhat was updated:
DreamConfiginterface — now{memory, decisions, knowledge}(nolearningfield);coerceConfigsilently drops legacylearningkeyscripts/hooks/dream-collect-tasks—learning)case is unconditionalrm(delete orphaned markers on sight, not just when disabled)scripts/hooks/session-start-context— removed Section 1.75 (LEARNED BEHAVIORS) and all_SC2_LEARN_EN/_LEARNING_DREAMreferencessrc/cli/utils/migrations.ts— addedpurge-learning-pipeline-v1(per-project) andpurge-learning-global-v1(global) migrations: clean upeval-learning,eval-reinforceinstalled hooks, orphaned learning log + config + state files, and thelearningkey from dream configsKept intentionally:
src/cli/utils/learning-cleanup.ts—cleanSelfLearningArtifactsis called by the per-project migrationsrc/cli/utils/observations.ts+observation-io.ts— decisions pipeline uses sharedLearningObservationtype andreadObservationsPhase B: Per-Task Skills + Dynamic Model Loading
New skills (
shared/skills/dream-{memory,decisions,knowledge,curation}/SKILL.md):dream.mdbody — logic is byte-for-byte identical, only hosting changed (applies ADR-008: LLM judgment in skills, plumbing in scripts)allowed-tools: Read, Bash, Write, Edit, Glob, Grep— deliberate exception to the read-only skill default; these skills materialize artifacts (same posture asquality-gatesandgit)mkdir || sleep 1 || exit 1pattern in both the decisions skill (.observations.lock) and curation skill (.decisions.lock) with a bounded retry+backoff loop: 9 attempts, exponential backoff doubling from 1s capped at 8s, ~47s total. On exhaustion the task fails cleanly (leaves.processingfor dream-recover) — never silently drops a write. No unbounded loops..decisions.lockEXACTLY ONCE across its read-modify-write; never callsdecisions-appendwhile holding it (avoids the deadlock documented in KNOWLEDGE.md)Rewritten Dream agent (
shared/agents/dream.md):skills:frontmatter alongside existingdevflow:apply-decisionsanddevflow:apply-feature-knowledgeRewritten SessionStart spawn directive (
scripts/hooks/session-start-contextSection 2):Agent(subagent_type="Dream")with per-taskAgent()calls using a hardcoded task→model map:memory=haiku,knowledge=sonnet,decisions=opus,curation=opusdecisions+curationco-pending → exactly ONEopusspawn whose prompt instructs running decisions skill then curation skill sequentially (prevents concurrent lock contention on.decisions.lock)dream-collect-tasksshould never emit them)bash -nclean;set -eintentionally absent (existing no-abort discipline, applies ADR-009/PF-008)Registration:
plugins/devflow-core-skills/.claude-plugin/plugin.jsonandsrc/cli/plugins.ts(core-skills is the correct home — skills install universally regardless of plugin selection)LEGACY_SKILL_NAMESfor future cleanup migrationsBreaking Changes
None for end users. The
devflow learncommand is removed; users who call it will get a "command not found" error. The two new migrations clean up all installed artifacts automatically ondevflow init.Token Cost Characterization (AC-P1)
The new design spawns N per-task agents (up to 3: haiku for memory, sonnet for knowledge, opus for decisions+curation) instead of one sequential agent handling all tasks in a single context. Expected higher per-cycle token cost, justified by:
Reviewer Focus Areas
shared/skills/dream-decisions/SKILL.md— bounded retry loop for.observations.lock; verify cap is 9 attempts, backoff doubles 1→2→4→8 (capped), no unbounded pathsshared/skills/dream-curation/SKILL.md— verify lock acquired once, Edit calls happen between acquire/release,decisions-appendnot called while holding the lockscripts/hooks/session-start-contextlines 183–245 — verify decisions+curation branch emits exactly ONE opus spawn; verify memory/knowledge branches emit their correct models; verify unknown types fall through to the "no known types" log pathshared/agents/dream.md— confirm no learning remnants; confirm the "decisions then curation" combined prompt is unambiguousFollow-up: 3 deferred tech-debt items resolved
Resolves the 3 items deferred during
/code-review+/resolveon this branch.dream-collect-tasks— restructured into one first-pass scan;get_mtime(statsubprocess) andbasenameare no longer spawned per-marker in the common case.get_mtimeis now invoked only when the marker count exceeds the cap (50). Also moved thelearning.*orphan sweep and disabled-feature deletion into the unconditional first pass, closing a latent gap where markers past position 50 were never swept.curation/unknown types always pass through. Signature and_DREAM_TASKScontract unchanged..reinforce.lock→.observations.lock— "reinforce" was learning-era vocabulary; the lock serializesmerge-observationwrites.sidecar/config.jsonfallback (clean break) — removed the runtime fallback from 6 hooks anddream-config.tsreadConfig; therename-sidecar-to-dream-v1migration already handles upgrades (ADR-001 clean-break).All quality gates green:
npm run build,tsc --noEmit, fullvitest(1475 tests),bash -non all edited hooks.