test(stackone-defender): QA fixture regression suite#23
Conversation
Pins production decisions on 12 canonical fixtures (benign/realistic/tricky) through the same PromptDefense config the daemon loads. Runs offline via node:test — no daemon, no network. CI can run this on every PR to catch defender behavior regressions before they reach QA. One known-FP override (research-note-on-injection.md quotes the canonical attack string verbatim) is pinned to current behavior; the test will flip when the underlying model rescues it. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
No issues found across 14 files
Auto-approved: This PR only adds a test suite with fixture files and a single test script; no production code or configuration is modified, so the blast radius is limited to the test pipeline and cannot affect runtime behavior.
Re-trigger cubic
There was a problem hiding this comment.
Pull request overview
Adds an offline Node.js regression test suite for the stackone-defender plugin that runs a canonical set of QA fixtures through @stackone/defender using the production daemon config, asserting allowed decisions to catch behavior drift.
Changes:
- Added
qa-fixtures.test.mjsto execute all fixtures undertests/fixtures/{benign,realistic,tricky}and assert decisions (with one pinned override). - Added 12 new fixture files spanning benign content, realistic attacks, and tricky FP-bait content.
- Added an
npm testscript for running the Node test runner.
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| plugins/security/stackone-defender/tests/qa-fixtures.test.mjs | New regression suite that loads production config and asserts allowed decisions per fixture bucket/override. |
| plugins/security/stackone-defender/package.json | Adds npm test script to run the new test suite. |
| plugins/security/stackone-defender/tests/fixtures/benign/sourdough-recipe.md | Benign fixture expected to be allowed. |
| plugins/security/stackone-defender/tests/fixtures/benign/lms-training-modules.txt | Benign fixture expected to be allowed. |
| plugins/security/stackone-defender/tests/fixtures/benign/hiking-trail.md | Benign fixture expected to be allowed. |
| plugins/security/stackone-defender/tests/fixtures/benign/git-log.txt | Benign fixture expected to be allowed. |
| plugins/security/stackone-defender/tests/fixtures/realistic/support-ticket.txt | Realistic injection fixture expected to be blocked. |
| plugins/security/stackone-defender/tests/fixtures/realistic/slack-thread.txt | Realistic injection fixture expected to be blocked. |
| plugins/security/stackone-defender/tests/fixtures/realistic/document-summary.md | Realistic injection fixture expected to be blocked. |
| plugins/security/stackone-defender/tests/fixtures/tricky/research-note-on-injection.md | Tricky FP-bait fixture with override pinned to current blocked behavior. |
| plugins/security/stackone-defender/tests/fixtures/tricky/release-notes-2.5.md | Tricky FP-bait fixture expected to be allowed. |
| plugins/security/stackone-defender/tests/fixtures/tricky/incident-postmortem.md | Tricky FP-bait fixture expected to be allowed. |
| plugins/security/stackone-defender/tests/fixtures/tricky/employee-policy.md | Tricky FP-bait fixture expected to be allowed. |
| plugins/security/stackone-defender/tests/fixtures/tricky/api-response-listing.json | Tricky structured-output fixture expected to be allowed. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Sort and file-filter fixtures so test order is deterministic across filesystems and unexpected non-file entries don't break the suite. - Pin engines.node >=22 to document the --test-force-exit requirement. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
0 issues found across 2 files (changes from recent commits).
Auto-approved: This PR adds only test fixtures and a regression test suite that runs the existing defender library against canonical examples; no production code, config, or dependencies are modified, and the blast radius is limited to CI assertion changes.
Re-trigger cubic
Summary
Adds an automated regression test for the
stackone-defenderplugin that pins production scan decisions on a canonical set of 12 fixtures. Replaces the manual-checklist portion of QA handoff with a CI-runnable assertion.benign/fixtures → expectedallowed: truerealistic/fixtures (real injection attacks) → expectedallowed: falsetricky/fixtures (FP-bait content discussing/quoting attacks) → expectedallowed: true, with one known-FP override pinned to current behaviorThe test loads
@stackone/defenderdirectly using the samedefender-daemon.config.jsonthe daemon consumes in production. Runs offline — no daemon, no socket, no network.How to run
Currently green locally: 12/12 pass in ~0.5s after warmup.
What this replaces in the QA handoff
QA no longer needs to manually verify "does defender catch the demo fixtures?" — CI does it. What's left for QA is the behavioral layer:
Test plan
npm install && npm testfromplugins/security/stackone-defender/passes 12/12tests/fixtures/realistic/causes the suite to assert againstallowed: falseautomatically@stackone/defenderand re-running surfaces any behavior drift🤖 Generated with Claude Code
Summary by cubic
Adds a fast, offline regression test suite for the
stackone-defenderplugin that pins production decisions on 12 canonical fixtures to catch behavior drift in CI. Uses the daemon’sdefender-daemon.config.json, adds annpm testscript, and pins Node >=22.New Features
@stackone/defenderwith production config; no daemon, socket, or network.tests/fixtures/*; assertions adapt automatically.Bug Fixes
Written for commit 17eaf7c. Summary will update on new commits. Review in cubic