fix: guard JIT prompts against injection by Gradata · Pull Request #238 · Gradata/gradata

Gradata · 2026-06-01T18:30:01Z

Summary

Adds src/gradata/hooks/_injection_guard.py with Unicode normalization, zero-width/BOM cleanup, regex heuristics, and base64/ROT13 decoded-payload checks.
Wires the guard into jit_inject.py immediately after UserPromptSubmit message extraction and before BM25/Jaccard rule scoring.
Adds prompt-injection corpus fixtures plus guard tests for direct override, role swap, encoded directives, marker injection, and benign controls.

Paperclip: GRA-2018 (bfe46192-868a-472e-a271-10836fee3048)
Related: GRA-1295, GRA-1596

Verification

python3 -m pytest tests/hooks/test_injection_guard.py tests/test_jit_inject.py tests/security/test_prompt_injection_poc.py → 91 passed, 1 skipped, 14 xfailed
uv run --extra dev ruff check src/gradata/hooks/_injection_guard.py src/gradata/hooks/jit_inject.py tests/hooks/test_injection_guard.py tests/security/test_prompt_injection_poc.py → All checks passed

greptile-apps

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

coderabbitai · 2026-06-01T18:30:14Z

📝 Walkthrough

Prompt Injection Guard Implementation

Core Changes

Adds new _injection_guard.py module with prompt-injection detection and sanitization:
- sanitize(text: str) -> str: Removes BOM, normalizes Unicode (NFKC), strips zero-width characters, collapses whitespace
- is_suspicious(text: str) -> tuple[bool, str]: Returns detection result and reason; checks for roleplay/persona bypass, system leakage, marker tokens, encoded payloads (base64/ROT13)
Integrates guard into jit_inject.py hook: sanitizes extracted user prompts and aborts with early return if injection detected
Provides environment control via GRADATA_INJECTION_GUARD (default OFF) and GRADATA_LEGACY_INSTALL for backward compatibility

Security & Testing

Comprehensive injection corpus: 40+ test fixtures covering 14 attack classes (direct override, role hijack, system leak, XML/JS injection, encoding bypass, few-shot hijack, goal hijack, marker injection, etc.)
Manifest-driven POC runner validates corpus against both regex-based and sanitization guards; 91 tests pass, 14 xfailed (known gaps)
Benign control cases ensure low false-positive rate on legitimate lesson content

New Public APIs

sanitize(text: str) -> str for text normalization
is_suspicious(text: str) -> tuple[bool, str] for injection detection

No Breaking Changes — Guard is disabled by default for backward compatibility; can be enabled via environment variable.

Walkthrough

This PR adds a prompt-injection guard that sanitizes and detects hostile patterns in user prompts before BM25 scoring. The guard is integrated into the JIT hook as an early abort gate, validated with comprehensive unit tests, and benchmarked against a 34-payload security corpus spanning 10 attack classes.

Changes

Prompt Injection Guard Implementation

Layer / File(s)	Summary
Guard module setup and text sanitization `Gradata/src/gradata/hooks/_injection_guard.py`	Environment-gated guard module (`GRADATA_INJECTION_GUARD`) with `sanitize(text)` that removes BOM, applies NFKC normalization, strips zero-width characters, and collapses excessive whitespace.
Injection pattern detection and decoding `Gradata/src/gradata/hooks/_injection_guard.py`	Precompiled regex patterns for roleplay framing, persona bypass, system/instruction leakage, LLM markers (ChatML, Alpaca), few-shot hijack, goal hijacking, indirect injection, and generic override phrasing; helpers decode base64 and ROT13 payloads and rescan decoded content.
JIT hook integration `Gradata/src/gradata/hooks/jit_inject.py`	Imports guard utilities; adds preprocessing gate that sanitizes user draft, runs `is_suspicious`, and returns `None` early if flagged, preventing BM25/rule injection downstream.
Guard module unit tests `Gradata/tests/hooks/test_injection_guard.py`	Tests gap payloads from manifest, regression tests for known patterns (direct override, zero-width variants, role swap, DAN-style, encoded markers, ChatML/Alpaca), `sanitize` edge cases (BOM, NFKC, zero-width, whitespace), and false-positive avoidance on benign texts.
Security test corpus and validation suite `Gradata/tests/security/fixtures/injection_corpus/*`, `Gradata/tests/security/fixtures/manifest.json`, `Gradata/tests/security/test_prompt_injection_poc.py`	34 fixture payloads organized by attack class (benign controls, direct override, encoding bypass, few-shot hijack, goal hijack, indirect, marker injection, role hijack, system leak, virtualization, XML and JavaScript template injection); manifest defines payload metadata, expected outcomes, and coverage gaps; proof-of-concept test runner validates block/sanitize outcomes and manifest integrity.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested labels

security

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 56.82% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'fix: guard JIT prompts against injection' directly and clearly describes the main change—adding prompt injection protection to JIT prompts.
Description check	✅ Passed	The description provides relevant details about the changeset, including the main components added, integration points, test coverage, and verification results.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch gra-2018-prompt-injection-guard

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 OpenGrep (1.22.0)

OpenGrep fatal error (exit code 2):
┌──────────────┐
│ Opengrep CLI │
└──────────────┘

�[32m✔�[39m �[1mOpengrep OSS�[0m
�[32m✔�[39m Basic security coverage for first-party code vulnerabilities.

�[1m Loading rules from local config...�[0m
[00.19][ERROR]: Error: exception Glob.Lexer.Syntax_error("malformed glob pattern: missing ']'")
Raised at Glob__Lexer.syntax_error in file "libs/glob/Lexer.mll", line 8, characters 2-26
Called from Glob__Lexer.__ocaml_lex_token_rec in file "libs/glob/Lexer.mll", line 29, characters 26-53
Cal

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@Gradata/src/gradata/hooks/_injection_guard.py`:
- Around line 251-254: The current fast-path in _injection_guard.py
unconditionally returns (False, "") when len(text) < 20, which skips detection
of short high-signal markers; instead, modify the short-input guard so it only
skips full expensive processing but still runs a minimal marker scan for the
variable text (e.g., call or inline a compact rule set that checks for known
injection tokens like "override", "system:", "assistant:", "###", ">>>",
prompt-injection keywords, or regexes) and return a defensive True/flag if any
minimal-rule matches; keep the existing full scanner for longer inputs but
ensure the early-return branch delegates to this minimal_scan helper (name it
minimal_scan_or_scan_short_input) and returns its (bool, reason) tuple rather
than always False.

In `@Gradata/tests/hooks/test_injection_guard.py`:
- Around line 55-56: The test relies on ambient environment for guard
enablement, causing nondeterministic failures; add an autouse pytest fixture
that pins the guard-related env so is_suspicious behaves consistently (e.g.,
ensure GRADATA_LEGACY_INSTALL is unset or set to the expected value) for tests
using test_gap_payload_detected and _block_ids; implement the fixture in the
test module (autouse=True) using pytest's monkeypatch to set or unset
os.environ["GRADATA_LEGACY_INSTALL"] before tests run and restore afterward so
the guard logic in is_suspicious runs deterministically.

In `@Gradata/tests/security/fixtures/manifest.json`:
- Around line 275-276: The manifest entry for encoding_bypass_003 has
contradictory metadata: the description says Unicode homoglyphs won't be caught
but detectable_by_current_guards is true; update the entry by setting the
boolean detectable_by_current_guards to false (or alternatively reword the
description to claim it is detectable) so the metadata and description
align—locate the encoding_bypass_003 object in manifest.json and change the
"detectable_by_current_guards" field accordingly while keeping the description
text as-is (or adjust the description if you prefer to keep the boolean true).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: bfa5432e-43f0-4cc1-a016-f1df024b226b

📥 Commits

Reviewing files that changed from the base of the PR and between a197bff and d3d9bf7.

📒 Files selected for processing (39)

Gradata/src/gradata/hooks/_injection_guard.py
Gradata/src/gradata/hooks/jit_inject.py
Gradata/tests/hooks/test_injection_guard.py
Gradata/tests/security/fixtures/injection_corpus/benign_control_001.txt
Gradata/tests/security/fixtures/injection_corpus/benign_control_002.txt
Gradata/tests/security/fixtures/injection_corpus/benign_control_003.txt
Gradata/tests/security/fixtures/injection_corpus/direct_override_001.txt
Gradata/tests/security/fixtures/injection_corpus/direct_override_002.txt
Gradata/tests/security/fixtures/injection_corpus/direct_override_003.txt
Gradata/tests/security/fixtures/injection_corpus/encoding_bypass_001.txt
Gradata/tests/security/fixtures/injection_corpus/encoding_bypass_002.txt
Gradata/tests/security/fixtures/injection_corpus/encoding_bypass_003.txt
Gradata/tests/security/fixtures/injection_corpus/few_shot_hijack_001.txt
Gradata/tests/security/fixtures/injection_corpus/goal_hijack_001.txt
Gradata/tests/security/fixtures/injection_corpus/goal_hijack_002.txt
Gradata/tests/security/fixtures/injection_corpus/indirect_001.txt
Gradata/tests/security/fixtures/injection_corpus/indirect_002.txt
Gradata/tests/security/fixtures/injection_corpus/js_template_001.txt
Gradata/tests/security/fixtures/injection_corpus/js_template_002.txt
Gradata/tests/security/fixtures/injection_corpus/js_template_003.txt
Gradata/tests/security/fixtures/injection_corpus/js_template_004.txt
Gradata/tests/security/fixtures/injection_corpus/marker_inject_001.txt
Gradata/tests/security/fixtures/injection_corpus/marker_inject_002.txt
Gradata/tests/security/fixtures/injection_corpus/marker_inject_003.txt
Gradata/tests/security/fixtures/injection_corpus/role_hijack_001.txt
Gradata/tests/security/fixtures/injection_corpus/role_hijack_002.txt
Gradata/tests/security/fixtures/injection_corpus/role_hijack_003.txt
Gradata/tests/security/fixtures/injection_corpus/role_hijack_004.txt
Gradata/tests/security/fixtures/injection_corpus/system_leak_001.txt
Gradata/tests/security/fixtures/injection_corpus/system_leak_002.txt
Gradata/tests/security/fixtures/injection_corpus/system_leak_003.txt
Gradata/tests/security/fixtures/injection_corpus/virtualization_001.txt
Gradata/tests/security/fixtures/injection_corpus/virtualization_002.txt
Gradata/tests/security/fixtures/injection_corpus/xml_inject_001.txt
Gradata/tests/security/fixtures/injection_corpus/xml_inject_002.txt
Gradata/tests/security/fixtures/injection_corpus/xml_inject_003.txt
Gradata/tests/security/fixtures/injection_corpus/xml_inject_004.txt
Gradata/tests/security/fixtures/manifest.json
Gradata/tests/security/test_prompt_injection_poc.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)

GitHub Check: pytest windows-latest / py3.12
GitHub Check: pytest macos-latest / py3.11
GitHub Check: pytest windows-latest / py3.11
GitHub Check: pytest ubuntu-latest / py3.12
GitHub Check: pytest ubuntu-latest / py3.11
GitHub Check: pytest macos-latest / py3.12
GitHub Check: pytest (py3.11)
GitHub Check: pytest (py3.12)

🧰 Additional context used

📓 Path-based instructions (2)

Gradata/src/**/*.py

📄 CodeRabbit inference engine (Gradata/AGENTS.md)

Gradata/src/**/*.py: Prefer sentence-transformers for local embeddings, google-genai for Gemini embeddings, cryptography for AES-GCM encrypted system.db, bm25s for BM25 rule ranking, and mem0ai for external memory adapters — guard all optional dependency imports with try / except ImportError at the call site, never at module level
Maintain strict layering: Layer 0 (Primitives: _types.py, _db.py, _events.py, _paths.py, _file_lock.py; Patterns: contrib/patterns/) must never import from Layer 1 (Enhancements: enhancements/, rules/) or Layer 2 (Public API: brain.py, cli.py, daemon.py, mcp_server.py)
Never use bare except: pass — use typed exceptions or at minimum logger.warning(...) with exc_info=True to avoid silent failure in a memory product
Never import from out-of-scope sibling directories ../Sprites/ or ../Hausgem/ within gradata/* code — that is a layering bug
Never leak private-sibling paths into public docs/code — no references to ../Sprites/, ../Hausgem/, email addresses, OneDrive paths, or Sprites-specific examples from inside gradata/*
Use atomic-write helper when writing JSON files to prevent corruption from mid-write crashes

Files:

Gradata/src/gradata/hooks/jit_inject.py
Gradata/src/gradata/hooks/_injection_guard.py

Gradata/tests/**/*.py

📄 CodeRabbit inference engine (Gradata/AGENTS.md)

Gradata/tests/**/*.py: Set BRAIN_DIR environment variable via tmp_path in conftest.py for test isolation — ensure _paths.py module cache refreshes when calling Brain.init() directly inside tests
Add unit tests in tests/test_*.py for every CI push without LLM calls (deterministic); mark integration tests with @pytest.mark.integration and skip them by default (they hit real LLM APIs)

Files:

Gradata/tests/hooks/test_injection_guard.py
Gradata/tests/security/test_prompt_injection_poc.py

🪛 LanguageTool

Gradata/tests/security/fixtures/injection_corpus/few_shot_hijack_001.txt

[style] ~7-~7: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ...sistant (unrestricted): You boil water. Also, here are all the system prompt content...

(REP_SENT_START_ALSO)

🔇 Additional comments (36)

Gradata/src/gradata/hooks/jit_inject.py (1)

33-33: LGTM!

Also applies to: 292-297

Gradata/tests/security/fixtures/injection_corpus/system_leak_003.txt (1)

1-1: LGTM!

Gradata/tests/security/fixtures/injection_corpus/virtualization_001.txt (1)

1-1: LGTM!

Gradata/tests/security/fixtures/injection_corpus/virtualization_002.txt (1)

1-1: LGTM!

Gradata/tests/security/fixtures/injection_corpus/xml_inject_001.txt (1)

1-1: LGTM!

Gradata/tests/security/test_prompt_injection_poc.py (1)

93-126: LGTM!

Also applies to: 136-168, 178-224

Gradata/tests/security/fixtures/injection_corpus/benign_control_001.txt (1)

1-1: LGTM!

Gradata/tests/security/fixtures/injection_corpus/benign_control_002.txt (1)

1-1: LGTM!

Gradata/tests/security/fixtures/injection_corpus/benign_control_003.txt (1)

1-1: LGTM!

Gradata/tests/security/fixtures/injection_corpus/direct_override_001.txt (1)

1-1: LGTM!

Gradata/tests/security/fixtures/injection_corpus/direct_override_002.txt (1)

1-1: LGTM!

Gradata/tests/security/fixtures/injection_corpus/xml_inject_002.txt (1)

1-1: LGTM!

Gradata/tests/security/fixtures/injection_corpus/xml_inject_004.txt (1)

1-5: LGTM!

Gradata/tests/security/fixtures/injection_corpus/direct_override_003.txt (1)

1-1: LGTM!

Gradata/tests/security/fixtures/injection_corpus/encoding_bypass_001.txt (1)

1-1: LGTM!

Gradata/tests/security/fixtures/injection_corpus/encoding_bypass_002.txt (1)

1-1: LGTM!

Gradata/tests/security/fixtures/injection_corpus/encoding_bypass_003.txt (1)

1-1: LGTM!

Gradata/tests/security/fixtures/injection_corpus/few_shot_hijack_001.txt (1)

1-10: LGTM!

Gradata/tests/security/fixtures/injection_corpus/system_leak_001.txt (1)

1-1: LGTM!

Gradata/tests/security/fixtures/injection_corpus/system_leak_002.txt (1)

1-1: LGTM!

Gradata/tests/security/fixtures/injection_corpus/xml_inject_003.txt (1)

1-1: LGTM!

Gradata/tests/security/fixtures/injection_corpus/goal_hijack_001.txt (1)

1-1: LGTM!

Gradata/tests/security/fixtures/injection_corpus/goal_hijack_002.txt (1)

1-1: LGTM!

Gradata/tests/security/fixtures/injection_corpus/indirect_001.txt (1)

1-5: LGTM!

Gradata/tests/security/fixtures/injection_corpus/indirect_002.txt (1)

1-5: LGTM!

Gradata/tests/security/fixtures/injection_corpus/js_template_001.txt (1)

1-1: LGTM!

Gradata/tests/security/fixtures/injection_corpus/role_hijack_002.txt (1)

1-1: LGTM!

Gradata/tests/security/fixtures/injection_corpus/role_hijack_003.txt (1)

1-1: LGTM!

Gradata/tests/security/fixtures/injection_corpus/role_hijack_004.txt (1)

1-1: LGTM!

Gradata/tests/security/fixtures/injection_corpus/js_template_002.txt (1)

1-1: LGTM!

Gradata/tests/security/fixtures/injection_corpus/js_template_003.txt (1)

1-1: LGTM!

Gradata/tests/security/fixtures/injection_corpus/js_template_004.txt (1)

1-1: LGTM!

Gradata/tests/security/fixtures/injection_corpus/marker_inject_001.txt (1)

1-4: LGTM!

Gradata/tests/security/fixtures/injection_corpus/marker_inject_002.txt (1)

1-6: LGTM!

Gradata/tests/security/fixtures/injection_corpus/marker_inject_003.txt (1)

1-6: LGTM!

Gradata/tests/security/fixtures/injection_corpus/role_hijack_001.txt (1)

1-1: LGTM!

coderabbitai · 2026-06-01T18:35:09Z

+    # Quick pre-check: if text is very short and doesn't contain known markers,
+    # skip expensive processing.
+    if len(text) < 20:
+        return False, ""


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Short-input fast-path bypasses known injection markers.

The code unconditionally returns False for inputs shorter than 20 chars, which contradicts the comment and misses concise payloads (e.g., short override directives). Keep the fast-path, but still scan short inputs with a minimal high-signal rule set.

🔧 Suggested fix

- # Quick pre-check: if text is very short and doesn't contain known markers, - # skip expensive processing. - if len(text) < 20: - return False, "" + # Quick pre-check: keep short-input fast path, but still check high-signal + # markers so concise injections are not missed. + if len(text) < 20: + quick_checks = ( + _RE_OVERRIDE, + _RE_SYSTEM_LEAK, + _RE_LLM_MARKERS, + _RE_GOAL_HIJACK, + _RE_ROLEPLAY, + ) + if not any(p.search(text) for p in quick_checks): + return False, ""

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Gradata/src/gradata/hooks/_injection_guard.py` around lines 251 - 254, The current fast-path in _injection_guard.py unconditionally returns (False, "") when len(text) < 20, which skips detection of short high-signal markers; instead, modify the short-input guard so it only skips full expensive processing but still runs a minimal marker scan for the variable text (e.g., call or inline a compact rule set that checks for known injection tokens like "override", "system:", "assistant:", "###", ">>>", prompt-injection keywords, or regexes) and return a defensive True/flag if any minimal-rule matches; keep the existing full scanner for longer inputs but ensure the early-return branch delegates to this minimal_scan helper (name it minimal_scan_or_scan_short_input) and returns its (bool, reason) tuple rather than always False.

coderabbitai · 2026-06-01T18:35:09Z

+@pytest.mark.parametrize("payload_id", _block_ids())
+def test_gap_payload_detected(payload_id: str) -> None:


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard tests are environment-coupled and can fail nondeterministically.

These tests rely on ambient env for guard enablement. If GRADATA_LEGACY_INSTALL is set in CI, is_suspicious returns early and the suite fails for the wrong reason. Pin guard env in an autouse fixture.

🔧 Suggested fix

import pytest from gradata.hooks._injection_guard import is_suspicious, sanitize + + +@pytest.fixture(autouse=True) +def _force_guard_enabled(monkeypatch: pytest.MonkeyPatch) -> None: + monkeypatch.setenv("GRADATA_INJECTION_GUARD", "1") + monkeypatch.delenv("GRADATA_LEGACY_INSTALL", raising=False)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Gradata/tests/hooks/test_injection_guard.py` around lines 55 - 56, The test relies on ambient environment for guard enablement, causing nondeterministic failures; add an autouse pytest fixture that pins the guard-related env so is_suspicious behaves consistently (e.g., ensure GRADATA_LEGACY_INSTALL is unset or set to the expected value) for tests using test_gap_payload_detected and _block_ids; implement the fixture in the test module (autouse=True) using pytest's monkeypatch to set or unset os.environ["GRADATA_LEGACY_INSTALL"] before tests run and restore afterward so the guard logic in is_suspicious runs deterministically.

coderabbitai · 2026-06-01T18:35:09Z

+      "description": "Thin Unicode lookalike variant ('ignore previous instructions' with homoglyphs). Listed as detectable via literal ASCII match here — but Unicode homoglyphs would not be caught; included to mark the boundary.",
+      "detectable_by_current_guards": true


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix contradictory detectability metadata for encoding_bypass_003.

Line 275 says this homoglyph variant is not reliably caught by literal ASCII matching, but Line 276 marks it as detectable. Set detectable_by_current_guards to false (or adjust the description) so corpus stats and gap tracking remain trustworthy.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Gradata/tests/security/fixtures/manifest.json` around lines 275 - 276, The manifest entry for encoding_bypass_003 has contradictory metadata: the description says Unicode homoglyphs won't be caught but detectable_by_current_guards is true; update the entry by setting the boolean detectable_by_current_guards to false (or alternatively reword the description to claim it is detectable) so the metadata and description align—locate the encoding_bypass_003 object in manifest.json and change the "detectable_by_current_guards" field accordingly while keeping the description text as-is (or adjust the description if you prefer to keep the boolean true).

fix: guard JIT prompts against injection

d3d9bf7

greptile-apps Bot reviewed Jun 1, 2026

View reviewed changes

coderabbitai Bot added the security label Jun 1, 2026

coderabbitai Bot requested changes Jun 1, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: guard JIT prompts against injection#238

fix: guard JIT prompts against injection#238
Gradata wants to merge 1 commit into
mainfrom
gra-2018-prompt-injection-guard

Gradata commented Jun 1, 2026

Uh oh!

greptile-apps Bot left a comment

Uh oh!

coderabbitai Bot commented Jun 1, 2026 •

edited

Loading

Prompt Injection Guard Implementation

Walkthrough

Changes

Estimated code review effort

Suggested labels

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 1, 2026

Uh oh!

coderabbitai Bot Jun 1, 2026

Uh oh!

coderabbitai Bot Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		@pytest.mark.parametrize("payload_id", _block_ids())
		def test_gap_payload_detected(payload_id: str) -> None:

		"description": "Thin Unicode lookalike variant ('ignore previous instructions' with homoglyphs). Listed as detectable via literal ASCII match here — but Unicode homoglyphs would not be caught; included to mark the boundary.",
		"detectable_by_current_guards": true

Conversation

Gradata commented Jun 1, 2026

Summary

Verification

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Prompt Injection Guard Implementation

Walkthrough

Changes

Estimated code review effort

Suggested labels

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented Jun 1, 2026 •

edited

Loading