Rule proposal: canvas-text-annotate — flag canvas/video text surfaces invisible to DOM walkers

### Category
New defense rule (speculative / research-direction)

### What problem does this solve?
Every rule that currently ships defends a DOM-readable surface — text nodes, attributes, structured data, accessibility-tree content. None defend against text that's rendered to `<canvas>` or video frames and only becomes "text" to the agent after OCR. Multimodal computer-use agents (Claude Computer Use, OpenAI Operator/Atlas, Browser Use's screenshot mode) increasingly run vision on rendered pages — at which point every DOM-side defense is bypassed by painting the payload to a canvas instead of writing it as DOM text.

The image-based-prompt-injection research line establishes that vision-language encoders do not distinguish "image content the user wants to show" from "instructions embedded inside an image" — the same architectural problem the indirect-prompt-injection rules already address for text. Reported attack-success rates range from ~64% under stealth constraints to higher under permissive threat models.

### Proposed solution
Three escalating options; pick whichever clears the FP bar:

1. **Presence annotation only.** Annotate the page when any `<canvas>` larger than a size threshold is rendered, or when `<video>` with autoplay is present. A coarse "vision-only content lives here" signal for the agent to weight DOM content as primary source.
2. **OCR-side check.** Off-thread (OffscreenCanvas + worker), rasterize the canvas, run a lightweight OCR pass (e.g. Tesseract.js / WASM, Apache 2.0), apply the existing prompt-injection pattern set to the recognized text. Replace canvas with placeholder when matches occur (this option crosses from `annotate` into `redact` territory and should likely be a distinct sibling rule).
3. **Pixel heuristic.** Skip OCR entirely. Annotate canvases that render as full-page or fill the viewport (≥ a configurable fraction). Canvas-as-content is rare outside attack surfaces and a small set of legitimate apps (Figma, Excalidraw, Google Docs canvas renderer, games).

Per repo convention ("defenses against prompt injection should strip the content, not just label it"): route 2 is the only one that can credibly *replace* matched canvas content with a placeholder. Route 1 and 3 can only annotate. v1 is the annotate-only floor; route 2 is the principled endpoint if it ever clears the cost bar.

### Alternatives considered
- **Strip the canvas.** Wrong threat model — many canvases are legitimate (charts, games, design tools). The defense is signalling that the content is invisible to DOM-side rules.
- **Defer to the agent's vision-encoder defenses.** Reasonable, but worth shipping a coarse content-script signal in the meantime.
- **Server-side image proxy that OCRs and re-renders.** Out of scope for an extension.

### Controlling false positives
The dominant FP risk is annotating legitimate canvas-heavy apps. Without strong gating this rule fires on every Figma/Excalidraw/Google Docs page.

- **Origin allowlist for known canvas apps.** Skip the rule entirely on `*.figma.com`, `excalidraw.com`, `docs.google.com`, `*.tldraw.com`, `*.adobe.com`, `miro.com`, `*.canva.com`, `lucid.app`, `*.notion.so` (canvas-rendered tables), `*.codesandbox.io`, common game-host origins (`*.itch.io`, `*.poki.com`), and major CAD/3D tools. Treat the allowlist as a maintenance surface, similar to how `roach-motel-annotate` and `disguised-ad-flag` lean on curated site data.
- **Size threshold.** Only consider canvases that fill ≥50% of viewport (configurable). A 200×80 chart sparkline is not the attack surface.
- **Stable across mutations.** Canvases redrawn every animation frame (games, video, real-time data viz) should not re-trigger annotation; debounce / annotate-once-per-load.
- **Off-screen canvases excluded.** Many libraries (chart.js, three.js) maintain off-screen render buffers. Require visibility via `IntersectionObserver` before considering.
- **Phrase the annotation carefully.** "Vision-readable content present that DOM-side defenses do not cover" — not "potential injection". Matches the same precise-statement posture as `bot-cloaking-annotate`.
- **For route 2 (OCR), reuse the existing prompt-injection pattern set with whole-string matching.** Same precision bar as `prompt-injection-redact` — avoid matching axis labels in a chart that happen to contain instruction-shaped substrings.
- **For route 2, OCR confidence threshold.** Tesseract.js exposes per-word confidence; require confidence above a floor (e.g., 70) before feeding to the pattern matcher. Garbage-OCR-as-injection is the most embarrassing FP mode.
- **Default-off, experimental.** Same posture as `bot-cloaking-annotate`. Not a default-on rule under any realistic threshold today.
- **Telemetry first.** Before considering default-on, gather per-host hit counts on real browsing data; promote allowlist entries based on observed false-positive sites, same way `schema-trust-sanitize` documents its known-syndicator short-circuit list.

### Prior art / references
- Image-Based Prompt Injection overview — Christian Schneider, *Multimodal prompt injection: attacks in images, audio, and video.* https://christian-schneider.net/blog/multimodal-prompt-injection/
- Cloud Security Alliance research note on image-prompt-injection multimodal LLM (2026). https://labs.cloudsecurityalliance.org/research/csa-research-note-image-prompt-injection-multimodal-llm-2026/
- *Mind Mapping Prompt Injection* (MDPI Electronics, 2025). https://www.mdpi.com/2079-9292/14/10/1907
- [Tesseract.js](https://github.com/naptha/tesseract.js) (Apache 2.0) — viable in-browser OCR if going down route 2.

Tagged Impact L / Complexity H.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rule proposal: canvas-text-annotate — flag canvas/video text surfaces invisible to DOM walkers #123

Category

What problem does this solve?

Proposed solution

Alternatives considered

Controlling false positives

Prior art / references

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Rule proposal: canvas-text-annotate — flag canvas/video text surfaces invisible to DOM walkers #123

Description

Category

What problem does this solve?

Proposed solution

Alternatives considered

Controlling false positives

Prior art / references

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions