Category
New defense rule
What problem does this solve?
Even when a page is "clean" (no injection, no dark patterns), sheer length is a defense surface. Liu et al. (TACL 2023), Lost in the Middle: How Language Models Use Long Contexts, document a U-shaped accuracy curve: LLM retrieval and reasoning degrade sharply when relevant information sits in the middle of a long context window, even on models advertised as long-context.
A page with 200 comments or 80 reviews pushes the agent's task instructions and the page payload to the edges, where the agent both (a) misses the answer and (b) is more susceptible to mid-context injection that benefits from positional dilution. The existing comments-redact and reviews-redact rules remove these surfaces entirely. But when the comments are the task ("summarize the top criticism of this product"), removal is wrong — the agent needs the content, just not all 800 entries.
Proposed solution
On pages whose visible text exceeds a budget (proposed: 50k chars; tunable), collapse:
- Comment threads past the first N entries
- Review lists past the first N entries
- Reply chains past the first N levels of depth
into the same click-to-reveal placeholder shape cross-origin-frame-redact, comments-redact, and irrelevant-sections-redact already use. Agents that need the tail can reveal explicitly; default behavior preserves the head of the list (typically highest-quality on engagement-sorted platforms).
Alternatives considered
- Lower the cap in
comments-redact / reviews-redact. Doesn't help — those rules are all-or-nothing today; this proposal is the "keep the head" variant.
- Reader-mode extraction (Readability.js). Already part of the prior-art lineage; Readability picks a main article and discards comments wholesale. We want the keep-some-comments shape.
- Generic LLM trim. Same shape as
irrelevant-sections-redact, but the trigger here is length, not engagement-rail recognition — no LLM call needed for a "keep first N" heuristic.
Controlling false positives
- Default-off. Until per-host hit/skip data confirms the head-N preserves task-relevant content, this rule should ship default-off — same posture as
irrelevant-sections-redact. Users opt in when their workflow tolerates the trade-off.
- Per-host denylist. Sites where the tail is structurally load-bearing — Hacker News (highly-rated child comments outvalue top-level), GitHub issue threads (resolution often in the last reply), Reddit (
AskScience-style threads with cited replies), public-comment portals like regulations.gov, court records — never apply the rule.
- Preserve elements with structural significance markers. Reddit "best answer" flags, Stack Overflow "Accepted" badges, GitHub "Marked as answer",
aria-label*="solved", and rows with engagement scores above a per-host percentile. The head-N count should not blindly drop a flagged answer because it sits at position 47.
- High length threshold. 50k visible chars is roughly 12k tokens — well above the budget where Lost-in-the-Middle starts to bite for current frontier models. Tuning low risks redacting pages that fit comfortably in context.
- Reveal-on-demand contract. Collapsed regions become click-to-reveal placeholders, not deletions. An agent that detects the placeholder shape can decide to expand — same affordance as
cross-origin-frame-redact.
- Per-section quotas, not page-wide. Apply N independently to each thread/list so a page with multiple distinct conversation surfaces (an article with comments + a sidebar of reviews) doesn't lose representation in one to keep room for the other.
- Skip when the page is the agent's task surface. If the user's task implies "summarize all reviews", the agent can't tell us — but a per-host denylist for the platforms where this is the common ask (review-aggregator sites like Trustpilot, Amazon SERP review pages once the user navigates to "see all reviews") gets most of the way there.
Prior art / references
Tagged Impact M / Complexity H.
Category
New defense rule
What problem does this solve?
Even when a page is "clean" (no injection, no dark patterns), sheer length is a defense surface. Liu et al. (TACL 2023), Lost in the Middle: How Language Models Use Long Contexts, document a U-shaped accuracy curve: LLM retrieval and reasoning degrade sharply when relevant information sits in the middle of a long context window, even on models advertised as long-context.
A page with 200 comments or 80 reviews pushes the agent's task instructions and the page payload to the edges, where the agent both (a) misses the answer and (b) is more susceptible to mid-context injection that benefits from positional dilution. The existing
comments-redactandreviews-redactrules remove these surfaces entirely. But when the comments are the task ("summarize the top criticism of this product"), removal is wrong — the agent needs the content, just not all 800 entries.Proposed solution
On pages whose visible text exceeds a budget (proposed: 50k chars; tunable), collapse:
into the same click-to-reveal placeholder shape
cross-origin-frame-redact,comments-redact, andirrelevant-sections-redactalready use. Agents that need the tail can reveal explicitly; default behavior preserves the head of the list (typically highest-quality on engagement-sorted platforms).Alternatives considered
comments-redact/reviews-redact. Doesn't help — those rules are all-or-nothing today; this proposal is the "keep the head" variant.irrelevant-sections-redact, but the trigger here is length, not engagement-rail recognition — no LLM call needed for a "keep first N" heuristic.Controlling false positives
irrelevant-sections-redact. Users opt in when their workflow tolerates the trade-off.AskScience-style threads with cited replies), public-comment portals like regulations.gov, court records — never apply the rule.aria-label*="solved", and rows with engagement scores above a per-host percentile. The head-N count should not blindly drop a flagged answer because it sits at position 47.cross-origin-frame-redact.Prior art / references
rules.md.Tagged Impact M / Complexity H.