feat(mcp-web-fetch): expose domain filters#383
Conversation
代码评审报告: feat(mcp-web-fetch): expose domain filters风险等级: 低 级联分析
问题发现无 blocking findings。 新增字段可选、默认 行级发现
Karpathy 评审
缺失覆盖
|
|
Repo-guard non-blocking note acknowledged. I am not adding extra runtime validation in this PR because the existing |
There was a problem hiding this comment.
Pull request overview
This PR updates the @frontagent/mcp-web-fetch MCP tool to expose Anthropic-compatible domain filters (allowed_domains, blocked_domains) and wires them through to the existing engine host filtering (allowHosts, denyHosts), with added no-network security tests to prove rejections happen before any fetch attempt.
Changes:
- Extended the
web_fetchMCP input schema with optionalallowed_domains/blocked_domainsstring-array parameters. - Mapped
allowed_domains -> allowHostsandblocked_domains -> denyHostsin the tool handler. - Added schema and handler-level “no network” tests validating early rejection and that
fetchis never invoked.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| packages/mcp-web-fetch/src/tools.ts | Adds schema fields and maps the new tool args into engine options (allowHosts / denyHosts). |
| packages/mcp-web-fetch/src/tools.security.test.ts | Adds tests covering schema exposure and pre-fetch rejection behavior for allow/block filters. |
| docs/superpowers/plans/2026-07-01-web-fetch-domain-filters.md | Documents an implementation plan for exposing the domain filters and verifying behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| it('exposes allowed_domains and blocked_domains schema parameters', () => { | ||
| expect(webFetchSchema.inputSchema.properties.allowed_domains).toEqual({ | ||
| type: 'array', | ||
| items: { type: 'string' }, | ||
| description: '可选允许域名列表;映射到底层引擎 allowHosts,仅允许抓取这些主机名。', | ||
| }); | ||
| expect(webFetchSchema.inputSchema.properties.blocked_domains).toEqual({ | ||
| type: 'array', | ||
| items: { type: 'string' }, | ||
| description: '可选阻止域名列表;映射到底层引擎 denyHosts,拒绝抓取这些主机名。', | ||
| }); | ||
| expect(webFetchSchema.inputSchema.required).toEqual(['url']); | ||
| }); |
There was a problem hiding this comment.
Fixed in 671cbd0: schema assertions now use toMatchObject so descriptions and future additive schema fields do not make the test brittle.
| allowed_domains: { | ||
| type: 'array', | ||
| items: { type: 'string' }, | ||
| description: '可选允许域名列表;映射到底层引擎 allowHosts,仅允许抓取这些主机名。', | ||
| }, |
There was a problem hiding this comment.
Fixed in 671cbd0: allowed_domains and blocked_domains now declare minItems: 1, and the handler rejects empty provided lists before any fetch attempt. Added a no-network regression test for empty allowed_domains.
There was a problem hiding this comment.
代码评审报告: feat(mcp-web-fetch): expose domain filters
风险等级: 低
处理建议: 评论
决策摘要: 变更范围收敛、契约扩展安全、验收标准基本覆盖,可在补齐 engine 端过滤真实生效的验证后合并;当前测试对拒绝措辞的断言略脆弱且未验证 allowHosts/denyHosts 实际传入 engine。
级联分析
- 变更符号:
webFetchSchema,allWebFetchSchemas,handleWebFetchTool - 受影响流程:
web_fetchMCP 工具调用 →fetchUrlengine 路径(host filter 已有) - 变更集外调用方:
handleWebFetchTool上游仅tools.security.test.ts(1 个),webFetchSchema直接调用方 0;新增字段为可选,向后兼容,未破坏既有调用方 - 置信度: medium(GitNexus 报告 LOW/affected 0,与 diff 一致;但 engine 端
allowHosts/denyHosts的实际拒绝语义未在本 PR diff 中呈现,依赖既有 engine 行为的推断)
问题发现
-
[低] 测试未验证过滤参数真正到达 engine,且未区分“字段透传”与“engine 拒绝”
- 证据:
tools.security.test.ts:70-84、:87-102只断言result.success === false且fetchMock未被调用。但 handler 在把allowHosts/denyHosts传给fetchUrl之前,fetchUrl内部对https://example.com/本身可能因其它 URL safety 检查失败,从而 fetch mock 同样不会被调用——测试无法区分“确因 allowlist/denylist 拒绝”还是“因无关原因拒绝”。 - 受影响调用方/流程:
web_fetch→fetchUrl。若未来 engine 忽略allowHosts/denyHosts,这些测试仍可能因巧合通过。 - 最小可行修复: 断言
result.error明确匹配 allowlist/denylist 拒绝措辞(当前 allow 用例已用/not in allowlist|allow/i,但/allow/i过宽),或增加一个 positive 反向用例:allowed_domains: ['example.com']+ 允许 fetch mock 返回,验证 fetch 被调用,证明 allowHosts 确实透传且未误拒。
- 证据:
-
[低] 拒绝措辞断言与 engine 实际错误文案耦合,易脆
- 证据:
tools.security.test.ts:78用/not in allowlist|allow/i,:96用/denylist|blocked|denied/i。这些正则依赖 engine 内部错误字符串,本 PR 未固定该契约。engine 文案一旦调整(非本 PR 控制),测试会 flakey。 - 最小可行修复: 若 engine 错误文案属于稳定契约,在 engine 层补一条注释/常量引用;否则放宽为断言拒绝发生 + fetch 未调用 + 一个 positive 透传用例,减少对具体措辞的依赖。
- 证据:
行级发现
- [packages/mcp-web-fetch/src/tools.ts:71] 用
args.allowed_domains as string[] | undefined直接 cast,未校验数组元素为 string,非法输入(如[123])会静默透传到 engine;建议在此处对元素类型做最小校验或依赖 schema 校验层明确生效。 - [packages/mcp-web-fetch/src/tools.ts:74] 空数组被显式拒绝,但 schema 已声明
minItems: 1——若 MCP 层已按 schema 校验,此 handler 分支为冗余防御;若 schema 未强制校验,则说明不能依赖 schema,第 71 行的元素类型也同样不可信。请明确 schema 校验是否在调用前生效,并保持两处假设一致。 - [packages/mcp-web-fetch/src/tools.security.test.ts:78]
/allow/i过于宽泛,可能匹配无关错误文案;收紧为/not in allowlist/i以确保断言的是 allowlist 拒绝。
Karpathy 评审
- 假设: handler 假设 MCP 层对 schema
minItems/type已校验,但又在 handler 内重复做空数组校验,说明假设不确定;需明确校验边界归属。 - 简洁性: 实现为最小 adapter 映射,未引入不必要抽象,符合 issue scope,良好。
- 结构质量: 逻辑保留在既有 URL safety/engine 路径,schema+handler 仅做参数透传,无层级泄漏或重复 helper;无结构退化。
- 变更范围: diff 聚焦,未触碰 URL safety policy、private IP、redirect、clamp、web_search、JS 渲染或 auto-approval,符合 issue non-goals。plan 文档为附带产物,非代码风险。
- 验证: 覆盖 schema 存在性、allow/block 拒绝、空 allowlist 拒绝;缺 positive 透传用例与 denylist 边界(空 blocked_domains 已在 handler 处理但无对应测试)。
缺失覆盖
- 缺 positive 用例:
allowed_domains命中允许主机时 fetch 应被调用,证明 allowHosts 未误拒且确实透传到 engine。 - 缺
blocked_domains: []的测试:handler 有对应拒绝分支(tools.ts:79-84)但无测试覆盖。 - 缺 allowed 与 blocked 同时提供时的行为验证(优先级/叠加语义),若属预期使用场景应补一条。
| try { | ||
| switch (toolName) { | ||
| case 'web_fetch': { | ||
| const allowHosts = args.allowed_domains as string[] | undefined; |
There was a problem hiding this comment.
用 args.allowed_domains as string[] | undefined 直接 cast,未校验数组元素为 string,非法输入(如 [123])会静默透传到 engine;建议在此处对元素类型做最小校验或依赖 schema 校验层明确生效。
| const allowHosts = args.allowed_domains as string[] | undefined; | ||
| const denyHosts = args.blocked_domains as string[] | undefined; | ||
|
|
||
| if (Array.isArray(allowHosts) && allowHosts.length === 0) { |
There was a problem hiding this comment.
空数组被显式拒绝,但 schema 已声明 minItems: 1——若 MCP 层已按 schema 校验,此 handler 分支为冗余防御;若 schema 未强制校验,则说明不能依赖 schema,第 71 行的元素类型也同样不可信。请明确 schema 校验是否在调用前生效,并保持两处假设一致。
| throw new Error('network attempted'); | ||
| }); | ||
| globalThis.fetch = fetchMock as unknown as typeof fetch; | ||
|
|
There was a problem hiding this comment.
/allow/i 过于宽泛,可能匹配无关错误文案;收紧为 /not in allowlist/i 以确保断言的是 allowlist 拒绝。
Linked Issue Or Context
Summary
allowed_domainsandblocked_domainsparameters to theweb_fetchMCP schema.allowed_domains-> engineallowHostsandblocked_domains-> enginedenyHosts.Impact Scope
@frontagent/mcp-web-fetchMCP tool schema/handler plus focused tests.GitNexus Impact Summary
packages/mcp-web-fetch/src/tools.ts,packages/mcp-web-fetch/src/tools.security.test.tsimpact(webFetchSchema, upstream)returned LOW with direct callers 0, affected processes 0, modules 0.impact(handleWebFetchTool, upstream)returned LOW with direct upstream 1 (tools.security.test.ts), affected processes 0, modules 0.detect_changes(scope=staged)reported changed_count 3, changed_files 3, affected_count 0, risk_level low; changed symbolswebFetchSchema,allWebFetchSchemas,handleWebFetchTool; affected_processes none.pnpm agent:bootstrap;pnpm quality:predev; focused mcp-web-fetch tests;pnpm quality:precommit;pnpm quality:localbefore push and again via pre-push hook.Verification
pnpm --dir packages/mcp-web-fetch test -- --run src/tools.security.test.tspnpm --dir packages/mcp-web-fetch testpnpm --dir packages/mcp-web-fetch typecheckpnpm quality:precommitpnpm quality:localChecklist
pnpm quality:precommit, or explained why it could not run.pnpm quality:localfor critical skeleton changes, or explained why it could not run.