fix: add _handle_qa error handling, workflow_dispatch simulate mode, cache env improvements#101
Conversation
… mode, improve cache env vars Three fixes in one PR: 1. Error handling: _handle_qa now wraps CSM_QA.from_env() and qa_engine.ask() in try/except, posting user-facing error replies on failure instead of crashing the workflow. Also added error handling to _handle_join and _handle_other. This aligns with discussion_bot.py's defensive pattern (lines 932-936). 2. Simulate mode: workflow_dispatch now supports discussion_number=0 (made optional with default '0') to simulate a complete classify+intent flow without a real Discussion. In simulate mode: - QA: uses ROUTER_COMMENT_BODY as question, calls CSM_QA.ask(), prints result - JOIN: prints simulated condition report - OTHER: prints guide message All handlers skip Discussion API calls when discussion_number==0. 3. Workflow env: added GH_TOKEN to QA step (for wiki clone auth), ROUTER_COMMENT_BODY+ROUTER_EVENT_TYPE to QA/JOIN/OTHER steps (required for simulate mode and consistent logging).
nevstop
left a comment
There was a problem hiding this comment.
Review: fix/router-qa-error-handling-and-simulate
Verdict: minor issues, OK to ship after fixing the blocking item
Blocking
scripts/router.pyL676 (new):_handle_qa现在使用 router 自己的fetch_discussion获取 discussion,但该函数的 GraphQL 查询缺少author { login }字段(对比discussion_bot.pyL252 中的fetch_discussion包含此字段)。导致 L732discussion.get(author)始终返回None,SKIP_AUTHORS检查永久失效 —nevstop的 discussions 将不再被跳过。
修复: 在 router 的fetch_discussionGraphQL 查询中添加author { login }(与 discussion_bot 版本对齐)。
Should-fix
scripts/router.pyL655–690 (simulate 分支): 模拟模式从os.environ.get(ROUTER_COMMENT_BODY)读取问题,但main()已经将--comment-body参数解析到args.comment_body。更一致的做法是将comment_body作为参数传入_handle_qa(与_handle_join接收comment_author的模式对齐),避免同一个值从两种途径获取。scripts/router.pyL935–936:_handle_join的GitHubGraphQL(token)新增了 try/exceptValueError,但只捕获了ValueError。如果GitHubGraphQL.__init__抛出其他异常(如RuntimeError),仍会崩溃。建议放宽为except Exception,与其他 handler 保持一致。- 测试覆盖: 新增的 simulate 分支和 error-handling 分支没有单元测试。建议至少添加
test_simulate_qa_no_comment_body和test_handle_qa_init_failure两个 case。
Nits
scripts/router.pyL664: 模拟分支的from csm_llm_qa import CSM_QA # noqa: F811与非模拟分支的导入重复。两个分支互斥所以无害,但# noqa: F811提示这不够优雅。可考虑将 CSM_QA 导入提升到函数顶部通用位置。.github/workflows/org-router.ymlL22:discussion_number默认值'0'作为 string 类型,但 router.py 中--discussion-number是type=int。GitHub Actions 传0给 shell 后python --discussion-number 0能正确解析为 int 0,所以实际没问题,但类型不一致值得注意。
验证:argparse的type=int会调用int(0)→0,无问题。
正向确认
- ✅
_handle_qa的三级 try/except(导入 / from_env / ask)正确对齐discussion_bot.py的防御模式 - ✅ 模拟模式下
main()跳过_build_classify_history(无真实 Discussion 可拉取)逻辑正确 - ✅
is_simulate标志传递给 handler 作为dry_run or is_simulate,确保模拟模式始终不发布回复 - ✅
GH_TOKEN添加到 QA step env 解决 wiki clone 认证问题 - ✅
_handle_other和_handle_join的异常保护合理,不会掩盖根因(均记录logger.exception)
…ent_body as param, widen exception catch
- fetch_discussion: add author { login } at discussion level (was missing,
causing SKIP_AUTHORS check to always fail)
- _handle_qa: accept comment_body as explicit parameter instead of reading
from os.environ (consistent with _handle_join pattern)
- _handle_join: widen GitHubGraphQL init catch from ValueError to Exception
Review 修复 — commit 6189f61🔴 Blocking → 已修复
🟡 Should-fix → 已修复
📝 Nits — 未修改
|
Re-review — commit 6189f61Verdict: ship as-is ✅ 已修复确认
无新问题
小注(不影响合入)
|
…boundLocalError The inside _handle_qa shadowed the module-level import, causing Python to treat GitHubGraphQL as a local variable throughout the entire function. This caused UnboundLocalError in the non-Q&A guidance path (L649/L654) which runs before the import. Fix: remove GitHubGraphQL from the local import — the module-level (L35) is sufficient since discussion_bot re-exports the same class.
测试结果 — commit 09a3790workflow_dispatch 测试(discussion #100,General 分类,dry_run=true):✅ 全部通过 https://github.com/NEVSTOP-LAB/.github/actions/runs/27927696285 发现的额外 bug 及修复测试 main 分支时发现 中存在 UnboundLocalError:
测试通过步骤
|
修复内容
1.
_handle_qa错误处理(根因修复)CSM_QA.from_env()和qa_engine.ask()现在被 try/except 包裹_handle_join和_handle_other同样加强了异常保护discussion_bot.py的防御模式2. workflow_dispatch 手动模拟模式
discussion_number改为可选(默认 0 = 模拟模式)comment_body作为问题,调用 CSM_QA 生成回答并打印到日志ROUTER_COMMENT_BODY、ROUTER_EVENT_TYPE环境变量GH_TOKEN确保 wiki 克隆有认证3. 缓存修复(已在基线 commit 2161106 中)
hashFiles('requirements-bot.txt')check-vs-populated步骤用 jq 校验commit_id防止过期缓存手动测试
discussion_number留空(默认 0)comment_body输入技术问题category_name填Q&Adry_run选trueCloses #99