Close remaining known-risks gaps (A2, D4, SPOF test coverage)#184
Conversation
Two items needed a behavior decision and got one: - A2: FetchHtml now calls Http::get()->throw(), treating any non-2xx response the same as a connection failure — neither writes to RawPage nor gets cached. Previously only connection exceptions were caught; a 404/500 response body would silently get written/cached. - D4: extracted Page update and Pak sync into UpdateOrCreatePageWithPaks, wrapping both in DB::transaction() so a SyncPak failure can no longer leave a Page with new title/text but stale pak associations visible to search. All three site Handlers now use this combined action. B3 (Notion re-sync throttling) stays unimplemented per decision — recorded in the ledger only, revisit if API quota becomes a problem. The rest is closing out every remaining SPOF row (single test) to OK (2+ tests) per the ledger's own correction conditions: A1/A3, B1/B2, C1/C2, D1/D3. Notably C1 had no direct web-route (non-JSON) test, and B2's null-URL delete guard and delete-side fault isolation had no test at all despite being implemented in the prior PR. known-risks.md updated to 🟢 OK across the board except B3 (still 🔴 Missing, by decision) and D2 (N/A, no draft/publish concept exists). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
There was a problem hiding this comment.
Code Review
This pull request introduces atomic page updates and pak synchronization using a new UpdateOrCreatePageWithPaks action wrapped in a database transaction, resolving a potential data consistency issue. It also updates FetchHtml to throw exceptions on non-2xx responses to prevent writing error pages to RawPage, and adds comprehensive feature tests. The review feedback suggests using DB::partialMock() instead of DB::shouldReceive() in tests to avoid breaking subsequent database queries, and recommends restricting HTTP retries to connection issues or 5xx server errors rather than retrying on permanent 4xx client errors.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
- UpdateOrCreatePageWithPaksTest: DB::shouldReceive('transaction')
replaces the whole database manager in the container, breaking the
real queries UpdateOrCreatePage/SyncPak issue inside the mocked
transaction. Switch to DB::partialMock() so only transaction() is
mocked and everything else still hits the real connection.
- FetchHtml: retry() previously retried on any throwable, including
4xx client errors that retrying can never fix (404/403 won't
resolve on attempt 2). Added a $when callback so only connection
exceptions and 5xx server errors are retried; 4xx fails immediately.
Added tests asserting the call count for both branches.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces transactional atomicity to the page extraction process by wrapping page updates and pak synchronization inside a single database transaction via the new UpdateOrCreatePageWithPaks action. Additionally, it improves HTTP scraping robustness in FetchHtml by throwing exceptions on non-2xx responses and only retrying on 5xx server errors or connection failures. A comprehensive suite of feature tests has been added to cover these scenarios, along with updates to the risk documentation. There are no review comments to address, and the changes look solid.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
Summary
Follow-up to #183, closing out the remaining open items in docs/known-risks.md.
Two items needed a behavior decision before implementing:
FetchHtmlcallsHttp::get()->throw(), so a 404/500 response is handled the same as a connection exception: it's caught by the Handler's existing try/catch and never written toRawPageor cached. (Decision: treat all non-2xx the same way, no 4xx/5xx distinction.)UpdateOrCreatePageWithPaks, wrapping the Page update and Pak sync in a singleDB::transaction(). Previously aSyncPakfailure could leave a Page with new title/text but stale/missing pak associations visible to search. All three site Handlers (Japan/Twitrans/Portal) now use this combined action instead of callingUpdateOrCreatePage+SyncPakseparately.B3 (Notion re-sync throttling) stays unimplemented per decision — recorded in the ledger only, to revisit if API quota becomes an actual problem.
The rest is closing every remaining 🟡 SPOF row (single test) to 🟢 OK by adding a second test, per the ledger's own stated correction conditions:
FeedActionTestproving the RSS feed path also doesn't eager-loadrawPage.Page::toFeedItem()itself, not justPageResource.docs/known-risks.md is now 🟢 OK across the board except B3 (🔴 Missing, by decision) and D2 (N/A — no draft/publish concept exists on
Page).Test plan
vendor/bin/pint --dirty— passedcomposer stan(PHPStan level 9) — no errorsvendor/bin/rector process --dry-run— no changes neededphp artisan test— 55 passed, 124 assertions, against an isolatedcs_testdatabase🤖 Generated with Claude Code