Unsupervisedcom · sansari · May 20, 2026 · May 20, 2026 · May 20, 2026 · May 20, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -10,6 +10,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ### Added
 
 - `review_depth: lightweight` annotation for review blocks in `job.yml` step outputs and `.deepreview` rules — when set, the workflow's `common_job_info` preamble is omitted from review instruction files, reducing token overhead for trivial or reversible intermediate steps (closes #86)
+- Hung-reviewer retry policy in quality gate guidance: reviewers completing with 0 tool uses are retried up to `REVIEWER_MAX_RETRIES` (default: 1) times before being skipped with a "manual review recommended" note; fast-fails (elapsed < `REVIEWER_FAST_FAIL_SECONDS`, default: 30s) are retried immediately (closes #408)
 
 ### Changed
 

diff --git a/doc/specs/deepwork/jobs/JOBS-REQ-004-quality-review-system.md b/doc/specs/deepwork/jobs/JOBS-REQ-004-quality-review-system.md
@@ -73,3 +73,13 @@ The quality review system evaluates step outputs against defined quality criteri
 4. The synthetic task's instructions MUST be prefixed with the same preamble used for file-based reviews (workflow `common_job_info` and input context).
 5. String outputs with `None` values MUST be skipped.
 6. The synthetic task's `review_id` MUST incorporate the string value into its content hash so that distinct string values produce distinct cache keys (per REVIEW-REQ-009.1.7).
+
+### JOBS-REQ-004.9: Hung Reviewer Retry Policy
+
+1. The review guidance MUST include a clearly labelled hung-reviewer retry policy section.
+2. The guidance MUST define a reviewer as "hung" when it completes with 0 tool uses and produces no substantive output.
+3. The guidance MUST instruct the agent to retry a hung reviewer up to `REVIEWER_MAX_RETRIES` times (default: 1) before skipping.
+4. After exhausting retries the guidance MUST instruct the agent to: (a) call `mark_review_as_passed` with the review ID only (the tool accepts no comment argument), and (b) separately inform the user with a skip message — not silently pass the review without any attempt.
+5. The guidance MUST distinguish a fast-fail (elapsed time under `REVIEWER_FAST_FAIL_SECONDS`, default: 30) from a slow-hang (elapsed time at or above that threshold), advising an immediate retry for fast-fails and a brief-pause retry for slow-hangs.
+6. `REVIEWER_MAX_RETRIES` and `REVIEWER_FAST_FAIL_SECONDS` MUST be named module-level constants (not inline magic numbers) so they can be adjusted without editing the guidance prose.
+7. `_build_review_guidance()` MUST accept `max_retries` and `fast_fail_seconds` keyword arguments so callers can override the defaults in tests or alternative configurations.
diff --git a/src/deepwork/jobs/mcp/quality_gate.py b/src/deepwork/jobs/mcp/quality_gate.py
@@ -31,6 +31,19 @@
 
 logger = logging.getLogger("deepwork.jobs.mcp.quality_gate")
 
+# Hung-reviewer retry policy (configurable constants).
+#
+# A reviewer is considered "hung" when it completes with 0 tool uses —
+# a strong signal of API overload or a dropped connection.  The dispatch
+# guidance instructs Claude to retry hung reviewers up to
+# REVIEWER_MAX_RETRIES times before skipping and logging a warning.
+#
+# REVIEWER_FAST_FAIL_SECONDS: if a reviewer returns 0 tool uses *and*
+# elapsed time is below this threshold it almost certainly never received
+# an API response at all — retry immediately without any backoff.
+REVIEWER_MAX_RETRIES: int = 1
+REVIEWER_FAST_FAIL_SECONDS: int = 30
+
 
 def validate_json_schemas(
     outputs: dict[str, ArgumentValue],
@@ -480,8 +493,27 @@ def run_quality_gate(
     return guidance
 
 
-def _build_review_guidance(review_output: str) -> str:
-    """Build the complete review guidance including /review skill instructions."""
+def _build_review_guidance(
+    review_output: str,
+    max_retries: int = REVIEWER_MAX_RETRIES,
+    fast_fail_seconds: int = REVIEWER_FAST_FAIL_SECONDS,
+) -> str:
+    """Build the complete review guidance including hung-reviewer retry policy.
+
+    Args:
+        review_output: Formatted list of review tasks from format_for_claude.
+        max_retries: How many times to retry a hung reviewer before skipping.
+        fast_fail_seconds: Elapsed-time threshold below which a 0-tool-use
+            result is treated as a fast-fail and retried immediately.
+    """
+    retry_word = "retry" if max_retries == 1 else "retries"
+    retry_instruction = (
+        "Retry it once" if max_retries == 1 else f"Retry it up to {max_retries} times"
+    )
+    exhaust_condition = (
+        "If the retry also returns" if max_retries == 1 else f"If all {max_retries} retries return"
+    )
+    total_attempts = max_retries + 1
     return f"""Quality reviews are required before this step can advance.
 
 {review_output}
@@ -490,6 +522,18 @@ def _build_review_guidance(review_output: str) -> str:
 
 For each review task listed above, launch it as a parallel Agent. The task's prompt field points to an instruction file — read it and follow the review instructions.
 
+## Handling Hung Reviewers
+
+A reviewer has **hung** when it completes with **0 tool uses** — a signal of API overload or a dropped connection.  Hung reviewers must be retried; do **not** silently pass them.
+
+**Retry policy (max {max_retries} {retry_word} per reviewer)**:
+
+1. After each reviewer completes, check whether it made 0 tool uses (shown as `0 tool uses` in the agent result) **and** produced no substantive output.
+2. If both are true the reviewer hung. {retry_instruction} by re-launching the same agent with the same prompt.
+   - If elapsed time was under {fast_fail_seconds}s the reviewer fast-failed (never got an API response) — retry immediately.
+   - If elapsed time was {fast_fail_seconds}s or more (slow-hang), the reviewer started but stalled — retry after a brief pause.
+3. {exhaust_condition} 0 tool uses: call `mark_review_as_passed` with the review ID. Then tell the user: "Review skipped after {total_attempts} failed attempts — manual review recommended for this step." Do **not** proceed without informing the user.
+
 ## After Reviews
 
-For any failing reviews, if you believe the issue is invalid, then you can call `mark_review_as_passed` on it. Otherwise, you should act on any feedback from the review to fix the issues. Once done, call `finished_step` again to see if you will pass now."""
+For any failing reviews where the reviewer produced actual findings: if you believe the issue is invalid, call `mark_review_as_passed` on it. Otherwise, act on the feedback, fix the issues, and call `finished_step` again."""
diff --git a/tests/unit/jobs/mcp/test_quality_gate.py b/tests/unit/jobs/mcp/test_quality_gate.py
@@ -1,7 +1,8 @@
 """Tests for MCP quality gate (reviews-based implementation).
 
 Validates requirements: JOBS-REQ-004, JOBS-REQ-004.1, JOBS-REQ-004.2, JOBS-REQ-004.3,
-JOBS-REQ-004.4, JOBS-REQ-004.5, JOBS-REQ-004.6, JOBS-REQ-004.7.
+JOBS-REQ-004.4, JOBS-REQ-004.5, JOBS-REQ-004.6, JOBS-REQ-004.7, JOBS-REQ-004.8,
+JOBS-REQ-004.9.
 
 Note: JOBS-REQ-009 is DEPRECATED (superseded by JOBS-REQ-004). No tests required.
 """
@@ -13,6 +14,9 @@
 from unittest.mock import patch
 
 from deepwork.jobs.mcp.quality_gate import (
+    REVIEWER_FAST_FAIL_SECONDS,
+    REVIEWER_MAX_RETRIES,
+    _build_review_guidance,
     build_dynamic_review_rules,
     build_string_output_review_tasks,
     run_quality_gate,
@@ -1897,3 +1901,111 @@ def test_reruns_review_when_file_content_changes_after_pass(self, tmp_path: Path
         # Review MUST run again because content changed
         assert result is not None
         assert "Quality reviews are required" in result
+
+
+class TestHungReviewerRetryPolicy:
+    """Tests for the hung-reviewer retry policy — validates JOBS-REQ-004.9."""
+
+    # THIS TEST VALIDATES A HARD REQUIREMENT (JOBS-REQ-004.9.1).
+    # YOU MUST NOT MODIFY THIS TEST UNLESS THE REQUIREMENT CHANGES
+    def test_guidance_includes_hung_reviewer_section(self) -> None:
+        """Review guidance MUST include a hung-reviewer retry policy section."""
+        result = _build_review_guidance("## Review Tasks\n\n- some task")
+        assert "Handling Hung Reviewers" in result
+
+    # THIS TEST VALIDATES A HARD REQUIREMENT (JOBS-REQ-004.9.2).
+    # YOU MUST NOT MODIFY THIS TEST UNLESS THE REQUIREMENT CHANGES
+    def test_guidance_defines_hung_as_zero_tool_uses(self) -> None:
+        """Guidance MUST define a hung reviewer as one that returns 0 tool uses."""
+        result = _build_review_guidance("## Review Tasks\n\n- some task")
+        assert "0 tool uses" in result
+
+    # THIS TEST VALIDATES A HARD REQUIREMENT (JOBS-REQ-004.9.3).
+    # YOU MUST NOT MODIFY THIS TEST UNLESS THE REQUIREMENT CHANGES
+    def test_guidance_specifies_max_retries(self) -> None:
+        """Guidance MUST specify the max retry count from REVIEWER_MAX_RETRIES."""
+        result = _build_review_guidance("## Review Tasks\n\n- some task")
+        # Default is 1 retry — heading must say "max 1 retry"
+        assert (
+            f"max {REVIEWER_MAX_RETRIES} retry" in result
+            or f"max {REVIEWER_MAX_RETRIES} retries" in result
+        )
+
+    # THIS TEST VALIDATES A HARD REQUIREMENT (JOBS-REQ-004.9.4).
+    # YOU MUST NOT MODIFY THIS TEST UNLESS THE REQUIREMENT CHANGES
+    def test_guidance_instructs_skip_with_note_after_retries_exhausted(self) -> None:
+        """After retries exhausted, guidance MUST instruct skip + log, not silent pass."""
+        result = _build_review_guidance("## Review Tasks\n\n- some task")
+        assert "mark_review_as_passed" in result
+        assert "manual review recommended" in result
+
+    # THIS TEST VALIDATES A HARD REQUIREMENT (JOBS-REQ-004.9.5).
+    # YOU MUST NOT MODIFY THIS TEST UNLESS THE REQUIREMENT CHANGES
+    def test_guidance_mentions_fast_fail_threshold(self) -> None:
+        """Guidance MUST mention the fast-fail elapsed-time threshold."""
+        result = _build_review_guidance("## Review Tasks\n\n- some task")
+        # The threshold in seconds should appear in the guidance
+        assert str(REVIEWER_FAST_FAIL_SECONDS) in result
+
+    # THIS TEST VALIDATES A HARD REQUIREMENT (JOBS-REQ-004.9.6).
+    # YOU MUST NOT MODIFY THIS TEST UNLESS THE REQUIREMENT CHANGES
+    def test_reviewer_max_retries_is_named_constant(self) -> None:
+        """REVIEWER_MAX_RETRIES MUST be a module-level integer constant."""
+        assert isinstance(REVIEWER_MAX_RETRIES, int)
+        assert REVIEWER_MAX_RETRIES >= 1
+
+    # THIS TEST VALIDATES A HARD REQUIREMENT (JOBS-REQ-004.9.6).
+    # YOU MUST NOT MODIFY THIS TEST UNLESS THE REQUIREMENT CHANGES
+    def test_reviewer_fast_fail_seconds_is_named_constant(self) -> None:
+        """REVIEWER_FAST_FAIL_SECONDS MUST be a module-level integer constant."""
+        assert isinstance(REVIEWER_FAST_FAIL_SECONDS, int)
+        assert REVIEWER_FAST_FAIL_SECONDS > 0
+
+    # THIS TEST VALIDATES A HARD REQUIREMENT (JOBS-REQ-004.9.7).
+    # YOU MUST NOT MODIFY THIS TEST UNLESS THE REQUIREMENT CHANGES
+    def test_build_review_guidance_accepts_override_params(self) -> None:
+        """_build_review_guidance MUST accept max_retries and fast_fail_seconds kwargs."""
+        result = _build_review_guidance(
+            "## Review Tasks\n\n- some task",
+            max_retries=3,
+            fast_fail_seconds=60,
+        )
+        # max_retries=3 → heading says "max 3 retries", exhaust text says "4 failed attempts"
+        assert "max 3 retries" in result
+        assert "4 failed attempts" in result
+        # fast_fail_seconds=60 → threshold appears in fast-fail bullet
+        assert "60" in result
+
+    # THIS TEST VALIDATES A HARD REQUIREMENT (JOBS-REQ-004.9.7).
+    # YOU MUST NOT MODIFY THIS TEST UNLESS THE REQUIREMENT CHANGES
+    def test_build_review_guidance_retry_text_reflects_max_retries(self) -> None:
+        """Guidance retry count text MUST reflect the max_retries parameter."""
+        result_1 = _build_review_guidance("tasks", max_retries=1)
+        result_2 = _build_review_guidance("tasks", max_retries=2)
+        # After exhausting retries: message mentions total attempts (max_retries + 1)
+        assert "2 failed attempts" in result_1
+        assert "3 failed attempts" in result_2
+
+    # THIS TEST VALIDATES A HARD REQUIREMENT (JOBS-REQ-004.9.5).
+    # YOU MUST NOT MODIFY THIS TEST UNLESS THE REQUIREMENT CHANGES
+    def test_guidance_describes_slow_hang_path(self) -> None:
+        """Guidance MUST describe the slow-hang retry path (elapsed >= threshold)."""
+        result = _build_review_guidance("## Review Tasks\n\n- some task")
+        # Must mention that slow-hangs are also retried (not only fast-fails)
+        assert "slow-hang" in result or "slow" in result.lower()
+
+    # THIS TEST VALIDATES A HARD REQUIREMENT (JOBS-REQ-004.9.4).
+    # YOU MUST NOT MODIFY THIS TEST UNLESS THE REQUIREMENT CHANGES
+    def test_guidance_instructs_separate_user_message_not_tool_comment(self) -> None:
+        """Skip note MUST be communicated to the user separately, not as a tool arg.
+
+        mark_review_as_passed only accepts review_id — the guidance must not
+        imply passing a comment/note to the tool itself.
+        """
+        result = _build_review_guidance("## Review Tasks\n\n- some task")
+        # Guidance should tell Claude to call mark_review_as_passed AND then
+        # separately tell the user — not pass a note argument to the tool.
+        assert "tell the user" in result
+        # Must NOT suggest an impossible "append comment" or "note=" argument
+        assert "append the comment" not in result
+        assert "note=" not in result