Skip to content

fix: added BrokenProcessPool exception to cleanly end a collection#35

Closed
ktstrader wants to merge 1 commit into
mainfrom
fix/BED-8691
Closed

fix: added BrokenProcessPool exception to cleanly end a collection#35
ktstrader wants to merge 1 commit into
mainfrom
fix/BED-8691

Conversation

@ktstrader

@ktstrader ktstrader commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Description

Change to src/openhound/scheduler/service.py:

  • Import BrokenProcessPool from concurrent.futures.process.
  • Add a private _reset_executor() helper that shuts down the broken pool non-blocking and creates a fresh ProcessPoolExecutor with the same settings (max_workers=1, max_tasks_per_child=1).
  • Add a dedicated except BrokenProcessPool branch to _handle_completed_job, placed before the existing generic except Exception. It reports the in-flight job as FAILED to BHE and calls _reset_executor(). The existing finally block continues to clear future/job_running.
  • Wrap the executor.submit(...) call in _start_job with a try/except BrokenProcessPool. On failure it clears job_running/future, rebuilds the executor, and reports the job as FAILED to BHE so it does not remain in Running.

Two tests added to tests/test_bhe_job_scheduling.py:

  • test_poll_recovers_from_broken_process_pool — sets a BrokenProcessPool exception on the future, asserts that job_running/future are cleared, the executor instance is replaced, and BHE receives FAILED with the expected message.
  • test_start_job_recovers_when_submit_raises_broken_pool — monkeypatches executor.submit to raise BrokenProcessPool, asserts that both start_job and end_job were called on BHE, local state is cleared, and the executor instance is replaced.

Motivation and Context

Resolves: BED-8692

… there is a failure that causes the collection to break
@ktstrader ktstrader marked this pull request as ready for review June 23, 2026 22:39
@ktstrader ktstrader closed this Jun 23, 2026
@ktstrader ktstrader deleted the fix/BED-8691 branch June 23, 2026 22:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant