Skip to content

fix: test-suite stability and drone camera pinhole#2545

Open
Danny024 wants to merge 4 commits into
dimensionalOS:mainfrom
Danny024:fix/test-stability-and-drone-pinhole
Open

fix: test-suite stability and drone camera pinhole#2545
Danny024 wants to merge 4 commits into
dimensionalOS:mainfrom
Danny024:fix/test-stability-and-drone-pinhole

Conversation

@Danny024

Copy link
Copy Markdown

Three independent fixes surfaced while running the test suite and the demo simulation. Each is its own commit.

1. fix(core) — flaky test_basic_deployment

The test asserted >= 8 messages after a fixed time.sleep(1). Under heavy parallel load (pytest -n auto) the publisher and LCM transport threads get starved and deliver fewer within that window, failing with assert 5 >= 8. It now polls up to a 15s deadline — still verifying end-to-end delivery through the deployment, without assuming a wall-clock throughput.

2. fix(tests) — resilient agent_setup teardown

When an agent test fails mid-run (e.g. the LLM is slow under load and finished_event.wait(60) times out), a single raising cleanup step skipped the remaining ones, leaking LCM transport threads. The autouse monitor_threads fixture then flagged the leak, turning one failure into a failure and a teardown error. A shared teardown_agent_setup() helper now runs every cleanup step (coordinator, transports, unsubs) best-effort and logs failures instead of propagating. Wired into both agent_setup fixtures.

3. fix(drone) — camera pinhole for the 3D view

The drone camera feed is logged at world/video, inside the 3D view's world origin, but no Pinhole was ever logged — so Rerun warned "2D visualizers require a pinhole ancestor to be shown in a 3D view" and the feed only rendered in the dedicated 2D pane. A static Pinhole is now logged at world/video, built from the camera intrinsics (pulled into a shared _CAMERA_INTRINSICS constant so they can't drift from the camera module).

Testing

  • test_basic_deployment passes under deliberate full-CPU starvation (previously failed assert 5 >= 8).
  • 3 navigation + 28 mcp tool-stream tests pass with clean teardown via the new helper.
  • drone-basic replay boots cleanly; the static pinhole logs at bridge start without error.
  • ruff check and ruff format --check clean on all changed files.

The test asserted >=8 messages after a fixed 1s sleep, but under heavy parallel load (pytest -n auto) the publisher/LCM threads get starved and deliver fewer, failing with 'assert 5 >= 8'. Poll up to a 15s deadline instead; this still verifies end-to-end delivery without assuming a wall-clock throughput.
@greptile-apps

greptile-apps Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR delivers three independent stability fixes: a flaky test poll-loop replacement, best-effort fixture teardown to prevent thread leaks under parallel load, and a missing Rerun Pinhole component for the drone camera feed.

  • test_basic_deployment replaces a fixed 1-second sleep with a 15-second early-exit poll loop, eliminating throughput-dependent flakiness under pytest -n auto while still asserting all three message counts post-loop.
  • agent_setup teardown extracts a teardown_agent_setup() helper that wraps every cleanup step in try/except, so a failing step no longer skips remaining ones and leaks LCM transport threads; both conftest files now delegate to it.
  • Drone pinhole logs a static rr.Pinhole at world/video built from the new _CAMERA_INTRINSICS constant, satisfying Rerun's requirement for a pinhole ancestor before a 2D image can render in a 3D view.

Confidence Score: 5/5

All three changes are narrowly scoped bug fixes with no shared mutable state, no API surface changes, and direct test coverage for the new teardown helper.

The poll-loop replacement is a strictly safer version of the original sleep; the teardown helper is covered by four unit tests and correctly handles every failure permutation; the pinhole constant and matrix are derived correctly from cx/cy yielding 1920×1080. No regressions are introduced.

No files require special attention.

Important Files Changed

Filename Overview
dimos/core/test_core.py Replaces fixed sleep(1) with a 15-second early-exit poll loop; assertions remain after the loop so timeouts still fail the test correctly.
dimos/utils/testing/agent_teardown.py New best-effort teardown helper; each step is isolated with try/except, transport labels include class name, and unsubscribe labels include index.
dimos/utils/testing/test_agent_teardown.py Unit tests for teardown_agent_setup covering happy path, coordinator failure, transport failure, None coordinator with empty collections.
dimos/agents/conftest.py Delegates fixture teardown to shared helper; removes 7-line inline cleanup block.
dimos/agents/mcp/conftest.py Same teardown delegation as dimos/agents/conftest.py — mirrors it identically.
dimos/robot/drone/blueprints/basic/drone_basic.py Adds _CAMERA_INTRINSICS constant and _static_camera_pinhole function; pinhole matrix and resolution (1920×1080) are correctly derived from cx/cy.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    subgraph Fix1["Fix 1 — test_basic_deployment"]
        A[robot.start / nav.start] --> B{Poll loop\nup to 15 s}
        B -- "all counts ≥ 8" --> C[break early]
        B -- deadline --> D[timeout — assertions fail]
        C --> E[assert mov/odom/lidar ≥ 8]
    end

    subgraph Fix2["Fix 2 — agent_setup teardown"]
        F[fixture yield] --> G[teardown_agent_setup]
        G --> H["_safe(coordinator.stop)"]
        H -- "raises?" --> H2[log error, continue]
        H --> I["_safe(transport.stop) × N"]
        I -- "raises?" --> I2[log error, continue]
        I --> J["_safe(unsub[i]) × M"]
        J -- "raises?" --> J2[log error, continue]
    end

    subgraph Fix3["Fix 3 — drone camera pinhole"]
        K[bridge start] --> L["_log_static()"]
        L --> M["rr.log('world/video', rr.Pinhole(...), static=True)"]
        M --> N[camera frustum visible in 3D view]
    end
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
    subgraph Fix1["Fix 1 — test_basic_deployment"]
        A[robot.start / nav.start] --> B{Poll loop\nup to 15 s}
        B -- "all counts ≥ 8" --> C[break early]
        B -- deadline --> D[timeout — assertions fail]
        C --> E[assert mov/odom/lidar ≥ 8]
    end

    subgraph Fix2["Fix 2 — agent_setup teardown"]
        F[fixture yield] --> G[teardown_agent_setup]
        G --> H["_safe(coordinator.stop)"]
        H -- "raises?" --> H2[log error, continue]
        H --> I["_safe(transport.stop) × N"]
        I -- "raises?" --> I2[log error, continue]
        I --> J["_safe(unsub[i]) × M"]
        J -- "raises?" --> J2[log error, continue]
    end

    subgraph Fix3["Fix 3 — drone camera pinhole"]
        K[bridge start] --> L["_log_static()"]
        L --> M["rr.log('world/video', rr.Pinhole(...), static=True)"]
        M --> N[camera frustum visible in 3D view]
    end
Loading

Reviews (3): Last reviewed commit: "test(tests): cover teardown_agent_setup ..." | Re-trigger Greptile

Comment thread dimos/utils/testing/agent_teardown.py Outdated
Comment on lines +55 to +56
for unsub in unsubs:
_safe(unsub, "unsubscribe")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Generic unsubscribe label hides which subscription failed

Every unsubscribe failure is logged with the same step="unsubscribe" label, so when a cleanup failure is reported you cannot tell whether it was the agent_transport or the finished_transport (or any others) that raised. Since the unsubs list is an ordered collection built from append calls in the fixture, a zero-cost improvement is to include the index or the callable's repr in the step name.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — applied in the latest push (folded into the teardown commit). Each unsubscribe is now labelled by index: _safe(unsub, f"unsubscribe[{i}]"), so a failing teardown step identifies which subscription raised.

Danny024 added 2 commits June 20, 2026 13:37
When an agent test fails mid-run (e.g. the LLM is slow under load and finished_event.wait(60) times out), a raising cleanup step skipped the rest, leaking LCM transport threads and turning the failure into a failure + teardown error. Add a shared teardown_agent_setup() that runs every step (coordinator, transports, unsubs) best-effort and logs rather than propagating. Each unsubscribe is labelled by index so a failing one is identifiable.
The camera image is logged at world/video, inside the 3D view's world origin, but no Pinhole was ever logged, so Rerun warned '2D visualizers require a pinhole ancestor to be shown in a 3D view' and the feed only rendered in the 2D pane. Register a static Pinhole at world/video built from the (now shared) camera intrinsics.
@Danny024 Danny024 force-pushed the fix/test-stability-and-drone-pinhole branch from 028e879 to df228ca Compare June 20, 2026 12:37
@Danny024

Copy link
Copy Markdown
Author

Hi @leshy @mustafab0 @paul-nechifor @spomichter — the ci and autofix.ci workflows on this PR are sitting in action_required (fork PR awaiting maintainer approval), so the test suite hasn't run yet. Could one of you approve the workflow run when you get a chance? The three commits are narrowly scoped (a flaky-test fix, an agent_setup teardown hardening, and an additive drone-camera Pinhole), and Greptile's one P2 note is already addressed. Thanks!

@codecov

codecov Bot commented Jun 22, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 65.51724% with 10 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
dimos/core/test_core.py 0.00% 5 Missing ⚠️
dimos/utils/testing/agent_teardown.py 81.25% 2 Missing and 1 partial ⚠️
dimos/robot/drone/blueprints/basic/drone_basic.py 50.00% 2 Missing ⚠️
@@            Coverage Diff             @@
##             main    #2545      +/-   ##
==========================================
- Coverage   70.81%   70.46%   -0.36%     
==========================================
  Files         862      863       +1     
  Lines       77475    77491      +16     
  Branches     6882     6881       -1     
==========================================
- Hits        54862    54601     -261     
- Misses      20818    21130     +312     
+ Partials     1795     1760      -35     
Flag Coverage Δ
OS-ubuntu-24.04-arm 62.93% <65.51%> (+<0.01%) ⬆️
OS-ubuntu-latest 65.75% <65.51%> (+<0.01%) ⬆️
Py-3.10 65.75% <65.51%> (+<0.01%) ⬆️
Py-3.11 65.74% <65.51%> (-0.01%) ⬇️
Py-3.12 ?
Py-3.13 65.74% <65.51%> (-0.01%) ⬇️
Py-3.14 65.76% <65.51%> (+<0.01%) ⬆️
Py-3.14t 65.74% <65.51%> (-0.01%) ⬇️
SelfHosted-macOS ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
dimos/agents/conftest.py 95.65% <100.00%> (+7.65%) ⬆️
dimos/agents/mcp/conftest.py 95.65% <100.00%> (+7.65%) ⬆️
dimos/robot/drone/blueprints/basic/drone_basic.py 66.66% <50.00%> (-3.34%) ⬇️
dimos/utils/testing/agent_teardown.py 81.25% <81.25%> (ø)
dimos/core/test_core.py 53.42% <0.00%> (-3.10%) ⬇️

... and 14 files with indirect coverage changes

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Unit-test the agent_setup teardown helper: all steps run in order, and a raising step (coordinator.stop, a transport, or an unsub) neither skips the remaining cleanup nor propagates. Covers the previously-untested except branch and the coordinator=None skip.
@Danny024

Copy link
Copy Markdown
Author

Note on the red ci-complete check

This is not caused by these changes — every real job passed:

  • All 7 tests matrix jobs (Python 3.10–3.14, x86 + arm64) ✅
  • lint, rust, docs-validate, md-babel, compute-ros-pin
  • Codecov: "All tests successful. No failed tests found."

ci-complete fails only because self-hosted-tests is marked required to succeed but is skipped on fork PRs (no self-hosted-runner access), which the gate counts as a failure:

📝 tests             → ✓ success [required to succeed]
📝 self-hosted-tests → ⬜ skipped [required to succeed]
##[error]Process completed with exit code 1

This would red-X any fork PR regardless of content. From the test side it's safe to merge — running the self-hosted suite internally (or merging past the gate) is a maintainer action.

Separately, pushed 6ef33337a adding unit tests for the teardown_agent_setup helper, covering the error-resilience branch Codecov flagged as uncovered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant