Skip to content

feat(robot): OpenPI policy harness, H.264 trace video, rollout batching against one agent#425

Open
lukass16 wants to merge 12 commits into
v6from
v6-robot-3
Open

feat(robot): OpenPI policy harness, H.264 trace video, rollout batching against one agent#425
lukass16 wants to merge 12 commits into
v6from
v6-robot-3

Conversation

@lukass16

@lukass16 lukass16 commented Jun 17, 2026

Copy link
Copy Markdown

Issue

The v6 robot harness needed to drive real OpenPI policy servers, run concurrent rollouts efficiently, and stream camera data to traces without bloating each step with JPEG frames. Slow sim boots (e.g. Isaac Sim) also exceeded the default env connect timeout.

Solution

  • Add RemoteModel — WebSocket/msgpack client for OpenPI policy servers (lazy connect, supports actions / action response keys).
  • Add BatchedAgent / BatchedModel — coalesce concurrent ainfer() calls into stacked forwards for parallel rollouts.
  • Adopt OpenPI slash-delimited observation keys end-to-end; add OpenPIAdapter so a stock OpenPI server drives the harness with no agent changes.
  • Stream per-camera H.264/CMAF video via VideoStreamer (hud/agents/robot/video.py); numeric state stays on ObservationStep, frames go as VideoSegmentStep spans.
  • Raise RobotClient connect ready_timeout default to 240s for slow container boots.
  • Also includes Modal/Daytona eval runtime providers merged from lukass/modal-daytona-runtimes.

Outcome / Verification

  • Robot rollout against OpenPI policy server via RemoteModel + OpenPIAdapter
  • Concurrent rollouts via BatchedAgent(batch_size=N)
  • Trace shows video_segment spans with playable H.264 segments
  • Env connect succeeds on slow Isaac Sim boots

Note

Medium Risk
Substantial changes to the robot inference contract (batch shape, removed model reset/ensembler) and new threaded video encoding could affect existing LeRobot rollouts; cloud runtimes add operational dependencies but are isolated behind optional extras.

Overview
Robot policy harness gains OpenPI server support via RemoteModel (WebSocket client, lazy connect, configurable actions/action response key) and OpenPIAdapter (passes obs['data'] + prompt). Model is now stateless with a fixed [N, T, A] batch contract; per-episode chunking stays on RobotAgent (no model.reset()). LeRobotModel drops Ensembler / standalone lerobot_infer and asserts the batch dimension is preserved.

Concurrent eval: BatchedModel / BatchedAgent coalesce parallel ainfer calls into one stacked forward (not compatible with RemoteModel).

Traces: camera frames move off per-tick JPEGs in ObservationStep to background H.264/CMAF encoding (video.pyVideoSegmentStep spans, optional trace_id on Step.emit for encoder threads). RobotClient.get_control_rate() drives video FPS.

Placement: ModalRuntime and DaytonaRuntime providers; connect default ready_timeout raised to 240s. robot extra adds PyAV; optional modal / daytona extras in pyproject.toml.

Reviewed by Cursor Bugbot for commit 51786b6. Bugbot is set up for automated code reviews on this repo. Configure here.

lukass16 added 8 commits June 17, 2026 01:53
Docker for slow envs like Isaac Sim publishes the port before @env.initialize finishes, so hello retries
can exceed 120s on slow container boots.
Add a weightless Model that queries a remote policy server over the OpenPI
msgpack/WebSocket protocol: the adapter builds the request dict, the server
owns all pre/post-processing + the forward, and infer() ships it and returns
the [T, A] chunk. connect() is lazy and idempotent (blocks until the server
is up); response_key covers "actions" (stock OpenPI) vs "action" (Cosmos).
…erence

BatchedModel wraps any Model and coalesces concurrent ainfer() calls into a
single stacked forward: a lazily-started worker drains up to batch_size queued
calls (or flushes after max_wait_s for the suite tail), runs one inner.infer,
and scatters the [N, T, A] rows back to each caller.

BatchedAgent wraps a RobotAgent and shallow-clones it per run so each rollout
keeps isolated episode state while sharing the one batched model. Usage stays a
one-liner: BatchedAgent(agent, batch_size=8) with max_concurrent set to match.
Migrate the robot harness to OpenPI-standard, slash-delimited observation
keys end-to-end, and add a thin OpenPIAdapter so a generic OpenPI policy
server drives the harness with no agent code changes.
Replace per-tick JPEG observation images with per-camera H.264/CMAF video
streaming for robot traces:

- Add hud/agents/robot/video.py (SegmentEncoder/VideoStreamer): encode each
  camera on a background thread, emitting CMAF fragments as VideoSegmentStep
  spans without blocking the act loop.
- RobotAgent starts/finalizes the streamer at the env control rate; finalize
  in `finally` so a crashed run still leaves video.
- ObservationStep.from_obs records only numeric state now; camera frames travel
  as video.
- Step.emit accepts an explicit trace_id so the encoder thread (no contextvars
  trace context) attributes spans correctly.
- Add RobotClient.get_control_rate(); add "video_segment" RobotStepSource;
  add PyAV (av>=12) to the robot extra.
Add ModalRuntime as a Provider alongside DockerRuntime: resolve image once
(from_name or lazy build), create an isolated Sandbox per rollout, expose
the env control channel over raw TCP, terminate on exit. Export from
hud.eval and add optional [modal] extra.
…oxes

Add DaytonaRuntime as a Provider alongside ModalRuntime: resolve snapshot once (build from image if missing), create an isolated sandbox per rollout, start the env server in a background session, reach it via an asyncssh local-forward (Daytona exposes only HTTPS previews, connect dials tcp://), delete on exit. workdir defaults to /app to match the scaffolded Dockerfile.hud. Export from hud.eval and add optional [daytona] extra.
Comment thread hud/agents/robot/batching.py Outdated
Comment thread hud/agents/robot/batching.py
Comment thread hud/agents/robot/batching.py
Remove the per-episode model.reset() hook (Model/LeRobotModel/RemoteModel/
BatchedModel + agent.on_episode_start); per-episode state lives only on the
agent, so a shared BatchedModel can no longer clear one rollout's policy
state mid-episode. Document that RemoteModel is not batchable (OpenPI server
has no batched-request shape) on RemoteModel, BatchedModel, and BatchedAgent.
Comment thread hud/agents/robot/model.py Outdated
Comment thread hud/agents/robot/batching.py
…ship

Spell out on Model.infer/ainfer that implementations must keep the leading
batch dim N (ainfer indexes [0], BatchedModel scatters rows along it) and add
a one-line assert in LeRobotModel.infer. Document that BatchedAgent mutates the
passed-in agent in place, leaving it permanently batched.

Co-authored-by: Cursor <cursoragent@cursor.com>
Comment thread hud/capabilities/robot.py Outdated
Clamp get_control_rate to max(1, round(...)) so sub-0.5 Hz contracts no longer
emit 0 FPS on VideoSegmentStep. Init _hooks_done before add_capability in
Environment.__init__. Load optional robot deps via importlib for pyright, add
shim-test ignores, and ruff-format flagged files.

Co-authored-by: Cursor <cursoragent@cursor.com>

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 446a05b. Configure here.

Comment thread hud/eval/runtime.py
except Exception: # not found: build it under this name
await daytona.snapshot.create(
CreateSnapshotParams(name=self.snapshot_name, image=self._image)
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Daytona snapshot probe swallows errors

Medium Severity

DaytonaRuntime._ensure_snapshot treats any snapshot.get failure like a missing snapshot and always calls snapshot.create. Transient API or auth errors can trigger a redundant create attempt and mark the snapshot resolved, hiding the real failure until sandbox startup.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 446a05b. Configure here.

Wrap long lines, move NDArray to TYPE_CHECKING, noqa intentional 0.0.0.0
bind in LocalRuntime, and reformat legacy shim test imports.

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant