CodeRobo Preview

CodeRobo is a research preview of a language-model-driven visual manipulation stack built on top of the LIBERO benchmark. The goal of this preview is to make the system architecture, reusable method components, evaluation gates, and current evidence boundaries visible to the community without publishing private run artifacts, full prompt-generation material, credentials, or machine-local experiment state.

This is not a formal release. APIs, scripts, and results may still change.

What Is Included

libero/: the LIBERO benchmark package, assets, BDDL task definitions, lifelong-learning baselines, and Hydra configs.
libero_sdk/: a small Python SDK around LIBERO tasks, environments, robot control, perception wrappers, and skills.
lmvs/: the CodeRobo / LMVS policy stack, including LLM planning, action primitives, perception fusion, structured memory, evidence tracking, skill registries, and audit utilities.
scripts/: static checks, sweep runners, gate checkers, evidence summaries, and focused smoke tests.
docs/: curated architecture and evaluation notes for the preview.

Generated datasets, checkpoints, model caches, local run directories, Codex transcripts, complete prompt-generation files, and presentation build artifacts are intentionally excluded from the public preview.

Method Overview

CodeRobo separates robot manipulation into auditable modules:

Task and environment layer: LIBERO provides benchmark suites, BDDL task files, initial states, observations, and sparse success signals.
Observation and perception layer: RGB-D observations are converted into object candidates, relation evidence, placement regions, grasp proposals, and current-trial world evidence.
LLM planning layer: CodexBrain produces one structured high-level decision at a time from task language, visible evidence, action history, prompt-safe memory, and available tools.
Action layer: ActionAPI exposes generic manipulation primitives such as visual pick, servo-assisted grasp, placement, relation-aware placement, contact scan, recovery, and terminal decisions.
Memory and skill layer: semantic memory, skill performance records, and failure summaries are stored as typed records. Reusable memory is restricted to human-observable strategy and must not contain hidden simulator state, expert replay actions, or demo-derived absolute coordinates.
Audit layer: scripts check no runtime ground truth, no replay actions, transcript completeness, Codex health, metadata consistency, same-task variation, and broader task-family claims.

The preview emphasizes transparent evidence over a single headline number. Successful demonstrations should include the exact sweep command, row metadata, model-call transcripts when applicable, no-replay checks, and gate outputs.

Installation

The base LIBERO environment targets Python 3.8.

conda create -n coderobo python=3.8.13
conda activate coderobo
pip install -r requirements.txt
pip install -e .

For GPU simulation, set MuJoCo/EGL device variables before running rollouts:

export CUDA_VISIBLE_DEVICES=0
export MUJOCO_EGL_DEVICE_ID=0

Download LIBERO datasets only when you need demonstrations or benchmark rollouts:

python benchmark_scripts/download_libero_datasets.py --datasets libero_spatial --use-huggingface

Datasets are ignored by Git and should stay outside the preview source release.

Optional LLM Configuration

The LLM planner can use a local Codex CLI transport, a file-backed subagent bridge, or an OpenAI-compatible API endpoint depending on the script.

For API transport:

export CODEX_API_BASE_URL="https://your-compatible-endpoint.example/v1"
export CODEX_API_KEY="..."
export CODEX_API_MODEL="your-model-name"

Never commit API keys, endpoint credentials, private transcripts, or prompt archives. The preview repository documents the interfaces, not private service configuration.

Quick Checks

Run fast checks that do not require MuJoCo simulation:

python scripts/run_lmvs_static_checks.py
python scripts/check_no_runtime_gt.py lmvs

The first command compiles the LMVS/SDK/script code and runs focused script tests. The second scans the runtime policy code for hidden simulator ground truth usage.

Dry-Run a Matrix

To inspect commands without launching long robot evaluations:

python scripts/run_codex_40_task_experiment.py --dry-run \
  --suites libero_spatial,libero_object,libero_goal,libero_10 \
  --tasks 0,1,2,3,4,5,6,7,8,9 \
  --init-states 0,1,2

Run a Small Sweep

A minimal simulation sweep writes a JSON summary under lmvs_runs/:

python scripts/run_libero_object_sweep.py \
  --suite libero_spatial \
  --tasks 0 \
  --init-states 0 \
  --max-turns 6 \
  --policy-mode codex-brain \
  --codex-tool-profile generic \
  --out lmvs_runs/spatial_t0_i0_preview.json

Depending on the selected transport, you may also need codex CLI login or the CODEX_API_* variables above.

Evidence and Results Summary

Current preview evidence supports the following conservative conclusions:

The architecture for LLM-guided visual closed-loop manipulation is in place: task discovery, visual evidence, generic action tools, structured memory, model-call transports, and gate scripts are implemented.
Strict generic-tool runs show meaningful local progress on LIBERO Spatial and selected Object/Goal cases, especially when tasks can be solved through visible object selection, servo grasping, and relation-aware placement.
The project should not yet be described as a solved full LIBERO benchmark or broad task-family generalization system. Hard cases remain around thin-object contact, object-container insertion, post-release verification, long-horizon recovery, and robustness under held-out initial states.
Evidence quality is treated as part of the method: a claimed result should pass no-runtime-ground-truth, no-replay, transcript, health, metadata, and variation gates before being reported.

See docs/PREVIEW_RELEASE.md for the open-source boundary and docs/RESULTS_SUMMARY.md for a concise result statement.

Repository Hygiene for Preview

Before pushing to https://github.com/IRMVLab/CodeRobo, verify that the staged files do not include:

lmvs_runs/, lmvs_memory/, lmvs_demo_bundles/, sdk_output/, agent_io/
libero/datasets/, bert/, experiments/, hf_cache/, external_repos/
presentations/*/prompts/, generated PPTX files, generated slide images
API keys, private endpoint URLs, local absolute paths, or full model prompts

Recommended pre-push checks:

git status --short
rg -n --hidden -g '!.git/**' -g '!libero/datasets/**' -g '!bert/**' \
  -g '!experiments/**' -g '!presentations/**' \
  'sk-[A-Za-z0-9_-]+|API_KEY|SECRET|TOKEN|/(home|root)/|<local-path>' .
python scripts/run_lmvs_static_checks.py
python scripts/check_no_runtime_gt.py lmvs

Relation to LIBERO

This preview builds on LIBERO:

@article{liu2023libero,
  title={LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot Learning},
  author={Liu, Bo and Zhu, Yifeng and Gao, Chongkai and Feng, Yihao and Liu, Qiang and Zhu, Yuke and Stone, Peter},
  journal={arXiv preprint arXiv:2306.03310},
  year={2023}
}

License

The inherited LIBERO code is distributed under the MIT License. Dataset assets and third-party model assets may have separate licenses; keep downloaded data, checkpoints, and external repositories outside the preview source tree unless their redistribution terms have been reviewed.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
benchmark_scripts		benchmark_scripts
docs		docs
images		images
libero		libero
libero_sdk		libero_sdk
lmvs		lmvs
scripts		scripts
templates		templates
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CodeRobo Preview

What Is Included

Method Overview

Installation

Optional LLM Configuration

Quick Checks

Dry-Run a Matrix

Run a Small Sweep

Evidence and Results Summary

Repository Hygiene for Preview

Relation to LIBERO

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

CodeRobo Preview

What Is Included

Method Overview

Installation

Optional LLM Configuration

Quick Checks

Dry-Run a Matrix

Run a Small Sweep

Evidence and Results Summary

Repository Hygiene for Preview

Relation to LIBERO

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages