Skip to content

Point-LIO: mirror FAST-LIO cleanups (no-YAML config, memory2 Recorder, frame scheme)#2559

Draft
jeff-hykin wants to merge 158 commits into
mainfrom
jeff/feat/pointlio_native
Draft

Point-LIO: mirror FAST-LIO cleanups (no-YAML config, memory2 Recorder, frame scheme)#2559
jeff-hykin wants to merge 158 commits into
mainfrom
jeff/feat/pointlio_native

Conversation

@jeff-hykin

Copy link
Copy Markdown
Member

Mirrors onto the Point-LIO native module the same cleanups already landed for FAST-LIO in #2498, so the two modules stay consistent.

Changes (pointlio only)

  • No config YAML — dropped config/default.yaml; all Point-LIO tuning is now typed fields on PointLioConfig passed to the binary as plain CLI args. C++ builds a PointLioParams struct from args (no yaml-cpp; dropped from CMakeLists.txt/flake).
  • Recorder inherits memory2.RecorderPointlioRecorder drops its bespoke ts-alignment/stream-naming machinery and uses the shared base hooks (@pose_setter_for, stream_remapping, record_tf, msg.ts). ~170 lines removed.
  • Consistent frame schemeframe_id (odom parent) / child_frame_id (odom child) / sensor_frame_id (pointcloud, default mid360_link); cloud stamped with the sensor frame.
  • pcap_to_db aligned with the fastlio tool (tuning CLI flags, rrd quick-look, span auto-stop); keeps main's lidar_max<=0 drain guard.

Testing

  • pointlio module + recorder + blueprints import clean; PointlioRecorder resolves to base Recorder.
  • Net diff vs main is pointlio-only (8 files); base memory2.Recorder hooks these rely on already shipped in FAST-LIO pcap record/replay + Virtual Livox #2498.

jeff-hykin added 30 commits June 3, 2026 15:51
…lio_native flake/cmake consuming dimos-module-fastlio2 pointlio branch + Estimator/parameters sources)
…y path)

The fast-lio input was pinned to file:///Users/jeffhykin/... which only
exists on the Mac. Repoint to the dimensionalOS/dimos-module-fastlio2
pointlio branch on github so the flake builds on Linux. Same locked rev.
Rename the mirrored fastlio_blueprints.py to pointlio_blueprints.py and
wire it to PointLio (was incorrectly using FastLio2). Adds mid360-pointlio
and mid360-pointlio-voxels to the blueprint registry.
…race-fix flake relock

Adds the offline replay/inspection tooling for the pointlio_native module:
pcap_replay.hpp (streams a Livox pcap into the SDK callbacks), deterministic_clock
+ dual-thread replay options, the ruwik2_pt3 replay harness, the pcap_to_db tool
(append pointlio_odometry into an existing memory db at ~30Hz, streaming), validated
Point-LIO mid360.yaml, and the live smoke-test helper. flake.lock relocks onto the
fastlio2 pointlio rev carrying the mtx_buffer race fix; main.cpp comment corrected
to attribute the divergence fix to that lock (not the publish-rate gating).
Switches the flake's fast-lio input from path:/home/dimos/repos/dimos-module-fastlio2
to github:dimensionalOS/dimos-module-fastlio2/pointlio so the module builds on any
machine without a local fastlio2 clone. Relocked onto rev 7e5d88f (the mtx_buffer
race fix); verified pointlio_native builds from the github source.
…hful config

OUTPUT-model replay of the Jun-7 hand-shake bag diverged to ~25km (vs
hku-mars master's bounded ~5m). Two issues, both fixed here (paired with
the dimos-module-fastlio2 pointlio-branch curvature/deque fixes):

- main.cpp: heap-use-after-free thread race in bag-frame replay. The
  feeder thread drives run_main_iter via step() after each feed, but the
  main thread also drove run_main_iter, so both raced on the shared PCL
  measurement cloud (ASan: free in sync_packages, read-after-free in
  ImuProcess::Process). Force serial_replay in bag-frame mode so only the
  feeder thread touches the EKF.
- CMakeLists.txt + flake.nix: the iVox map backend needs glog; add the
  find_package/link and the nix buildInput. Drop ikd_Tree.cpp from
  sources (iVox replaces ikd-Tree).
- config/mid360.yaml (+ cpp/config duplicate): set satu_acc 5.5->3.0
  (master value — accel >=3g saturated, residual zeroed, keeps velocity
  bounded) and add ivox_grid_resolution=2.0 / ivox_nearby_type=6 (NEARBY6)
  matching the master config that produced the 5.028m baseline.
- replay_ruwik2_pt3.py: make MAX_WALL_SEC env-overridable.

After the fix the OUTPUT model stays bounded (peak |pos| 3.898m) on the
full bag; ASan reports zero memory errors.
…onsumes the iVox fix

The iVox-port + divergence fixes live in dimos-module-fastlio2 @ pointlio
commit 02d5066, which is committed locally but NOT pushed (per the task's
no-push constraint). The flake's fast-lio input was pointing at
github:dimensionalOS/dimos-module-fastlio2/pointlio, whose lock pinned the
pre-fix rev 7e5d88f. Since CMakeLists.txt already drops ikd_Tree.cpp (iVox
replaces ikd-Tree), building against that stale github source fails with
KD_TREE undefined-reference linker errors.

Repoint fast-lio at path:/home/dimos/repos/dimos-module-fastlio2 and bump
the lock so `nix build .#pointlio_native` consumes the local 02d5066 source.
Verified: build exits 0, no KD_TREE errors, 661176-byte binary.

NOTE: this bakes a machine-specific absolute path into the lock. Once
Repo A's pointlio branch is pushed, switch this input back to
github:dimensionalOS/dimos-module-fastlio2/pointlio and bump the lock to
02d5066 for portability.
Now that dimos-module-fastlio2 pointlio (02d5066, the iVox port + divergence
fix) is on origin, switch the fast-lio flake input from the local path: pin
back to github:dimensionalOS/dimos-module-fastlio2/pointlio and bump the lock
to rev 02d5066. This makes jeff/feat/pointlio_native self-contained and
portable (no machine-specific path in the lock).

Verified: nix build .#pointlio_native exits 0, no KD_TREE errors.
… add --force

Extends pcap_to_db.py to populate both pointlio_odometry and pointlio_lidar
streams (start-aligned onto the db's earliest ts) so pointlio can be compared
against fastlio in one recording. --force overwrites existing pointlio streams.
Untracks the machine-specific fastlio_test/setup_network symlinks (enhance
overlay) and drops the empty tools/__init__.py.
…o tools

- Consolidate config/ to one default.yaml (the tuned Mid-360/iVox config);
  remove the unused upstream sensor presets (avia, horizon, marsim, ouster64,
  velodyne, mid360). Module now defaults to default.yaml.
- Remove tools/replay_ruwik2_pt3.py (pcap_to_db.py covers offline replay) and
  tools/demo_live_test.py.
…egistry

- --db is now optional: with no existing db, build one from scratch (defaults
  to <pcap>.db next to the pcap); with an existing db, append + time-align as
  before.
- Rename the internal recorder Rec/RecConfig -> _Rec/_RecConfig so the
  blueprint-registry generator skips it (a tool helper isn't a public module);
  fixes test_all_blueprints_is_current.
- Docstring: document from-scratch generation using the ruwik2_part3 LFS sample.
… rm cpp/README + config copies; concise comments

- config: consolidate to a single config/default.yaml; drop the ROS-only
  lid_topic/imu_topic + odom frame-id/publish/pcd_save keys (parsed-but-unused
  by the native binary), and note which params are Mid-360-specific.
- delete cpp/config/{mid360.json,mid360.yaml} (duplicate) and cpp/README.md.
- tighten verbose comments in main.cpp, pcap_replay.hpp, timing.hpp
  (comments only; binary rebuilds clean).
… db name in docstring

120s clip (elapsed 55-175s) of the ruwik velocity-spike recording — the data
that diverges through FAST-LIO at the 0.5m pre-KF voxel and is bounded under
Point-LIO. Use via get_data('ruwik2_part3'); pcap_to_db --pcap <it> builds the
db from scratch.
….0 clock bug

- module.py docstring: Point-LIO (not FAST-LIO2) + correct import path.
- pcap_to_db.py: use 'is not None' (not 'or') for the ts fallback so a real
  sensor ts of 0.0 isn't replaced by wall time (which would misclassify the
  stream clock in _resolve_offset).
… NIC

Reads a Livox Mid-360 pcap into RAM, rewrites packet timestamps to
current-time, and replays point/IMU/status onto a virtual network
interface at a configurable rate + delay. Synthesizes the Livox SDK2
control protocol (discovery + GetInternalInfo/FwType ACKs, CRC16/CRC32)
so an unmodified consumer (pointlio) handshakes with it as a real sensor.
Builds via nix (rustPlatform.buildRustPackage, cargoLock from Cargo.lock).
…bal_map

- Cloud now published in the sensor frame (mid360_link): use fastlio2
  get_body_cloud() (undistorted scan, no world registration) instead of
  inverse-transforming the world cloud. No transform in publish_lidar.
- Split frames: frame_id (mid360_link) on both the cloud + odometry
  headers; odom_parent_frame_id (odom) -> odom_frame_id (base_link) for
  the TF publish.
- Remove global_map / voxel_map.hpp entirely (deleted file, config,
  blueprint + pcap_to_db references).
- Bump fast-lio pin to fcbd1c2 (adds get_body_cloud).
get_data() at module level triggered a git-LFS download during blueprint
validation (test_blueprint_is_valid), which CI blocks via git-lfs-guard —
failing the whole test matrix. Default pcap to empty; resolve the capture
path at run time instead.
Port the minimal pcap-replay subsystem from jeff/feat/go2_record into the
clean branch so FAST-LIO can run offline from a Mid-360 pcap, matching the
Point-LIO pcap_to_db workflow:

- cpp: pcap_replay.hpp + timing.hpp (header-only), main.cpp refactored so the
  main loop runs from either the live SDK or a pcap feeder thread, with an
  optional deterministic sensor-clock mode. Keeps the clean branch's
  velocity-cap (guarded set_max_velocity_norm_ms) and flake (velocity-cap
  fast-lio); does not pull go2_record's tcpdump record path.
- module.py: replay_pcap / replay_skip_until_ns / first_packet_marker /
  deterministic_clock config fields; skip network validation in replay mode.
- tools/pcap_to_db.py: replay a pcap through FastLio2 (real-time, non-
  deterministic) and append fastlio_odometry + fastlio_lidar into an existing
  memory2 db, time-aligned onto its clock. --force overwrites.
…FastLio2Recorder

- livox/pcap_recorder.py: standalone tcpdump pcap capture (LivoxPcapRecorder),
  decoupled from FAST-LIO. The lidar/SLAM module no longer owns packet capture.
- fastlio2/recorder.py: FastLio2Recorder records fastlio_odometry + fastlio_lidar
  and rewrites ONLY those streams' timestamps onto the db clock (promoted from
  pcap_to_db's inline recorder; fixes the ts==0.0 falsy-fallback bug).
- pcap_to_db.py now imports FastLio2Recorder instead of an inline copy.
FastLio2 no longer produces a global voxel map — odometry + registered lidar
only. Removed global_map Out, map config, and the mapping.GlobalPointcloud spec.
Updated all consumers (fastlio_blueprints, alfred_nav, g1_onboard, g1_nav_onboard,
mobile, pcap_to_db) to drop map args + the global_map remap. Nav blueprints lose
their fastlio map; full nav wants a separate mapper wired in (follow-up).
…tlio

The FAST-LIO core only exposed get_world_cloud (world-registered). Patch the
pinned source in the flake to add get_body_cloud (returns the undistorted scan
in the LiDAR/sensor frame, matching pointlio) and publish that instead; the
lidar header now carries the body/child frame. Consumers register the cloud via
the odometry pose — the recorders' @pose_setter_for already stamps each frame
with the latest odom pose, and the pcap_to_db .rrd now transforms body-frame
clouds to world by that pose. Guard TfRecorder's tf callback against the
teardown race (late LCM callback on a closing store).

KNOWN BREAK: the nav stack's registered_scan consumers expect world-frame —
they're now fed body-frame and need a separate world-registration step (TBD).
…ke patch)

get_body_cloud now lives on the fast-lio source branch jeff/feat/fastlio-body-cloud
(github.com/dimensionalOS/dimos-module-fastlio2 @ 26f18cf) instead of an in-tree
flake patch. Re-point the fast-lio input + relock; remove fast-lio-body-cloud.patch
and the fastlioSrc patched-source derivation. Binary rebuilt + e2e-verified.
- delete dead cpp/voxel_map.hpp (unincluded since global_map removal) + its README row
- remove dead max_velocity_norm_ms CLI arg in main.cpp (parsed, never forwarded)
- pcd_save_en defaults False (FAST-LIO writes .pcd to disk when on); drop the now-
  redundant pcd_save_en=False from the nav blueprints
- drop the '# --- ... ---' banner comments; tighten a few 3-line comments to 2
- render_config_yaml indexes the typed enum fields (no more # type: ignore[index])
- narrow broad excepts: TfRecorder.on_tf -> sqlite3.ProgrammingError; pcap_to_db
  get_data fallback -> (FileNotFoundError, RuntimeError, OSError)
- extrinsic_r default floats; import Any where used
- vendored core (dimos-module-fastlio2 @ 50367cb) now reads
  mapping.filter_size_surf/map from the YAML (was hardcoded 0.5); relock flake.
- main.cpp honors the publish flags (passed as CLI args): scan_publish_en gates
  the lidar output, scan_bodyframe_pub_en picks sensor/body vs world frame,
  dense_publish_en voxel-downsamples the cloud when off.
- nav blueprints set scan_bodyframe_pub_en=False so registered_scan is
  world-frame again (fixes the body-frame break).
- FastLio2Config drops the lid_topic/imu_topic/path_en/pcd_save_en/interval
  fields the fork never reads; keeps the live tuning + the 3 publish flags.
- remove the pointless virtual_now() wrapper + its unreachable failure branch.
…ame)

Drop the body/world toggle; the module always publishes the sensor/body-frame
cloud (get_body_cloud). Remove the field, the main.cpp arg + frame branch, and
the nav blueprints' scan_bodyframe_pub_en=False overrides.
…-pcap

- After writing <db>.rrd, open it in rerun (subprocess), unless --no-gui.
- A --pcap that isn't a local file is resolved via get_data (LFS), mirroring
  the --db fallback, so a get_data-style pcap path works directly.
… mount

- mobile / alfred_nav / unitree_g1_onboard drop their FastLio2 tuning overrides
  (acc_cov/gyr_cov/det_range/extrinsic_est_en/filter_size) and use the module
  defaults.
- preserve the removed FlowBase Mid-360 mount pose as FLOWBASE_MID360_MOUNT in
  the new dimos/hardware/drive_trains/flowbase/config.py for later use.
Add a 'FastLio2 tuning' arg group (--acc-cov, --filter-size-surf/map, --det-range,
--blind, --fov-degree, --lidar-type, --extrinsic-est-en, --scan/--dense-publish-en,
etc.) merged into the FastLio2Config overrides — they take precedence over the
--config YAML doc. Only flags that are set override anything.
The vendored core (dimos-module-fastlio2 @ a32c9f5) now takes a FastLioParams
struct instead of loading a YAML, so drop yaml-cpp (flake + CMake). module.py no
longer generates a throwaway YAML / config_path; the tuning fields are emitted as
plain CLI args (lidar_type/timestamp_unit as strings, extrinsics as comma lists),
and main.cpp reads them into FastLioParams. Also wire dense_publish_en to the
core's get_body_cloud_down (IESKF-downsampled scan) instead of a PCL voxel in
main.cpp.
Single space around assignment (was column-aligned, lisp-style) and collapse
over-aligned inline comments.
It was a [&]-capturing lambda called from exactly one place (a leftover of the
old replay/virtual-clock design). Inline the body into the while loop and drop
the now-redundant check_now copy.
… a lambda)

Extract the per-iteration body into a plain static function that takes the time
point + the rate-limit bookmarks/intervals/flags as explicit args, instead of a
[&]-capturing lambda.
…inery

Remove _existing_min_ts / _resolve_offset / _resolve_ts / time_offset and the
EPS tie-breaker. Records use the base Recorder's ts (msg.ts) directly — the
native modules always stamp a real epoch ts, so no re-basing is needed. Drop the
now-unused --time-offset from both pcap_to_db tools.
The README was redundant; the tuning comment referenced the removed YAML/_YAML_LAYOUT
(it's plain CLI args now).
Drop the jhist pointer and Go2-specific wording; describe acc_cov generically
(high vs low IMU-accel trust).
…streams

_prepare_streams is now 3 lines: delete the recorder's own two streams if present
(keeping any other streams), then record. Removes the force config + the refuse-
to-overwrite raise, and --force from both pcap_to_db tools.
…ream_name

Name the In ports as the db tables (fastlio_/pointlio_odometry/lidar) and wire
them to the module's odometry/lidar outputs with .remappings() in pcap_to_db —
matching the base Recorder convention (stream = port name). Removes _stream_name
and the odom/lidar_stream_name config; _prepare_streams just replaces our own
ports' streams. pcap_to_db drops the now-fixed --odom/lidar-stream-name args.
Move record_tf + @pose_setter_for into Recorder/RecorderConfig and delete
tf_recorder.py. Every Recorder now records the live tf stream by default and
supports per-stream pose setters; fastlio/pointlio recorders subclass Recorder
directly. Drop the now-redundant tf-recorder blueprint entry.
Mirror the fastlio no-yaml change: PointLioConfig tuning fields are passed to
the binary as CLI args (read into a PointLioParams struct in main.cpp) instead
of being rendered to a throwaway YAML read via --config_path. Removes the
yaml-cpp dependency from the flake + CMake and the config-file plumbing from the
module. Requires the matching dimos-module-fastlio2 pointlio-branch change
(flake.lock bumped to 288e357).
{port_name: db_stream_name} to control the recorded stream/table name without
subclassing — conceptually what .remappings() expresses, but the active
remappings aren't readily accessible from inside the module.
… frame

Both modules now expose the same three frames: frame_id (fixed odom, the
odometry header + TF parent), child_frame_id (moving body, the odometry child +
TF child), and sensor_frame_id (the lidar's own frame, stamped on the published
point cloud). get_body_cloud is the undistorted scan in the sensor frame, so the
cloud was previously mislabeled — fastlio stamped it with the body frame and
pointlio reused frame_id for both the cloud and the odometry header. pointlio's
body_start_frame_id/body_frame_id are folded into this scheme. Drops a stale PGO
comment.
Use the lidar's concrete frame name rather than the generic FRAME_SENSOR.
…_remapping

Move the replace-on-append logic into the base Recorder so it drops exactly the
streams it is about to rewrite: the remapped In-port streams (via _stream_name)
AND the tf stream. Previously each recorder's _prepare_streams deleted only the
raw In-port names, so re-running into an existing db appended a second full copy
of the tf tree and would orphan remapped streams. Also guard _record_tf for
non-pubsub tf backends, and drop the now-dead world/global_map_fastlio rerun
overrides (FastLio2's global_map port was removed in this PR).
…_stream_name

Braces on every if/for/while (inline single-statement bodies), collapse the
awkwardly-wrapped calls/signatures to one line, and break the long run_main_iter
def/call with the closing paren on its own line. Inline get_publish_ts (and the
Recorder._stream_name helper, now config.stream_remapping.get at the use sites);
add the missing braces to pointlio's parse_doubles too.
The pose-setter dict is typed dict[str, Any] (to avoid evaluating the Pose
forward-ref at class-definition time), so cast its result back to Pose | None.
The pointlio improvements (no-yaml config, Recorder-subclass rework,
frame_id scheme) move to jeff/feat/pointlio_native via merge; this PR
carries only the fastlio2 work + shared memory2 Recorder base.
# Conflicts:
#	dimos/hardware/sensors/lidar/fastlio2/tools/pcap_to_db.py
#	dimos/memory2/module.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants