Skip to content

Clarification on OpenDriveVLA collision evaluation metrics #40

@hawk249

Description

@hawk249

Thank you for releasing OpenDriveVLA and the official evaluation code. We have been studying the model, and the work has been very helpful for our VLM-AD planning research.

In our local evaluation of the official OpenDriveVLA-0.5B checkpoint on nuScenes val, the log reports Processed total 6019 samples, gt collision: 36, with UniAD-style metrics L2 = 0.21/0.61/1.24/0.68 and Collision = 0.00/0.17/0.63/0.27%, and STP-3-style metrics L2 = 0.15/0.32/0.57/0.35 and Collision = 0.01/0.07/0.22/0.10%.

We would like to make sure we interpret these numbers correctly. In particular, could you kindly clarify the official collision evaluation protocol used here? For example, should the reported collision be understood as endpoint collision, averaged per-step collision, or any-collision-within-horizon? We also noticed that the evaluation appears to use precomputed planing_gt_segmentation_val; could you clarify the BEV grid resolution, ego-box handling, and whether this protocol is intended to be directly comparable with OmniDrive-style planning collision metrics

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions