[RUNTIME][FFI][RELAX] Add vm.builtin.shape_to_tensor builtin#19849
[RUNTIME][FFI][RELAX] Add vm.builtin.shape_to_tensor builtin#19849cbalint13 wants to merge 1 commit into
Conversation
There was a problem hiding this comment.
Code Review
This pull request replaces the Python-based runtime function relax.run.shape_to_tensor with a C++ implementation vm.builtin.shape_to_tensor in the Virtual Machine builtins. It updates the operator registration, the constant folding pass, and the VM builtin lowering pass to use this new builtin, and adds a corresponding test. Feedback points out a potential runtime type mismatch in fold_constant.cc where arr (an Array<int64_t>) is passed directly to vm.builtin.shape_to_tensor which expects ffi::Shape. It is recommended to explicitly construct a ffi::Shape from arr's elements.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
4baa2fc to
8bf91ff
Compare
8bf91ff to
57cb31b
Compare
2e22d1e to
566cf7b
Compare
566cf7b to
1889a53
Compare
Implements
vm.builtin.shape_to_tensorbuiltin to runtime.Adds hook for relax to builtin lowering, siding with the existing
vm.builtin.tensor_to_shapebuiltin.Replaces python
relax.run.shape_to_tensorvariant of the function.Issue
Attempted to do inference with a c++ only deployment (no public API yet) and discovered the following issue:
Exported DSO contains a
relax.run.shape_to_tensorcall that works in python env but not in pure runtime:Solution
The
relax.run.shape_to_tensor(python only) is replaced with more appropriatevm.builtin.shape_to_tensor.Results
[x] The exported DSO is instantiable in a pure C++ env having only
tvm_runtime.so+tvm_ffi.so.[x] The inference speed went up by almost 2x factor (for a RNN, small input size), even in a python program (full env):
Before
After