Skip to content

feat: add runtime backend API foundation#14

Merged
voltjia merged 1 commit into
masterfrom
feat/add-runtime-backend-api-foundation
Jul 3, 2026
Merged

feat: add runtime backend API foundation#14
voltjia merged 1 commit into
masterfrom
feat/add-runtime-backend-api-foundation

Conversation

@voltjia

@voltjia voltjia commented Jul 3, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Add the non-graph runtime API surface introduced by CPU runtime extraction to the remaining runtime backends.
  • Wire CUDA-compatible backends to their corresponding runtime APIs for host allocation, stream-ordered allocation/free where supported, mem info, memset async, stream, and event operations.
  • Add conservative unsupported stubs for Ascend/Cambricon APIs without confirmed direct equivalents, and for Moore async allocation/free on the current Moore SDK.
  • Extend the platform-adaptive native runtime and dispatch tests so the new API surface is exercised on each enabled backend.

Motivation

This splits the backend runtime API foundation out of #9 so that #9 can focus on graph runtime APIs only. It follows the surface already added for CPU in #8 and now uses the shared platform-adaptive test mechanism from #15.

Closes N/A

Type of Change

  • feat - new feature / new operator / new platform
  • fix - bug fix
  • perf - performance improvement (no behavioral change)
  • refactor - code restructuring without behavior change
  • test - adding or fixing tests only
  • docs - documentation only
  • build / ci - build system or CI configuration
  • chore - tooling, formatting, or other non-code changes
  • Breaking change (requires a ! in the Conventional Commits prefix or a BREAKING CHANGE: footer)

Platforms Affected

  • CPU (WITH_CPU)
  • NVIDIA (WITH_NVIDIA)
  • Iluvatar (WITH_ILUVATAR)
  • MetaX (WITH_METAX)
  • Cambricon (WITH_CAMBRICON)
  • Moore (WITH_MOORE)
  • Ascend (WITH_ASCEND)
  • PyTorch C++ bindings (WITH_TORCH)
  • Build system / CMake / CI
  • Python bindings / user-facing API

Smoke Test Result

ssh nvidia
image: accelerator-dev/nvidia:latest
cmake -B build -DWITH_CPU=ON -DWITH_NVIDIA=ON -DINFINI_RT_BUILD_TESTING=ON -DCMAKE_BUILD_TYPE=Release
cmake --build build -j$(nproc)
ctest --output-on-failure

100% tests passed, 0 tests failed out of 7

Test Results on Supported Platforms

Platform Affected Build / Smoke Result Full Result / Notes
NVIDIA yes passed CPU+NVIDIA CMake/CTest passed on ssh nvidia with accelerator-dev/nvidia:latest
Iluvatar yes passed CPU+Iluvatar CMake/CTest passed on ssh iluvatar with accelerator-dev/iluvatar:latest
MetaX yes passed CPU+MetaX CMake/CTest passed on ssh metax with accelerator-dev/metax:latest; installed CMake in-container for ctest
Cambricon yes passed CPU+Cambricon CMake/CTest passed on ssh cambricon with accelerator-dev/cambricon:latest
Moore yes passed CPU+Moore CMake/CTest passed on ssh moore with accelerator-dev/moore:latest; async alloc/free are expected unsupported on current SDK
Ascend yes passed CPU+Ascend CMake/CTest output passed on ssh ascend with accelerator-dev/ascend:latest; SSH wrapper returned non-zero due known host issue
Representative NVIDIA `ctest` output
Test project /workspace/InfiniRT/build
    Start 1: test_smoke
1/7 Test #1: test_smoke .......................   Passed    0.00 sec
    Start 2: test_core
2/7 Test #2: test_core ........................   Passed    0.01 sec
    Start 3: test_cpu_runtime
3/7 Test #3: test_cpu_runtime .................   Passed    0.00 sec
    Start 4: test_nvidia_runtime
4/7 Test #4: test_nvidia_runtime ..............   Passed    0.53 sec
    Start 5: test_runtime_dispatch
5/7 Test #5: test_runtime_dispatch ............   Passed    0.47 sec
    Start 6: test_install
6/7 Test #6: test_install .....................   Passed    0.01 sec
    Start 7: test_install_consumer
7/7 Test #7: test_install_consumer ............   Passed    1.52 sec

100% tests passed, 0 tests failed out of 7

Benchmark / Performance Impact

N/A

Notes for Reviewers

  • This PR intentionally does not add graph APIs; those remain in feat: add graph runtime api #9.
  • Ascend/Cambricon get the same public surface but return non-success for APIs without a confirmed direct runtime equivalent, rather than pretending unsupported async/event operations succeeded.
  • Moore's current SDK does not declare musaFreeAsync, so this PR exposes the API but reports async allocation/free as unsupported for Moore.

@voltjia voltjia force-pushed the feat/add-runtime-backend-api-foundation branch 2 times, most recently from 31b8520 to faef619 Compare July 3, 2026 05:46
@voltjia voltjia force-pushed the feat/add-runtime-backend-api-foundation branch from faef619 to 4a1d37c Compare July 3, 2026 05:51
@voltjia voltjia merged commit fef94da into master Jul 3, 2026
4 checks passed
@voltjia voltjia deleted the feat/add-runtime-backend-api-foundation branch July 3, 2026 06:02
voltjia added a commit that referenced this pull request Jul 3, 2026
* feat!: align runtime API and add runtime dispatch (#11)

* Align runtime API with generated wrappers

* Add default runtime dispatch specialization

* Refactor runtime dispatch namespace

* Use Abseil status for runtime device API

* Revert "Use Abseil status for runtime device API"

This reverts commit a26ddff.

* Address runtime dispatch review feedback

* Keep runtime API list in generator

* Add TensorView constructor guard test

* Align runtime memcpy kind constants with CUDA API

* Use CUDA-style runtime memcpy constants

* Use CUDA-style runtime memcpy constants

* Move TensorView tests back into core test

* Remove standalone TensorView test target

* Remove standalone TensorView test file

* Use fully qualified runtime API names in README

* style: format runtime dispatch test

* feat: refactor InfiniCore CPU runtime to InfiniRT (#8)

Co-authored-by: Jiacheng Huang <huangjiacheng0709@outlook.com>

* feat: add platform-adaptive runtime tests (#15)

* feat: add runtime backend API foundation (#14)

---------

Co-authored-by: spike-zhu <74974704+spike-zhu@users.noreply.github.com>
voltjia added a commit that referenced this pull request Jul 3, 2026
* feat!: align runtime API and add runtime dispatch (#11)

* Align runtime API with generated wrappers

* Add default runtime dispatch specialization

* Refactor runtime dispatch namespace

* Use Abseil status for runtime device API

* Revert "Use Abseil status for runtime device API"

This reverts commit a26ddff.

* Address runtime dispatch review feedback

* Keep runtime API list in generator

* Add TensorView constructor guard test

* Align runtime memcpy kind constants with CUDA API

* Use CUDA-style runtime memcpy constants

* Use CUDA-style runtime memcpy constants

* Move TensorView tests back into core test

* Remove standalone TensorView test target

* Remove standalone TensorView test file

* Use fully qualified runtime API names in README

* style: format runtime dispatch test

* feat: refactor InfiniCore CPU runtime to InfiniRT (#8)

Co-authored-by: Jiacheng Huang <huangjiacheng0709@outlook.com>

* feat: add platform-adaptive runtime tests (#15)

* feat: add runtime backend API foundation (#14)

---------

Co-authored-by: spike-zhu <74974704+spike-zhu@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant