A reference implementation of std::execution ([exec]), the C++26 model for asynchronous and parallel programming.
stdexec lets you express asynchronous work as composable, lazy sender pipelines that can run on threads, thread pools, GPUs, or any custom execution context β with structured concurrency guarantees.
Warning
stdexec is experimental and tracks an evolving standard. APIs may change without notice. NVIDIA does not guarantee fitness for any particular purpose.
- Example
- Features
- Compiler support
- Installation
- Quick start
- GPU support
- Examples gallery
- Documentation
- Building tests and examples
- IDE support
- Resources
- Contributing
- Citation
- License
Run three pieces of work concurrently on the system thread pool. Try it live on godbolt.
#include <stdexec/execution.hpp>
#include <cstdio>
namespace ex = stdexec;
int main() {
auto sched = ex::get_parallel_scheduler();
auto fun = [](int i) { return i * i; };
// Build a lazy pipeline: three squares, computed in parallel.
auto work = ex::when_all(ex::on(sched, ex::just(0) | ex::then(fun)),
ex::on(sched, ex::just(1) | ex::then(fun)),
ex::on(sched, ex::just(2) | ex::then(fun)));
// Launch the work and wait for the result.
auto [i, j, k] = ex::sync_wait(std::move(work)).value();
std::printf("%d %d %d\n", i, j, k); // prints "0 1 4"
}- C++26 reference implementation of
std::execution(P2300). - Header-only, no external dependencies.
- Composable algorithms:
then,let_value,when_all,bulk,split,transfer,upon_*, ... - Structured concurrency primitives:
async_scope,task,finally,when_any,repeat_n, ... - Pluggable schedulers: system parallel scheduler, static thread pool, Linux
io_uringcontext, NVIDIA GPU contexts, your own. - GPU offload via
nvexecschedulers (nvc++compiler). - Coroutine interop: senders are awaitable; awaitables are senders.
- Generic extensions (
<exec/...>) for primitives not (yet) in the standard.
| Compiler | Minimum version | Notes |
|---|---|---|
| GCC | 12 | |
| Clang | 16 | |
| MSVC | 14.43 | |
| Xcode (Apple Clang) | 16 | |
| nvc++ | 25.9 | required for GPU support |
Requires -std=c++20 or later.
Note
stdexec does not yet support NVIDIA's nvcc compiler.
Pick whichever fits your project.
CPM fetches and configures stdexec automatically from your CMakeLists.txt:
CPMAddPackage(
NAME stdexec
GITHUB_REPOSITORY NVIDIA/stdexec
GIT_TAG main # or a specific tag
)
target_link_libraries(my_target PRIVATE STDEXEC::stdexec)Clone alongside your project and add it as a subdirectory:
git clone https://github.com/NVIDIA/stdexec.gitadd_subdirectory(stdexec)
target_link_libraries(my_target PRIVATE STDEXEC::stdexec)A conanfile.py is provided for use with the Conan package manager.
Starting with NVHPC SDK 22.11, stdexec is bundled with nvc++. Pass --experimental-stdpar to put stdexec headers on the include path. Add -stdpar=gpu for GPU features. See the godbolt example.
stdexec is header-only, so adding -I<stdexec root>/include to your compile command is sufficient. Using the CMake target is recommended because it sets the required compile flags.
A minimal CMakeLists.txt using CPM:
cmake_minimum_required(VERSION 3.25.0)
project(stdexec_example LANGUAGES CXX)
include(CPM.cmake) # see https://github.com/cpm-cmake/CPM.cmake#adding-cpm
CPMAddPackage(
NAME stdexec
GITHUB_REPOSITORY NVIDIA/stdexec
GIT_TAG main
)
add_executable(example example.cpp)
target_link_libraries(example PRIVATE STDEXEC::stdexec)stdexec ships GPU schedulers in <nvexec/...> for use with nvc++ -stdpar=gpu:
| Scheduler | Header | Description |
|---|---|---|
nvexec::stream_scheduler |
<nvexec/stream_context.cuh> |
Single-GPU scheduler (device 0). |
nvexec::multi_gpu_stream_scheduler |
<nvexec/multi_gpu_context.cuh> |
Multi-GPU scheduler across all visible devices. |
Live example: https://godbolt.org/z/h7rh5qGhj
The examples/ directory contains runnable programs demonstrating the library.
| Example | What it shows |
|---|---|
hello_world.cpp |
The "hello world" of senders. |
hello_coro.cpp |
Awaiting a sender from a coroutine. |
then.cpp |
Writing a then algorithm from scratch. |
retry.cpp |
Writing a retry algorithm from scratch. |
scope.cpp |
Structured concurrency with async_scope. |
io_uring.cpp |
Async I/O via the Linux io_uring context. |
sudoku.cpp |
A parallel sudoku solver. |
server_theme/ |
Server-style patterns (let_value, split, bulk, transfer). |
nvexec/ |
GPU schedulers, including the Maxwell solver. |
π Full documentation: https://nvidia.github.io/stdexec
- User guide: https://nvidia.github.io/stdexec/user/ (source)
- Reference: https://nvidia.github.io/stdexec/reference/ (source)
- Developer docs: https://nvidia.github.io/stdexec/developer/ (source)
- Contributing to docs:
docs/CONTRIBUTING-docs.md - The proposal:
[exec]βstd::execution
The library is organized into three namespaces:
| Namespace | Headers | Contents |
|---|---|---|
::stdexec |
<stdexec/...> |
Things in (or proposed for) the C++ standard. |
::exec |
<exec/...> |
Generic additions and extensions. |
::nvexec |
<nvexec/...> |
NVIDIA-specific schedulers and customizations. |
The library itself is header-only β these steps are only needed if you want to build the test suite or the examples.
cmake -S . -B build -G Ninja
cmake --build build
ctest --test-dir buildTo select a specific compiler:
cmake -S . -B build/clang -DCMAKE_CXX_COMPILER=$(which clang++)
cmake --build build/clangTo use libc++ with Clang:
cmake -S . -B build/libcxx \
-DCMAKE_CXX_COMPILER=$(which clang++) \
-DCMAKE_CXX_FLAGS=-stdlib=libc++
cmake --build build/libcxxA VSCode extension is available that colorizes compiler diagnostics from stdexec, making the long template error messages much easier to read. Source and configuration: https://github.com/ericniebler/buildoutputcolorizer.
- P2300 β
std::executionβ the proposal accepted into C++26.
- Working with Asynchrony Generically: A Tour of Executors (Part 2) β comprehensive introduction.
- From Zero to Sender/Receiver in ~60 Minutes β live-coding a toy sender/receiver from scratch.
- A Unifying Abstraction for Async in C++ β concepts behind P2300.
- Structured Concurrency β what structured concurrency means and why.
- Structured Networking in C++ β what a P2300-style networking library could look like.
- What are Senders Good For, Anyway? β wrapping a C-style async API in a sender.
- A Universal Async Abstraction for C++ β an introduction to senders.
- A Universal I/O Abstraction for C++ β senders meet
io_uring. - Executors: a Change of Perspective β on the computational completeness of senders.
- Structured Concurrency in C++ β how senders manifest structured concurrency.
- HPCWire: New C++ Sender Library Enables Portable Asynchrony.
Contributions are welcome. Before opening a PR, please review:
CODE_OF_CONDUCT.mdMAINTAINERS.mddocs/CONTRIBUTING-docs.mdfor documentation contributions.
Bug reports and feature requests belong in GitHub Issues; design discussion in GitHub Discussions.
If you reference stdexec in academic work, please cite the standards proposal:
@techreport{P2300,
author = {Niebler, Eric and Shoop, Kirk and Baker, Lewis and Dominiak, MichaΕ and
Evtushenko, Georgy and Teodorescu, Lucian Radu and Howes, Lee and Garland,
Michael and Lelbach, Bryce Adelstein}
title = {{P2300R10}: \texttt{std::execution}},
institution = {ISO/IEC JTC1/SC22/WG21},
year = {2024},
url = {https://wg21.link/p2300}
}stdexec is licensed under the Apache License 2.0 with LLVM Exceptions. See LICENSE.txt for the full text.