Kernel Lab

Kernel Lab is a small research playground for learning and comparing operator implementations across three stages:

Torch reference implementation
Triton implementation
CUDA extension implementation

Every operator follows the same workflow: correctness first, then benchmarks, then profiling.

Design goals

Keep the code readable enough for study and iteration.
Make local macOS development possible without requiring a GPU.
Keep the remote Linux + NVIDIA validation path obvious and repeatable.

Repository layout

kernel_lab/
├── ops/
│   ├── registry.py
│   ├── references/
│   ├── triton/
│   ├── cuda/
│   └── common/
├── tests/
├── benchmarks/
├── scripts/
└── docs/

Development loop

Implement one operator at a time.

Start in kernel_lab/ops/references/ with the Torch baseline.
Add the Triton version in kernel_lab/ops/triton/.
Add the CUDA binding and kernels in kernel_lab/ops/cuda/.
Validate with pytest.
Compare with the benchmark scripts.
Profile on the Linux + NVIDIA server with ncu or nsys.

Quick start

Install the package in editable mode:

pip install -e ".[dev]"

Run the default tests:

pytest

Run a baseline benchmark on CPU:

python benchmarks/bench_softmax.py --backend reference --device cpu

When you are on a Linux + NVIDIA machine, you can build the CUDA extension with:

python setup.py build_ext --inplace

Current sample operators

softmax
rmsnorm
rope
swiglu placeholder
attention_toy placeholder

The Triton and CUDA directories currently provide templates and integration points, not finished optimized kernels.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Kernel Lab

Design goals

Repository layout

Development loop

Quick start

Current sample operators

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
benchmarks		benchmarks
docs		docs
kernel_lab		kernel_lab
scripts		scripts
tests		tests
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml
setup.py		setup.py

Folders and files

Latest commit

History

Repository files navigation

Kernel Lab

Design goals

Repository layout

Development loop

Quick start

Current sample operators

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages