v0.10.0#34
Open
matajoh wants to merge 1 commit into
Open
Conversation
A result-reading and documentation release. `Cown.unwrap()` replaces ad-hoc context-manager reads of behavior results with a single quiescence-guarded call lowered to the C capsule, and the test suite moves wholesale to the `quiesce()` + `unwrap()` pattern. `Matrix` gains arg-reductions (`argmin` / `argmax`) and an explicit PRNG `seed`, and its matmul kernel is re-ordered for cache-friendly auto-vectorization (bit-for-bit identical output). The legacy `notice_sync` shim is removed in favour of `quiesce(noticeboard=True)` for reads and the new `notice_seed` for synchronous main-thread seeding. **New Features** - **`notice_seed(key, value)`** — a synchronous, main-interpreter-only noticeboard write that commits under the noticeboard mutex *before it returns*, so every behavior scheduled afterwards observes it. Unlike the fire-and-forget `notice_write`, it gives read-your-writes ordering for installing read-mostly configuration before scheduling the behaviors that read it, and it starts the runtime if called first — so seeding can be a program's first bocpy call with no explicit `start()`. It is a plain overwrite and does not provide `notice_update`'s read-modify-write atomicity. Calling it from a worker raises `RuntimeError`. - **`Cown.unwrap()`** — return the cown's stored value, or re-raise a captured behavior exception on the caller's thread (Rust `Result::unwrap` shape). Acquires the cown for the read and requires global quiescence (`quiesce` / `wait`) first, raising `RuntimeError` otherwise so a result is never read while its producer is still in flight. Lowered to a C-level `CownCapsule.unwrap`, so a behavior that returns a `Cown` (surfacing downstream as a bare `CownCapsule`) unwraps the same way without rewrapping. A captured exception is cleared on read, so it is not re-reported when the cown is dropped and a second `unwrap()` returns `None`. - **`Matrix.argmin(axis=None)` / `Matrix.argmax(axis=None)`** — index of the minimum / maximum element, first occurrence on ties. Flat (`axis=None`) returns a row-major `int`; `axis=0` / `axis=1` return per-column / per-row index vectors. NaN elements are skipped unless the running extreme starts at NaN, which pins the result to that position (this differs from NumPy, which propagates NaN). - **`Matrix.seed(value)`** — classmethod seeding the process-global C PRNG used by `normal()` / `uniform()`, making subsequent draws reproducible when generation stays on a single thread. - **`Matrix` pickling** — `Matrix` now supports `pickle` (all protocols) and `copy.deepcopy` via `__reduce__`, so a matrix nested in a pickled container (dict, list, …) round-trips with its neighbours instead of raising `TypeError`. Serialization copies the raw, native-endian, row-major `double` buffer in one block, so the cost is linear in the element count with no per-element Python object churn and every value (including `NaN`, `±inf`, `-0.0`, and subnormals) is preserved bit-for-bit. The current interpreter must own the matrix: pickling one that has been released into a `Cown` raises `RuntimeError`. The encoding is native-endian, so a pickle is not portable across architectures of differing byte order. - **`examples/fanout_benchmark.py`** — a dispatch-rate microbenchmark for the fanout workload (a producer that allocates fresh consumer cowns it does not hold and dispatches one `@when` each), surfacing per-worker queue contention (`enqueue_cas_retries`) as the gating signal. Complements the chain workload in `examples/benchmark.py`. **Improvements** - **matmul cache-friendly reorder** — `impl_matmul` is re-ordered from `ijk` to `ikj` so the inner loop walks contiguous rows of the right-hand operand and the output, enabling compiler auto-vectorization. Output is bit-for-bit identical (each inner product still accumulates `k` in ascending order); measured ~2.9–3.2× faster on square shapes, ~1.5–1.8× on rectangular ones. A bitwise-reproducibility regression test pins the accumulation order. **Bug Fixes** A warm welcome and thank-you to first-time contributor **Shivanand Mishra** (@xemishra), who tracked down and fixed a subtle transpiler bug this release — exactly the kind of sharp-eyed catch that makes the project better. - **`@when` result assignment dropped for module-level behaviors** (microsoft#30, thanks @xemishra) — a behavior defined at module level transpiled without its result cown, so the exported module silently dropped the return value and downstream behaviors could not schedule over it. Fixed, with a regression test guarding the exported-module shape. - **Nested `@when` capture** — the transpiler now correctly surfaces a nested `@when`'s free names as the outer behavior's captures and resolves its cown arguments in the outer frame, instead of leaving them to Python's closure machinery where they could not be reached from the worker interpreter. - **`Matrix` range/return checks** — added overflow and return-value checks on the `range_read` path uncovered while migrating the matrix tests. **Breaking Changes** - **`notice_sync` removed** — the noticeboard-sync shim is gone from `bocpy.__all__`. Use `quiesce(noticeboard=True)` instead, which blocks until in-flight behaviors complete and returns a noticeboard snapshot without tearing the runtime down. **Documentation** - Removed the `notice_sync` references from `noticeboard` and the type stubs; documented the NaN tie-break behavior of `argmin` / `argmax`; corrected the happens-after example in the `thinking-in-boc` skill to order across genuinely unrelated data; added a `fanout_benchmark.py` section to the examples README. **Tests** - Migrated `test_boc.py`, `test_noticeboard.py`, and the scheduler / pinned-pump suites to the `quiesce()` + `Cown.unwrap()` pattern. Added matmul bitwise-reproducibility and `argmin` / `argmax` NaN regression tests. **Dependencies** - Bumped the `github-actions` group (microsoft#31, microsoft#27, dependabot): `actions/checkout` 6.0.2 → 6.0.3 and `pypa/cibuildwheel` 3.4.1 → 4.0.0. **Internal** - Large comment scrub across the C extensions, Python runtime, scripts, and tests, followed by a remediation pass that restored load-bearing rationale (memory-ordering fences, UAF guards, deliberate-leak notes, and the vendored Apache-2.0 provenance header) as condensed summaries. - Ignored Sphinx-related updates in `dependabot.yml` to keep the docs toolchain pinned. Signed-off-by: Matthew A Johnson <matjoh@microsoft.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This release addresses a slew of smaller issues and repo maintenance items.
Cown.unwrap()replaces ad-hoc context-manager reads of behavior results with a single quiescence-guarded call lowered to the C capsule, and the test suite moves wholesale to thequiesce()+unwrap()pattern.Matrixgains arg-reductions (argmin/argmax) and an explicit PRNGseed, and its matmul kernel is re-ordered for cache-friendly auto-vectorization (bit-for-bit identical output). The legacynotice_syncshim is removed in favour ofquiesce(noticeboard=True)for reads and the newnotice_seedfor synchronous main-thread seeding.New Features
notice_seed(key, value)— a synchronous, main-interpreter-only noticeboard write that commits under the noticeboard mutex before it returns, so every behavior scheduled afterwards observes it. Unlike the fire-and-forgetnotice_write, it gives read-your-writes ordering for installing read-mostly configuration before scheduling the behaviors that read it, and it starts the runtime if called first — so seeding can be a program's first bocpy call with no explicitstart(). It is a plain overwrite and does not providenotice_update's read-modify-write atomicity. Calling it from a worker raisesRuntimeError.Cown.unwrap()— return the cown's stored value, or re-raise a captured behavior exception on the caller's thread (RustResult::unwrapshape). Acquires the cown for the read and requires global quiescence (quiesce/wait) first, raisingRuntimeErrorotherwise so a result is never read while its producer is still in flight. Lowered to a C-levelCownCapsule.unwrap, so a behavior that returns aCown(surfacing downstream as a bareCownCapsule) unwraps the same way without rewrapping. A captured exception is cleared on read, so it is not re-reported when the cown is dropped and a secondunwrap()returnsNone.Matrix.argmin(axis=None)/Matrix.argmax(axis=None)— index of the minimum / maximum element, first occurrence on ties. Flat (axis=None) returns a row-majorint;axis=0/axis=1return per-column / per-row index vectors. NaN elements are skipped unless the running extreme starts at NaN, which pins the result to that position (this differs from NumPy, which propagates NaN).Matrix.seed(value)— classmethod seeding the process-global C PRNG used bynormal()/uniform(), making subsequent draws reproducible when generation stays on a single thread.Matrixpickling —Matrixnow supportspickle(all protocols) andcopy.deepcopyvia__reduce__, so a matrix nested in a pickled container (dict, list, …) round-trips with its neighbours instead of raisingTypeError. Serialization copies the raw, native-endian, row-majordoublebuffer in one block, so the cost is linear in the element count with no per-element Python object churn and every value (includingNaN,±inf,-0.0, and subnormals) is preserved bit-for-bit. The current interpreter must own the matrix: pickling one that has been released into aCownraisesRuntimeError. The encoding is native-endian, so a pickle is not portable across architectures of differing byte order.examples/fanout_benchmark.py— a dispatch-rate microbenchmark for the fanout workload (a producer that allocates fresh consumer cowns it does not hold and dispatches one@wheneach), surfacing per-worker queue contention (enqueue_cas_retries) as the gating signal. Complements the chain workload inexamples/benchmark.py.Improvements
impl_matmulis re-ordered fromijktoikjso the inner loop walks contiguous rows of the right-hand operand and the output, enabling compiler auto-vectorization. Output is bit-for-bit identical (each inner product still accumulateskin ascending order); measured ~2.9–3.2× faster on square shapes, ~1.5–1.8× on rectangular ones. A bitwise-reproducibility regression test pins the accumulation order.Bug Fixes
A warm welcome and thank-you to first-time contributor Shivanand Mishra (@xemishra), who tracked down and fixed a subtle transpiler bug this release — exactly the kind of sharp-eyed catch that makes the project better.
@whenresult assignment dropped for module-level behaviors (fix(transpiler): @when result assignment silently dropped in exported module #30, thanks @xemishra) — a behavior defined at module level transpiled without its result cown, so the exported module silently dropped the return value and downstream behaviors could not schedule over it. Fixed, with a regression test guarding the exported-module shape.@whencapture — the transpiler now correctly surfaces a nested@when's free names as the outer behavior's captures and resolves its cown arguments in the outer frame, instead of leaving them to Python's closure machinery where they could not be reached from the worker interpreter.Matrixrange/return checks — added overflow and return-value checks on therange_readpath uncovered while migrating the matrix tests.Breaking Changes
notice_syncremoved — the noticeboard-sync shim is gone frombocpy.__all__. Usequiesce(noticeboard=True)instead, which blocks until in-flight behaviors complete and returns a noticeboard snapshot without tearing the runtime down.Documentation
notice_syncreferences fromnoticeboardand the type stubs; documented the NaN tie-break behavior ofargmin/argmax; corrected the happens-after example in thethinking-in-bocskill to order across genuinely unrelated data; added afanout_benchmark.pysection to the examples README.Tests
test_boc.py,test_noticeboard.py, and the scheduler / pinned-pump suites to thequiesce()+Cown.unwrap()pattern. Added matmul bitwise-reproducibility andargmin/argmaxNaN regression tests.Dependencies
github-actionsgroup (Bump the github-actions group with 2 updates #31, Bump actions/checkout from 6.0.2 to 6.0.3 in the github-actions group across 1 directory #27, dependabot):actions/checkout6.0.2 → 6.0.3 andpypa/cibuildwheel3.4.1 → 4.0.0.Internal
dependabot.ymlto keep the docs toolchain pinned.