[cuda.compute][cuda.coop]: Replace all usages of device arrays outside examples with new wrapper hat does not depend on cupy or numba-cuda#9653
Conversation
… that does not depend on cupy or numba-cuda
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (8)
🚧 Files skipped from review as they are similar to previous changes (6)
📝 WalkthroughSummary by CodeRabbit
WalkthroughThis PR removes CuPy as the test dependency by moving CUDA buffer handling to a new ChangesDeviceArray migration and packaging
Assessment against linked issues
Out-of-scope changesNone Suggested reviewers
Comment |
|
@coderabbitai full review |
✅ Action performedFull review finished. |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
python/cuda_cccl/tests/compute/conftest.py (1)
73-77: 🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick winsuggestion:
cuda_streamfixture never releases the stream.
device.create_stream()returns acuda.core.Streamthat owns the underlying CUDA stream and must be destroyed viaclose(); the fixture just returns it without ayield/cleanup, so the stream handle leaks every test invocation. Use a teardown viayield+stream.close().♻️ Suggested fix
`@pytest.fixture`(scope="function") def cuda_stream() -> Stream: device = Device() device.set_current() - return device.create_stream() + stream = device.create_stream() + yield stream + stream.close()
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 34e471e4-3fc6-48d3-adde-05f213b52bf6
📒 Files selected for processing (47)
ci/test_cuda_cccl_examples_python.shci/test_cuda_compute_minimal_python.shci/windows/test_cuda_cccl_examples_python.ps1docs/python/setup.rstpython/cuda_cccl/pyproject.tomlpython/cuda_cccl/tests/_utils/__init__.pypython/cuda_cccl/tests/_utils/device_array.pypython/cuda_cccl/tests/compute/conftest.pypython/cuda_cccl/tests/compute/test_binary_search.pypython/cuda_cccl/tests/compute/test_deferred_annotations.pypython/cuda_cccl/tests/compute/test_histogram.pypython/cuda_cccl/tests/compute/test_iterators.pypython/cuda_cccl/tests/compute/test_merge_sort.pypython/cuda_cccl/tests/compute/test_nested_struct.pypython/cuda_cccl/tests/compute/test_no_numba.pypython/cuda_cccl/tests/compute/test_permutation_iterator.pypython/cuda_cccl/tests/compute/test_radix_sort.pypython/cuda_cccl/tests/compute/test_raw_op.pypython/cuda_cccl/tests/compute/test_reduce.pypython/cuda_cccl/tests/compute/test_scan.pypython/cuda_cccl/tests/compute/test_segmented_reduce.pypython/cuda_cccl/tests/compute/test_segmented_sort.pypython/cuda_cccl/tests/compute/test_select.pypython/cuda_cccl/tests/compute/test_shuffle_iterator.pypython/cuda_cccl/tests/compute/test_three_way_partition.pypython/cuda_cccl/tests/compute/test_transform.pypython/cuda_cccl/tests/compute/test_unique_by_key.pypython/cuda_cccl/tests/compute/test_zip_iterator.pypython/cuda_cccl/tests/coop/_experimental/test_block_exchange.pypython/cuda_cccl/tests/coop/_experimental/test_block_load.pypython/cuda_cccl/tests/coop/_experimental/test_block_load_store_api.pypython/cuda_cccl/tests/coop/_experimental/test_block_merge_sort.pypython/cuda_cccl/tests/coop/_experimental/test_block_merge_sort_api.pypython/cuda_cccl/tests/coop/_experimental/test_block_radix_sort.pypython/cuda_cccl/tests/coop/_experimental/test_block_radix_sort_api.pypython/cuda_cccl/tests/coop/_experimental/test_block_reduce.pypython/cuda_cccl/tests/coop/_experimental/test_block_reduce_api.pypython/cuda_cccl/tests/coop/_experimental/test_block_scan.pypython/cuda_cccl/tests/coop/_experimental/test_block_scan_api.pypython/cuda_cccl/tests/coop/_experimental/test_block_store.pypython/cuda_cccl/tests/coop/_experimental/test_warp_merge_sort.pypython/cuda_cccl/tests/coop/_experimental/test_warp_merge_sort_api.pypython/cuda_cccl/tests/coop/_experimental/test_warp_reduce.pypython/cuda_cccl/tests/coop/_experimental/test_warp_reduce_api.pypython/cuda_cccl/tests/coop/_experimental/test_warp_scan.pypython/cuda_cccl/tests/coop/_experimental/test_warp_scan_api.pypython/cuda_cccl/tests/test_examples.py
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
python/cuda_cccl/tests/compute/test_unique_by_key.py (1)
385-406: 🎯 Functional Correctness | 🔴 Critical | ⚡ Quick wincritical: redundant
h_in_keysreassignment drops the duplicate-key bound, defeating the test's intent.Line 393 re-generates
h_in_keys = random_array(num_items, np.int32)withoutmax_value=20, overwriting the bounded array created at line 387 (max_value=20) that was specifically meant to produce duplicate keys forunique_by_keyto dedupe. With this overwrite, keys are effectively unbounded random int32 values and will almost never collide, so the test no longer exercises actual deduplication — it still passes, but silently stops testing the intended behavior. This looks like a leftover from the refactor (d_in_keys = numba.cuda.to_device(h_in_keys)likely became this duplicate generation by mistake).🐛 Proposed fix
h_in_keys = random_array(num_items, np.int32, max_value=20) h_in_items = random_array(num_items, np.float32) h_out_keys = np.empty(num_items, dtype=np.int32) h_out_items = np.empty(num_items, dtype=np.float32) h_out_num_selected = np.empty(1, np.int32) - h_in_keys = random_array(num_items, np.int32) d_in_keys = DeviceArray.from_numpy(h_in_keys, stream=cuda_stream)
🧹 Nitpick comments (2)
python/cuda_cccl/tests/compute/conftest.py (1)
73-77: 🩺 Stability & Availability | 🔵 Trivial | 💤 Low valuesuggestion:
cuda_streamfixture never closes the stream it creates.cuda.core.Stream.close()destroys the stream if owned; relying solely on GC across hundreds of test invocations using this fixture is fragile. Consider ayield-based fixture that closes the stream on teardown.♻️ Possible fix
`@pytest.fixture`(scope="function") -def cuda_stream() -> Stream: +def cuda_stream(): device = Device() device.set_current() - return device.create_stream() + stream = device.create_stream() + yield stream + stream.close()As per path instructions for
python/cuda_cccl/**/*: "Focus on ... memory ownership ...".Source: Path instructions
python/cuda_cccl/tests/compute/test_radix_sort.py (1)
49-53: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winsuggestion:
get_compute_capability_major()duplicates the sameDevice(); set_current(); compute_capabilitysnippet inlined intest_zip_iterator.py(and similar patterns elsewhere). Consider moving this helper intotests/_utilsso all test files share one implementation instead of re-deriving CC checks per file.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: edf5ef8f-50ba-41eb-a5c3-f2a88afbb30d
📒 Files selected for processing (47)
ci/test_cuda_cccl_examples_python.shci/test_cuda_compute_minimal_python.shci/windows/test_cuda_cccl_examples_python.ps1docs/python/setup.rstpython/cuda_cccl/pyproject.tomlpython/cuda_cccl/tests/_utils/__init__.pypython/cuda_cccl/tests/_utils/device_array.pypython/cuda_cccl/tests/compute/conftest.pypython/cuda_cccl/tests/compute/test_binary_search.pypython/cuda_cccl/tests/compute/test_deferred_annotations.pypython/cuda_cccl/tests/compute/test_histogram.pypython/cuda_cccl/tests/compute/test_iterators.pypython/cuda_cccl/tests/compute/test_merge_sort.pypython/cuda_cccl/tests/compute/test_nested_struct.pypython/cuda_cccl/tests/compute/test_no_numba.pypython/cuda_cccl/tests/compute/test_permutation_iterator.pypython/cuda_cccl/tests/compute/test_radix_sort.pypython/cuda_cccl/tests/compute/test_raw_op.pypython/cuda_cccl/tests/compute/test_reduce.pypython/cuda_cccl/tests/compute/test_scan.pypython/cuda_cccl/tests/compute/test_segmented_reduce.pypython/cuda_cccl/tests/compute/test_segmented_sort.pypython/cuda_cccl/tests/compute/test_select.pypython/cuda_cccl/tests/compute/test_shuffle_iterator.pypython/cuda_cccl/tests/compute/test_three_way_partition.pypython/cuda_cccl/tests/compute/test_transform.pypython/cuda_cccl/tests/compute/test_unique_by_key.pypython/cuda_cccl/tests/compute/test_zip_iterator.pypython/cuda_cccl/tests/coop/_experimental/test_block_exchange.pypython/cuda_cccl/tests/coop/_experimental/test_block_load.pypython/cuda_cccl/tests/coop/_experimental/test_block_load_store_api.pypython/cuda_cccl/tests/coop/_experimental/test_block_merge_sort.pypython/cuda_cccl/tests/coop/_experimental/test_block_merge_sort_api.pypython/cuda_cccl/tests/coop/_experimental/test_block_radix_sort.pypython/cuda_cccl/tests/coop/_experimental/test_block_radix_sort_api.pypython/cuda_cccl/tests/coop/_experimental/test_block_reduce.pypython/cuda_cccl/tests/coop/_experimental/test_block_reduce_api.pypython/cuda_cccl/tests/coop/_experimental/test_block_scan.pypython/cuda_cccl/tests/coop/_experimental/test_block_scan_api.pypython/cuda_cccl/tests/coop/_experimental/test_block_store.pypython/cuda_cccl/tests/coop/_experimental/test_warp_merge_sort.pypython/cuda_cccl/tests/coop/_experimental/test_warp_merge_sort_api.pypython/cuda_cccl/tests/coop/_experimental/test_warp_reduce.pypython/cuda_cccl/tests/coop/_experimental/test_warp_reduce_api.pypython/cuda_cccl/tests/coop/_experimental/test_warp_scan.pypython/cuda_cccl/tests/coop/_experimental/test_warp_scan_api.pypython/cuda_cccl/tests/test_examples.py
🥳 CI Workflow Results🟩 Finished in 3h 11m: Pass: 100%/506 | Total: 6d 16h | Max: 1h 56m | Hits: 100%/653836See results here. |
Description
closes #9646
Checklist