diskann-benchmark refactor pt2 - shared logic overhaul by JordanMaples · Pull Request #1169 · microsoft/DiskANN

JordanMaples · 2026-06-15T20:25:10Z

Follow-up to #1168 (module reorganization). This PR contains the behavioral changes that were previously combined in #1142.

Summary

Simplifies the streaming benchmark infrastructure, unifies bf-tree input types, and makes Managed ID translation opt-in — preparing for its removal with inmem 2.0.

Changes

Managed ID translation is now opt-in

run_streaming and the streaming stack are now generic over the stream type via a new StreamingOutput trait
bftree: implements Stream<DataArgs> directly on StreamRunner, using tags as slot IDs with no Managed wrapper (no ID translation overhead)
inmem: continues using Managed for tag-to-slot ID translation until inmem 2.0 handles this natively
Removed SlotReclaim::Immediate and recycle_tags (bftree-only dead code)
Added build_direct_streamer for non-managed paths alongside the existing build_managed_streamer

Shared streaming execution layer

Introduce StreamRunner, Managed, build_managed_streamer, and run_streaming — a reusable stack shared across in-mem and bf-tree streaming benchmarks, replacing duplicated per-backend logic
Add InmemMaintainer (DropDeleted + Release) extracted from the old inline implementation

Streaming search overhaul (addresses @magdalendobson feedback on #1142)

New StreamingSearchParams struct with scalar num_threads / search_l — streaming runs a single search config between runbook stages rather than sweeping a parameter matrix
Remove the vestigial groundtruth field that was validated but never read during streaming execution

Input validation

Validate search_l >= search_n in StreamingSearchParams and GraphSearch
Validate recall_k <= search_n in StreamingSearchParams and GraphSearch to fail fast before long benchmark runs

Input unification

Rename Dynamic → Streaming for runbook params and input types (graph-index-dynamic → graph-index-stream-run)
Unify BfTreeBuild and BfTreeStreamingRun to use a QuantConfig enum (none | spherical) instead of separate input structs per quantization mode
Extract quantizer_util for shared spherical quantizer construction
Restore InlineFilterSearchPhase naming (accidentally overwritten)

Housekeeping

Simplify PhantomData<fn() -> T> to PhantomData<T> on StreamRunner
Delete streaming_utils.rs and streaming/full_precision.rs (subsumed by the shared layer)
Add new example: graph-index-bftree-stream-spherical.json
Update README with current command syntax

Breaking changes to JSON inputs

Streaming input files need updating:

"type": "graph-index-dynamic" → "graph-index-stream-run"
"search_phase" → "search" (flat struct with queries, reps, num_threads, search_l, search_n, recall_k)
Bf-tree inputs now require "quantization": { "kind": "none" } or the spherical config

Copilot

Pull request overview

This PR refactors diskann-benchmark streaming benchmarks by introducing a shared streaming execution layer (runner + common helpers), unifying bf-tree streaming/build inputs (including quantization configuration), and updating examples/docs to the new JSON schema.

Changes:

Introduce shared streaming infrastructure (StreamRunner, Maintainer, build_streamer, run_streaming) and extract in-mem maintenance logic into InmemMaintainer.
Unify/rename streaming inputs (Dynamic → Streaming) and introduce scalar StreamingSearchParams (no parameter sweep matrix) for streaming runs.
Consolidate bf-tree quantization inputs behind a single QuantConfig enum and add a shared spherical quantizer helper.

Reviewed changes

Copilot reviewed 28 out of 28 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
diskann-benchmark/src/main.rs	Updates integration test coverage for renamed streaming example and adds bf-tree spherical streaming example test.
diskann-benchmark/src/inputs/graph_index.rs	Renames dynamic→streaming types; introduces `StreamingSearchParams`; privatizes some input fields behind accessors.
diskann-benchmark/src/inputs/exhaustive.rs	Makes `TransformKind` clonable (used by new shared quantization config).
diskann-benchmark/src/inputs/bftree.rs	Unifies bf-tree build/streaming inputs around `QuantConfig` and streaming search/runbook types.
diskann-benchmark/src/index/streaming/runner.rs	Adds generic shared streaming runner with pluggable maintenance.
diskann-benchmark/src/index/streaming/mod.rs	Adds shared streamer construction + runbook execution helpers; re-exports shared pieces.
diskann-benchmark/src/index/streaming/full_precision.rs	Removes legacy full-precision streaming implementation (superseded by shared runner).
diskann-benchmark/src/index/result.rs	Tightens visibility of result structs/enums to the `index` module.
diskann-benchmark/src/index/inmem/streaming.rs	Introduces `InmemMaintainer` (DropDeleted + Release) extracted from old inline logic.
diskann-benchmark/src/index/inmem/spherical.rs	Updates to new accessor-based input APIs.
diskann-benchmark/src/index/inmem/scalar.rs	Updates to new accessor-based input APIs.
diskann-benchmark/src/index/inmem/product.rs	Updates to new accessor-based input APIs.
diskann-benchmark/src/index/inmem/mod.rs	Exposes `InmemMaintainer` and adds inmem streaming module.
diskann-benchmark/src/index/build.rs	Tightens visibility of internal build helpers/types.
diskann-benchmark/src/index/bftree/streaming_utils.rs	Removes duplicated streaming runbook-driving helper (replaced by shared `run_streaming`).
diskann-benchmark/src/index/bftree/spherical.rs	Switches spherical bf-tree build to unified `BfTreeBuild` + shared quantizer helper.
diskann-benchmark/src/index/bftree/spherical_streaming.rs	Switches bf-tree spherical streaming to shared streamer + `StreamRunner`.
diskann-benchmark/src/index/bftree/quantizer_util.rs	Adds shared spherical quantizer construction utility for bf-tree benchmarks.
diskann-benchmark/src/index/bftree/mod.rs	Updates module docs and wires new shared quantizer utility.
diskann-benchmark/src/index/bftree/full_precision.rs	Switches bf-tree full-precision build to unified `BfTreeBuild` (with `QuantConfig::None`).
diskann-benchmark/src/index/bftree/full_precision_streaming.rs	Switches bf-tree full-precision streaming to shared streamer + `StreamRunner`.
diskann-benchmark/src/index/benchmarks.rs	Refactors in-mem streaming benchmark to use shared `run_streaming`/`build_streamer` + `StreamRunner`.
diskann-benchmark/README.md	Updates documented command/example to new streaming input filename and tag.
diskann-benchmark/example/graph-index-stream.json	Updates streaming example JSON to new schema (`type`, `search` shape, no groundtruth).
diskann-benchmark/example/graph-index-bftree.json	Updates bf-tree build example to unified `graph-index-bftree` + `quantization` config.
diskann-benchmark/example/graph-index-bftree-stream.json	Updates bf-tree streaming example to unified tag + new `search` + `quantization` config.
diskann-benchmark/example/graph-index-bftree-stream-spherical.json	Adds new example for bf-tree spherical streaming with unified `quantization` config.
diskann-benchmark/example/graph-index-bftree-spherical.json	Updates spherical bf-tree build example to unified tag + nested `quantization` config.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

codecov-commenter · 2026-06-15T20:40:17Z

Codecov Report

❌ Patch coverage is 71.72775% with 108 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.76%. Comparing base (b5ebac2) to head (ff5cff6).
⚠️ Report is 4 commits behind head on main.

Files with missing lines	Patch %	Lines
diskann-benchmark/src/index/streaming/runner.rs	55.63%	63 Missing ⚠️
diskann-benchmark/src/inputs/graph_index.rs	54.23%	27 Missing ⚠️
diskann-benchmark/src/index/streaming/mod.rs	88.88%	9 Missing ⚠️
diskann-benchmark/src/index/benchmarks.rs	87.03%	7 Missing ⚠️
diskann-benchmark/src/index/build.rs	75.00%	1 Missing ⚠️
diskann-benchmark/src/index/inmem/streaming.rs	97.43%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1169      +/-   ##
==========================================
- Coverage   90.99%   90.76%   -0.24%     
==========================================
  Files         489      490       +1     
  Lines       93130    93360     +230     
==========================================
- Hits        84746    84736      -10     
- Misses       8384     8624     +240

Flag	Coverage Δ
miri	`90.76% <71.72%> (-0.24%)`	⬇️
unittests	`90.72% <71.72%> (-0.24%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
diskann-benchmark/src/index/inmem/product.rs	`100.00% <ø> (ø)`
diskann-benchmark/src/index/inmem/scalar.rs	`100.00% <ø> (ø)`
diskann-benchmark/src/index/inmem/spherical.rs	`100.00% <ø> (ø)`
diskann-benchmark/src/index/result.rs	`49.03% <100.00%> (ø)`
diskann-benchmark/src/index/streaming/managed.rs	`97.56% <ø> (ø)`
diskann-benchmark/src/inputs/exhaustive.rs	`26.83% <ø> (ø)`
diskann-benchmark/src/main.rs	`91.36% <100.00%> (ø)`
diskann-benchmark/src/utils/streaming.rs	`91.38% <ø> (ø)`
diskann-benchmark/src/index/build.rs	`85.92% <75.00%> (ø)`
diskann-benchmark/src/index/inmem/streaming.rs	`97.43% <97.43%> (ø)`
... and 4 more

... and 12 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Behavioral changes on top of the module reorganization: - Introduce shared StreamRunner/Managed/build_streamer abstractions for streaming benchmarks, replacing duplicated per-backend logic - Rename Dynamic -> Streaming for runbook params and input types - Unify bftree build/streaming inputs with QuantConfig enum (none or spherical) instead of separate input structs - Add StreamingSearchParams with scalar num_threads/search_l, removing the vestigial groundtruth file requirement and unnecessary parameter sweep arrays from streaming search - Extract quantizer_util for shared spherical quantizer construction - Add InmemMaintainer for inmem streaming maintenance (DropDeleted + Release) - Privatize input struct fields behind accessor methods Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

magdalendobson

Looks close, just a couple issues in comments.

… search_n validation - Restore InlineFilterSearchPhase naming that was accidentally overwritten - Add recall_k > search_n validation to GraphSearch::validate and StreamingSearchParams::validate to fail fast before running benchmarks Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

hildebrandmw

Two big concerns here:

Moving from pub(crate) members to pub(crate) accessors adds a lot of noise to the diff without meaningfully increasing protection. The real danger with sharing inputs is having backends ignore fields entirely, which per-field accessors doesn't really change.
This PR centers around ManagedStream, but that is a crutch for providers not being able to manage their own ID translation. The upcoming inmem 2.0 (which can do everything except make it to a PR ready state), will not need this management layer. In which case, will a separate streaming implementation be needed anyways? Is there a way to share this infrastructure without requiring ManagedStream?

hildebrandmw · 2026-06-22T22:24:50Z

+    inplace_delete_num_to_replace: usize,
+    inplace_delete_method: InplaceDeleteMethod,
+    maintainer: M,
+    _marker: PhantomData<fn() -> T>,


Why is this invariant on T?

Switched it to PhantomData<T>

Introduce StreamingOutput trait and make run_streaming generic over the stream type. This allows: - bftree: implements Stream<DataArgs> directly on StreamRunner, using tags as slot IDs without translation (no Managed wrapper) - inmem: continues using Managed for tag-to-slot ID translation When inmem 2.0 lands with native ID management, Managed can be deleted entirely with no infrastructure changes needed. Also: - Remove SlotReclaim::Immediate and recycle_tags (bftree-only paths) - Simplify PhantomData<fn() -> T> to PhantomData<T> on StreamRunner Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

hildebrandmw

Something was bothering me about removing the ID translation layer for bf-tree, and I think I managed to figure out what it is. For some reason, bf-tree is pre-configure with a maximum capacity. The BAB streaming runs simulate the runbook to obtain the maximum number of active points, which could be less than the maximum tag. It's this maximum that is provided to the BfTreeProvider's constructor. So, for runbooks with an active set that is indeed less, then the untranslated IDs will trigger the assert linked earlier. The runbook in test_data doesn't hit this because it doesn't have the property of the active points being less than the maximum tag.

Did this latest iteration go through the wikipedia runbook.

A small example runbook that triggers this situation:

sift-small-256:
  max_pts: 128
  1:
    operation: "insert"
    start: 0
    end: 128
  2:
    operation: "search"
  3:
    operation: "delete"
    start: 0
    end: 64
  4:
    operation: "insert"
    start: 128
    end: 192
  5:
    operation: "search"
  6:
    operation: "replace"
    tags_start: 64
    tags_end: 128
    ids_start: 192
    ids_end: 256
  7:
    operation: "search"

JordanMaples requested review from a team and Copilot June 15, 2026 20:25

Copilot started reviewing on behalf of JordanMaples June 15, 2026 20:25 View session

Copilot AI reviewed Jun 15, 2026

View reviewed changes

Comment thread diskann-benchmark/src/inputs/graph_index.rs

Comment thread diskann-benchmark/src/index/streaming/mod.rs Outdated

magdalendobson reviewed Jun 18, 2026

View reviewed changes

Comment thread diskann-benchmark/src/inputs/bftree.rs Outdated

magdalendobson reviewed Jun 18, 2026

View reviewed changes

Comment thread diskann-benchmark/src/inputs/graph_index.rs Outdated

magdalendobson reviewed Jun 18, 2026

View reviewed changes

Comment thread diskann-benchmark/src/inputs/bftree.rs Outdated

JordanMaples and others added 4 commits June 22, 2026 08:32

undo name rename

a0f481c

enum

b37c21f

QuantConfig implements display

634b25c

JordanMaples force-pushed the jordanmaples/benchmarks-refactor-v3 branch from c33c71c to 634b25c Compare June 22, 2026 15:35

Validate search_l >= search_n in StreamingSearchParams

b306e94

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

magdalendobson reviewed Jun 22, 2026

View reviewed changes

Comment thread diskann-benchmark/src/inputs/graph_index.rs

magdalendobson reviewed Jun 22, 2026

View reviewed changes

Comment thread diskann-benchmark/src/inputs/graph_index.rs

magdalendobson requested changes Jun 22, 2026

View reviewed changes

hildebrandmw reviewed Jun 22, 2026

View reviewed changes

hildebrandmw reviewed Jun 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

diskann-benchmark refactor pt2 - shared logic overhaul#1169

diskann-benchmark refactor pt2 - shared logic overhaul#1169
JordanMaples wants to merge 7 commits into
mainfrom
jordanmaples/benchmarks-refactor-v3

JordanMaples commented Jun 15, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

codecov-commenter commented Jun 15, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

magdalendobson left a comment

Uh oh!

hildebrandmw left a comment

Uh oh!

hildebrandmw Jun 22, 2026

Uh oh!

JordanMaples Jun 25, 2026

Uh oh!

hildebrandmw left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

JordanMaples commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Managed ID translation is now opt-in

Shared streaming execution layer

Streaming search overhaul (addresses @magdalendobson feedback on #1142)

Input validation

Input unification

Housekeeping

Breaking changes to JSON inputs

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

codecov-commenter commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

magdalendobson left a comment

Choose a reason for hiding this comment

Uh oh!

hildebrandmw left a comment

Choose a reason for hiding this comment

Uh oh!

hildebrandmw Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

JordanMaples Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

hildebrandmw left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

JordanMaples commented Jun 15, 2026 •

edited

Loading

codecov-commenter commented Jun 15, 2026 •

edited

Loading