Create benchmarks for the flat index (full-precision in-memory implementation) by arrayka · Pull Request #1170 · microsoft/DiskANN

arrayka · 2026-06-15T22:30:00Z

Establishes a baseline benchmark for FlatIndex::knn_search, enabling evaluation of new capabilities and innovations in DiskANN's flat-scan implementation.

Why Add Flat Index Benchmarks if Exhaustive Exists?

Exhaustive benchmarks are lightweight benchmarks focused on quantizer speed and accuracy: they measure compress + distance performance in isolation, without involving the FlatIndex machinery or any data provider. They operate entirely outside the DiskANN provider/index abstraction.

Flat benchmarks are end-to-end benchmarks that exercise FlatIndex with different strategy configurations. They exercise FlatIndex through the actual provider and search strategy abstractions, so they capture realistic runtime behavior, and integration overhead that exhaustive benchmarks intentionally skip.

This PR enables full-precision in-memory flat search. Future work may add:

Quantized flat search
Quantized flat search with reranking (disk-based or in-memory)
Batched SIMD scan
Integration with diversity search

Changes

New diskann-benchmark/src/flat benchmark module with:
- InMemProvider wrapping FastMemoryVectorProviderAsync (cache-line-aligned vector storage) with identity DataProvider mapping
- FlatScanStrategy and FlatVisitor implementing SearchStrategy/DistancesUnordered
- FlatSearcher implementing the Search trait from benchmark_core
Input schema (diskann-benchmark/src/inputs/flat.rs) with dataset, distance metric, queries, groundtruth, k, thread counts, and reps
Integration test (flat_search_integration) and example input (diskann-benchmark/example/flat-index.json)
Performance test input for wikipedia-100K dataset (diskann-benchmark/perf_test_inputs/wikipedia-100K-flat-index.json)
Unrelated fix: renamed temp file in run_integration_test from graph-index.json to input.json - the helper is shared by all backends, not just graph-index

Follows established patterns from the graph-index and exhaustive backends.

Output example:

cargo run --package diskann-benchmark --release -- run --input-file diskann-benchmark\perf_test_inputs\wikipedia-100K-flat-index.json --output-file ./target/tmp/flat-index-output.json     
                                                                                                                                                                                
######################
# Running Job 1 of 1 #
######################

              Data: target/tmp\wikipedia_cohere/wikipedia_base.bin.crop_nb_100000
         Data Type: float32
          Distance: inner_product
           Queries: target/tmp\wikipedia_cohere/wikipedia_query.bin
       Groundtruth: target/tmp\wikipedia_cohere/wikipedia-100K
                 K: 100
           Threads: 4, 8
              Reps: 1

Loading dataset...
  Loaded 100000 vectors of dimension 768
  Queries: 5000, Groundtruth: 5000x100


  K,   Avg cmps,   QPS - mean(max),             Avg Latency,           p99 Latency,               Recall,   Threads
===================================================================================================================
100,     100000,     123.0 (123.0),   32430.0us (32430.0us),   35711.0us (35711us),   0.9999899999999999,         4
100,     100000,     250.4 (250.4),   31909.2us (31909.2us),   33323.0us (33323us),   0.9999899999999999,         8

# Conflicts: # diskann-benchmark/src/backend/mod.rs

codecov-commenter · 2026-06-15T22:48:47Z

Codecov Report

❌ Patch coverage is 95.80247% with 17 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.83%. Comparing base (3aa44ac) to head (dbf18a4).
⚠️ Report is 7 commits behind head on main.

Files with missing lines	Patch %	Lines
diskann-benchmark/src/flat/search.rs	95.10%	16 Missing ⚠️
diskann-benchmark/src/inputs/flat.rs	98.52%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1170      +/-   ##
==========================================
+ Coverage   89.46%   89.83%   +0.36%     
==========================================
  Files         487      491       +4     
  Lines       92170    93636    +1466     
==========================================
+ Hits        82460    84115    +1655     
+ Misses       9710     9521     -189

Flag	Coverage Δ
miri	`89.83% <95.80%> (+0.36%)`	⬆️
unittests	`89.49% <95.80%> (+0.37%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
diskann-benchmark-core/src/lib.rs	`53.33% <ø> (ø)`
diskann-benchmark-core/src/utils.rs	`100.00% <100.00%> (ø)`
diskann-benchmark/src/flat/mod.rs	`100.00% <100.00%> (ø)`
diskann-benchmark/src/inputs/mod.rs	`81.25% <ø> (ø)`
diskann-benchmark/src/main.rs	`91.47% <100.00%> (+0.11%)`	⬆️
diskann-benchmark/src/inputs/flat.rs	`98.52% <98.52%> (ø)`
diskann-benchmark/src/flat/search.rs	`95.10% <95.10%> (ø)`

... and 28 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot

Pull request overview

This PR adds a new end-to-end benchmark backend for diskann::flat::FlatIndex::knn_search, establishing a baseline for full-precision in-memory brute-force kNN performance (recall + latency) to support future flat-search optimizations.

Changes:

Registers a new flat-index benchmark backend and adds an integration test wired to a new example input.
Introduces a FlatSearch input schema and JSON examples/perf-test inputs for running flat-index benchmarks.
Implements an in-memory DataProvider + SearchStrategy that performs a full sequential scan using FastMemoryVectorProviderAsync.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
diskann-benchmark/src/main.rs	Registers the new flat backend; adds a flat-search integration test; renames the temp input file used by the shared integration helper.
diskann-benchmark/src/inputs/mod.rs	Exposes the new flat benchmark input module.
diskann-benchmark/src/inputs/flat.rs	Adds `FlatSearch` / `SearchPhase` input schema, validation hooks, and display formatting.
diskann-benchmark/src/flat/mod.rs	Adds the flat benchmark module entry point and registration function.
diskann-benchmark/src/flat/search.rs	Implements the flat benchmark backend (provider + scan strategy + search runner + aggregation + formatting).
diskann-benchmark/perf_test_inputs/wikipedia-100K-flat-index.json	Adds a perf-test job configuration for wikipedia-100K flat search.
diskann-benchmark/example/flat-index.json	Adds an example job configuration used by the new integration test.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

wuw92 · 2026-06-22T07:13:17Z

+            .map(|r| recall.num_queries as f64 / r.end_to_end_latency().as_seconds())
+            .collect();
+
+        let mean_cmps = {


nit: The mean_cmps fold here looks identical to average_all in diskann-benchmark-core.
Worth considering promoting average_all to pub and reusing it

arkrishn94

Thanks for working on this Alex. Just have one non-minor comment - we should be using a provider, visitor and DistancesUnordered implementation as is in the benchmark. Maybe this PR can include the implementation of these for the full precision provider?

arkrishn94 · 2026-06-23T14:18:12Z

+struct InMemProvider<T: VectorRepr> {
+    data: FastMemoryVectorProviderAsync<T>,
+}


Can we use the full precision in-memory provider here instead?

The full precision in memory provider is an instantiation of the DefaultProvider and DataProvider is implemented for it.

You'll need to implement the right traits for it to be able to use as a flat index, but I think that's the right move here...

I'd tentatively recommend avoiding using the constructs in diskann-providers given the safety issue associated with it. Is there a reason to not use Matrix for this simple store initially?

@hildebrandmw, this benchmark originally used Matrix<T>.

Aditya and I agreed that, as an end-to-end benchmark, it should compare FlatIndex against a real data provider rather than an in-memory matrix.

The goal is to predict FlatIndex performance in a production-ready setup, similar to how we benchmark disk-based indexes, so we can predict performance in production.

Let's use FastMemoryVectorProviderAsync, as proposed in this PR.

It looks like the regular MemoryVectorProviderAsync isn't accessible outside the diskann-providers crate.

We should continue the discussion on where the Flat trait implementations for existing data providers should live and come to an agreement there.

arkrishn94 · 2026-06-23T14:28:14Z

+/// The visitor that iterates over all vectors in the provider.
+struct FlatVisitor<'a, T: VectorRepr> {
+    data: &'a FastMemoryVectorProviderAsync<T>,
+}


Same comment as above, we should have a visitor implementation for FastmemoryVectorProviderAsync and that should exist along with its definition.

arkrishn94 · 2026-06-23T14:29:20Z

+//////////////////////////////////////////
+
+/// Wraps a [`FlatIndex`] and queries to implement the [`Search`] trait from benchmark_core.
+struct FlatSearcher<T: VectorRepr> {


nit: do we need the Flat prefix over all the structs here? Can we use the module path to distinguish?

arkrishn94 · 2026-06-23T14:41:48Z

For my understanding - what is this input provider for? I'm guessing you'll want to create a benchmark for it in CI?

yes, it's the wikipedia-100K config intended to seed a CI flat-index benchmark later (mirroring the existing disk-index perf inputs). This is an actual benchmark configuration using a real dataset - not a sample configuration.

Alex Razumov (from Dev Box) added 6 commits June 12, 2026 17:26

Added the first version

42f9910

Added benchmark file

8261545

Use NAME

ec5db9a

Add FlatSearchParameters

9f80e12

Removed num_vectors

36d7f0e

Merge branch 'main' into u/arrayka/flat_bench

68725a6

# Conflicts: # diskann-benchmark/src/backend/mod.rs

arrayka linked an issue Jun 15, 2026 that may be closed by this pull request

Create benchmarks for the flat index (in-memory implementation) #1002

Open

Alex Razumov (from Dev Box) added 2 commits June 17, 2026 16:01

Merge branch 'main' into u/arrayka/flat_bench

986a037

Switch to FastMemoryVectorProviderAsync

b6e742a

arrayka changed the title ~~Create benchmarks for the flat index (in-memory implementation)~~ Create benchmarks for the flat index (full-precision in-memory implementation) Jun 19, 2026

Minor fixes

721f83a

arrayka marked this pull request as ready for review June 19, 2026 03:14

arrayka requested review from a team and Copilot June 19, 2026 03:14

Copilot started reviewing on behalf of arrayka June 19, 2026 03:14 View session

Copilot AI reviewed Jun 19, 2026

View reviewed changes

Comment thread diskann-benchmark/src/flat/search.rs

Comment thread diskann-benchmark/src/flat/search.rs

Comment thread diskann-benchmark/src/flat/search.rs Outdated

Comment thread diskann-benchmark/src/flat/search.rs Outdated

Comment thread diskann-benchmark/src/flat/search.rs

arrayka and others added 2 commits June 18, 2026 21:57

Apply suggestions from code review

4b7fd99

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Increased test coverage

f55446b

wuw92 approved these changes Jun 22, 2026

View reviewed changes

arkrishn94 reviewed Jun 23, 2026

View reviewed changes

Alex Razumov (from Dev Box) added 2 commits June 25, 2026 20:37

Promote average_all to pub, reuse it

e960a7d

Removed Flat prefix

dbf18a4

Uh oh!

Conversation

arrayka commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why Add Flat Index Benchmarks if Exhaustive Exists?

Changes

Output example:

Uh oh!

codecov-commenter commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arkrishn94 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

arrayka commented Jun 15, 2026 •

edited

Loading

codecov-commenter commented Jun 15, 2026 •

edited

Loading