Skip to content

Create benchmarks for the flat index (full-precision in-memory implementation)#1170

Open
arrayka wants to merge 13 commits into
mainfrom
u/arrayka/flat_bench
Open

Create benchmarks for the flat index (full-precision in-memory implementation)#1170
arrayka wants to merge 13 commits into
mainfrom
u/arrayka/flat_bench

Conversation

@arrayka

@arrayka arrayka commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Establishes a baseline benchmark for FlatIndex::knn_search, enabling evaluation of new capabilities and innovations in DiskANN's flat-scan implementation.

Why Add Flat Index Benchmarks if Exhaustive Exists?

Exhaustive benchmarks are lightweight benchmarks focused on quantizer speed and accuracy: they measure compress + distance performance in isolation, without involving the FlatIndex machinery or any data provider. They operate entirely outside the DiskANN provider/index abstraction.

Flat benchmarks are end-to-end benchmarks that exercise FlatIndex with different strategy configurations. They exercise FlatIndex through the actual provider and search strategy abstractions, so they capture realistic runtime behavior, and integration overhead that exhaustive benchmarks intentionally skip.

This PR enables full-precision in-memory flat search. Future work may add:

  • Quantized flat search
  • Quantized flat search with reranking (disk-based or in-memory)
  • Batched SIMD scan
  • Integration with diversity search

Changes

  • New diskann-benchmark/src/flat benchmark module with:
    • InMemProvider wrapping FastMemoryVectorProviderAsync (cache-line-aligned vector storage) with identity DataProvider mapping
    • FlatScanStrategy and FlatVisitor implementing SearchStrategy/DistancesUnordered
    • FlatSearcher implementing the Search trait from benchmark_core
  • Input schema (diskann-benchmark/src/inputs/flat.rs) with dataset, distance metric, queries, groundtruth, k, thread counts, and reps
  • Integration test (flat_search_integration) and example input (diskann-benchmark/example/flat-index.json)
  • Performance test input for wikipedia-100K dataset (diskann-benchmark/perf_test_inputs/wikipedia-100K-flat-index.json)
  • Unrelated fix: renamed temp file in run_integration_test from graph-index.json to input.json - the helper is shared by all backends, not just graph-index

Follows established patterns from the graph-index and exhaustive backends.

Output example:

cargo run --package diskann-benchmark --release -- run --input-file diskann-benchmark\perf_test_inputs\wikipedia-100K-flat-index.json --output-file ./target/tmp/flat-index-output.json     
                                                                                                                                                                                
######################
# Running Job 1 of 1 #
######################

              Data: target/tmp\wikipedia_cohere/wikipedia_base.bin.crop_nb_100000
         Data Type: float32
          Distance: inner_product
           Queries: target/tmp\wikipedia_cohere/wikipedia_query.bin
       Groundtruth: target/tmp\wikipedia_cohere/wikipedia-100K
                 K: 100
           Threads: 4, 8
              Reps: 1

Loading dataset...
  Loaded 100000 vectors of dimension 768
  Queries: 5000, Groundtruth: 5000x100


  K,   Avg cmps,   QPS - mean(max),             Avg Latency,           p99 Latency,               Recall,   Threads
===================================================================================================================
100,     100000,     123.0 (123.0),   32430.0us (32430.0us),   35711.0us (35711us),   0.9999899999999999,         4
100,     100000,     250.4 (250.4),   31909.2us (31909.2us),   33323.0us (33323us),   0.9999899999999999,         8

@arrayka arrayka linked an issue Jun 15, 2026 that may be closed by this pull request
@codecov-commenter

codecov-commenter commented Jun 15, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 95.80247% with 17 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.83%. Comparing base (3aa44ac) to head (dbf18a4).
⚠️ Report is 7 commits behind head on main.

Files with missing lines Patch % Lines
diskann-benchmark/src/flat/search.rs 95.10% 16 Missing ⚠️
diskann-benchmark/src/inputs/flat.rs 98.52% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1170      +/-   ##
==========================================
+ Coverage   89.46%   89.83%   +0.36%     
==========================================
  Files         487      491       +4     
  Lines       92170    93636    +1466     
==========================================
+ Hits        82460    84115    +1655     
+ Misses       9710     9521     -189     
Flag Coverage Δ
miri 89.83% <95.80%> (+0.36%) ⬆️
unittests 89.49% <95.80%> (+0.37%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
diskann-benchmark-core/src/lib.rs 53.33% <ø> (ø)
diskann-benchmark-core/src/utils.rs 100.00% <100.00%> (ø)
diskann-benchmark/src/flat/mod.rs 100.00% <100.00%> (ø)
diskann-benchmark/src/inputs/mod.rs 81.25% <ø> (ø)
diskann-benchmark/src/main.rs 91.47% <100.00%> (+0.11%) ⬆️
diskann-benchmark/src/inputs/flat.rs 98.52% <98.52%> (ø)
diskann-benchmark/src/flat/search.rs 95.10% <95.10%> (ø)

... and 28 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@arrayka arrayka changed the title Create benchmarks for the flat index (in-memory implementation) Create benchmarks for the flat index (full-precision in-memory implementation) Jun 19, 2026
@arrayka arrayka marked this pull request as ready for review June 19, 2026 03:14
@arrayka arrayka requested review from a team and Copilot June 19, 2026 03:14

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a new end-to-end benchmark backend for diskann::flat::FlatIndex::knn_search, establishing a baseline for full-precision in-memory brute-force kNN performance (recall + latency) to support future flat-search optimizations.

Changes:

  • Registers a new flat-index benchmark backend and adds an integration test wired to a new example input.
  • Introduces a FlatSearch input schema and JSON examples/perf-test inputs for running flat-index benchmarks.
  • Implements an in-memory DataProvider + SearchStrategy that performs a full sequential scan using FastMemoryVectorProviderAsync.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
diskann-benchmark/src/main.rs Registers the new flat backend; adds a flat-search integration test; renames the temp input file used by the shared integration helper.
diskann-benchmark/src/inputs/mod.rs Exposes the new flat benchmark input module.
diskann-benchmark/src/inputs/flat.rs Adds FlatSearch / SearchPhase input schema, validation hooks, and display formatting.
diskann-benchmark/src/flat/mod.rs Adds the flat benchmark module entry point and registration function.
diskann-benchmark/src/flat/search.rs Implements the flat benchmark backend (provider + scan strategy + search runner + aggregation + formatting).
diskann-benchmark/perf_test_inputs/wikipedia-100K-flat-index.json Adds a perf-test job configuration for wikipedia-100K flat search.
diskann-benchmark/example/flat-index.json Adds an example job configuration used by the new integration test.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread diskann-benchmark/src/flat/search.rs
Comment thread diskann-benchmark/src/flat/search.rs
Comment thread diskann-benchmark/src/flat/search.rs Outdated
Comment thread diskann-benchmark/src/flat/search.rs Outdated
Comment thread diskann-benchmark/src/flat/search.rs
arrayka and others added 2 commits June 18, 2026 21:57
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Comment thread diskann-benchmark/src/flat/search.rs Outdated
.map(|r| recall.num_queries as f64 / r.end_to_end_latency().as_seconds())
.collect();

let mean_cmps = {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: The mean_cmps fold here looks identical to  average_all  in  diskann-benchmark-core.
Worth considering promoting  average_all  to  pub  and reusing it

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@arkrishn94 arkrishn94 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this Alex. Just have one non-minor comment - we should be using a provider, visitor and DistancesUnordered implementation as is in the benchmark. Maybe this PR can include the implementation of these for the full precision provider?

Comment on lines +61 to +63
struct InMemProvider<T: VectorRepr> {
data: FastMemoryVectorProviderAsync<T>,
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use the full precision in-memory provider here instead?

The full precision in memory provider is an instantiation of the DefaultProvider and DataProvider is implemented for it.

You'll need to implement the right traits for it to be able to use as a flat index, but I think that's the right move here...

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd tentatively recommend avoiding using the constructs in diskann-providers given the safety issue associated with it. Is there a reason to not use Matrix for this simple store initially?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hildebrandmw, this benchmark originally used Matrix<T>.

Aditya and I agreed that, as an end-to-end benchmark, it should compare FlatIndex against a real data provider rather than an in-memory matrix.

The goal is to predict FlatIndex performance in a production-ready setup, similar to how we benchmark disk-based indexes, so we can predict performance in production.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use FastMemoryVectorProviderAsync, as proposed in this PR.

It looks like the regular MemoryVectorProviderAsync isn't accessible outside the diskann-providers crate.

We should continue the discussion on where the Flat trait implementations for existing data providers should live and come to an agreement there.

Comment on lines +241 to +244
/// The visitor that iterates over all vectors in the provider.
struct FlatVisitor<'a, T: VectorRepr> {
data: &'a FastMemoryVectorProviderAsync<T>,
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above, we should have a visitor implementation for FastmemoryVectorProviderAsync and that should exist along with its definition.

Comment thread diskann-benchmark/src/flat/search.rs Outdated
//////////////////////////////////////////

/// Wraps a [`FlatIndex`] and queries to implement the [`Search`] trait from benchmark_core.
struct FlatSearcher<T: VectorRepr> {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: do we need the Flat prefix over all the structs here? Can we use the module path to distinguish?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my understanding - what is this input provider for? I'm guessing you'll want to create a benchmark for it in CI?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it's the wikipedia-100K config intended to seed a CI flat-index benchmark later (mirroring the existing disk-index perf inputs). This is an actual benchmark configuration using a real dataset - not a sample configuration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Create benchmarks for the flat index (in-memory implementation)

6 participants