Create benchmarks for the flat index (full-precision in-memory implementation)#1170
Create benchmarks for the flat index (full-precision in-memory implementation)#1170arrayka wants to merge 13 commits into
Conversation
# Conflicts: # diskann-benchmark/src/backend/mod.rs
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1170 +/- ##
==========================================
+ Coverage 89.46% 89.83% +0.36%
==========================================
Files 487 491 +4
Lines 92170 93636 +1466
==========================================
+ Hits 82460 84115 +1655
+ Misses 9710 9521 -189
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This PR adds a new end-to-end benchmark backend for diskann::flat::FlatIndex::knn_search, establishing a baseline for full-precision in-memory brute-force kNN performance (recall + latency) to support future flat-search optimizations.
Changes:
- Registers a new
flat-indexbenchmark backend and adds an integration test wired to a new example input. - Introduces a
FlatSearchinput schema and JSON examples/perf-test inputs for running flat-index benchmarks. - Implements an in-memory
DataProvider+SearchStrategythat performs a full sequential scan usingFastMemoryVectorProviderAsync.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| diskann-benchmark/src/main.rs | Registers the new flat backend; adds a flat-search integration test; renames the temp input file used by the shared integration helper. |
| diskann-benchmark/src/inputs/mod.rs | Exposes the new flat benchmark input module. |
| diskann-benchmark/src/inputs/flat.rs | Adds FlatSearch / SearchPhase input schema, validation hooks, and display formatting. |
| diskann-benchmark/src/flat/mod.rs | Adds the flat benchmark module entry point and registration function. |
| diskann-benchmark/src/flat/search.rs | Implements the flat benchmark backend (provider + scan strategy + search runner + aggregation + formatting). |
| diskann-benchmark/perf_test_inputs/wikipedia-100K-flat-index.json | Adds a perf-test job configuration for wikipedia-100K flat search. |
| diskann-benchmark/example/flat-index.json | Adds an example job configuration used by the new integration test. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
| .map(|r| recall.num_queries as f64 / r.end_to_end_latency().as_seconds()) | ||
| .collect(); | ||
|
|
||
| let mean_cmps = { |
There was a problem hiding this comment.
nit: The mean_cmps fold here looks identical to average_all in diskann-benchmark-core.
Worth considering promoting average_all to pub and reusing it
arkrishn94
left a comment
There was a problem hiding this comment.
Thanks for working on this Alex. Just have one non-minor comment - we should be using a provider, visitor and DistancesUnordered implementation as is in the benchmark. Maybe this PR can include the implementation of these for the full precision provider?
| struct InMemProvider<T: VectorRepr> { | ||
| data: FastMemoryVectorProviderAsync<T>, | ||
| } |
There was a problem hiding this comment.
Can we use the full precision in-memory provider here instead?
The full precision in memory provider is an instantiation of the DefaultProvider and DataProvider is implemented for it.
You'll need to implement the right traits for it to be able to use as a flat index, but I think that's the right move here...
There was a problem hiding this comment.
I'd tentatively recommend avoiding using the constructs in diskann-providers given the safety issue associated with it. Is there a reason to not use Matrix for this simple store initially?
There was a problem hiding this comment.
@hildebrandmw, this benchmark originally used Matrix<T>.
Aditya and I agreed that, as an end-to-end benchmark, it should compare FlatIndex against a real data provider rather than an in-memory matrix.
The goal is to predict FlatIndex performance in a production-ready setup, similar to how we benchmark disk-based indexes, so we can predict performance in production.
There was a problem hiding this comment.
Let's use FastMemoryVectorProviderAsync, as proposed in this PR.
It looks like the regular MemoryVectorProviderAsync isn't accessible outside the diskann-providers crate.
We should continue the discussion on where the Flat trait implementations for existing data providers should live and come to an agreement there.
| /// The visitor that iterates over all vectors in the provider. | ||
| struct FlatVisitor<'a, T: VectorRepr> { | ||
| data: &'a FastMemoryVectorProviderAsync<T>, | ||
| } |
There was a problem hiding this comment.
Same comment as above, we should have a visitor implementation for FastmemoryVectorProviderAsync and that should exist along with its definition.
| ////////////////////////////////////////// | ||
|
|
||
| /// Wraps a [`FlatIndex`] and queries to implement the [`Search`] trait from benchmark_core. | ||
| struct FlatSearcher<T: VectorRepr> { |
There was a problem hiding this comment.
nit: do we need the Flat prefix over all the structs here? Can we use the module path to distinguish?
There was a problem hiding this comment.
For my understanding - what is this input provider for? I'm guessing you'll want to create a benchmark for it in CI?
There was a problem hiding this comment.
yes, it's the wikipedia-100K config intended to seed a CI flat-index benchmark later (mirroring the existing disk-index perf inputs). This is an actual benchmark configuration using a real dataset - not a sample configuration.
Establishes a baseline benchmark for
FlatIndex::knn_search, enabling evaluation of new capabilities and innovations in DiskANN's flat-scan implementation.Why Add Flat Index Benchmarks if Exhaustive Exists?
Exhaustive benchmarks are lightweight benchmarks focused on quantizer speed and accuracy: they measure compress + distance performance in isolation, without involving the
FlatIndexmachinery or any data provider. They operate entirely outside the DiskANN provider/index abstraction.Flat benchmarks are end-to-end benchmarks that exercise
FlatIndexwith different strategy configurations. They exerciseFlatIndexthrough the actual provider and search strategy abstractions, so they capture realistic runtime behavior, and integration overhead that exhaustive benchmarks intentionally skip.This PR enables full-precision in-memory flat search. Future work may add:
Changes
diskann-benchmark/src/flatbenchmark module with:InMemProviderwrappingFastMemoryVectorProviderAsync(cache-line-aligned vector storage) with identityDataProvidermappingFlatScanStrategyandFlatVisitorimplementingSearchStrategy/DistancesUnorderedFlatSearcherimplementing theSearchtrait frombenchmark_corediskann-benchmark/src/inputs/flat.rs) with dataset, distance metric, queries, groundtruth, k, thread counts, and repsflat_search_integration) and example input (diskann-benchmark/example/flat-index.json)diskann-benchmark/perf_test_inputs/wikipedia-100K-flat-index.json)run_integration_testfromgraph-index.jsontoinput.json- the helper is shared by all backends, not just graph-indexFollows established patterns from the graph-index and exhaustive backends.
Output example: