Skip to content

perf(dex_solana.trades test): bound accepted_range scan to a recent window#9779

Open
a-monteiro wants to merge 1 commit into
mainfrom
andre/cur2-2800-bound-dex-solana-accepted-range
Open

perf(dex_solana.trades test): bound accepted_range scan to a recent window#9779
a-monteiro wants to merge 1 commit into
mainfrom
andre/cur2-2800-bound-dex-solana-accepted-range

Conversation

@a-monteiro

@a-monteiro a-monteiro commented Jun 13, 2026

Copy link
Copy Markdown
Member

The dbt_utils.accepted_range test on dex_solana.trades.amount_usd ran with no where config, so every run full-scanned the entire table — 17.2B rows / 236.8 GB, ~1.34M cpu_ms, ~62s wall — just to flag trades above $1b. It runs 4 heavy times/day = ~0.95 TB IO and ~1.49 CPU-hrs/day, purely re-validating immutable history.

The test exists to catch outlier amount_usd as data lands, so bounding it to a recent window matches its intent. This adds a config.where of block_time >= now() - interval '7' day. block_time is a regular column with Delta min/max stats and block_month is the partition key, so the bound prunes to recent files (pushdown confirmed live).

Measured on prod data (spellbook-hourly, faithful full-column scan, 3 warm runs, medians):

axis full history (prod) 7-day bound factor
IO 236.8 GB 13.7 GB 17.2x
rows scanned 17.20B 100.3M ~171x
cpu_ms 1,338,645 85,556 15.6x
wall 62.3s 4.8s ~13x

Per day (4 heavy runs): ~947 → ~55 GB IO and ~1.49 → ~0.10 CPU-hrs.

No equivalence proof here on purpose: this changes the test's coverage from all-history to the recent window. An outlier landing more than 7 days ago (e.g. a late backfill) would no longer be flagged — wanting reviewer sign-off on the window length (happy to bump to 30d for a wider safety net; ~30d still scans far less than full history).

Compiled test confirms the bound wraps the relation:

from (select * from dex_solana.trades where block_time >= now() - interval '7' day) dbt_subquery

Fixes CUR2-2800

…indow

The dbt_utils.accepted_range test on amount_usd full-scanned all history
(17.2B rows / 236 GB) on every run. Bound it to the last 7 days via a where
config: the test exists to flag outlier amount_usd as data lands, so a recent
window matches intent. Cuts the test scan ~17x IO and ~16x CPU.
@github-actions github-actions Bot added WIP work in progress dbt: solana covers the Solana dbt subproject labels Jun 13, 2026

Copy link
Copy Markdown
Member Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

@a-monteiro a-monteiro marked this pull request as ready for review June 13, 2026 18:43
@a-monteiro a-monteiro requested a review from a team June 13, 2026 18:44
@github-actions github-actions Bot added ready-for-review this PR development is complete, please review and removed WIP work in progress labels Jun 13, 2026
@cursor

cursor Bot commented Jun 13, 2026

Copy link
Copy Markdown

PR Summary

Low Risk
Test-only change that narrows data-quality coverage to the last seven days; no model SQL or production pipeline logic is modified.

Overview
Limits the dbt_utils.accepted_range check on dex_solana.trades.amount_usd so it only evaluates rows with block_time >= now() - interval '7' day, instead of full-table scans on every run.

The $1b cap is unchanged; only test scope and cost change. Outliers in amount_usd are still caught for recent ingests, but trades older than seven days (e.g. late backfills) would no longer fail this test.

Reviewed by Cursor Bugbot for commit e723d21. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dbt: solana covers the Solana dbt subproject ready-for-review this PR development is complete, please review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant