Skip to content

fix(lore-0059): rollup timestamp-shadowing fix + full-chain version-propagation test#61

Merged
karczuRF merged 3 commits into
developfrom
feat/0059_mv-rollup-fullchain-integration-test
Jun 26, 2026
Merged

fix(lore-0059): rollup timestamp-shadowing fix + full-chain version-propagation test#61
karczuRF merged 3 commits into
developfrom
feat/0059_mv-rollup-fullchain-integration-test

Conversation

@karczuRF

@karczuRF karczuRF commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator

Task

lore-0059 — MV rollup-chain version propagation under enriched _1m re-inserts. Unblocked once 0051 landed the production MV DDL.

What this does

Closes 0059's remaining ACs by verifying the real shipped rollup chain against a live ClickHouse, and fixes a correctness bug the verification surfaced.

New: full-chain integration test

packages/prices-clickhouse/tests/rollup_chain_it.rs (#[ignore], runs against the pinned docker CH):

  • Drives the real rollups.sql chain end-to-end _1m → _15m → _1h → _4h → _1d → _1w → _1M via SYSTEM REFRESH VIEW + poll.
  • Proves an enrichment re-INSERT (version+1, volume_quote_usd filled) propagates to every grain with no under-count and no double-count (volume_base stays 30; volume_quote_usd 0 → 300; version 1 → 2 wins at every grain).
  • Adds a preroll.sql full-range pass asserting the same OHLC correctness.

Bug fixed: argMin/argMax timestamp shadowing

The as-shipped rollups.sql and preroll.sql alias the bucket key toStartOfInterval(timestamp, …) AS timestamp, which shadows the source timestamp column. The bare timestamp inside argMin(open) / argMax(close) / argMax(close_usd) then resolved to the constant bucket-start, so open/close/close_usd tie-broke to an arbitrary row instead of the true first/last by time (volumes, high, low were unaffected). Fixed by qualifying the source column via FROM … AS t + t.timestamp.

  • current.sql was already correct (AS c + c.timestamp).
  • Schema-overview §3.2 reference DDL updated to match.

Review follow-ups (post-review)

Two issues raised in code review, both addressed in this branch:

  • Pin local/CI ClickHouse to the EXACT prod version. The test was being validated against clickhouse-server:25.6 (25.6.13.41) while ch-prod-01 runs 26.3.10.60 — a 25.x → 26.x major gap, so "green locally" did not cover the engine the shipped DDL actually runs on. docker-compose.yml is now pinned to clickhouse/clickhouse-server:26.3.10.60; the full prices-clickhouse suite (rollup_chain, current_mv, views, lib) was re-verified green against it. Policy going forward: local/CI matches the live ch-prod-01 version — re-check SELECT version() on prod before bumping the pin. This complements, does not replace the live re-verify owed by 0071.
  • Deterministic test anchor. insert_bucket re-evaluated toStartOfInterval(now(), …) per INSERT, so a wall-clock 15-minute boundary crossing between the un-enriched and enriched batches would anchor them to different buckets → broken ReplacingMergeTree dedup and a flaky single-bucket assertion. bucket_anchor() now freezes the boundary once as an integer epoch (toUInt64(toUnixTimestamp(…))toDateTime(<n>)), reused across every INSERT and robust across CH versions/timezones (avoids formatDateTime specifier portability).

⚠️ Live follow-up — 0071

The buggy DDL is already applied on ch-prod-01 (under 0051, 2026-06-22). Spawned lore-0071 to re-create the six live MVs from the corrected DDL + recompute mis-rolled buckets. Not done here (prepare-not-deploy).

Testing

docker compose up -d clickhouse                                       # CH 26.3.10.60 (matches ch-prod-01)
cargo test -p prices-clickhouse --test rollup_chain_it -- --ignored   # 2 passed
cargo test -p prices-clickhouse --test current_mv_it --test views_it -- --ignored   # 3 passed
cargo test -p prices-clickhouse --lib                                  # 5 passed (incl. statement-count guards)
cargo clippy -p prices-clickhouse --tests                             # clean

karczuRF added 3 commits June 26, 2026 13:10
…n test

The shipped rollups.sql and preroll.sql alias the bucket key
`toStartOfInterval(timestamp, …) AS timestamp`, which shadows the source
`timestamp` column. The bare `timestamp` inside argMin(open)/argMax(close)/
argMax(close_usd) then resolves to the constant bucket-start, so open/close/
close_usd tie-break to an arbitrary row instead of the true first/last by time
(volumes, high, low were unaffected). Qualify the source column via FROM … AS t.

Add tests/rollup_chain_it.rs: a full _1m → … → _1M integration test that drives
the real refreshable-MV chain end-to-end and proves enrichment re-inserts
propagate with no under/double-count and a winning version at every grain, plus
a preroll.sql full-range pass. The fix is what makes the OHLC assertions green.

current.sql was already correct (FROM … AS c + c.timestamp). Schema-overview
§3.2 reference DDL updated to match.
Mark 0059's remaining ACs proven against the real 0051 DDL, document the
argMin/argMax timestamp-shadowing finding (Emerged), and spawn 0071 to re-apply
the corrected MV/preroll DDL to the live ch-prod-01 cluster (the buggy DDL was
applied under 0051).
…nchor

The full-chain rollup test ran against docker CH 25.6 while ch-prod-01
runs 26.3.10.60 (25.x -> 26.x gap), so "green locally" did not cover the
engine the shipped DDL actually runs on. Pin docker-compose to the exact
prod version 26.3.10.60; full prices-clickhouse suite re-verified green.

Also make insert_bucket deterministic: it re-evaluated
toStartOfInterval(now()) per INSERT, so a wall-clock 15-minute boundary
crossing between the un-enriched and enriched batches anchored them to
different buckets, breaking the ReplacingMergeTree dedup and the
single-bucket assertions. bucket_anchor() now freezes the boundary once
as an integer epoch (toUInt64(toUnixTimestamp(...)) -> toDateTime(<n>)),
reused across every INSERT and robust across CH versions/timezones.
@karczuRF

Copy link
Copy Markdown
Collaborator Author

✅ Full suite verified against the prod ClickHouse version

Re-ran the prices-clickhouse suite against a local CH pinned to 26.3.10.60 — the exact version running on ch-prod-01 (confirmed SELECT version()26.3.10.60). Working tree was clean, so this is the committed branch state.

Check Result
--lib unit tests (statement-count guards) 5 passed
rollup_chain_it (--ignored) 2 passed
current_mv_it (--ignored) 1 passed
views_it (--ignored) 2 passed
clippy --tests clean
rollup_chain_it repeated back-to-back 6/6 passed

Notes:

  • No regression — every test green on 25.6 is green on 26.3.10.60.
  • Determinism fix validated on prod version: the anchor SQL (toUInt64(toUnixTimestamp(toStartOfInterval(now(), INTERVAL 15 MINUTE))) frozen once → rebuilt as toDateTime(<epoch>)) round-trips correctly and yields 3 distinct per-minute rows (arrayUniq = 3) — the premise the ReplacingMergeTree dedup / single-bucket assertions rely on. The 6× repeat specifically stresses the old 15-minute-boundary flakiness.

Caveat: this validates the test + its SQL on the prod engine version, not the shipped rollup DDL applied to the live ch-prod-01 cluster — that live re-apply + verify remains 0071's job.

@karczuRF karczuRF merged commit e216ac4 into develop Jun 26, 2026
3 checks passed
@karczuRF karczuRF deleted the feat/0059_mv-rollup-fullchain-integration-test branch June 26, 2026 13:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant