Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion docker-compose.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
services:
clickhouse:
image: clickhouse/clickhouse-server:25.6
# Pinned to the EXACT version running on ch-prod-01 (Hetzner) so local/CI
# tests exercise the same engine as production — refreshable-MV semantics,
# alias resolution and date/time functions can differ across CH releases.
# Before changing this, re-check the live `SELECT version()` on ch-prod-01.
image: clickhouse/clickhouse-server:26.3.10.60
ports:
- '8123:8123'
- '9000:9000'
Expand Down
21 changes: 15 additions & 6 deletions docs/database-schema/database-schema-overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -491,22 +491,22 @@ CREATE MATERIALIZED VIEW prices.mv_ohlcv_1m_to_15m
REFRESH EVERY 1 MINUTE -- coarser grains refresh less often
TO prices.price_ohlcv_15m AS
SELECT
toStartOfInterval(timestamp, INTERVAL 15 MINUTE) AS timestamp,
toStartOfInterval(t.timestamp, INTERVAL 15 MINUTE) AS timestamp,
asset_id,
quote_asset_id,
source,
argMin(open, timestamp) AS open,
max(high) AS high,
argMin(open, t.timestamp) AS open, -- qualified: the AS-timestamp
max(high) AS high, -- alias shadows the source column
min(low) AS low,
argMax(close, timestamp) AS close,
argMax(close, t.timestamp) AS close,
sum(volume_base) AS volume_base,
sum(volume_quote) AS volume_quote,
sum(volume_quote_usd) AS volume_quote_usd,
volume_quote_usd / nullIf(volume_base, 0) AS vwap, -- ref aliases, never re-sum(…)
sum(trade_count) AS trade_count,
max(version) AS version
FROM prices.price_ohlcv_1m FINAL -- post-dedup, post-enrichment
WHERE timestamp >= now() - INTERVAL 2 HOUR -- bounded re-scan; widen for coarse grains
FROM prices.price_ohlcv_1m AS t FINAL -- post-dedup, post-enrichment
WHERE t.timestamp >= now() - INTERVAL 2 HOUR -- bounded re-scan; widen for coarse grains
GROUP BY timestamp, asset_id, quote_asset_id, source;

-- Repeat for 15m → 1h, 1h → 4h, 4h → 1d, 1d → 1w, 1w → 1M — each FROM the
Expand All @@ -525,6 +525,15 @@ volume_base, 0)`), never `sum(…)/sum(…)` — re-summing an aliased column ne
an early minute leaves the bucket max unchanged, tying the stale and corrected
rollup rows; project a strictly-increasing version (`sum(version)` or a
refresh epoch) there.
- **Qualify the bucket-time argument.** The bucket key is aliased `AS timestamp`
to land in the target's `timestamp` column, but that alias **shadows** the
source `timestamp` column. `argMin(open, …)` / `argMax(close, …)` /
`argMax(close_usd, …)` must therefore reference the **qualified** source column
`t.timestamp` (`FROM … AS t`). With the bare `timestamp` they read the
constant bucket-start, so open/close/close_usd tie-break to an arbitrary row in
the bucket instead of the true first/last by time. The 0059 full-chain
integration test (`prices-clickhouse/tests/rollup_chain_it.rs`) caught this in
the as-shipped `rollups.sql` / `preroll.sql`; both are fixed.

Refreshable MVs require ClickHouse ≥ 23.12; the exact mechanism (refreshable MV
vs. scheduled re-aggregate) is finalised in task **0051** against the Hetzner
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,25 @@ history:
this task verifies now exists. Moving back to active to land the
remaining ACs: the full _15m…_1M chain integration test +
extending the proof harness against the real rollups.sql DDL.
- date: 2026-06-26
status: active
who: oski
note: >
**Remaining ACs delivered + a real bug fixed.** Authored
packages/prices-clickhouse/tests/rollup_chain_it.rs — a full-chain
(_1m → _15m → … → _1M) integration test driving the REAL shipped
rollups.sql, plus a preroll.sql full-range pass; both green vs docker
CH 25.6. The test surfaced a correctness bug in the as-shipped
rollups.sql AND preroll.sql: `toStartOfInterval(timestamp,…) AS
timestamp` shadows the source column, so argMin(open)/argMax(close)/
argMax(close_usd) tie-break to an arbitrary row (volumes/high/low were
fine). Fixed both (FROM … AS t + qualified t.timestamp) and the
schema-overview §3.2 reference DDL; current.sql was already correct.
The buggy DDL is live on ch-prod-01 (applied under 0051) → spawned
0071 to re-apply. All four ACs now [x]. Verified the shipped
max(version) projection is correct for the true-refreshable replace-mode
chain that 0051 actually shipped (differs from the G-note's APPEND/
sum(version) lock-in).
---

# MV rollup-chain version propagation under enriched `_1m` re-inserts
Expand Down Expand Up @@ -172,14 +191,96 @@ contract; also implies `_1m` retention ≥ widest rollup refresh window.

## Acceptance Criteria

- [~] MV chain projects a `version` that lets an enriched `_1m`
re-insert win at every rolled-up granularity — **semantics decided +
proven** (`sum(version)`/refresh-epoch, not `max(version)`); production
DDL lands in 0051
- [~] No double-count / under-count of `volume_quote_usd` in `_15m … _1M`
after an enrichment pass — **proven on `_1m → _15m`** (re-aggregate from
`_1m FINAL`); full chain `_15m … _1M` verified once 0051 lands the DDL
- [~] Integration test covering write → roll up → enrich → assert across
all granularities (`FINAL`) — **proof harness exists** (`proof/`, 1 hop);
extend to all grains against the real 0051 DDL
- [x] MV chain projects a `version` that lets an enriched `_1m`
re-insert win at every rolled-up granularity — **proven against the real
0051 DDL** across the full `_1m → _15m → … → _1M` chain
(`tests/rollup_chain_it.rs`): the shipped chain is a *true* refreshable MV
in replace mode (atomic target swap), so `max(version)` is sufficient and
the enriched row wins at every grain (version advances 1 → 2).
- [x] No double-count / under-count of `volume_quote_usd` in `_15m … _1M`
after an enrichment pass — **proven on the full chain**: `volume_base`
stays 30 (no double-count) and `volume_quote_usd` propagates 0 → 300 at
every grain after the enrichment re-insert + refresh.
- [x] Integration test covering write → roll up → enrich → assert across
all granularities (`FINAL`) — **`tests/rollup_chain_it.rs`** drives the
real `rollups.sql` chain end-to-end (+ a `preroll.sql` full-range pass),
green against docker CH 25.6.
- [x] 0026 G-note dependency note resolved / cross-linked

## Implementation Notes

Full-chain integration test landed at
`packages/prices-clickhouse/tests/rollup_chain_it.rs` (two `#[ignore]` tests,
scratch-DB isolated like `views_it.rs`, driven deterministically via
`SYSTEM REFRESH VIEW` + poll like `current_mv_it.rs`). It applies the **real**
shipped `INIT_SQL` + `ROLLUPS_SQL`, rolls a 3-minute bucket up all six grains,
then enrichment-re-inserts (`version+1`, `volume_quote_usd` filled) and
re-drives the chain, asserting OHLCV + version at every grain `FINAL`.

**Bug found and fixed (see Design Decisions → Emerged).** The as-shipped
`rollups.sql` / `preroll.sql` mis-computed `open`/`close`/`close_usd`. Fixed in
both schema files + the schema-overview §3.2 reference DDL. The buggy DDL is
live on `ch-prod-01` (applied under 0051) → re-apply spawned as **0071**.

## Design Decisions

### Emerged

1. **`AS timestamp` bucket alias shadowed the source column (correctness bug).**
`toStartOfInterval(timestamp, …) AS timestamp` makes the bare `timestamp`
inside `argMin(open, …)` / `argMax(close, …)` / `argMax(close_usd, …)`
resolve to the *bucket-start* (constant within a bucket), so O/C/close_usd
tie-break to an arbitrary row instead of the true first/last by time. Volumes
(`sum`), `high`/`low` are unaffected. The 0059 desk proof only checked
volumes, so it never surfaced this. **Fix:** `FROM … AS t` + qualified
`t.timestamp` in `rollups.sql` and `preroll.sql`. `current.sql` was already
correct (it uses `AS c` + `c.timestamp`).

2. **`max(version)` accepted (not `sum(version)`).** The G-note locked in
"APPEND + `sum(version)`", but 0051 actually shipped a *true* refreshable MV
in **replace** mode (atomic target swap) + bounded window, with `preroll.sql`
as the separate full-range historical path. Under atomic replace there is no
`ReplacingMergeTree` version tie to lose, so the shipped `max(version)` is
correct — the integration test confirms the enriched row wins at every grain.
The test asserts the shipped semantics rather than re-litigating the G-note.

3. **Pin the local/CI ClickHouse to the EXACT production version.** Code review
flagged that the new integration test was being validated against
`docker-compose.yml`'s `clickhouse-server:25.6` (`25.6.13.41`), while
`ch-prod-01` runs **`26.3.10.60`** (per the 0051 G-live-schema-state note and
0063 provisioning verification) — a 25.x → 26.x major gap. Refreshable-MV
semantics, SQL alias resolution and date/time functions can all differ across
CH releases, so "green locally" was not evidence for the engine the DDL
actually runs on. **Decision:** pin `docker-compose.yml` to the exact prod
version `clickhouse/clickhouse-server:26.3.10.60` and re-run the full
`prices-clickhouse` suite against it (rollup_chain, current_mv, views — all
green). Policy going forward: local/CI must match the live `ch-prod-01`
version; re-check `SELECT version()` on prod before bumping the pin. (This
strengthens — but does not replace — the live re-verify owed by **0071** on
the production cluster itself.)

4. **Flaky test anchor made deterministic + version-robust.** The first cut of
`insert_bucket` embedded `toStartOfInterval(now(), …)`, re-evaluated
server-side at *each* INSERT; the un-enriched and enriched batches are
separate INSERTs with the whole chain-drive between them, so a wall-clock
15-minute boundary crossing would anchor them to different buckets → no
`ReplacingMergeTree` dedup → `_1m FINAL` keeps all 6 rows and the
single-bucket / no-double-count assertions fail. **Fix:** `bucket_anchor()`
fetches the boundary **once** as a `toUInt64(toUnixTimestamp(…))` integer
epoch and rebuilds it as `toDateTime(<n>)`, reused for every INSERT. The epoch
round-trip (not `formatDateTime`, whose `%i`/`%M` specifiers are a portability
liability across CH versions) keeps it correct regardless of server version or
timezone.

## Issues Encountered

- **MV → target column mapping is positional, not by-name** (verified with a
swapped-alias `INSERT … SELECT` probe), so qualifying the source column while
keeping the output column aliased `timestamp` is safe.
- `prefer_column_name_to_alias=1` is **not** a viable fix — it makes `GROUP BY
timestamp` bind to the raw column, shattering the bucket into per-minute rows.

## Future Work

- **0071** (spawned) — re-apply the corrected `rollups.sql` / `preroll.sql` to
the live `ch-prod-01` cluster (the buggy DDL is already deployed under 0051).
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
---
id: "0071"
title: "Re-apply corrected rollup/preroll DDL to live ch-prod-01 (argMin/argMax timestamp-shadowing fix)"
type: BUG
status: backlog
related_adr: ["0007"]
related_tasks: ["0059", "0051"]
tags: [layer-database, priority-high, effort-small, clickhouse, materialized-views, rollups, operations]
links:
- "../../../packages/prices-clickhouse/schema/rollups.sql"
- "../../../packages/prices-clickhouse/schema/preroll.sql"
history:
- date: 2026-06-26
status: backlog
who: oski
note: >
Spawned from 0059. The 0059 full-chain integration test surfaced an
OHLC-correctness bug in the as-shipped rollups.sql / preroll.sql: the
`toStartOfInterval(timestamp, …) AS timestamp` bucket alias shadows the
source `timestamp` column, so argMin(open)/argMax(close)/argMax(close_usd)
tie-break to an arbitrary row instead of the true first/last by time. The
schema files are fixed (FROM … AS t + qualified t.timestamp), but the
BUGGY DDL was already applied LIVE on ch-prod-01 under 0051 (2026-06-22).
The six refreshable MVs in prices.* must be re-created from the corrected
rollups.sql. Deferred to a deploy-capable session (prepare-not-deploy).
---

# Re-apply corrected rollup/preroll DDL to live ch-prod-01

## Summary

The live `prices.*` refreshable-MV rollup chain on `ch-prod-01` was created
from a version of `schema/rollups.sql` that mis-computes `open` / `close` /
`close_usd` (the argMin/argMax-by-time aggregates). Task 0059 fixed the schema
files; this task re-applies the fix to the running cluster.

## Context

The bug (task 0059, full-chain integration test `rollup_chain_it.rs`): the
bucket key `toStartOfInterval(timestamp, …) AS timestamp` **shadows** the source
`timestamp` column, so `argMin(open, timestamp)` / `argMax(close, timestamp)` /
`argMax(close_usd, timestamp)` evaluate against the constant bucket-start and
tie-break to an arbitrary row. Volumes (`sum`), `high` (`max`), `low` (`min`)
are unaffected — only O/C/close_usd are wrong. Fixed by `FROM … AS t` +
qualified `t.timestamp`. `current.sql` was already correct (it uses `AS c` +
`c.timestamp`); only `rollups.sql` and `preroll.sql` were affected.

Live apply provenance: task 0051 `notes/G-live-schema-state.md` (Route A,
`ssh … docker exec app-clickhouse-1 clickhouse-client --multiquery`, CH 26.3.10,
loopback `default` admin).

## Implementation

- `DROP VIEW` the six `prices.mv_ohlcv_*` MVs on `ch-prod-01`, then re-create
them from the corrected `schema/rollups.sql` (loopback `default` admin, same
Route A path as 0051). Dropping an MV does **not** touch its target table, so
no rollup rows are lost.
- Recompute any already-mis-rolled buckets: TRUNCATE + re-run the corrected
`preroll.sql` over the backfilled range, or let the bounded-window MVs
overwrite the recent window on their next refresh (replace mode). Decide based
on how much coarse data has accrued since 0051's live apply.
- Verify post-apply: spot-check that `open`/`close` at `_15m`+ match the true
first/last `_1m` close in a bucket (the assertion the integration test makes).

## Acceptance Criteria

- [ ] Six `prices.mv_ohlcv_*` MVs on ch-prod-01 re-created from corrected DDL
- [ ] Mis-rolled historical buckets recomputed (preroll or window refresh)
- [ ] Live spot-check confirms correct argMin-open / argMax-close at ≥ `_15m`
- [ ] Provenance appended to 0051 `notes/G-live-schema-state.md` (or a 0071 note)
Loading
Loading