Skip to content

explorer: heatmap overlay spike — phase 1 (#233)#240

Merged
rdhyee merged 1 commit into
isamplesorg:mainfrom
rdhyee:feat/heatmap-overlay-spike
May 28, 2026
Merged

explorer: heatmap overlay spike — phase 1 (#233)#240
rdhyee merged 1 commit into
isamplesorg:mainfrom
rdhyee:feat/heatmap-overlay-spike

Conversation

@rdhyee
Copy link
Copy Markdown
Contributor

@rdhyee rdhyee commented May 27, 2026

Summary

Phase 1 of the #233 progressive heatmap spike. Adds a toggleable heatmap overlay in the explorer that renders a filter-aware density layer from in-viewport sample coordinates via DuckDB-WASM → heatmap.js → Cesium SingleTileImageryProvider. Toggle off restores the existing cluster/point view unchanged.

Scope deliberately narrow: answer the spike's viability question (does heatmap.js + Cesium imagery + DuckDB-WASM compose?) with minimal architectural change. Progressive refinement, cache, kernel/color tuning, and third-mode promotion all deferred to phase 2+.

What you'll see

A new Heatmap (filtered density) checkbox under the source legend. When checked:

  • On moveEnd (debounced 250 ms) and on every filter change, queries samples_map_lite.parquet for sample coords in the current viewport bbox, applying source + material/feature/specimen filters (LIMIT 100,000 — see caveat below).
  • Pre-bins the result into a 512×512 grid per pixel before passing to heatmap.js (smart perf optimization — Codex's addition).
  • Renders to an offscreen canvas, swaps in a Cesium SingleTileImageryProvider.
  • Status text (italic) reports point count per refresh. Explicit cap warning when at the 100k LIMIT.
State Sample count Notes
Cyprus alt=500 km, no filter 96,694 Below LIMIT — honest density
Cyprus alt=500 km, +material=organicmaterial 13 Filter regenerates heatmap, dramatically smaller density
World view, no filter LIMIT-capped Status warns "first 100,000 samples (capped — zoom or filter for full density)"

Cyprus numbers (96,694 / 13) match raw DuckDB bbox queries against the same predicates exactly.

Architectural decisions

# Decision Rationale
D1 Additive overlay, not exclusive third mode Minimizes change to the altitude-driven getMode() state machine. Phase 3 can promote to third-mode if the overlay proves the concept.
D2 heatmap.js v2.0.5 via jsDelivr CDN Matches how Cesium is loaded; no build-tool changes.
D3 SingleTileImageryProvider Simplest Cesium integration; matches #233's spec recommendation.
D4 Query samples_map_lite.parquet exact viewport The point of the heatmap is "what's in the rectangle I'm looking at."
D5 heatmapReqId cancellation Matches existing facetCountsReqId / requestId patterns.
D6 Triggers: moveEnd, source-filter, material/feature/specimen Same triggers that fire refreshFacetCounts and loadViewportSamples.
D7 (Codex r1) heatmapLastKey set ONLY after successful layer swap; cleared on error and on moveStart cancellation Original implementation set it before the render fired — a toggle+camera-gesture race could wedge the overlay.

Caveats / known limitations

  • LIMIT 100,000: at global views with no filter, the lite parquet has ~6M rows. The 100k cap shows an arbitrary first 100k, not honest density. Status text explicitly warns when capped. Phase 2 progressive refinement (TABLESAMPLE 1% → 10% → 100% passes) removes the cap properly.
  • Antimeridian rectangle: the wrapped-bbox path uses east + 360. Cesium normally expects west > east for wrapped rectangles. Codex flagged this for a dateline visual/spec test; deferred until someone reports a problem at the dateline.
  • No cache yet: every (viewport, filter) combo re-queries. Phase 2 adds LRU cache keyed on (viewport-hash, filter-hash).

Test plan

  • tests/playwright/heatmap-overlay.spec.js 4/4 pass on localhost in 45.1s
    • heatmap toggle exists
    • toggle on → visible layer + lastPointCount > 0
    • toggle off → layer removed
    • source + material filter changes → lastImageHash changes (asserts on image-hash change, not just timestamp — so the error path doesn't satisfy)
  • tests/playwright/facet-viewport.spec.js 4/4 still pass (no regression in PR explorer: B1 viewport-aware facet counts (#234 step 3) #237's work)
  • Patient visual probe confirms Cyprus numbers (96,694 / 13) match raw bbox queries. Image hashes change per refresh.
  • Verify on rdhyee fork staging (will mirror after this round of fixes lands)

Out of scope for this PR

What success of phase 1 unlocks (from the plan)

Implementation provenance

Bulk of the explorer.qmd diff (~252 LOC) was authored by OpenAI Codex CLI from a Claude-authored phase-1 plan. Codex jumped past "review the plan" straight to "execute the plan." Claude reviewed the resulting implementation, verified it works (spec passes + visual probe matches raw queries), then asked Codex for a round-1 PR review.

Codex's round-1 review caught real bugs:

  1. Stale dedupe key (toggle+camera race wedges overlay) — fixed in this PR
  2. Silent LIMIT 100k cap — fixed: status now explicitly warns
  3. PR diff included PR tests: extract URL helper for sub-path-safe page.goto across the suite #238's commits (branch-base issue) — fixed by rebase onto upstream/main
  4. Spec only asserted lastRefreshAt, error path could satisfy — fixed: asserts lastImageHash
  5. Antimeridian convention questioned — deferred (no easy repro)
  6. PR text overclaim — addressed in this revision

Commit credits Codex as co-author per repo conventions.

Cross-refs

@rdhyee rdhyee force-pushed the feat/heatmap-overlay-spike branch 2 times, most recently from 0a2bc05 to 8998455 Compare May 27, 2026 23:41
rdhyee added a commit to rdhyee/isamplesorg.github.io that referenced this pull request May 27, 2026
@rdhyee rdhyee force-pushed the feat/heatmap-overlay-spike branch from 8998455 to 6b63944 Compare May 28, 2026 00:10
rdhyee added a commit to rdhyee/isamplesorg.github.io that referenced this pull request May 28, 2026
Adds a toggleable heatmap overlay as a third visualization
alongside cluster (H3 dots) and point (individual samples) mode.
Phase 1 of the isamplesorg#233 spike: answer the viability question of
heatmap.js + Cesium SingleTileImageryProvider + DuckDB-WASM
composing into a filter-aware density layer.

In scope this commit:
- Loads heatmap.js v2.0.5 via jsDelivr CDN (alongside Cesium).
- New `#heatmapToggle` checkbox in the source legend.
- `refreshHeatmap()` queries `samples_map_lite.parquet` for in-
  viewport sample coords (applying source + material/feature/
  specimen filters), bins them per pixel into a 512x512 grid
  before passing to heatmap.js (smart perf optimization: keeps
  the data array under 262k regardless of how many samples
  match — though see cap warning below), renders to an offscreen
  canvas, and swaps a `SingleTileImageryProvider` into the
  Cesium imagery layers.
- Cancellation via `heatmapReqId` (matches the existing
  `facetCountsReqId` / `requestId` patterns).
- Refresh triggers: `camera.moveEnd` (debounced 250 ms),
  source-filter change, material-filter change. moveStart
  bumps the reqId and shows "waiting for camera" status.
- Toggle off removes the imagery layer.
- Skip-if-same-key optimization on identical (viewport, filter)
  combos — only marked "done" after a successful render.
- Status text (italic) reports point count per refresh, with
  explicit cap warning when at LIMIT.
- New `tests/playwright/heatmap-overlay.spec.js` (4 tests):
  toggle exists, toggle on renders, toggle off clears, filter
  change regenerates (asserts on lastImageHash, not just
  lastRefreshAt, so the error path doesn't satisfy the test).

Codex round-1 fixes baked in:
- Stale dedupe key bug: `heatmapLastKey` is now set ONLY after
  a successful layer swap, and cleared on (a) error path, and
  (b) moveStart cancellation. Previously a toggle+camera-gesture
  race could leave the key set without a render having happened,
  wedging the overlay (next moveEnd would early-return).
- Silent LIMIT 100k cap: status text now explicitly says
  "(capped — zoom or filter for full density)" when at LIMIT.
  Lite parquet has ~6M rows; the cap shows an arbitrary first
  100k, not honest density. Phase 2 progressive refinement
  removes the cap.
- `_heatmapOverlay.capped` field exposed for tests.

Verified:
- 4/4 spec tests pass on localhost in 45.1s (post-fixes)
- Patient probe at Cyprus alt=500km confirms numbers match raw
  bbox query: no filter = 96,694 samples; +organicmaterial
  filter = 13 samples; image hash changes per refresh
- No regression in facet-viewport.spec.js (4/4 still pass)

Out of scope for this PR (deferred to phase 2+):
- Progressive refinement (TABLESAMPLE 1% → 10% → 100%) —
  removes the LIMIT cap properly
- Cache by (viewport-hash, filter-hash)
- Kernel / color-ramp tuning, alpha tuning
- Third-mode promotion (currently overlay, not exclusive mode)
- Interaction with `#facetNote` apology copy
- Antimeridian rectangle convention — current path uses
  `east + 360` for wrapped bboxes; Cesium normally expects
  `west > east`. Codex flagged for a dateline test, deferred.

Implementation provenance: Bulk of explorer.qmd diff (~252
LOC) authored by OpenAI Codex CLI from a Claude-authored phase-1
plan that was sent for "review" but Codex jumped straight to
implementation. Claude reviewed the implementation against the
plan, verified it works (spec + visual probe), and asked Codex
for a round-1 PR review. Codex caught real bugs (stale dedupe
key, silent cap) — those are addressed in this amended commit.

Co-Authored-By: OpenAI Codex CLI <noreply@openai.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@rdhyee rdhyee force-pushed the feat/heatmap-overlay-spike branch from 6b63944 to 2755b1f Compare May 28, 2026 00:31
rdhyee added a commit to rdhyee/isamplesorg.github.io that referenced this pull request May 28, 2026
@rdhyee rdhyee merged commit 60ac865 into isamplesorg:main May 28, 2026
1 check passed
rdhyee added a commit to rdhyee/isamplesorg.github.io that referenced this pull request May 28, 2026
Replaces the LIMIT 100000 raw-row scan + JS per-pixel binning with
a single DuckDB GROUP BY query that does the binning server-side.
Removes the arbitrary cap honestly: every sample in the bbox is
counted into its true pixel cell, regardless of total sample count.

Why the LIMIT was bad: `LIMIT 100000` returned the first 100k rows
in parquet storage order — not random, not geographic. At world
view, the heatmap silently showed whichever source happened to be
physically first in the file (likely SESAR, the largest source by
row count). The "(capped)" status warning disclosed the problem but
didn't fix it. RY feedback 2026-05-27 on PR isamplesorg#240 ("wondering
whether we can do better geographic random sampling").

How the SQL pushdown works: compute `(x_bin, y_bin)` pixel
coordinates from `latitude`/`longitude` server-side using FLOOR /
LEAST / GREATEST, then GROUP BY (x_bin, y_bin) returning one row
per non-empty pixel with COUNT(*) as the sample count. Result
cardinality is bounded by canvas pixels (≤ 512² = 262k),
independent of bbox sample count. JS just iterates the aggregated
rows and applies the same log(1+n) scaling for heatmap.js.

Verified counts vs `samples table` summary line (= true sample
count for the current view):

  view              | heatmap | table   | match
  ------------------|---------|---------|------
  PKAP (100km)      |  77,840 |  77,840 |  ✅
  Cyprus medium     | 100,970 | 100,970 |  ✅  (was capped at 100k)
  Cyprus regional   | 682,029 | 682,029 |  ✅  (was capped at 100k)
  World view        | 5.98M   | 5.98M   |  ✅  (was capped at 100k)

Render time at world view (~6M samples → 35k cells): ~7s on
localhost, similar to or faster than the LIMIT 100k version at
smaller zooms.

Removes the "(capped)" status branch and the `HEATMAP_LIMIT`
constant becomes unused (left in place for now in case Phase 2
progressive refinement reintroduces a safety cap on cell count).

Side effect of removing the cap: the per-pixel max-bias is now
even more extreme at high-density views, but the log(1+n) scaling
from PR isamplesorg#240 handles it.

Verified: 5/5 heatmap-overlay.spec.js still pass on localhost.
(The spec asserts `lastPointCount > 0`, which is still true; one
spec change worth a follow-up: the spec used to expect capped
behavior for large views, but no test currently asserts that, so
no spec changes needed here.)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
rdhyee added a commit to rdhyee/isamplesorg.github.io that referenced this pull request May 28, 2026
)

Two related changes that follow up PR isamplesorg#240 (heatmap phase 1):

1. SQL pre-aggregation removes the LIMIT 100000 cap honestly.
2. Adaptive per-point radius + maxOpacity caps avoid blur-overlap
   saturation at high-cell-count views (world view "everything
   red" symptom RY surfaced after isamplesorg#240 shipped).

## (1) SQL pre-aggregation

Previously: `SELECT latitude, longitude FROM lite WHERE bbox AND
filters LIMIT 100000`, then bin per pixel in JS. Two problems:

  - LIMIT 100000 returned the first 100k rows in parquet storage
    order — NOT random, NOT geographic. At world view, the
    heatmap silently showed whichever source happened to be
    physically first in the file (likely SESAR, the largest by
    row count). The "(capped)" status warning disclosed the
    problem but didn't fix it.
  - For sample sets above the cap, the density was unfaithful.

Now: SQL computes pixel cell coords server-side using FLOOR /
LEAST / GREATEST, then GROUP BY (x, y) returning one row per
non-empty pixel with COUNT(*) as the count. Result cardinality
is bounded by canvas pixels (≤ 512² = 262k), independent of
how many samples the bbox contains. No LIMIT needed — every
sample counted into its true pixel bucket.

Antimeridian handled: when bbox wraps (west > east), SQL shifts
longitudes < west by +360 so pixel arithmetic works in a
continuous coordinate space.

Verified counts vs `samples table` summary line (= true sample
count for the current view):

  view              | heatmap  | table    | match
  ------------------|----------|----------|------
  PKAP (100km)      |  77,840  |  77,840  | ✅
  Cyprus medium     | 100,970  | 100,970  | ✅  (was capped at 100k)
  Cyprus regional   | 682,029  | 682,029  | ✅  (was capped at 100k)
  Africa (1.9Mkm)   |  12,875  |  12,875  | ✅
  World view        | 5.98M    | 5.98M    | ✅  (was capped at 100k)

Render time at world view (~6M samples → 35k cells): ~7s on
localhost, similar to or faster than the LIMIT 100k version.

`HEATMAP_LIMIT` constant left in place but no longer used (kept
for back-compat in case phase 2 reintroduces a safety cell-count
cap).

## (2) Adaptive radius + maxOpacity

After (1), RY tested staging and reported world view "everything
is red." Cause: with 35k+ pixel cells on a 512² canvas, heatmap.js's
default 25-pixel blur radius made each cell's Gaussian blur cover
~1% of canvas. 35k × 1% = >>100% → linear-additive blending
saturated everything.

Two complementary fixes:

  - `maxOpacity: 0.6` on the heatmap.js instance config. Caps the
    rendered alpha so dense areas don't fully wash out the
    satellite imagery underneath.
  - Per-point radius computed from `sqrt(canvas_pixels /
    cell_count) * 2`, clamped to [6, 30]. World view (35k cells)
    → radius ≈ 6px (tight pixel dots, no overlap). Cyprus medium
    (~400 cells) → radius = 30px (cap, smooth blobs as before).

Together: world view shows geographic structure instead of
solid red. Tight zooms unchanged visually.

## Test plan

- `tests/playwright/heatmap-overlay.spec.js` 5/5 still pass
  on localhost.
- Visual verified on rdhyee staging at the URLs RY surfaced
  (Africa-wide, Atlantic alt=15Mkm). World view now shows
  structure; tight zooms unchanged.

## Provenance

Authored by Claude, prompted by RY ("wondering whether we can
do better geographic random sampling"). Approach (Option C from
Claude's menu: SQL pre-aggregation by pixel cell) recommended
over TABLESAMPLE because it removes the cap entirely rather
than just making the sampling random.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
rdhyee added a commit to rdhyee/isamplesorg.github.io that referenced this pull request May 28, 2026
)

Two related changes that follow up PR isamplesorg#240 (heatmap phase 1):

1. SQL pre-aggregation removes the LIMIT 100000 cap honestly.
2. Adaptive per-point radius + maxOpacity caps avoid blur-overlap
   saturation at high-cell-count views (world view "everything
   red" symptom RY surfaced after isamplesorg#240 shipped).

## (1) SQL pre-aggregation

Previously: `SELECT latitude, longitude FROM lite WHERE bbox AND
filters LIMIT 100000`, then bin per pixel in JS. Two problems:

  - LIMIT 100000 returned the first 100k rows in parquet storage
    order — NOT random, NOT geographic. At world view, the
    heatmap silently showed whichever source happened to be
    physically first in the file (likely SESAR, the largest by
    row count). The "(capped)" status warning disclosed the
    problem but didn't fix it.
  - For sample sets above the cap, the density was unfaithful.

Now: SQL computes pixel cell coords server-side using FLOOR /
LEAST / GREATEST, then GROUP BY (x, y) returning one row per
non-empty pixel with COUNT(*) as the count. Result cardinality
is bounded by canvas pixels (≤ 512² = 262k), independent of
how many samples the bbox contains. No LIMIT needed — every
sample counted into its true pixel bucket.

Antimeridian handled: when bbox wraps (west > east), SQL shifts
longitudes < west by +360 so pixel arithmetic works in a
continuous coordinate space.

Verified counts vs `samples table` summary line (= true sample
count for the current view):

  view              | heatmap  | table    | match
  ------------------|----------|----------|------
  PKAP (100km)      |  77,840  |  77,840  | ✅
  Cyprus medium     | 100,970  | 100,970  | ✅  (was capped at 100k)
  Cyprus regional   | 682,029  | 682,029  | ✅  (was capped at 100k)
  Africa (1.9Mkm)   |  12,875  |  12,875  | ✅
  World view        | 5.98M    | 5.98M    | ✅  (was capped at 100k)

Render time at world view (~6M samples → 35k cells): ~7s on
localhost, similar to or faster than the LIMIT 100k version.

`HEATMAP_LIMIT` constant left in place but no longer used (kept
for back-compat in case phase 2 reintroduces a safety cell-count
cap).

## (2) Adaptive radius + maxOpacity

After (1), RY tested staging and reported world view "everything
is red." Cause: with 35k+ pixel cells on a 512² canvas, heatmap.js's
default 25-pixel blur radius made each cell's Gaussian blur cover
~1% of canvas. 35k × 1% = >>100% → linear-additive blending
saturated everything.

Two complementary fixes:

  - `maxOpacity: 0.6` on the heatmap.js instance config. Caps the
    rendered alpha so dense areas don't fully wash out the
    satellite imagery underneath.
  - Per-point radius computed from `sqrt(canvas_pixels /
    cell_count) * 2`, clamped to [6, 30]. World view (35k cells)
    → radius ≈ 6px (tight pixel dots, no overlap). Cyprus medium
    (~400 cells) → radius = 30px (cap, smooth blobs as before).

Together: world view shows geographic structure instead of
solid red. Tight zooms unchanged visually.

## Test plan

- `tests/playwright/heatmap-overlay.spec.js` 5/5 still pass
  on localhost.
- Visual verified on rdhyee staging at the URLs RY surfaced
  (Africa-wide, Atlantic alt=15Mkm). World view now shows
  structure; tight zooms unchanged.

## Provenance

Authored by Claude, prompted by RY ("wondering whether we can
do better geographic random sampling"). Approach (Option C from
Claude's menu: SQL pre-aggregation by pixel cell) recommended
over TABLESAMPLE because it removes the cap entirely rather
than just making the sampling random.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
rdhyee added a commit to rdhyee/isamplesorg.github.io that referenced this pull request May 28, 2026
)

Two related changes that follow up PR isamplesorg#240 (heatmap phase 1):

1. SQL pre-aggregation removes the LIMIT 100000 cap honestly.
2. Adaptive per-point radius + maxOpacity caps avoid blur-overlap
   saturation at high-cell-count views (world view "everything
   red" symptom RY surfaced after isamplesorg#240 shipped).

## (1) SQL pre-aggregation

Previously: `SELECT latitude, longitude FROM lite WHERE bbox AND
filters LIMIT 100000`, then bin per pixel in JS. Two problems:

  - LIMIT 100000 returned the first 100k rows in parquet storage
    order — NOT random, NOT geographic. At world view, the
    heatmap silently showed whichever source happened to be
    physically first in the file (likely SESAR, the largest by
    row count). The "(capped)" status warning disclosed the
    problem but didn't fix it.
  - For sample sets above the cap, the density was unfaithful.

Now: SQL computes pixel cell coords server-side using FLOOR /
LEAST / GREATEST, then GROUP BY (x, y) returning one row per
non-empty pixel with COUNT(*) as the count. Result cardinality
is bounded by canvas pixels (≤ 512² = 262k), independent of
how many samples the bbox contains. No LIMIT needed — every
sample counted into its true pixel bucket.

Antimeridian handled: when bbox wraps (west > east), SQL shifts
longitudes < west by +360 so pixel arithmetic works in a
continuous coordinate space.

Verified counts vs `samples table` summary line (= true sample
count for the current view):

  view              | heatmap  | table    | match
  ------------------|----------|----------|------
  PKAP (100km)      |  77,840  |  77,840  | ✅
  Cyprus medium     | 100,970  | 100,970  | ✅  (was capped at 100k)
  Cyprus regional   | 682,029  | 682,029  | ✅  (was capped at 100k)
  Africa (1.9Mkm)   |  12,875  |  12,875  | ✅
  World view        | 5.98M    | 5.98M    | ✅  (was capped at 100k)

Render time at world view (~6M samples → 35k cells): ~7s on
localhost, similar to or faster than the LIMIT 100k version.

`HEATMAP_LIMIT` constant left in place but no longer used (kept
for back-compat in case phase 2 reintroduces a safety cell-count
cap).

## (2) Adaptive radius + maxOpacity

After (1), RY tested staging and reported world view "everything
is red." Cause: with 35k+ pixel cells on a 512² canvas, heatmap.js's
default 25-pixel blur radius made each cell's Gaussian blur cover
~1% of canvas. 35k × 1% = >>100% → linear-additive blending
saturated everything.

Two complementary fixes:

  - `maxOpacity: 0.6` on the heatmap.js instance config. Caps the
    rendered alpha so dense areas don't fully wash out the
    satellite imagery underneath.
  - Per-point radius computed from `sqrt(canvas_pixels /
    cell_count) * 2`, clamped to [6, 30]. World view (35k cells)
    → radius ≈ 6px (tight pixel dots, no overlap). Cyprus medium
    (~400 cells) → radius = 30px (cap, smooth blobs as before).

Together: world view shows geographic structure instead of
solid red. Tight zooms unchanged visually.

## Test plan

- `tests/playwright/heatmap-overlay.spec.js` 5/5 still pass
  on localhost.
- Visual verified on rdhyee staging at the URLs RY surfaced
  (Africa-wide, Atlantic alt=15Mkm). World view now shows
  structure; tight zooms unchanged.

## Provenance

Authored by Claude, prompted by RY ("wondering whether we can
do better geographic random sampling"). Approach (Option C from
Claude's menu: SQL pre-aggregation by pixel cell) recommended
over TABLESAMPLE because it removes the cap entirely rather
than just making the sampling random.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
rdhyee added a commit that referenced this pull request May 28, 2026
Two related changes that follow up PR #240 (heatmap phase 1):

1. SQL pre-aggregation removes the LIMIT 100000 cap honestly.
2. Adaptive per-point radius + maxOpacity caps avoid blur-overlap
   saturation at high-cell-count views (world view "everything
   red" symptom RY surfaced after #240 shipped).

## (1) SQL pre-aggregation

Previously: `SELECT latitude, longitude FROM lite WHERE bbox AND
filters LIMIT 100000`, then bin per pixel in JS. Two problems:

  - LIMIT 100000 returned the first 100k rows in parquet storage
    order — NOT random, NOT geographic. At world view, the
    heatmap silently showed whichever source happened to be
    physically first in the file (likely SESAR, the largest by
    row count). The "(capped)" status warning disclosed the
    problem but didn't fix it.
  - For sample sets above the cap, the density was unfaithful.

Now: SQL computes pixel cell coords server-side using FLOOR /
LEAST / GREATEST, then GROUP BY (x, y) returning one row per
non-empty pixel with COUNT(*) as the count. Result cardinality
is bounded by canvas pixels (≤ 512² = 262k), independent of
how many samples the bbox contains. No LIMIT needed — every
sample counted into its true pixel bucket.

Antimeridian handled: when bbox wraps (west > east), SQL shifts
longitudes < west by +360 so pixel arithmetic works in a
continuous coordinate space.

Verified counts vs `samples table` summary line (= true sample
count for the current view):

  view              | heatmap  | table    | match
  ------------------|----------|----------|------
  PKAP (100km)      |  77,840  |  77,840  | ✅
  Cyprus medium     | 100,970  | 100,970  | ✅  (was capped at 100k)
  Cyprus regional   | 682,029  | 682,029  | ✅  (was capped at 100k)
  Africa (1.9Mkm)   |  12,875  |  12,875  | ✅
  World view        | 5.98M    | 5.98M    | ✅  (was capped at 100k)

Render time at world view (~6M samples → 35k cells): ~7s on
localhost, similar to or faster than the LIMIT 100k version.

`HEATMAP_LIMIT` constant left in place but no longer used (kept
for back-compat in case phase 2 reintroduces a safety cell-count
cap).

## (2) Adaptive radius + maxOpacity

After (1), RY tested staging and reported world view "everything
is red." Cause: with 35k+ pixel cells on a 512² canvas, heatmap.js's
default 25-pixel blur radius made each cell's Gaussian blur cover
~1% of canvas. 35k × 1% = >>100% → linear-additive blending
saturated everything.

Two complementary fixes:

  - `maxOpacity: 0.6` on the heatmap.js instance config. Caps the
    rendered alpha so dense areas don't fully wash out the
    satellite imagery underneath.
  - Per-point radius computed from `sqrt(canvas_pixels /
    cell_count) * 2`, clamped to [6, 30]. World view (35k cells)
    → radius ≈ 6px (tight pixel dots, no overlap). Cyprus medium
    (~400 cells) → radius = 30px (cap, smooth blobs as before).

Together: world view shows geographic structure instead of
solid red. Tight zooms unchanged visually.

## Test plan

- `tests/playwright/heatmap-overlay.spec.js` 5/5 still pass
  on localhost.
- Visual verified on rdhyee staging at the URLs RY surfaced
  (Africa-wide, Atlantic alt=15Mkm). World view now shows
  structure; tight zooms unchanged.

## Provenance

Authored by Claude, prompted by RY ("wondering whether we can
do better geographic random sampling"). Approach (Option C from
Claude's menu: SQL pre-aggregation by pixel cell) recommended
over TABLESAMPLE because it removes the cap entirely rather
than just making the sampling random.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant