Skip to content

fix: tx prices linking, dashboard load, initial sync errors#1132

Open
chedieck wants to merge 22 commits into
masterfrom
fix/tx-prices-linking
Open

fix: tx prices linking, dashboard load, initial sync errors#1132
chedieck wants to merge 22 commits into
masterfrom
fix/tx-prices-linking

Conversation

@chedieck
Copy link
Copy Markdown
Collaborator

@chedieck chedieck commented Jun 2, 2026

Depends On

Description

  • Transactions with missing prices no longer crash the entire sync batch or cache rebuild — they're skipped with a warning, and should be fixed by the jobs thread.
  • Chronik WebSocket disconnections no longer kill the initial sync — reconnects with exponential backoff
  • TX_CONFIRMED handler throttled to max 6 concurrent — prevents DB connection exhaustion on blocks with many tracked txs
  • Cache rebuild no longer blocks page loads — uses cursor pagination, excludes unused relations, and doesn't starve the connection pool
  • Payments/wallets endpoints use direct DB queries instead of slow subqueries

Performance

  • /api/payments uses addressId IN (...) instead of EXISTS subquery (120s → 4s for large users)
  • /api/payments/count uses direct DB count instead of dashboard cache recomputation
  • Wallet balance uses SQL aggregate instead of loading all transactions
  • Cache rebuild uses cursor-based pagination and skips inputs relation

Test Plan

This is running on prod rn.
To test locally, preferrably use prod's DB locally according to the workflow detailed in #1131

  1. Spin up the local mirror of prod
  2. Verify addresses with missing prices log warnings but don't halt the batch
  3. Check that, while the jobs are syncing the address and the dashboard is being created, we still don't get endless loading pages and get the partial data served
  4. Wait for a block with multiple tracked txs — confirm no connection exhaustion errors
  5. Navigate to payments page as a large user — should load in <5s

Summary by CodeRabbit

  • New Features

    • Added cache rebuilding status banner on dashboard that displays when data is being calculated
    • Dashboard automatically polls for updates during cache rebuilds
  • Improvements

    • Enhanced WebSocket reconnection with automatic retries and exponential backoff strategy
    • Better handling of missing price data with improved error resilience
    • Optimized concurrent request deduplication for improved performance
    • Refined background cache rebuilding strategy

chedieck added 18 commits June 2, 2026 08:03
- executeCall coalesces concurrent calls (waits) instead of throwing
- cacheAddressesInBackground deduplicates per user (skip if already running)
- Dashboard not cached when partial (background rebuild active)
- Dashboard API returns cacheRebuilding flag; frontend shows banner + auto-refreshes
- Payments count falls back to DB count during cache rebuild
- payments/count: always use direct DB count instead of dashboard cache path
- cache rebuild: use cursor-based pagination instead of offset (O(1) vs O(n))
- cache rebuild: exclude inputs JOIN (unused by cache)
- dashboard: cache partial data with 30s TTL during rebuild
- log dedup: only log 'starting background rebuild' when actually starting
Root causes:
- initPaymentCache triggered full cache build (200k+ tx scan) on every
  websocket message for uncached addresses, saturating MariaDB I/O
- payments/wallets queries used EXISTS subquery pattern, forcing MySQL
  to scan all rows before sorting
- generateAddressPaymentInfo loaded ALL transactions into memory just
  to sum amounts (replaced with SQL aggregate)
- background cache rebuild shared connection pool with user queries

Fixes:
- SKIP_CACHE_REBUILD env var to disable cache rebuild in dev-from-dump
- Separate backgroundPrisma client with 3-connection pool for cache rebuild
- Replace EXISTS subquery with addressId IN (...) for payments/count queries
- Replace full-table-load with prisma.transaction.aggregate for balance
- 200ms delay between background cache batches to reduce I/O pressure
- Reduce cache rebuild page size from 5000 to 1000

Results (dev-from-dump, 223k payments user):
- /api/payments: 120s -> 4s
- /api/payments/count: hanging -> 89ms
- /api/wallets: hanging -> 23ms
The separate 3-connection pool caused timeouts for txCreation calls
from websocket handlers (initPaymentCache → addressCreation path).
The real fix for dev-from-dump I/O saturation is SKIP_CACHE_REBUILD,
not a separate pool.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 2, 2026

Review Change Stack

Warning

Review limit reached

@chedieck, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 32 minutes and 59 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f5eefc7a-4020-4570-80c0-30a9f33dc500

📥 Commits

Reviewing files that changed from the base of the PR and between f83885d and e5eafb3.

📒 Files selected for processing (2)
  • constants/index.ts
  • services/chronikService.ts
📝 Walkthrough

Walkthrough

This PR implements background cache rebuilding with WebSocket resilience, concurrent transaction throttling, and query optimization. Chronik WebSocket reconnection uses exponential backoff, concurrent cache requests are deduplicated per user, dashboard state tracks rebuild progress via polling UI, and transaction/address queries are optimized using Prisma aggregation and cursor-based pagination.

Changes

Background Cache Rebuild with WebSocket Resilience and Query Optimization

Layer / File(s) Summary
Configuration and constants setup
.env.from-dump, .gitignore, constants/index.ts
New environment variables FROM_DUMP and SKIP_CACHE_REBUILD, .forever ignore pattern, and constants for Chronik WebSocket retry/backoff and confirmed-transaction concurrency limits.
Concurrent cache request deduplication
redis/index.ts
CacheGet replaces duplicate-detection throwing with promise reuse: concurrent calls with identical userId and methodName return the same in-flight promise instead of rejecting.
Chronik WebSocket reconnection and retry logic
services/chronikService.ts
ChronikBlockchainClient adds reconnection guarding (wsReconnecting), exponential-backoff retry loop (connectWsWithRetry()), endpoint recreation with re-subscription (reconnectWs()), and manual reconnection wiring in WebSocket config.
Confirmed transaction concurrency throttling
services/chronikService.ts
processWsMessage waits when confirmed-transaction count reaches the configured maximum, increments the counter before processing, and decrements in a finally block.
Background cache rebuild infrastructure and state
redis/types.ts, redis/paymentCache.ts, redis/dashboardCache.ts
DashboardData adds optional cacheRebuilding flag; payment cache exports isBackgroundRebuildActive and cacheAddressesInBackground for non-blocking per-user rebuilds with error handling and dashboard-cache clearing; getPaymentList and getPaymentStream trigger async rebuilds; generateAndCacheGroupedPaymentsAndInfoForAddress validates transaction prices and adds batch delays; getUserDashboardData conditionally sets rebuild flag or caches based on rebuild state.
Dashboard UI polling and rebuild status banner
pages/api/payments/count/index.ts, pages/dashboard/index.tsx, pages/dashboard/dashboard.module.css
Payments-count API always computes via getFilteredTransactionCount (removes Redis fallback); Dashboard component polls api/dashboard every 15 seconds when cacheRebuilding is true; new .cache_banner CSS class and conditional banner render.
Query optimization for address and transaction services
services/addressService.ts, services/transactionService.ts
Address payment info uses Prisma aggregation (_sum and _count) instead of fetching all transactions; transaction queries adopt cursor-based pagination with reduced page size, change price validation to warnings/skipping, and replace nested relation filtering with prefetched address-ID matching for user and paybutton constraints.
Unit and integration test updates
tests/integration-tests/api.test.ts, tests/unittests/addressService.test.ts, tests/unittests/transactionService.test.ts
Integration test expects cacheRebuilding field; address-service test mocks Prisma aggregation/count instead of findMany; transaction-service test adds beforeEach mock for address.findMany.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • PayButton/paybutton-server#1128: Overlaps in services/chronikService.ts processWsMessage confirmed-transaction handling (deduplication vs. concurrency throttling on the same code path).
  • PayButton/paybutton-server#1070: Both add Chronik WebSocket retry/reconnection and confirmed-transaction bounded concurrency to address freeze/race conditions.
  • PayButton/paybutton-server#1124: Both modify getFilteredTransactionCount and transaction filtering logic in services/transactionService.ts.

Suggested labels

bug

Suggested reviewers

  • Fabcien
  • Klakurka

Poem

🐰 A rabbit's tale of cache rebuilt with care,
WebSockets retry through the air,
Deduplicate the calls, let backgrounds run free,
Cursor pagination dancing gracefully,
No more froze transactions here—just speedy cheer!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title concisely summarizes the three main fixes: transaction price linking, dashboard load performance, and initial sync resilience—all core objectives of the changeset.
Description check ✅ Passed The description is comprehensive and well-structured, covering dependency, implementation details, performance metrics, and a concrete test plan with steps for local validation.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/tx-prices-linking

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
tests/integration-tests/api.test.ts (1)

1348-1366: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid hard-coding a transient rebuild state in the dashboard integration test.

Line 1366 asserts cacheRebuilding: true, but this test never sets up an active background rebuild. Since the flag reflects runtime state, this makes the suite environment-dependent and it will fail whenever rebuilds are already complete or skipped. Assert the field shape here, or explicitly mock rebuild-in-progress state in a dedicated test.

Proposed fix
     expect(responseData).toEqual({
       thirtyDays: expectedPeriodData,
       sevenDays: expectedPeriodData,
       year: expectedPeriodData,
       all: expectedPeriodData,
       total: {
         revenue: {
           usd: expect.any(String),
           cad: expect.any(String)
         },
         payments: expect.any(Number),
         buttons: expect.any(Number)
       },
       filtered: expect.any(Boolean),
-      cacheRebuilding: true
+      cacheRebuilding: expect.any(Boolean)
     }
     )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/integration-tests/api.test.ts` around lines 1348 - 1366, The test
currently asserts a runtime-specific boolean by expecting cacheRebuilding: true
in the 'Should return HTTP 200' case; change the assertion to validate the field
shape instead (e.g., expect.any(Boolean)) or remove the explicit value check so
the test does not depend on a transient rebuild state; locate the block in the
it('Should return HTTP 200', ...) test where responseData is compared to the
expected object and replace the cacheRebuilding: true entry with a shape
assertion (using expect.any(Boolean)) or omit it and create a separate test that
mocks an active rebuild if you need to assert true.
redis/index.ts (1)

80-131: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Include all result-shaping arguments in the dedup key.

executeCall() now coalesces only by userId + methodName, but dashboardData() also varies by timezone and buttonIds, and paymentsCount() varies by timezone. Concurrent requests for the same user with different filters will reuse the first in-flight promise and return the wrong payload to one caller.

Proposed fix
-  private static readonly pendingPromises = new Map<string, Map<MethodName, Promise<any>>>()
+  private static readonly pendingPromises = new Map<string, Promise<unknown>>()

   private static async executeCall<T>(
-    userId: string,
-    methodName: MethodName,
+    dedupeKey: string,
     fn: () => Promise<T>
   ): Promise<T> {
-    let userMap = this.pendingPromises.get(userId)
-    if (userMap === undefined) {
-      userMap = new Map()
-      this.pendingPromises.set(userId, userMap)
-    }
-
-    const existing = userMap.get(methodName)
+    const existing = this.pendingPromises.get(dedupeKey)
     if (existing !== undefined) {
       return await existing as T
     }

     const promise = fn()
-    userMap.set(methodName, promise)
+    this.pendingPromises.set(dedupeKey, promise)

     try {
       return await promise
     } finally {
-      userMap.delete(methodName)
-      if (userMap.size === 0) {
-        this.pendingPromises.delete(userId)
-      }
+      this.pendingPromises.delete(dedupeKey)
     }
   }

   static async dashboardData (userId: string, timezone: string, buttonIds?: string[]): Promise<DashboardData> {
-    return await this.executeCall(userId, 'dashboardData', async () => {
+    const key = `dashboardData:${userId}:${timezone}:${[...(buttonIds ?? [])].sort().join(',')}`
+    return await this.executeCall(key, async () => {
       return await getUserDashboardData(userId, timezone, buttonIds)
     })
   }

   static async paymentsCount (userId: string, timezone: string): Promise<number> {
-    return await this.executeCall(userId, 'paymentsCount', async () => {
+    return await this.executeCall(`paymentsCount:${userId}:${timezone}`, async () => {
       return await getCachedPaymentsCountForUser(userId, timezone)
     })
   }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@redis/index.ts` around lines 80 - 131, The dedup key currently used by
executeCall (userId + methodName) is missing result-shaping arguments, causing
dashboardData and paymentsCount callers with different timezone/buttonIds to
receive the wrong in-flight result; fix by changing executeCall to accept an
explicit dedupKey (string) or an array of key parts and use that composite key
in pendingPromises instead of just userId, then update callers dashboardData to
include timezone and buttonIds in the dedup key and paymentsCount to include
timezone (e.g., pass
`${userId}|dashboardData|${timezone}|${buttonIds?.join(',')}` or equivalent via
an array-of-params approach) so in-flight promises are coalesced only when all
shaping args match.
redis/dashboardCache.ts (1)

365-391: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Surface rebuild state on cached dashboard responses too.

Right now cacheRebuilding is only attached on a cache miss. If a stale dashboard summary is already cached when cacheAddressesInBackground() starts, this path returns it unchanged, so the client never sees the rebuild banner or polling state until the rebuild finishes and clears the cache. Please check isBackgroundRebuildActive(userId) before the early cached return and overlay the flag there as well.

Suggested fix
 export const getUserDashboardData = async function (userId: string, timezone: string, paybuttonIds?: string[]): Promise<DashboardData> {
   let dashboardData = await getCachedDashboardData(userId)
+  const rebuilding = isBackgroundRebuildActive(userId)
   if ((paybuttonIds !== undefined && paybuttonIds.length > 0) ||
     dashboardData?.filtered === true) {
     dashboardData = null
   }
+  if (dashboardData !== null) {
+    return rebuilding ? { ...dashboardData, cacheRebuilding: true } : dashboardData
+  }
-  if (dashboardData === null) {
-    console.log('[CACHE]: Recreating dashboard for user', userId)
-    const nMonthsTotal = await getNumberOfMonths(userId)
-    const paymentStream = getPaymentStream(userId)
+  console.log('[CACHE]: Recreating dashboard for user', userId)
+  const nMonthsTotal = await getNumberOfMonths(userId)
+  const paymentStream = getPaymentStream(userId)
 
-    const dashboardData = await generateDashboardDataFromStream(
-      paymentStream,
-      nMonthsTotal,
-      { revenue: '`#66fe91`', payments: '`#669cfe`' },
-      timezone,
-      paybuttonIds
-    )
+  const rebuiltDashboardData = await generateDashboardDataFromStream(
+    paymentStream,
+    nMonthsTotal,
+    { revenue: '`#66fe91`', payments: '`#669cfe`' },
+    timezone,
+    paybuttonIds
+  )
 
-    const rebuilding = isBackgroundRebuildActive(userId)
-    if (rebuilding) {
-      dashboardData.cacheRebuilding = true
-    } else {
-      await cacheDashboardData(userId, dashboardData)
-    }
-    return dashboardData
+  if (rebuilding) {
+    rebuiltDashboardData.cacheRebuilding = true
+  } else {
+    await cacheDashboardData(userId, rebuiltDashboardData)
   }
-  return dashboardData
+  return rebuiltDashboardData
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@redis/dashboardCache.ts` around lines 365 - 391, The cached-return path
should surface background rebuild state: after retrieving dashboardData via
getCachedDashboardData(userId) and before returning it early, call
isBackgroundRebuildActive(userId) and if true set dashboardData.cacheRebuilding
= true so the client sees the rebuild banner; ensure you modify the existing
dashboardData object (don't shadow the variable name) so the flag is preserved,
and leave the existing cache-miss path (generateDashboardDataFromStream /
cacheDashboardData) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pages/dashboard/index.tsx`:
- Around line 143-156: The current useEffect uses a single setTimeout based on
dashboardData?.cacheRebuilding and selectedButtonIds, but if the fetched
response still has cacheRebuilding: true the effect’s dependency value doesn’t
change and no further poll is scheduled; change to an explicit polling loop:
implement a local async function (e.g., pollDashboard) inside the effect that
fetches via the same URL logic (using selectedButtonIds and moment.tz.guess()),
calls setDashboardData(json), and if json.cacheRebuilding === true schedules
itself with setTimeout to run again (or immediately re-invoke after await) so
polling continues until cacheRebuilding becomes false; ensure you clear the
pending timeout on cleanup and reference the same symbols (useEffect,
selectedButtonIds, pollDashboard, setDashboardData,
dashboardData?.cacheRebuilding, timer) so the loop respects selection changes
and is cancelled properly.

In `@services/chronikService.ts`:
- Around line 204-220: The reconnect logic reopens Chronik during an intentional
teardown; add a wsShuttingDown boolean flag on ChronikBlockchainClient, set it
to true in a new shutdown() method (call shutdown() from
MultiBlockchainClient.destroy() before closing chronikWSEndpoint), and guard the
onEnd handler and reconnectWs() (which currently uses wsReconnecting and
reconnectWs) to return immediately if wsShuttingDown is true so the socket isn't
reopened during shutdown; apply the same guard for the other onEnd/reconnect
path around the code at the 705-714 area.
- Around line 186-202: connectWsWithRetry currently stops trying after
CHRONIK_WS_MAX_RETRIES and leaves chronikWSEndpoint offline since autoReconnect
is disabled; change it so that after the final failed attempt it schedules a
future reconnect (using setTimeout) to call this.connectWsWithRetry again with
exponential backoff rather than just logging and returning, and guard scheduling
with a flag/property (e.g., this.reconnectTimerId or this.isReconnectScheduled)
to avoid multiple concurrent timers; ensure the scheduled attempt still uses
chronikWSEndpoint.waitForOpen and honors baseDelay/backoff so outages recover
automatically without a process restart.

---

Outside diff comments:
In `@redis/dashboardCache.ts`:
- Around line 365-391: The cached-return path should surface background rebuild
state: after retrieving dashboardData via getCachedDashboardData(userId) and
before returning it early, call isBackgroundRebuildActive(userId) and if true
set dashboardData.cacheRebuilding = true so the client sees the rebuild banner;
ensure you modify the existing dashboardData object (don't shadow the variable
name) so the flag is preserved, and leave the existing cache-miss path
(generateDashboardDataFromStream / cacheDashboardData) unchanged.

In `@redis/index.ts`:
- Around line 80-131: The dedup key currently used by executeCall (userId +
methodName) is missing result-shaping arguments, causing dashboardData and
paymentsCount callers with different timezone/buttonIds to receive the wrong
in-flight result; fix by changing executeCall to accept an explicit dedupKey
(string) or an array of key parts and use that composite key in pendingPromises
instead of just userId, then update callers dashboardData to include timezone
and buttonIds in the dedup key and paymentsCount to include timezone (e.g., pass
`${userId}|dashboardData|${timezone}|${buttonIds?.join(',')}` or equivalent via
an array-of-params approach) so in-flight promises are coalesced only when all
shaping args match.

In `@tests/integration-tests/api.test.ts`:
- Around line 1348-1366: The test currently asserts a runtime-specific boolean
by expecting cacheRebuilding: true in the 'Should return HTTP 200' case; change
the assertion to validate the field shape instead (e.g., expect.any(Boolean)) or
remove the explicit value check so the test does not depend on a transient
rebuild state; locate the block in the it('Should return HTTP 200', ...) test
where responseData is compared to the expected object and replace the
cacheRebuilding: true entry with a shape assertion (using expect.any(Boolean))
or omit it and create a separate test that mocks an active rebuild if you need
to assert true.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0036b6ff-4960-41ad-b7a6-ab5c16477bed

📥 Commits

Reviewing files that changed from the base of the PR and between 7d4e91d and f83885d.

📒 Files selected for processing (16)
  • .env.from-dump
  • .gitignore
  • constants/index.ts
  • pages/api/payments/count/index.ts
  • pages/dashboard/dashboard.module.css
  • pages/dashboard/index.tsx
  • redis/dashboardCache.ts
  • redis/index.ts
  • redis/paymentCache.ts
  • redis/types.ts
  • services/addressService.ts
  • services/chronikService.ts
  • services/transactionService.ts
  • tests/integration-tests/api.test.ts
  • tests/unittests/addressService.test.ts
  • tests/unittests/transactionService.test.ts

Comment thread pages/dashboard/index.tsx
Comment on lines +143 to +156
useEffect(() => {
if (dashboardData?.cacheRebuilding !== true) return
const timer = setTimeout(() => {
let url = 'api/dashboard'
if (selectedButtonIds.length > 0) {
url += `?buttonIds=${selectedButtonIds.join(',')}`
}
fetch(url, { headers: { Timezone: moment.tz.guess() } })
.then(async res => await res.json())
.then(json => { setDashboardData(json) })
.catch(console.error)
}, 15000)
return () => clearTimeout(timer)
}, [dashboardData?.cacheRebuilding, selectedButtonIds])
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Polling stops after the first refresh.

This effect only depends on the boolean dashboardData?.cacheRebuilding. If the first refresh still returns cacheRebuilding: true, the dependencies do not change, so no second timer is scheduled. Long-running rebuilds will therefore update once and then stall.

Suggested fix
   useEffect(() => {
     if (dashboardData?.cacheRebuilding !== true) return
-    const timer = setTimeout(() => {
-      let url = 'api/dashboard'
-      if (selectedButtonIds.length > 0) {
-        url += `?buttonIds=${selectedButtonIds.join(',')}`
-      }
-      fetch(url, { headers: { Timezone: moment.tz.guess() } })
-        .then(async res => await res.json())
-        .then(json => { setDashboardData(json) })
-        .catch(console.error)
-    }, 15000)
-    return () => clearTimeout(timer)
-  }, [dashboardData?.cacheRebuilding, selectedButtonIds])
+    const interval = setInterval(() => {
+      let url = 'api/dashboard'
+      if (selectedButtonIds.length > 0) {
+        url += `?buttonIds=${selectedButtonIds.join(',')}`
+      }
+      fetch(url, { headers: { Timezone: moment.tz.guess() } })
+        .then(async res => await res.json())
+        .then(json => { setDashboardData(json) })
+        .catch(console.error)
+    }, 15000)
+    return () => clearInterval(interval)
+  }, [dashboardData, selectedButtonIds])
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pages/dashboard/index.tsx` around lines 143 - 156, The current useEffect uses
a single setTimeout based on dashboardData?.cacheRebuilding and
selectedButtonIds, but if the fetched response still has cacheRebuilding: true
the effect’s dependency value doesn’t change and no further poll is scheduled;
change to an explicit polling loop: implement a local async function (e.g.,
pollDashboard) inside the effect that fetches via the same URL logic (using
selectedButtonIds and moment.tz.guess()), calls setDashboardData(json), and if
json.cacheRebuilding === true schedules itself with setTimeout to run again (or
immediately re-invoke after await) so polling continues until cacheRebuilding
becomes false; ensure you clear the pending timeout on cleanup and reference the
same symbols (useEffect, selectedButtonIds, pollDashboard, setDashboardData,
dashboardData?.cacheRebuilding, timer) so the loop respects selection changes
and is cancelled properly.

Comment on lines +186 to +202
private async connectWsWithRetry (maxRetries = CHRONIK_WS_MAX_RETRIES, baseDelay = CHRONIK_WS_BASE_DELAY_MS): Promise<void> {
for (let attempt = 1; attempt <= maxRetries; attempt++) {
try {
await this.chronikWSEndpoint.waitForOpen()
console.log(`${this.CHRONIK_MSG_PREFIX}: WebSocket connected.`)
return
} catch (err: any) {
console.error(`${this.CHRONIK_MSG_PREFIX}: WebSocket connection attempt ${attempt}/${maxRetries} failed: ${err.message as string}`)
if (attempt < maxRetries) {
const delay = Math.min(baseDelay * Math.pow(2, attempt - 1), 60000)
console.log(`${this.CHRONIK_MSG_PREFIX}: Retrying WebSocket in ${delay / 1000}s...`)
await new Promise(resolve => setTimeout(resolve, delay))
}
}
}
console.error(`${this.CHRONIK_MSG_PREFIX}: WebSocket failed after ${maxRetries} attempts. Continuing without real-time updates.`)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Retry exhaustion leaves the client permanently offline.

When connectWsWithRetry() hits CHRONIK_WS_MAX_RETRIES, it only logs and returns. Since autoReconnect is disabled, nothing schedules another reconnect cycle after that point, so a long Chronik outage becomes a permanent loss of real-time updates until the process restarts.

Proposed fix
   private async connectWsWithRetry (maxRetries = CHRONIK_WS_MAX_RETRIES, baseDelay = CHRONIK_WS_BASE_DELAY_MS): Promise<void> {
     for (let attempt = 1; attempt <= maxRetries; attempt++) {
@@
     }
-    console.error(`${this.CHRONIK_MSG_PREFIX}: WebSocket failed after ${maxRetries} attempts. Continuing without real-time updates.`)
+    console.error(`${this.CHRONIK_MSG_PREFIX}: WebSocket failed after ${maxRetries} attempts. Scheduling another reconnect cycle.`)
+    if (!this.wsShuttingDown) {
+      await new Promise(resolve => setTimeout(resolve, 60000))
+      await this.reconnectWs()
+    }
   }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@services/chronikService.ts` around lines 186 - 202, connectWsWithRetry
currently stops trying after CHRONIK_WS_MAX_RETRIES and leaves chronikWSEndpoint
offline since autoReconnect is disabled; change it so that after the final
failed attempt it schedules a future reconnect (using setTimeout) to call
this.connectWsWithRetry again with exponential backoff rather than just logging
and returning, and guard scheduling with a flag/property (e.g.,
this.reconnectTimerId or this.isReconnectScheduled) to avoid multiple concurrent
timers; ensure the scheduled attempt still uses chronikWSEndpoint.waitForOpen
and honors baseDelay/backoff so outages recover automatically without a process
restart.

Comment on lines +204 to +220
private async reconnectWs (): Promise<void> {
if (this.wsReconnecting) return
this.wsReconnecting = true
try {
const addresses = this.getSubscribedAddresses()
this.chronikWSEndpoint = this.chronik.ws(this.getWsConfig())
this.chronikWSEndpoint.subscribeToBlocks()
for (const addr of addresses) {
this.chronikWSEndpoint.subscribeToAddress(addr)
}
await this.connectWsWithRetry()
} catch (err: any) {
console.error(`${this.CHRONIK_MSG_PREFIX}: WebSocket reconnection error: ${err.message as string}`)
} finally {
this.wsReconnecting = false
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Prevent reconnects during intentional shutdown.

destroy() closes chronikWSEndpoint at Lines 1460-1465, and this new onEnd handler immediately calls reconnectWs(). Without a shutdown flag, teardown now reopens the Chronik socket you are trying to close, which leaks background connections during process shutdown and test cleanup.

Proposed fix
 export class ChronikBlockchainClient {
   chronik!: ChronikClient
@@
   private latencyTestFinished: boolean
   private wsReconnecting = false
   private confirmedTxsBeingProcessed = 0
+  private wsShuttingDown = false
@@
   private async reconnectWs (): Promise<void> {
-    if (this.wsReconnecting) return
+    if (this.wsReconnecting || this.wsShuttingDown) return
     this.wsReconnecting = true
@@
       onReconnect: (_: ws.Event) => {
+        if (this.wsShuttingDown) return
         console.log(`${this.CHRONIK_MSG_PREFIX}: Chronik webSocket unexpectedly closed. Attempting reconnection...`)
         void this.reconnectWs()
       },
@@
       onEnd: (e: ws.Event) => {
+        if (this.wsShuttingDown) return
         console.log(`${this.CHRONIK_MSG_PREFIX}: Chronik WebSocket ended, type: ${e.type}. Attempting reconnection...`)
         void this.reconnectWs()
       },

Add a small shutdown() method on ChronikBlockchainClient that sets wsShuttingDown = true before closing sockets, and call that from MultiBlockchainClient.destroy().

Also applies to: 705-714

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@services/chronikService.ts` around lines 204 - 220, The reconnect logic
reopens Chronik during an intentional teardown; add a wsShuttingDown boolean
flag on ChronikBlockchainClient, set it to true in a new shutdown() method (call
shutdown() from MultiBlockchainClient.destroy() before closing
chronikWSEndpoint), and guard the onEnd handler and reconnectWs() (which
currently uses wsReconnecting and reconnectWs) to return immediately if
wsShuttingDown is true so the socket isn't reopened during shutdown; apply the
same guard for the other onEnd/reconnect path around the code at the 705-714
area.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant