Skip to content

fix(feed): Improve NVD feed download resilience (x2)#3344

Merged
dcaravel merged 1 commit into
masterfrom
dc/nvd-resilience-x2
Jun 30, 2026
Merged

fix(feed): Improve NVD feed download resilience (x2)#3344
dcaravel merged 1 commit into
masterfrom
dc/nvd-resilience-x2

Conversation

@dcaravel

@dcaravel dcaravel commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Expands on #3343 to increase the retires / timeouts - based on observed status in https://github.com/stackrox/scanner/actions/runs/28467785361/job/84372363491 riding the threshold (update: and failing)

@dcaravel dcaravel requested a review from a team as a code owner June 30, 2026 20:14

@jvdm jvdm left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concerns with Konflux timeout, do we need to look at that @BradLugo?

@coderabbitai

coderabbitai Bot commented Jun 30, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

This PR makes two small numeric configuration changes in the NVD vulnerability loader package: the HTTP client timeout is increased from 6 to 10 minutes, and the feed download retry limit is increased from 5 to 10 attempts. No other logic is modified.

Changes

NVD Loader Resilience Tuning

Layer / File(s) Summary
Increase HTTP timeout and retry limit
pkg/vulnloader/nvdloader/loader_api.go, pkg/vulnloader/nvdloader/loader_feed.go
The NVD API client Timeout is increased from 6 to 10 minutes, and maxRetries in downloadFeedForYear is increased from 5 to 10 attempts.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly matches the change: improving NVD feed download resilience by increasing retries and timeouts.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The description matches the diff, describing increased retries and timeouts for NVD feed downloads.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch dc/nvd-resilience-x2

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
pkg/vulnloader/nvdloader/loader_feed.go (1)

53-68: 🚀 Performance & Scalability | 🔵 Trivial | 🏗️ Heavy lift

Worst-case retry duration grows substantially with unbounded exponential backoff.

With maxRetries = 10 and backoff doubling from 10s each attempt, the worst-case cumulative sleep time before giving up on a single year is 10*(2^9-1) ≈ 5110s (~85 minutes), on top of per-attempt HTTP timeouts (now 10 minutes each, per loader_api.go). Since this loop runs for every year from 2002 to present sequentially, a few consecutive failing years could push the overall job well past typical CI timeout budgets, defeating the goal of improving resilience. Consider capping the backoff (e.g., via min(backoff, maxBackoff)) so retries don't grow unbounded.

♻️ Cap exponential backoff
+	const maxBackoff = 2 * time.Minute
 	const maxRetries = 10
 	backoff := 10 * time.Second
 	var apiFeed *apischema.CVEAPIJSON20
 	for attempt := 1; ; attempt++ {
 		var err error
 		apiFeed, err = fetchFeed(url, year)
 		if err == nil {
 			break
 		}
 		if attempt >= maxRetries {
 			return errors.Wrapf(err, "failed to download feed for year %d after %d attempts", year, attempt)
 		}
 		log.Warnf("Feed year %d: attempt %d failed: %v; retrying in %s", year, attempt, err, backoff)
 		time.Sleep(backoff)
-		backoff *= 2
+		backoff *= 2
+		if backoff > maxBackoff {
+			backoff = maxBackoff
+		}
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/vulnloader/nvdloader/loader_feed.go` around lines 53 - 68, The retry loop
in fetchFeed currently uses uncapped exponential backoff, which can make
loader_feed.go spend far too long retrying a failing year. Update the retry
logic around maxRetries/backoff in the feed download loop to cap the sleep
duration with a maximum backoff value, while still doubling between attempts
until that cap is reached. Keep the existing retry logging and error handling
intact, but ensure the backoff growth in this loop cannot become unbounded.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@pkg/vulnloader/nvdloader/loader_feed.go`:
- Around line 53-68: The retry loop in fetchFeed currently uses uncapped
exponential backoff, which can make loader_feed.go spend far too long retrying a
failing year. Update the retry logic around maxRetries/backoff in the feed
download loop to cap the sleep duration with a maximum backoff value, while
still doubling between attempts until that cap is reached. Keep the existing
retry logging and error handling intact, but ensure the backoff growth in this
loop cannot become unbounded.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 8200de36-faa1-4359-bd95-b6519545d632

📥 Commits

Reviewing files that changed from the base of the PR and between 14ea07f and cbefb87.

📒 Files selected for processing (2)
  • pkg/vulnloader/nvdloader/loader_api.go
  • pkg/vulnloader/nvdloader/loader_feed.go

@BradLugo

Copy link
Copy Markdown
Contributor

Concerns with Konflux timeout, do we need to look at that @BradLugo?

I think doing retries as needed will be sufficient. If yall are curious about what I was referring to, here's an example from when I bumped the konflux timeout last time:

@dcaravel dcaravel merged commit c2626ef into master Jun 30, 2026
36 checks passed
@dcaravel dcaravel deleted the dc/nvd-resilience-x2 branch June 30, 2026 21:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants