Skip to content

fix(feed): Improve NVD feed download resilience#3343

Merged
dcaravel merged 1 commit into
masterfrom
dc/nvd-resilience
Jun 30, 2026
Merged

fix(feed): Improve NVD feed download resilience#3343
dcaravel merged 1 commit into
masterfrom
dc/nvd-resilience

Conversation

@dcaravel

@dcaravel dcaravel commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Add retry logic with exponential backoff (5 attempts, starting at 10s) to NVD feed downloads, which previously had no retry handling — unlike the API loader path which already had queryWithBackoff
  • Add detailed logging (year, HTTP status, content-length, protocol, timing, bytes read) to diagnose feed download failures
  • Separate network I/O from JSON parsing by reading the full response body before unmarshalling, so errors clearly indicate whether the failure was a network issue or malformed data
  • Increase HTTP client timeout from 5m to 6m to accommodate slow transfers from NVD

Context

CI was failing with stream error: stream ID 1; INTERNAL_ERROR; received from peer when downloading NVD feeds. The NVD server intermittently kills HTTP/2 streams mid-transfer. The previous code had no retries and no logging beyond the opaque error, making failures hard to diagnose and impossible to recover from.

Testing

CI and tested locally

$ ROX_NVD_FEED_LOADER=true go run ./cmd/updater generate-dump --out-file dignore/dump.zip
INFO[0000] Using temp dir "/var/folders/5n/r7z59gz16v98qlf3mb_d3_gh0000gn/T/vuln-updater1459286801" for scratch space 
INFO[0000] Downloading istio...                         
INFO[0000] Downloading k8s...                           
INFO[0000] Downloading nvd...                           
INFO[0000] Downloading NVD data using 2.0 Data Feed     
INFO[0001] Downloading NVD feed for year 2002 from https://nvd.nist.gov/feeds/json/cve/2.0/nvdcve-2.0-2002.json.gz 
INFO[0001] Feed year 2002: HTTP 200, Content-Length: 2176171, Proto: HTTP/2.0 (connect took 227.062584ms) 
INFO[0024] Feed year 2002: read 27075193 decompressed bytes (elapsed: 23.067583792s) 
INFO[0024] Feed year 2002: completed with 2371 vulnerabilities 
INFO[0024] Downloading NVD feed for year 2003 from https://nvd.nist.gov/feeds/json/cve/2.0/nvdcve-2.0-2003.json.gz 
INFO[0024] Feed year 2003: HTTP 200, Content-Length: 728839, Proto: HTTP/2.0 (connect took 335.20175ms) 
INFO[0038] Feed year 2003: read 7908123 decompressed bytes (elapsed: 14.576578792s) 
INFO[0039] Feed year 2003: completed with 1518 vulnerabilities 
INFO[0039] Downloading NVD feed for year 2004 from https://nvd.nist.gov/feeds/json/cve/2.0/nvdcve-2.0-2004.json.gz 
INFO[0039] Feed year 2004: HTTP 200, Content-Length: 1489929, Proto: HTTP/2.0 (connect took 168.428334ms) 
INFO[0052] Feed year 2004: read 16380792 decompressed bytes (elapsed: 13.98577125s) 
INFO[0053] Feed year 2004: completed with 2658 vulnerabilities 
INFO[0053] Downloading NVD feed for year 2005 from https://nvd.nist.gov/feeds/json/cve/2.0/nvdcve-2.0-2005.json.gz 
INFO[0053] Feed year 2005: HTTP 200, Content-Length: 2172295, Proto: HTTP/2.0 (connect took 125.782ms) 
INFO[0080] Feed year 2005: read 25788598 decompressed bytes (elapsed: 27.719835125s) 
INFO[0081] Feed year 2005: completed with 4641 vulnerabilities 
INFO[0081] Downloading NVD feed for year 2006 from https://nvd.nist.gov/feeds/json/cve/2.0/nvdcve-2.0-2006.json.gz 
INFO[0081] Feed year 2006: HTTP 200, Content-Length: 3499086, Proto: HTTP/2.0 (connect took 197.345459ms) 
INFO[0113] Feed year 2006: read 41214931 decompressed bytes (elapsed: 32.4874215s) 
INFO[0113] Feed year 2006: completed with 7009 vulnerabilities 
INFO[0113] Downloading NVD feed for year 2007 from https://nvd.nist.gov/feeds/json/cve/2.0/nvdcve-2.0-2007.json.gz 
INFO[0114] Feed year 2007: HTTP 200, Content-Length: 3605890, Proto: HTTP/2.0 (connect took 224.533042ms) 
INFO[0158] Feed year 2007: read 39412640 decompressed bytes (elapsed: 44.793401792s) 
INFO[0159] Feed year 2007: completed with 6472 vulnerabilities 
INFO[0159] Downloading NVD feed for year 2008 from https://nvd.nist.gov/feeds/json/cve/2.0/nvdcve-2.0-2008.json.gz 
INFO[0159] Feed year 2008: HTTP 200, Content-Length: 4184728, Proto: HTTP/2.0 (connect took 781.288166ms) 
WARN[0232] Feed year 2008: attempt 1 failed: reading feed body (read 21255684 bytes, elapsed: 1m12.976061458s): stream error: stream ID 13; INTERNAL_ERROR; received from peer; retrying in 10s 
INFO[0242] Downloading NVD feed for year 2008 from https://nvd.nist.gov/feeds/json/cve/2.0/nvdcve-2.0-2008.json.gz 
INFO[0242] Feed year 2008: HTTP 200, Content-Length: 4184728, Proto: HTTP/2.0 (connect took 113.769833ms) 
INFO[0277] Feed year 2008: read 45538353 decompressed bytes (elapsed: 35.7586865s) 
INFO[0278] Feed year 2008: completed with 7019 vulnerabilities 
INFO[0278] Downloading NVD feed for year 2009 from https://nvd.nist.gov/feeds/json/cve/2.0/nvdcve-2.0-2009.json.gz 
INFO[0278] Feed year 2009: HTTP 200, Content-Length: 4527844, Proto: HTTP/2.0 (connect took 148.260417ms) 
INFO[0332] Feed year 2009: read 43658887 decompressed bytes (elapsed: 53.983005167s) 
INFO[0332] Feed year 2009: completed with 4935 vulnerabilities 
INFO[0332] Downloading NVD feed for year 2010 from https://nvd.nist.gov/feeds/json/cve/2.0/nvdcve-2.0-2010.json.gz 
INFO[0332] Feed year 2010: HTTP 200, Content-Length: 4359722, Proto: HTTP/2.0 (connect took 212.959667ms) 
INFO[0381] Feed year 2010: read 47114329 decompressed bytes (elapsed: 48.90411675s) 
INFO[0381] Feed year 2010: completed with 5088 vulnerabilities 
INFO[0381] Downloading NVD feed for year 2011 from https://nvd.nist.gov/feeds/json/cve/2.0/nvdcve-2.0-2011.json.gz 
INFO[0382] Feed year 2011: HTTP 200, Content-Length: 4274446, Proto: HTTP/2.0 (connect took 135.668959ms) 
INFO[0430] Feed year 2011: read 46637075 decompressed bytes (elapsed: 48.99382225s) 
INFO[0431] Feed year 2011: completed with 4659 vulnerabilities 
INFO[0431] Downloading NVD feed for year 2012 from https://nvd.nist.gov/feeds/json/cve/2.0/nvdcve-2.0-2012.json.gz 
INFO[0431] Feed year 2012: HTTP 200, Content-Length: 4864066, Proto: HTTP/2.0 (connect took 262.515ms) 
INFO[0514] Feed year 2012: read 54356777 decompressed bytes (elapsed: 1m23.475092917s) 
INFO[0515] Feed year 2012: completed with 5502 vulnerabilities 
INFO[0515] Downloading NVD feed for year 2013 from https://nvd.nist.gov/feeds/json/cve/2.0/nvdcve-2.0-2013.json.gz 
INFO[0515] Feed year 2013: HTTP 200, Content-Length: 5797516, Proto: HTTP/2.0 (connect took 172.976416ms) 
INFO[0588] Feed year 2013: read 59649684 decompressed bytes (elapsed: 1m13.562761958s) 
INFO[0589] Feed year 2013: completed with 6235 vulnerabilities 
INFO[0589] Downloading NVD feed for year 2014 from https://nvd.nist.gov/feeds/json/cve/2.0/nvdcve-2.0-2014.json.gz 
INFO[0592] Feed year 2014: HTTP 200, Content-Length: 4546320, Proto: HTTP/2.0 (connect took 3.525931541s) 
INFO[0650] Feed year 2014: read 55250771 decompressed bytes (elapsed: 1m1.571582541s) 
INFO[0651] Feed year 2014: completed with 8441 vulnerabilities 
INFO[0651] Downloading NVD feed for year 2015 from https://nvd.nist.gov/feeds/json/cve/2.0/nvdcve-2.0-2015.json.gz 
INFO[0651] Feed year 2015: HTTP 200, Content-Length: 4189750, Proto: HTTP/2.0 (connect took 137.389333ms) 
INFO[0698] Feed year 2015: read 55160881 decompressed bytes (elapsed: 47.524761167s) 
INFO[0699] Feed year 2015: completed with 8125 vulnerabilities 
INFO[0699] Downloading NVD feed for year 2016 from https://nvd.nist.gov/feeds/json/cve/2.0/nvdcve-2.0-2016.json.gz 
INFO[0699] Feed year 2016: HTTP 200, Content-Length: 5156473, Proto: HTTP/2.0 (connect took 418.343916ms) 
INFO[0769] Feed year 2016: read 70164601 decompressed bytes (elapsed: 1m9.825750916s) 
INFO[0769] Feed year 2016: completed with 9379 vulnerabilities 
INFO[0769] Downloading NVD feed for year 2017 from https://nvd.nist.gov/feeds/json/cve/2.0/nvdcve-2.0-2017.json.gz 
INFO[0769] Feed year 2017: HTTP 200, Content-Length: 7703862, Proto: HTTP/2.0 (connect took 134.323542ms) 
INFO[0864] Feed year 2017: read 101735258 decompressed bytes (elapsed: 1m34.455919709s) 
WARN[0865] Skipping vuln CVE-2017-5638 because it is being manually enriched 
INFO[0865] Feed year 2017: completed with 14773 vulnerabilities 
INFO[0865] Downloading NVD feed for year 2018 from https://nvd.nist.gov/feeds/json/cve/2.0/nvdcve-2.0-2018.json.gz 
INFO[0865] Feed year 2018: HTTP 200, Content-Length: 8252744, Proto: HTTP/2.0 (connect took 177.03025ms) 
INFO[0960] Feed year 2018: read 108616775 decompressed bytes (elapsed: 1m35.263610458s) 
INFO[0961] Feed year 2018: completed with 16202 vulnerabilities 
INFO[0961] Downloading NVD feed for year 2019 from https://nvd.nist.gov/feeds/json/cve/2.0/nvdcve-2.0-2019.json.gz 
INFO[0961] Feed year 2019: HTTP 200, Content-Length: 10161812, Proto: HTTP/2.0 (connect took 109.398083ms) 
INFO[1113] Feed year 2019: read 125767191 decompressed bytes (elapsed: 2m32.204399083s) 
INFO[1114] Feed year 2019: completed with 16107 vulnerabilities 
INFO[1114] Downloading NVD feed for year 2020 from https://nvd.nist.gov/feeds/json/cve/2.0/nvdcve-2.0-2020.json.gz 
INFO[1114] Feed year 2020: HTTP 200, Content-Length: 13961420, Proto: HTTP/2.0 (connect took 82.263791ms) 
WARN[1254] Feed year 2020: attempt 1 failed: reading feed body (read 64231732 bytes, elapsed: 2m19.741525166s): stream error: stream ID 39; INTERNAL_ERROR; received from peer; retrying in 10s 
INFO[1264] Downloading NVD feed for year 2020 from https://nvd.nist.gov/feeds/json/cve/2.0/nvdcve-2.0-2020.json.gz 
INFO[1264] Feed year 2020: HTTP 200, Content-Length: 13961420, Proto: HTTP/2.0 (connect took 112.792583ms) 
INFO[1417] Feed year 2020: read 165100966 decompressed bytes (elapsed: 2m33.035853875s) 
INFO[1418] Feed year 2020: completed with 19400 vulnerabilities 
INFO[1418] Downloading NVD feed for year 2021 from https://nvd.nist.gov/feeds/json/cve/2.0/nvdcve-2.0-2021.json.gz 
INFO[1418] Feed year 2021: HTTP 200, Content-Length: 16834124, Proto: HTTP/2.0 (connect took 124.389667ms) 
INFO[1576] Feed year 2021: read 197840274 decompressed bytes (elapsed: 2m37.878328292s) 
WARN[1578] Skipping vuln CVE-2021-44228 because it is being manually enriched 
WARN[1578] Skipping vuln CVE-2021-45046 because it is being manually enriched 
WARN[1578] Skipping vuln CVE-2021-45105 because it is being manually enriched 
WARN[1578] Skipping vuln CVE-2021-41411 because it is being manually enriched 
INFO[1578] Feed year 2021: completed with 22596 vulnerabilities 
INFO[1578] Downloading NVD feed for year 2022 from https://nvd.nist.gov/feeds/json/cve/2.0/nvdcve-2.0-2022.json.gz 
INFO[1578] Feed year 2022: HTTP 200, Content-Length: 18303606, Proto: HTTP/2.0 (connect took 392.876167ms) 
WARN[1647] Feed year 2022: attempt 1 failed: reading feed body (read 95139825 bytes, elapsed: 1m9.274086042s): stream error: stream ID 45; INTERNAL_ERROR; received from peer; retrying in 10s 
INFO[1657] Downloading NVD feed for year 2022 from https://nvd.nist.gov/feeds/json/cve/2.0/nvdcve-2.0-2022.json.gz 
INFO[1657] Feed year 2022: HTTP 200, Content-Length: 18303606, Proto: HTTP/2.0 (connect took 203.939791ms) 
WARN[1779] Feed year 2022: attempt 2 failed: reading feed body (read 70929008 bytes, elapsed: 2m1.587821958s): stream error: stream ID 47; INTERNAL_ERROR; received from peer; retrying in 20s 
INFO[1799] Downloading NVD feed for year 2022 from https://nvd.nist.gov/feeds/json/cve/2.0/nvdcve-2.0-2022.json.gz 
INFO[1799] Feed year 2022: HTTP 200, Content-Length: 18303606, Proto: HTTP/2.0 (connect took 75.535792ms) 
INFO[1881] Feed year 2022: read 216154381 decompressed bytes (elapsed: 1m22.069938834s) 
WARN[1882] Skipping vuln CVE-2022-0811 because it is being manually enriched 
WARN[1882] Skipping vuln CVE-2022-22963 because it is being manually enriched 
WARN[1882] Skipping vuln CVE-2022-22965 because it is being manually enriched 
WARN[1882] Skipping vuln CVE-2022-22978 because it is being manually enriched 
WARN[1882] Filtering out CPE: wfn:[part="h",vendor="intel",product=ANY,version=ANY,update=ANY,edition=ANY,language=ANY] 
INFO[1882] Feed year 2022: completed with 26432 vulnerabilities 
INFO[1882] Downloading NVD feed for year 2023 from https://nvd.nist.gov/feeds/json/cve/2.0/nvdcve-2.0-2023.json.gz 
INFO[1883] Feed year 2023: HTTP 200, Content-Length: 20120152, Proto: HTTP/2.0 (connect took 104.581584ms) 
INFO[1998] Feed year 2023: read 246815971 decompressed bytes (elapsed: 1m55.154203375s) 
WARN[1999] Skipping vuln CVE-2023-32697 because it is being manually enriched 
WARN[1999] Filtering out CPE: wfn:[part="h",vendor="amd",product=ANY,version=ANY,update=ANY,edition=ANY,language=ANY] 
WARN[1999] Skipping vuln CVE-2023-44487 because it is being manually enriched 
WARN[1999] Skipping vuln CVE-2023-39325 because it is being manually enriched 
WARN[1999] Skipping vuln CVE-2023-38545 because it is being manually enriched 
WARN[1999] Skipping vuln CVE-2023-38546 because it is being manually enriched 
INFO[2000] Feed year 2023: completed with 30606 vulnerabilities 
INFO[2000] Downloading NVD feed for year 2024 from https://nvd.nist.gov/feeds/json/cve/2.0/nvdcve-2.0-2024.json.gz 
INFO[2000] Feed year 2024: HTTP 200, Content-Length: 24859217, Proto: HTTP/2.0 (connect took 135.574583ms) 
WARN[2250] Feed year 2024: attempt 1 failed: reading feed body (read 245466749 bytes, elapsed: 4m10.804004458s): stream error: stream ID 53; INTERNAL_ERROR; received from peer; retrying in 10s 
INFO[2260] Downloading NVD feed for year 2024 from https://nvd.nist.gov/feeds/json/cve/2.0/nvdcve-2.0-2024.json.gz 
INFO[2261] Feed year 2024: HTTP 200, Content-Length: 24859217, Proto: HTTP/2.0 (connect took 163.036083ms) 
INFO[2539] Feed year 2024: read 289607825 decompressed bytes (elapsed: 4m38.132844958s) 
INFO[2541] Feed year 2024: completed with 38398 vulnerabilities 
INFO[2541] Downloading NVD feed for year 2025 from https://nvd.nist.gov/feeds/json/cve/2.0/nvdcve-2.0-2025.json.gz 
INFO[2541] Feed year 2025: HTTP 200, Content-Length: 23368655, Proto: HTTP/2.0 (connect took 118.935667ms) 
WARN[2841] Feed year 2025: attempt 1 failed: reading feed body (read 271167181 bytes, elapsed: 5m0.067260125s): context deadline exceeded (Client.Timeout or context cancellation while reading body); retrying in 10s 
INFO[2851] Downloading NVD feed for year 2025 from https://nvd.nist.gov/feeds/json/cve/2.0/nvdcve-2.0-2025.json.gz 
INFO[2851] Feed year 2025: HTTP 200, Content-Length: 23368655, Proto: HTTP/2.0 (connect took 300.471458ms) 
INFO[3100] Feed year 2025: read 277201851 decompressed bytes (elapsed: 4m9.248846875s) 
INFO[3102] Feed year 2025: completed with 43102 vulnerabilities 
INFO[3102] Downloading NVD feed for year 2026 from https://nvd.nist.gov/feeds/json/cve/2.0/nvdcve-2.0-2026.json.gz 
INFO[3102] Feed year 2026: HTTP 200, Content-Length: 15088134, Proto: HTTP/2.0 (connect took 273.277208ms) 
INFO[3275] Feed year 2026: read 174476419 decompressed bytes (elapsed: 2m53.018629125s) 
INFO[3276] Feed year 2026: completed with 28497 vulnerabilities 
INFO[3276] Downloading redhat...             

@dcaravel dcaravel requested a review from a team as a code owner June 30, 2026 17:26
@dcaravel dcaravel changed the title fix(ci): Improve NVD feed download resilience fix(feed): Improve NVD feed download resilience Jun 30, 2026
@coderabbitai

coderabbitai Bot commented Jun 30, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

The NVD feed loader gains a 5-attempt exponential-backoff retry loop in downloadFeedForYear and a new fetchFeed helper that performs HTTP GET, validates status 200, gunzips, reads the full body into memory via io.ReadAll, and unmarshals JSON, with detailed elapsed-time logging. The package-level HTTP client timeout is bumped from 5 to 6 minutes.

Changes

NVD Feed Loader Retry & Timeout

Layer / File(s) Summary
Retry loop, fetchFeed helper, and timeout bump
pkg/vulnloader/nvdloader/loader_api.go, pkg/vulnloader/nvdloader/loader_feed.go
HTTP client timeout increased to 6 minutes; downloadFeedForYear now retries up to 5 times with exponential backoff starting at 10 s; new fetchFeed helper handles HTTP request, status validation, gzip decompression, full-body io.ReadAll, and json.Unmarshal with elapsed-time logs; year context added to error messages.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: improving NVD feed download resilience.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The description accurately matches the retry, logging, unmarshalling, and timeout changes in the NVD feed loader.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch dc/nvd-resilience

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/vulnloader/nvdloader/loader_feed.go`:
- Around line 56-67: downloadFeedForYear is retrying all failures from
fetchFeed, including permanent 4xx responses, so classify errors in fetchFeed
and only retry transport errors and retryable HTTP statuses like 429/5xx. Update
the retry loop in downloadFeedForYear to inspect the error type/status before
sleeping, and preserve the existing maxRetries/backoff behavior for transient
failures while failing fast on non-retryable client errors.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 91cd5d80-4950-43fd-8954-1a3ee79d9319

📥 Commits

Reviewing files that changed from the base of the PR and between 7940811 and 2e3485a.

📒 Files selected for processing (2)
  • pkg/vulnloader/nvdloader/loader_api.go
  • pkg/vulnloader/nvdloader/loader_feed.go

Comment on lines +56 to +67
for attempt := 1; ; attempt++ {
var err error
apiFeed, err = fetchFeed(url, year)
if err == nil {
break
}
if attempt >= maxRetries {
return errors.Wrapf(err, "failed to download feed for year %d after %d attempts", year, attempt)
}
log.Warnf("Feed year %d: attempt %d failed: %v; retrying in %s", year, attempt, err, backoff)
time.Sleep(backoff)
backoff *= 2

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Stop retrying permanent HTTP client errors.

fetchFeed turns every non-200 into the same generic error, so downloadFeedForYear also retries 400/401/403/404 responses. With a 6-minute client timeout plus 10→80s backoff, one permanent 4xx can stall a single year for 30+ minutes before failing. Please classify errors so only transport failures and retryable statuses (for example 429/5xx) loop.

Also applies to: 101-103

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/vulnloader/nvdloader/loader_feed.go` around lines 56 - 67,
downloadFeedForYear is retrying all failures from fetchFeed, including permanent
4xx responses, so classify errors in fetchFeed and only retry transport errors
and retryable HTTP statuses like 429/5xx. Update the retry loop in
downloadFeedForYear to inspect the error type/status before sleeping, and
preserve the existing maxRetries/backoff behavior for transient failures while
failing fast on non-retryable client errors.

Source: Path instructions

@dcaravel dcaravel merged commit 14ea07f into master Jun 30, 2026
36 checks passed
@dcaravel dcaravel deleted the dc/nvd-resilience branch June 30, 2026 18:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants