perf: amortize per-sample startup and post comments on fork PRs#1506
Merged
JacksonWeber merged 2 commits intoJun 16, 2026
Merged
Conversation
Run all --samples of a given scenario inside a single child Node process. OpenTelemetry global-state isolation is preserved because each scenario still gets a fresh process, but Node startup, importing applicationinsights' large dependency tree, and one-time SDK init (useAzureMonitor()) are now paid once per scenario instead of once per sample. On the perf CI runner this removes several minutes from each "Run X benchmarks" step. Split the perf workflow into producer/consumer to support fork PRs: - performance.yml now runs without pull-requests:write and uploads baseline.json, candidate.json, perf-comparison.md, and pr-number.txt as the perf-results artifact. The inline sticky-comment step (which was gated on head.repo.full_name == github.repository, so forks never got comments) is removed. - performance-comment.yml is new. It triggers on workflow_run after Performance completes, runs in the trusted base-repo context with pull-requests:write, downloads the artifact, validates the PR number is a positive integer, regenerates report.md from the JSON using the base branch's own trusted comparePerf.mjs (so attacker- supplied markdown is never posted under the writable token), and posts the sticky comment. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR improves performance CI by reducing benchmark runtime overhead and enabling performance comparison comments to be posted on fork-based pull requests in a safer, split-workflow model.
Changes:
- Refactors the perf harness to take multiple samples per scenario within a single child process (
bench.mjs+runBenchmarks.mjs) to amortize Node/module initialization costs. - Splits PR commenting into a new
workflow_runworkflow that runs in the base-repo context, downloads perf artifacts, regenerates markdown from JSON using trusted code, and posts a sticky PR comment. - Updates perf-test documentation to describe the workflow split and security model.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
test/performanceTests/runBenchmarks.mjs |
Passes --samples through to the bench child and aggregates per-sample ops/s into summary stats. |
test/performanceTests/bench.mjs |
Collects N back-to-back timed samples in one process and outputs a samples[] JSON shape. |
test/performanceTests/README.md |
Documents the CI workflow split and rationale/security model. |
.github/workflows/performance.yml |
Removes PR write permissions and uploads perf outputs + PR number as an artifact for follow-on commenting. |
.github/workflows/performance-comment.yml |
New workflow to download artifact in base context, regenerate report from JSON, and post sticky PR comment. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- bench.mjs: require Number.isInteger for --samples so 1.5 is rejected. - performance-comment.yml: resolve PR number from workflow_run event payload (or listPullRequestsAssociatedWithCommit keyed off the untamperable head_sha) instead of the untrusted pr-number.txt artifact, so a malicious fork PR cannot trick the writable token into commenting on an unrelated PR. - performance-comment.yml: drop the duplicated PERF_REGRESSION_THRESHOLD env var; comparePerf.mjs's built-in default of 15 now applies, removing the third source of truth. - performance.yml: drop the now-unused pr-number.txt writer/upload.
rads-1996
approved these changes
Jun 16, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Two perf-CI improvements, both motivated by the recent
perf-regressionjob taking ~14 minutes per PR.1. Make perf tests faster
Diagnosis from a recent run: 6 scenarios × 3 samples = 18 child Node processes, each paying ~14.7s of pure overhead (Node startup + importing
applicationinsights' large dependency tree +useAzureMonitor()init) for only 6s of actual measurement.Fix: refactor
bench.mjsto accept--samples Nand take all N samples back-to-back inside one child process (warmup runs once, then N timed durations).runBenchmarks.mjsnow spawns 1 child per scenario instead ofsamples × scenarios. OpenTelemetry global-state isolation is preserved because each scenario still gets a fresh process.Expected savings: ~3 min off each of the two
Run X benchmarkssteps.2. PR comments on fork PRs
The previous inline
sticky-pull-request-commentstep was gated onhead.repo.full_name == github.repositoryso fork PRs never received a comment.Modeled on microsoft/opentelemetry-distro-python's split:
performance.ymlnow runs withoutpull-requests: writeand uploadsbaseline.json,candidate.json,perf-comparison.md,pr-number.txtas theperf-resultsartifact.performance-comment.yml(new) triggers viaworkflow_runafterPerformancecompletes, runs in the trusted base-repo context withpull-requests: write, downloads the artifact, validates the PR number is a positive integer, regeneratesreport.mdfrom the JSON using the base branch's own trustedcomparePerf.mjs(so attacker-supplied markdown is never posted under the writable token), and posts the sticky comment.3. Docs
test/performanceTests/README.mdupdated to describe the workflow split (with security model) and the single-child-per-scenario speedup.Verification
bench.mjs+runBenchmarks.mjsend-to-end with--samples 2 --duration 1 --warmup 1 --scenarios OtelLogTest,OtelSpanTest: produces the expected JSON shape andcomparePerf.mjsconsumes it unchanged.