Skip to content

improve obvious performance issues#186

Merged
bennyz merged 1 commit into
centos-automotive-suite:mainfrom
bennyz:perf-bottlenecks
Mar 29, 2026
Merged

improve obvious performance issues#186
bennyz merged 1 commit into
centos-automotive-suite:mainfrom
bennyz:perf-bottlenecks

Conversation

@bennyz
Copy link
Copy Markdown
Contributor

@bennyz bennyz commented Mar 29, 2026

  • pagination
  • remove buffer
  • reduce updates

Summary by CodeRabbit

Release Notes

  • New Features

    • Pagination support for container builds, builds, sealed items, and workspaces listings
    • SHA-256 spec hashing to prevent unnecessary resource updates
  • Refactor

    • Optimized status update logic in image controller operations to reduce redundant API calls
    • Simplified error handling in secret deletion operations
    • Enhanced tarball upload handling with temporary file storage instead of in-memory buffering

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 29, 2026

Warning

Rate limit exceeded

@bennyz has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 22 minutes and 41 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 22 minutes and 41 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 77183f6e-ce0f-4589-b84f-70f6fa35503e

📥 Commits

Reviewing files that changed from the base of the PR and between c68272a and ce3c874.

📒 Files selected for processing (7)
  • internal/buildapi/container_builds.go
  • internal/buildapi/server.go
  • internal/buildapi/workspace.go
  • internal/controller/image/controller.go
  • internal/controller/imagebuild/controller.go
  • internal/controller/imagereseal/controller.go
  • internal/controller/operatorconfig/controller.go
📝 Walkthrough

Walkthrough

This PR adds query-driven pagination to API list handlers (listContainerBuilds, listBuilds, listFlash, listSealed, listWorkspaces), switches tarball upload storage from in-memory to temporary files, optimizes Kubernetes reconciliation via spec hashing and reduced API calls, and simplifies retry logic by delegating to controller reconcile cycles.

Changes

Cohort / File(s) Summary
API Pagination
internal/buildapi/container_builds.go, internal/buildapi/server.go, internal/buildapi/workspace.go
Added shared pagination helpers (parsePagination, applyPagination, default/max limits) and applied limit/offset query parameters to list endpoints, returning paginated subsets instead of full result sets.
Workspace Upload Optimization
internal/buildapi/workspace.go
Changed syncWorkspace to spool request body to temporary file via os.CreateTemp + io.Copy instead of in-memory buffering, with explicit cleanup and rewind handling.
Controller Retry Simplification
internal/controller/imagebuild/controller.go, internal/controller/imagereseal/controller.go
Removed exponential backoff retry loops from deleteSecretWithRetry, delegating retries to the controller's reconcile cycle instead of in-function sleeping.
Image Controller Optimization
internal/controller/image/controller.go
Optimized LastVerified timestamp updates to only patch when stale (>30 min) or unset, and consolidated status patching to avoid redundant GET+Patch round-trips.
Spec Hashing & Resource Management
internal/controller/operatorconfig/controller.go
Introduced SHA-256 spec hashing to skip unnecessary Kubernetes updates; refactored createOrUpdateTask and createOrUpdatePipeline to use a shared createOrUpdate helper with hash-based short-circuiting.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • internal registry fixes #94: Modifies listBuilds handler in internal/buildapi/server.go for external image URL translation, overlapping with pagination changes.
  • caib: improve build flow #111: Updates sorting and log-stream behavior in the same internal/buildapi/server.go build API handlers.
  • initial workspace #147: Introduces workspace API handlers (listWorkspaces, syncWorkspace) that are being paginated and optimized in this PR.

Suggested reviewers

  • bkhizgiy

Poem

🐰 Pagination, temp files, and hashes so neat,
Controller retry loops now face defeat,
Status stamps bundled, no more round-trip dance,
Spec hashing keeps updates in their right stance,
Kubernetes reconciliation gets a boost!

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'improve obvious performance issues' is vague and generic, using non-descriptive terms that don't convey meaningful information about the specific changes made. Consider a more specific title that highlights the primary performance improvement, such as 'add pagination to build and workspace list endpoints' or 'reduce memory usage by spooling large uploads to disk'.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (2)
internal/controller/operatorconfig/controller.go (1)

545-552: Consider logging when hash computation fails.

When json.Marshal fails, the function silently returns an empty string, causing all updates to proceed (safe fallback). However, this makes debugging difficult if serialization issues occur. Adding a log statement would improve observability without changing behavior.

🔧 Suggested improvement
 func computeSpecHash(obj client.Object) string {
 	data, err := json.Marshal(obj)
 	if err != nil {
+		// Log but don't fail - falling back to always-update behavior
 		return ""
 	}
 	h := sha256.Sum256(data)
 	return hex.EncodeToString(h[:])
 }

Note: You'd need to pass a logger or use a package-level logger to actually log. Alternatively, consider returning (string, error) and letting the caller decide how to handle failures.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/controller/operatorconfig/controller.go` around lines 545 - 552,
computeSpecHash currently swallows json.Marshal errors and returns an empty
string; update it to log the serialization error so failures are observable by
either (a) adding a package-level logger or accepting a logger parameter and
calling it when json.Marshal fails, or (b) change the signature to return
(string, error) so callers can log/handle the error; locate the computeSpecHash
function and implement one of these options, ensuring the error message includes
context (e.g., "computeSpecHash: failed to marshal object") and the original
error.
internal/controller/image/controller.go (1)

115-120: This still leaves one self-triggered verify per timestamp refresh.

Every successful updateLastVerified writes the Image status, so the controller reconciles the same object again and runs verifyImageLocation a second time. The 30-minute gate reduces churn, but it does not remove the remaining duplicate check. Since this controller already schedules its own requeues, consider keeping pure LastVerified refreshes off the watched update path.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/controller/image/controller.go` around lines 115 - 120, The
controller currently writes status.LastVerified via updateLastVerified which
triggers a watched update and causes an immediate duplicate reconcile/verify;
change the flow so pure LastVerified refreshes don't hit the watched path:
modify updateLastVerified to patch only the status (use client.Status().Patch
with client.MergeFrom(original)) and mark those patches with a short-lived
marker (e.g. annotation "internal.example.com/last-verified-only=true"); then
update the controller's watch/predicate in Reconcile setup (or the existing
predicate function) to ignore events whose only change is that marker or only
status.LastVerified changed (i.e. compare old/new status and annotations and
return false if nothing but LastVerified/marker changed), leaving
verifyImageLocation and the main Reconcile path unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/buildapi/server.go`:
- Around line 2676-2699: parsePagination currently applies defaultPageLimit=50
even when callers don't supply a limit, causing silent truncation; change it to
preserve legacy "full list unless requested" behavior by only enforcing limits
when the client provides a "limit" query: in parsePagination (and related
constants defaultPageLimit/maxPageLimit) return a sentinel (e.g., 0 or -1) or a
boolean flag to indicate "no limit provided" so callers can fetch the full list,
and when a limit is provided validate and clamp it to maxPageLimit as before;
update any callers that consume parsePagination to treat the sentinel/flag as
“no limiting” rather than a 50-item default.

In `@internal/buildapi/workspace.go`:
- Around line 328-339: The current code builds the owned slice from wsList.Items
and calls applyPagination without a deterministic sort, causing unstable pages;
fix by sorting the owned slice deterministically before pagination (e.g., using
sort.Slice on the owned []WorkspaceResponse returned by workspaceResponseFromCR)
— compare a stable field such as workspace name or UID (fields available on the
WorkspaceResponse produced by workspaceResponseFromCR) and then call
applyPagination(owned, limit, offset); ensure you import sort if needed and keep
the sort key consistent (e.g., Name or ID) so pages are stable across requests.

In `@internal/controller/imagebuild/controller.go`:
- Around line 1487-1510: The deleteSecretWithRetry helper currently swallows
transient delete errors causing leaked secrets; change deleteSecretWithRetry on
ImageBuildReconciler to return an error (or a result/boolean) instead of only
logging, treat IsNotFound as success, and return the delete error for any other
failure; update callers in the reconcile path to check the returned error and
explicitly requeue the reconciliation (e.g., using ctrl.Result{Requeue:true} or
RequeueAfter) so cleanup is retried deterministically until the secret is
removed.

In `@internal/controller/imagereseal/controller.go`:
- Around line 724-743: The deleteSecretWithRetry function currently swallows
transient Delete errors (in Reconciler.deleteSecretWithRetry), relying on the
reconcile loop to re-run which doesn't happen once status is terminal; change
the pattern to return the error instead of only logging it so callers can handle
retries. Update deleteSecretWithRetry to return error (preserving NotFound as
nil), remove the silent log-only failure path, and modify its callers in the
Reconciler cleanup code to check the returned error and call
ctrl.Result{RequeueAfter: <shortDuration>} (or return the error to be converted
into a RequeueAfter) so transient failures trigger explicit requeueing rather
than being left behind in terminal phases.

---

Nitpick comments:
In `@internal/controller/image/controller.go`:
- Around line 115-120: The controller currently writes status.LastVerified via
updateLastVerified which triggers a watched update and causes an immediate
duplicate reconcile/verify; change the flow so pure LastVerified refreshes don't
hit the watched path: modify updateLastVerified to patch only the status (use
client.Status().Patch with client.MergeFrom(original)) and mark those patches
with a short-lived marker (e.g. annotation
"internal.example.com/last-verified-only=true"); then update the controller's
watch/predicate in Reconcile setup (or the existing predicate function) to
ignore events whose only change is that marker or only status.LastVerified
changed (i.e. compare old/new status and annotations and return false if nothing
but LastVerified/marker changed), leaving verifyImageLocation and the main
Reconcile path unchanged.

In `@internal/controller/operatorconfig/controller.go`:
- Around line 545-552: computeSpecHash currently swallows json.Marshal errors
and returns an empty string; update it to log the serialization error so
failures are observable by either (a) adding a package-level logger or accepting
a logger parameter and calling it when json.Marshal fails, or (b) change the
signature to return (string, error) so callers can log/handle the error; locate
the computeSpecHash function and implement one of these options, ensuring the
error message includes context (e.g., "computeSpecHash: failed to marshal
object") and the original error.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 0d2aedf1-c971-4259-959c-2e5dbfecdaa1

📥 Commits

Reviewing files that changed from the base of the PR and between a8beb42 and c68272a.

📒 Files selected for processing (7)
  • internal/buildapi/container_builds.go
  • internal/buildapi/server.go
  • internal/buildapi/workspace.go
  • internal/controller/image/controller.go
  • internal/controller/imagebuild/controller.go
  • internal/controller/imagereseal/controller.go
  • internal/controller/operatorconfig/controller.go

Comment thread internal/buildapi/server.go Outdated
Comment thread internal/buildapi/workspace.go Outdated
Comment thread internal/controller/imagebuild/controller.go Outdated
Comment thread internal/controller/imagereseal/controller.go Outdated
@bennyz bennyz force-pushed the perf-bottlenecks branch from c68272a to cc7ae98 Compare March 29, 2026 12:39
Signed-off-by: Benny Zlotnik <bzlotnik@redhat.com>
Assisted-by: claude-opus-4.6
@bennyz bennyz force-pushed the perf-bottlenecks branch from cc7ae98 to ce3c874 Compare March 29, 2026 12:43
@bennyz bennyz requested a review from bkhizgiy March 29, 2026 12:48
@bennyz bennyz merged commit b657436 into centos-automotive-suite:main Mar 29, 2026
4 checks passed
@bennyz bennyz deleted the perf-bottlenecks branch March 29, 2026 13:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants