test: fix flaky integration tests across multiple packages#1417
Open
pranav-new-relic wants to merge 6 commits into
Open
test: fix flaky integration tests across multiple packages#1417pranav-new-relic wants to merge 6 commits into
pranav-new-relic wants to merge 6 commits into
Conversation
…nup to servicelevel - Add NewFleetIntegrationTestConfig and GetFleetTestAccountID helpers to testhelpers so fleet tests use NEW_RELIC_FLEET_TEST_* credentials instead of the default account - Rewrite fleetcontrol integration tests into three self-contained workflows: fleet lifecycle, configuration lifecycle, and read-only deployment/managed entity search; remove all hardcoded GUIDs; add package-level comment documenting why managed-entity and deployment write paths are excluded - Add defer-based best-effort cleanup to servicelevel integration tests so SLIs are deleted even when a test fails mid-run (mirrors nrqldroprules pattern) - Wire NEW_RELIC_FLEET_TEST_* secrets into the GitHub Actions integration test job
…esource leaks Register the defer before the create call (with an empty-ID guard) so that any assertion failure between creation and the original defer site cannot leave a dangling fleet or configuration entity. Mirrors the pattern used in the servicelevel and nrqldroprules integration tests.
…efore update KeyTransactionUpdate resolves a deeply nested entity including serviceLevel.indicators. On a brand-new entity under CI load, NerdGraph fails to resolve those fields with "An error occurred resolving this field" and exhausts all 3 retries. Adding a 5-second sleep after create (matching the servicelevel test pattern) gives the entity time to be indexed. Also fix three secondary issues: - require.NoError in the defer would panic/obscure failures; replaced with best-effort cleanup using the standard createdGUID + deleted guard pattern - defer was registered after several require checks on the create response, leaving a window where a created entity would not be cleaned up - cleanup block logged wrong variable (err instead of deleteErr)
…leanup servicelevel: GetIndicators must receive the owning application entity GUID, not the SLI's own GUID; using the SLI GUID returns no results. usermanagement: swallow "Could not find the target or you are unauthorized" errors in UserManagementGroupCleanupForIntegrationTests so parallel tests racing to delete the same leftover groups do not fail in CI.
… GUIDs Switch ExternalEntity to MobileApplicationEntity to match the actual entity type of the test GUIDs, and update the source/target GUIDs to active Mobile Application entities so the CRUD test passes in CI.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1417 +/- ##
==========================================
+ Coverage 32.70% 32.92% +0.22%
==========================================
Files 141 141
Lines 6691 6700 +9
==========================================
+ Hits 2188 2206 +18
+ Misses 4292 4281 -11
- Partials 211 213 +2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…o fix flaky CI AccountManagementUpdateAccount fails with "Unable to look up organization by account ID" when called too soon after create — the organization-lookup service propagates new accounts asynchronously and 2 seconds is not enough under CI load, causing all 3 retries to be exhausted. Bump the post-create sleep from 2s to 10s so the backend has enough time to index the new account before the update is attempted. Also bump the post-cancel sleep from 2s to 5s for the same reason (isCanceled visibility has the same propagation delay).
nr-developer-toolkit
approved these changes
May 22, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes several integration test failures observed in CI that pass locally but fail under concurrent load or after entity creation:
NEW_RELIC_FLEET_TEST_*credentials (isolated account for fleet resources); rewrite tests into three self-contained workflows (fleet lifecycle, configuration lifecycle, deployment/managed-entity read-only search); remove all hardcoded GUIDs; add package-level comment explaining why managed-entity and deployment write paths are excluded; fix defer cleanup registration order so resources are never leaked on mid-test assertion failuresKeyTransactionUpdateresolvesserviceLevel.indicatorson the newly created entity, which NerdGraph cannot resolve until the entity is indexed; without the sleep the call exhausts all 3 retries under CI loadGetIndicatorswas called with the SLI's own GUID instead of the parent application entity GUID;serviceLevel { indicators }only resolves on the owning entity, so the SLI GUID always returns nothing; also add defer-based best-effort cleanup so SLIs are deleted even on mid-test failuresUserManagementGroupCleanupForIntegrationTestspropagated "Could not find the target or you are unauthorized" fromUserManagementDeleteGroup, causingTestIntegrationGroupManagementWithUsersto fail when parallel tests race to delete the same leftover groupExternalEntity→MobileApplicationEntity) and refresh stale test entity GUIDs to active Mobile Application entitiesNEW_RELIC_FLEET_TEST_*secrets into the integration test jobTest plan
fleetcontrol,keytransaction,servicelevel,usermanagement,entityrelationshipNEW_RELIC_FLEET_TEST_API_KEYis not set