Skip to content

Parallelize PR system tests via GitHub Actions matrix#829

Open
PranjalManhgaye wants to merge 7 commits into
precice:developfrom
PranjalManhgaye:issue-789-parallel-system-tests-matrix
Open

Parallelize PR system tests via GitHub Actions matrix#829
PranjalManhgaye wants to merge 7 commits into
precice:developfrom
PranjalManhgaye:issue-789-parallel-system-tests-matrix

Conversation

@PranjalManhgaye

Copy link
Copy Markdown
Collaborator

Summary

This PR splits PR system tests into two GitHub Actions matrix jobs (release_test_shard_1 and release_test_shard_2). Together they cover the same 48 cases as release_test; I left release_test itself unchanged so manual runs and other workflows still work as before, I set fail-fast: false so if one shard fails, the other keeps running => that makes failures easier to read and cheaper to retry.

Why

Right now, when one test fails you often have to re-run the whole suite and dig through one huge log. With two shards, you get smaller logs per job and can re-run only the failed matrix job.

If we have two precice-tests-vm runners, the shards can run in parallel. On a single runner they may still queue, but we still get clearer CI output => which matches what we discussed for this issue.

Test plan

  • I have already ran python3 validate_release_test_shards.py =>48 cases = 24 + 24
  • Local : system-tests-dev with --rundir on the precice-data partition (pass)
  • please check from there side for this PR with trigger-system-tests so we can verify on precice-tests-vm (I don’t have access to trigger that from my side ig)

Notes

  • Manual and latest-components workflows still use release_test; we can extend the matrix there in a follow-up if you want.
  • If two shards run at once on different runners, Docker may build the same images twice. I kept v1 simple; we can tighten that later if CI shows problems.

close #789

PranjalManhgaye added a commit to PranjalManhgaye/tutorials that referenced this pull request Jun 5, 2026
@PranjalManhgaye

Copy link
Copy Markdown
Collaborator Author

@MakisH follow-ups (i think later, not this PR) :

  • same matrix for manual / latest-components workflows
  • more shards if we get more runners
  • docker build sharing only if ci shows we need it
  • option 2 (systemtests.py parallel) => skipped for now

@PranjalManhgaye PranjalManhgaye requested a review from MakisH June 5, 2026 14:27

@MakisH MakisH left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to see a first prototype towards parallization!

While this is a valid and easy-to-implement approach for parallelization, I think it mainly adds a layer to the current approach.

Ideally, we should end up with the individual test suites (the ones per tutorial) as shards, so that the jobs also get meaningful names. Issues there will be:

  • race conditions in building the same Docker layers, if a runner picks more than one shard at the same time,
  • that some too long tests are currently excluded from the release_test (see extra), as these take too long. We could then just define these directly in the extra, instead of referring to the ones defined at the tutorial level with anchors.

Nevertheless, I could create another runner and use this PR to test if the parallelism with multiple custom runners makes sense.

Comment thread heat-exchanger/download-meshes.sh Outdated
Comment thread .github/workflows/system-tests-pr.yml
Comment thread tools/tests/README.md Outdated
Comment thread tools/tests/validate_release_test_shards.py Outdated
PranjalManhgaye added a commit to PranjalManhgaye/tutorials that referenced this pull request Jun 6, 2026
@PranjalManhgaye PranjalManhgaye force-pushed the issue-789-parallel-system-tests-matrix branch from 7e76ab0 to 9d75c1f Compare June 6, 2026 12:46
@PranjalManhgaye PranjalManhgaye requested a review from MakisH June 6, 2026 15:06
Comment thread .github/workflows/system-tests-latest-components.yml
Comment thread tools/tests/systemtests/TestSuite.py
@MakisH MakisH moved this from Planned next to Needs review in GSoC 2026: System tests improvements Jun 8, 2026
PranjalManhgaye added a commit to PranjalManhgaye/tutorials that referenced this pull request Jun 12, 2026
@PranjalManhgaye PranjalManhgaye force-pushed the issue-789-parallel-system-tests-matrix branch from 38b4813 to a266f23 Compare June 12, 2026 11:05
@PranjalManhgaye PranjalManhgaye requested a review from MakisH June 12, 2026 11:35
@MakisH

MakisH commented Jun 15, 2026

Copy link
Copy Markdown
Member

@PranjalManhgaye #842 introduced some conflicts in system-tests-latest-components.yml that should be easy to resolve.

Add release_test_shard_1/2 covering the same cases as release_test,
and run them as separate matrix jobs for clearer logs and cheaper reruns.
Move the release_test shard matrix to system-tests-latest-components,
restore system-tests-pr to a single release_test job, and clarify README
wording on concurrent Docker builds.
Define release_test_shard_1/2 tutorial lists once with YAML anchors
and build release_test as their union. Flatten nested list aliases in
TestSuite parsing and remove validate_release_test_shards.py.
Include channel-transport-particles, flow-over-heated-plate fenicsx,
and partitioned-heat-conduction fenicsx cases so release_shard_1/2
cover the same 53 cases as develop release.
@PranjalManhgaye PranjalManhgaye force-pushed the issue-789-parallel-system-tests-matrix branch from a266f23 to 47e2084 Compare June 16, 2026 04:35
@MakisH

MakisH commented Jun 16, 2026

Copy link
Copy Markdown
Member

I triggered a test run: https://github.com/precice/tutorials/actions/runs/27600082238

@MakisH

MakisH commented Jun 16, 2026

Copy link
Copy Markdown
Member

While the matrix is a good approach, I think we should drop the shards and instead reuse the per-tutorial test suites (elastic-tube-1d, elastic-tube-3d, ...). This should also give clearer and faster output, and these test suites are by definition not overlapping with each other.

What I am not sure at the moment is what is the right way to get these inputs to avoid duplication.

@PranjalManhgaye

Copy link
Copy Markdown
Collaborator Author

@MakisH what do you think ; add a small prepare-matrix job in the workflow that reads tests.yaml , or should we extend systemtests.py tooling to expose this list somehow ??

@MakisH

MakisH commented Jun 16, 2026

Copy link
Copy Markdown
Member

Maybe related: I always have to count how many tutorials and tutorial cases we have, both for reporting and now for ensuring that everything is tested. We could collect this information from all the metadata.yaml (assuming that they are complete, or somehow checking with another script if these are complete, checking all available directories).

We could then use that information as input for this matrix.

But in general, a prepare-matrix job would work. Note, however, that tests.yaml includes some duplicate information. We could define the release as all tutorials and read that. The question is then what do we do with the extra that take longer to run.

@PranjalManhgaye

Copy link
Copy Markdown
Collaborator Author

@MakisH i am happy to follow this and drop the artificial shards , and i can implement it by adding a small prepare-matrix step that reads tests.yaml so the workflow doesn’t duplicate suite lists, but to confirm before i do, do you want the matrix at tutorial-suite level (not per-case), keep release as the source of truth there for now, and keep extra out of the default latest-components run (manual or less frequent) ??

@MakisH

MakisH commented Jun 16, 2026

Copy link
Copy Markdown
Member

Overall, I would say that the whole feature is still in the exploration phase and would need some research / brainstorming.

do you want the matrix at tutorial-suite level (not per-case)

We could also split per-case, as long as we don't run a case multiple times. That would give even more refined output. But I think it would be simpler to split per-tutorial, which would also give simpler names. It would also give fewer jobs, which might be easier to manage for the runners, and will share some of the setup overhead.

keep release as the source of truth there for now, and keep extra out of the default latest-components run (manual or less frequent) ??

Yes, I would keep this as-is for now, unless it makes sense to change it.

@MakisH MakisH changed the title Parallelize PR system tests via GitHub Actions matrix. Parallelize PR system tests via GitHub Actions matrix Jun 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Needs review

Development

Successfully merging this pull request may close these issues.

Parallel execution of the system tests

2 participants