[WIP][CI][Accuracy] Add HunyuanImage3 pixel accuracy test and nightly CI#3657
[WIP][CI][Accuracy] Add HunyuanImage3 pixel accuracy test and nightly CI#3657BLANKETusers wants to merge 0 commit into
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ce7719d521
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| model_name: str, | ||
| vllm_image: Image.Image, | ||
| diffusers_image: Image.Image, | ||
| baseline_image: Image.Image, |
There was a problem hiding this comment.
Preserve existing helper keyword names
When the existing accuracy tests call assert_similarity(..., diffusers_image=...) or assert_image_sequence_similarity(..., diffusers_images=...) (for example tests/e2e/accuracy/test_qwen_image.py:114 and test_qwen_image_layered.py:142), this signature change makes pytest fail immediately with TypeError: got an unexpected keyword argument before any image comparison runs. Please keep backward-compatible keyword aliases or update all existing call sites in the same change.
Useful? React with 👍 / 👎.
751540a to
fbcf39d
Compare
|
|
||
| _REPO_ROOT = Path(__file__).resolve().parent.parent.parent.parent | ||
| BASELINE_PATH = _REPO_ROOT / "tests" / "assets" / "hunyuan" / "hunyuan_baseline.png" | ||
| _DEFAULT_DEPLOY_CONFIG = _REPO_ROOT / "vllm_omni" / "deploy" / "hunyuan_image3.yaml" |
There was a problem hiding this comment.
maybe you can use get_deploy_config_path in stage_config.py
| "--stage-init-timeout", "300", | ||
| "--init-timeout", "900", | ||
| ] | ||
| with OmniServer(model, server_args, use_omni=True) as omni_server: |
There was a problem hiding this comment.
maybe you can use omni_server fixtures
| ) | ||
|
|
||
| # online vs baseline_image | ||
| # assert_images_pixel_close( |
|
Does this PR need to be closed? |
|
new PR:3790 |
Summary
assert_images_pixel_closehelper for full-image pixel-level comparisonwith mean/p99 absolute channel difference metrics and detailed diagnostics
test_hunyuan_image3_pixel_accuracythat generates images via offlineend2end.pyand compares output against a pre-saved baseline imageaccuracy regressions
diffusers_image→baseline_imageacross accuracy helper APIs(
assert_similarity,assert_image_sequence_similarity)Files changed
.buildkite/test-nightly.ymlvllm-omni · HunyuanImage3 · Accuracy Testtests/assets/hunyuan/hunyuan_baseline.pngtests/e2e/accuracy/helpers.pyassert_images_pixel_close; rename paramstests/e2e/accuracy/test_hunyuan_image3_pixel_accuracy.pyTest plan
HUNYUAN_IMAGE3_DEPLOY_CONFIG=../hunyuan_image3_dit_copy.yaml pytest -s -v tests/e2e/accuracy/test_hunyuan_image3_pixel_accuracy.py --run-level
Test Result
1 passed, 18 warnings in 101.49s (0:01:41)
Pixel Metrics
tencent/HunyuanImage-3.0-Instruct — (offline vs baseline)
Mismatch ratios (pixel_ratio / channel_ratio)