Skip to content

add: unified_reward and qwen_image_bench#1471

Open
yjy415 wants to merge 2 commits into
modelscope:mainfrom
yjy415:metric2
Open

add: unified_reward and qwen_image_bench#1471
yjy415 wants to merge 2 commits into
modelscope:mainfrom
yjy415:metric2

Conversation

@yjy415
Copy link
Copy Markdown
Collaborator

@yjy415 yjy415 commented May 31, 2026

No description provided.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for three new image quality metrics: Qwen-Image-Bench, UnifiedReward 2.0, and UnifiedReward Edit, including their respective models, metrics, state dict converters, documentation, and example scripts. The code review identified several critical issues: an accidental deletion of hidream_o1_image_series from the model configurations, a potential ValueError crash when parsing pairwise rank winners in UnifiedRewardEditModel, a potential AttributeError in fix_score_json if the input is not a dictionary, a greedy regex pattern that could break JSON extraction, and a redundant regex character class [::]. Additionally, the input normalization logic in UnifiedRewardEditModel can be simplified.

Comment thread diffsynth/configs/model_configs.py Outdated
Comment thread diffsynth/models/unified_reward_edit.py
Comment thread diffsynth/models/unified_reward_edit.py Outdated
Comment on lines +365 to +373
if len(prompts) == 1 and len(images) > 1 and not isinstance(images[0], (list, tuple)):
prompts = prompts * len(images)
if len(prompts) != len(images):
if len(prompts) == 1:
prompts = prompts * len(images)
elif len(images) == 1:
images = images * len(prompts)
if len(prompts) != len(images):
raise ValueError(f"Expected the same number of prompts and images, got {len(prompts)} and {len(images)}.")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The input normalization logic here is redundant and can be simplified significantly. The check not isinstance(images[0], (list, tuple)) prevents duplication when images is a list of lists (which is the case for pairwise tasks), but then the fallback if len(prompts) == 1: duplicates it anyway. We can simplify this to match the clean pattern used in QwenImageBenchModel._normalize_inputs.

        if len(prompts) == 1 and len(images) > 1:
            prompts = prompts * len(images)
        if len(images) == 1 and len(prompts) > 1:
            images = images * len(prompts)
        if len(prompts) != len(images):
            raise ValueError(f"Expected the same number of prompts and images, got {len(prompts)} and {len(images)}.")

Comment on lines +339 to +342
def fix_score_json(score_json, l1_dim):
"""Fix flat structure, L3 misplacement, and L3 typos based on checklists.py hierarchy."""
if not score_json:
return score_json
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To prevent potential AttributeError crashes if the model output fails to parse as a dictionary (e.g., if it parses as a list or a string), we should add a defensive check to ensure score_json is a dictionary before calling .values() or .items().

Suggested change
def fix_score_json(score_json, l1_dim):
"""Fix flat structure, L3 misplacement, and L3 typos based on checklists.py hierarchy."""
if not score_json:
return score_json
def fix_score_json(score_json, l1_dim):
"""Fix flat structure, L3 misplacement, and L3 typos based on checklists.py hierarchy."""
if not isinstance(score_json, dict):
return {}

Comment on lines +261 to +263
def _extract_first_json(text: str):
match = re.search(r"\{.*\}", text, flags=re.S)
if not match:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The regex pattern r"\{.*\}" is greedy and will match from the first opening brace { to the very last closing brace } in the entire text. If the model output contains additional explanations or text with braces after the JSON block, this greedy match will capture them and fail to parse as valid JSON. Using a non-greedy match r"\{.*?\}" is much more robust.

    @staticmethod
    def _extract_first_json(text: str):
        match = re.search(r"\{.*?\}", text, flags=re.S)

Comment thread diffsynth/models/unified_reward_2.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant