[Bug]: DeepSeek V4 load_weights UnboundLocalError: 'name_mapped' when expert mapping has no match

### Summary

`vllm/model_executor/models/deepseek_v4.py` has an unbound-local bug in `DeepseekV4ForCausalLM.load_weights()`'s expert branch. The variable `name_mapped` is assigned **inside** a `for mapping in expert_mapping` loop, but referenced **after** the loop. When no entry in `expert_mapping` matches the incoming tensor name (all iterations hit the early `continue`), `name_mapped` is never bound — but the subsequent `loaded_params.add(name_mapped)` still runs:

```
File "vllm/model_executor/models/deepseek_v4.py", line 1557, in load_weights
    loaded_params.add(name_mapped)
UnboundLocalError: cannot access local variable 'name_mapped' where it is not associated with a value
```

### Environment

- vLLM: 0.21.0 (also reproducible on `main` as of 2026-05-15)
- Transformers: 4.57.1
- Hardware: 2× A100 80GB, TP=2
- Trigger: any DSV4 checkpoint whose expert weight tensor names don't match every entry in `get_expert_mapping()` (e.g., custom quantization-aware suffixes like `.tq_packed`).

### Affected code

`vllm/model_executor/models/deepseek_v4.py` around lines 1532-1558 (line numbers from main as of 2026-05-15, may drift slightly):

```python
                if ".experts." in name:
                    # E8M0 scales special handling...
                    if (
                        "weight_scale" in name
                        and loaded_weight.dtype == torch.float8_e8m0fnu
                    ):
                        loaded_weight = loaded_weight.view(torch.uint8)
                    for mapping in expert_mapping:
                        param_name, weight_name, expert_id, shard_id = mapping
                        if weight_name not in name:
                            continue                 # <-- early continue
                        name_mapped = name.replace(weight_name, param_name)
                        # ... weight loader call ...
                        if success:
                            name = name_mapped
                            break
                    loaded_params.add(name_mapped)   # <-- unbound if loop never set it
                    continue
```

If every iteration `continue`s on the `if weight_name not in name` guard, the `for` loop exits without `break` AND without ever assigning `name_mapped`. `loaded_params.add(name_mapped)` then raises `UnboundLocalError`.

### Repro

Easiest path: load any DSV4 checkpoint that includes an expert tensor name not present in `get_expert_mapping()`'s output. We hit it loading a TQ3-native variant where weight names have a `.tq_packed` suffix, but it would also fire on:
- Custom quantization schemes that decorate weight names
- Any checkpoint shape mismatch where `.experts.` is in the name but the suffix doesn't match the expected `gate_proj`/`up_proj`/`down_proj`/`w1`/`w2`/`w3` patterns

### Proposed fix

Initialize `name_mapped = None` before the loop, then short-circuit if it stayed unset:

```python
                if ".experts." in name:
                    if (
                        "weight_scale" in name
                        and loaded_weight.dtype == torch.float8_e8m0fnu
                    ):
                        loaded_weight = loaded_weight.view(torch.uint8)
                    name_mapped = None               # <-- ADDED
                    for mapping in expert_mapping:
                        param_name, weight_name, expert_id, shard_id = mapping
                        if weight_name not in name:
                            continue
                        name_mapped = name.replace(weight_name, param_name)
                        # ... weight loader call ...
                        if success:
                            name = name_mapped
                            break
                    if name_mapped is None:          # <-- ADDED
                        continue                     # <-- ADDED
                    loaded_params.add(name_mapped)
                    continue
```

Three-line change. Skips unrecognized expert tensors cleanly instead of raising. If a maintainer prefers raising explicitly for unrecognized expert names, the `continue` could be `raise ValueError(f"unrecognized expert tensor name: {name}")` — that's a stylistic choice.

### Workaround we're using

Patching the installed `deepseek_v4.py` at runtime via a setup script (`sed`-style replace with the diff above). Three-line surgical patch.

### Context

We've been validating a TQ3-native (3-bit weight quantization) version of DeepSeek-V4-Flash on vLLM. This is the second `deepseek_v4.py` bug we've hit during that work; the first was reported as [#42741](https://github.com/vllm-project/vllm/issues/42741) (`compress_ratios` attribute access vs the new transformers normalization). Happy to submit a PR with both fixes together if a maintainer wants.

---

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: DeepSeek V4 load_weights UnboundLocalError: 'name_mapped' when expert mapping has no match #42769

Summary

Environment

Affected code

Repro

Proposed fix

Workaround we're using

Context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: DeepSeek V4 load_weights UnboundLocalError: 'name_mapped' when expert mapping has no match #42769

Description

Summary

Environment

Affected code

Repro

Proposed fix

Workaround we're using

Context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions