[Bugfix] Fix Llama4 weight loading by tlrmchlsmth · Pull Request #45047 · vllm-project/vllm

tlrmchlsmth · 2026-06-09T17:33:24Z

The MoE refactor (#41184) moved expert weights under a routed_experts submodule but the remap function only handled .mlp.experts. paths. Models using .feed_forward.experts. (Llama4, mllama4, lfm2_moe) hit KeyError on scale params like w2_input_scale during weight loading.

Generalize maybe_remap_moe_expert_param_name to match any .experts. pattern and add remap calls in the affected models' weight loaders.

Fixes Nightly CI build #70731:

MoE Refactor Integration Test (B200 - TEMPORARY): Llama-4-Scout-Fp8-ModelOpt-fi-trtllm server crashed with KeyError: 'layers.0.feed_forward.experts.w2_input_scale' in llama4.py weight loading

The MoE refactor (vllm-project#41184) moved expert weights under a routed_experts submodule but the remap function only handled .mlp.experts. paths. Models using .feed_forward.experts. (Llama4, mllama4, lfm2_moe) hit KeyError on scale params like w2_input_scale during weight loading. Generalize maybe_remap_moe_expert_param_name to match any *.experts.* pattern and add remap calls in the affected models' weight loaders. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com>

bnellnm

Thanks for the fix. I was just looking into this.

Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: divineearthly <divineearthly@gmail.com>

Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

tlrmchlsmth requested a review from 22quinn as a code owner June 9, 2026 17:33

mergify Bot added llama Related to Llama models bug Something isn't working labels Jun 9, 2026

tlrmchlsmth mentioned this pull request Jun 9, 2026

Revert "[WideEP] Integrate DeepEP v2" (#41183) #45008

Closed

bnellnm approved these changes Jun 9, 2026

View reviewed changes

tlrmchlsmth added the ready ONLY add when PR is ready to merge/full CI is needed label Jun 9, 2026

hmellor approved these changes Jun 9, 2026

View reviewed changes

Merge branch 'main' into fix/llama4-moe-weight-remap

689d9de

tlrmchlsmth enabled auto-merge (squash) June 10, 2026 14:29

robertgshaw2-redhat disabled auto-merge June 10, 2026 17:40

robertgshaw2-redhat merged commit fa8c868 into vllm-project:main Jun 10, 2026
67 of 70 checks passed

wcynb1023 pushed a commit to wcynb1023/vllm that referenced this pull request Jun 11, 2026

[Bugfix] Fix Llama4 weight loading (vllm-project#45047)

91814a8

Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Saddss pushed a commit to Saddss/vllm that referenced this pull request Jun 14, 2026

[Bugfix] Fix Llama4 weight loading (vllm-project#45047)

050930b

Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

vivek8123 pushed a commit to odh-on-pz/vllm-upstream that referenced this pull request Jun 18, 2026

[Bugfix] Fix Llama4 weight loading (vllm-project#45047)

5a13997

Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

tunglinwood pushed a commit to tunglinwood/vllm that referenced this pull request Jun 22, 2026

[Bugfix] Fix Llama4 weight loading (vllm-project#45047)

ba0ece6

Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

nkzhenhua pushed a commit to nkzhenhua/vllm that referenced this pull request Jun 24, 2026

[Bugfix] Fix Llama4 weight loading (vllm-project#45047)

b4fb5b8

Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bugfix] Fix Llama4 weight loading#45047

[Bugfix] Fix Llama4 weight loading#45047
robertgshaw2-redhat merged 2 commits into
vllm-project:mainfrom
tlrmchlsmth:fix/llama4-moe-weight-remap

tlrmchlsmth commented Jun 9, 2026

bnellnm left a comment

Uh oh!

Labels

4 participants

Uh oh!

Uh oh!

Conversation

tlrmchlsmth commented Jun 9, 2026

bnellnm left a comment

Choose a reason for hiding this comment

Uh oh!

Labels

4 participants