Skip to content

[Bugfix] Fix Llama4 weight loading#45047

Merged
robertgshaw2-redhat merged 2 commits into
vllm-project:mainfrom
tlrmchlsmth:fix/llama4-moe-weight-remap
Jun 10, 2026
Merged

[Bugfix] Fix Llama4 weight loading#45047
robertgshaw2-redhat merged 2 commits into
vllm-project:mainfrom
tlrmchlsmth:fix/llama4-moe-weight-remap

Conversation

@tlrmchlsmth

Copy link
Copy Markdown
Member

The MoE refactor (#41184) moved expert weights under a routed_experts submodule but the remap function only handled .mlp.experts. paths. Models using .feed_forward.experts. (Llama4, mllama4, lfm2_moe) hit KeyError on scale params like w2_input_scale during weight loading.

Generalize maybe_remap_moe_expert_param_name to match any .experts. pattern and add remap calls in the affected models' weight loaders.

Fixes Nightly CI build #70731:

MoE Refactor Integration Test (B200 - TEMPORARY): Llama-4-Scout-Fp8-ModelOpt-fi-trtllm server crashed with KeyError: 'layers.0.feed_forward.experts.w2_input_scale' in llama4.py weight loading

The MoE refactor (vllm-project#41184) moved expert weights under a routed_experts
submodule but the remap function only handled .mlp.experts. paths.
Models using .feed_forward.experts. (Llama4, mllama4, lfm2_moe) hit
KeyError on scale params like w2_input_scale during weight loading.

Generalize maybe_remap_moe_expert_param_name to match any *.experts.*
pattern and add remap calls in the affected models' weight loaders.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com>
@tlrmchlsmth tlrmchlsmth requested a review from 22quinn as a code owner June 9, 2026 17:33
@mergify mergify Bot added llama Related to Llama models bug Something isn't working labels Jun 9, 2026

@bnellnm bnellnm left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix. I was just looking into this.

@tlrmchlsmth tlrmchlsmth added the ready ONLY add when PR is ready to merge/full CI is needed label Jun 9, 2026
@tlrmchlsmth tlrmchlsmth enabled auto-merge (squash) June 10, 2026 14:29
@robertgshaw2-redhat robertgshaw2-redhat merged commit fa8c868 into vllm-project:main Jun 10, 2026
67 of 70 checks passed
wcynb1023 pushed a commit to wcynb1023/vllm that referenced this pull request Jun 11, 2026
Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Saddss pushed a commit to Saddss/vllm that referenced this pull request Jun 14, 2026
Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
vivek8123 pushed a commit to odh-on-pz/vllm-upstream that referenced this pull request Jun 18, 2026
Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
divineearthly pushed a commit to divineearthly/vllm that referenced this pull request Jun 19, 2026
Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: divineearthly <divineearthly@gmail.com>
tunglinwood pushed a commit to tunglinwood/vllm that referenced this pull request Jun 22, 2026
Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
nkzhenhua pushed a commit to nkzhenhua/vllm that referenced this pull request Jun 24, 2026
Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working llama Related to Llama models ready ONLY add when PR is ready to merge/full CI is needed

4 participants