[Cohere] Fix Cohere2MoE weight loading when using Transformers ≥5.10 by Terrencezzj · Pull Request #44747 · vllm-project/vllm

Terrencezzj · 2026-06-06T20:36:24Z

Problem

Cohere2MoeDecoderLayer and Cohere2MoeAttention relied on getattr(config, "first_k_dense_replace", 0) to decide whether a layer uses a dense MLP or MoE.

Under Transformers 5.10+, Cohere2MoeConfig consumes first_k_dense_replace and sets mlp_layer_types = ["dense", "sparse", ...], leaving first_k_dense_replace as None. vLLM then defaulted to 0, built layer 0 as MoE, and failed to load checkpoints with dense layer-0 weights:

KeyError: 'layers.0.mlp.down_proj.weight'

Fix

Normalize config in Cohere2MoeModel(before make_layers): if mlp_layer_types is absent, derive it from legacy first_k_dense_replace, otherwise default to all "sparse"
add is_prefix_dense_layer()

Test Plan

Test Result

Deterministic generation produces identical output under Transformers 5.9.0 and 5.10.2

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

Signed-off-by: Terrencezzj <terrence@cohere.ai>

claude

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

mergify · 2026-06-08T17:46:01Z

Hi @Terrencezzj, the pre-commit checks have failed. Please run:

uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Terrencezzj · 2026-06-08T20:54:28Z

Hi @Terrencezzj, the pre-commit checks have failed. Please run:
uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-files
Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

It's failed because of deepseekv4

…llm-project#44747) Signed-off-by: Terrencezzj <terrence@cohere.ai> Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>

…llm-project#44747) Signed-off-by: Terrencezzj <terrence@cohere.ai> Signed-off-by: Waqar Ahmed <waqar.ahmed@amd.com>

…llm-project#44747) Signed-off-by: Terrencezzj <terrence@cohere.ai>

…llm-project#44747) Signed-off-by: Terrencezzj <terrence@cohere.ai> Signed-off-by: divineearthly <divineearthly@gmail.com>

…llm-project#44747) Signed-off-by: Terrencezzj <terrence@cohere.ai>

fix weight loading issue in transformers 5.10

f164d81

Signed-off-by: Terrencezzj <terrence@cohere.ai>

claude Bot reviewed Jun 6, 2026

View reviewed changes

mgoin added the ready ONLY add when PR is ready to merge/full CI is needed label Jun 6, 2026

Merge branch 'main' into cohere2_moe_fix_loading

9581ac4

Merge branch 'main' into cohere2_moe_fix_loading

f57c548

mgoin approved these changes Jun 9, 2026

View reviewed changes

mgoin added the bug Something isn't working label Jun 9, 2026

vllm-bot merged commit 3e8afdf into vllm-project:main Jun 9, 2026
53 of 57 checks passed

Terrencezzj mentioned this pull request Jun 11, 2026

fix(cohere2_moe): fix weight loader KeyErrors for FP8 and BF16 checkpoints #45314

Closed

Saddss pushed a commit to Saddss/vllm that referenced this pull request Jun 14, 2026

[Cohere] Fix Cohere2MoE weight loading when using Transformers ≥5.10 (v…

b01f278

…llm-project#44747) Signed-off-by: Terrencezzj <terrence@cohere.ai>

vivek8123 pushed a commit to odh-on-pz/vllm-upstream that referenced this pull request Jun 18, 2026

[Cohere] Fix Cohere2MoE weight loading when using Transformers ≥5.10 (v…

d77071a

…llm-project#44747) Signed-off-by: Terrencezzj <terrence@cohere.ai>

tunglinwood pushed a commit to tunglinwood/vllm that referenced this pull request Jun 22, 2026

[Cohere] Fix Cohere2MoE weight loading when using Transformers ≥5.10 (v…

00a261c

…llm-project#44747) Signed-off-by: Terrencezzj <terrence@cohere.ai>

ArsalanShakil mentioned this pull request Jun 22, 2026

[Bug]: cohere2_moe failing with vLLM>=v0.22.1 #46366

Closed

1 task

nkzhenhua pushed a commit to nkzhenhua/vllm that referenced this pull request Jun 24, 2026

[Cohere] Fix Cohere2MoE weight loading when using Transformers ≥5.10 (v…

57522a7

…llm-project#44747) Signed-off-by: Terrencezzj <terrence@cohere.ai>

ohsono pushed a commit to ohsono/vllm that referenced this pull request Jul 3, 2026

[Cohere] Fix Cohere2MoE weight loading when using Transformers ≥5.10 (v…

af118ee

…llm-project#44747) Signed-off-by: Terrencezzj <terrence@cohere.ai>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Cohere] Fix Cohere2MoE weight loading when using Transformers ≥5.10#44747

[Cohere] Fix Cohere2MoE weight loading when using Transformers ≥5.10#44747
vllm-bot merged 3 commits into
vllm-project:mainfrom
Terrencezzj:cohere2_moe_fix_loading

Terrencezzj commented Jun 6, 2026 •

edited by github-actions Bot

Loading

claude Bot left a comment

mergify Bot commented Jun 8, 2026

Terrencezzj commented Jun 8, 2026

Uh oh!

Labels

3 participants

Uh oh!

Uh oh!

Conversation

Terrencezzj commented Jun 6, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Test Plan

Test Result

claude Bot left a comment

Choose a reason for hiding this comment

Claude Code Review

mergify Bot commented Jun 8, 2026

Terrencezzj commented Jun 8, 2026

Uh oh!

Labels

3 participants

Terrencezzj commented Jun 6, 2026 •

edited by github-actions Bot

Loading