Skip to content

[Model Runner V2] Fix v2 AttributeError: 'CohereASRDecoder' object has no attribute 'embed_input_ids'#44568

Merged
WoosukKwon merged 3 commits into
mainfrom
wentao-fix-v2-CohereASRDecoder
Jun 10, 2026
Merged

[Model Runner V2] Fix v2 AttributeError: 'CohereASRDecoder' object has no attribute 'embed_input_ids'#44568
WoosukKwon merged 3 commits into
mainfrom
wentao-fix-v2-CohereASRDecoder

Conversation

@yewentao256

Copy link
Copy Markdown
Member

Purpose

Fixing for #44443

VLLM_USE_V2_MODEL_RUNNER=1 pytest tests/models/test_initialization.py::test_can_initialize_large_subset[CohereAsrForConditionalGeneration]

Originally

(EngineCore pid=1833568)   File "/home/yewentao256/vllm-source/vllm/v1/executor/uniproc_executor.py", line 92, in collective_rpc
(EngineCore pid=1833568)     result = run_method(self.driver_worker, method, args, kwargs)
(EngineCore pid=1833568)              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=1833568)   File "/home/yewentao256/vllm-source/vllm/v1/serial_utils.py", line 510, in run_method
(EngineCore pid=1833568)     return func(*args, **kwargs)
(EngineCore pid=1833568)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=1833568)   File "/home/yewentao256/.venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 124, in decorate_context
(EngineCore pid=1833568)     return func(*args, **kwargs)
(EngineCore pid=1833568)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=1833568)   File "/home/yewentao256/vllm-source/vllm/v1/worker/gpu_worker.py", line 400, in determine_available_memory
(EngineCore pid=1833568)     self.model_runner.profile_run()
(EngineCore pid=1833568)   File "/home/yewentao256/.venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 124, in decorate_context
(EngineCore pid=1833568)     return func(*args, **kwargs)
(EngineCore pid=1833568)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=1833568)   File "/home/yewentao256/vllm-source/vllm/v1/worker/gpu/model_runner.py", line 629, in profile_run
(EngineCore pid=1833568)     hidden_states, sample_hidden_states = self._dummy_run(
(EngineCore pid=1833568)                                           ^^^^^^^^^^^^^^^^
(EngineCore pid=1833568)   File "/home/yewentao256/.venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 124, in decorate_context
(EngineCore pid=1833568)     return func(*args, **kwargs)
(EngineCore pid=1833568)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=1833568)   File "/home/yewentao256/vllm-source/vllm/v1/worker/gpu/eplb_utils.py", line 37, in wrapper
(EngineCore pid=1833568)     result = fn(self, *args, **kwargs)
(EngineCore pid=1833568)              ^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=1833568)   File "/home/yewentao256/vllm-source/vllm/v1/worker/gpu/model_runner.py", line 539, in _dummy_run
(EngineCore pid=1833568)     self.execute_model(
(EngineCore pid=1833568)   File "/home/yewentao256/.venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 124, in decorate_context
(EngineCore pid=1833568)     return func(*args, **kwargs)
(EngineCore pid=1833568)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=1833568)   File "/home/yewentao256/vllm-source/vllm/v1/worker/gpu/model_runner.py", line 1210, in execute_model
(EngineCore pid=1833568)     inputs_embeds = self.model_state.get_mm_embeddings(
(EngineCore pid=1833568)                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=1833568)   File "/home/yewentao256/vllm-source/vllm/v1/worker/gpu/model_states/default.py", line 126, in get_mm_embeddings
(EngineCore pid=1833568)     inputs_embeds = self.encoder_runner.get_inputs_embeds(
(EngineCore pid=1833568)                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=1833568)   File "/home/yewentao256/.venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 124, in decorate_context
(EngineCore pid=1833568)     return func(*args, **kwargs)
(EngineCore pid=1833568)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=1833568)   File "/home/yewentao256/vllm-source/vllm/v1/worker/gpu/mm/encoder_runner.py", line 143, in get_inputs_embeds
(EngineCore pid=1833568)     x = self.model.embed_input_ids(
(EngineCore pid=1833568)         ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=1833568)   File "/home/yewentao256/vllm-source/vllm/model_executor/models/interfaces.py", line 398, in embed_input_ids
(EngineCore pid=1833568)     self.get_language_model().embed_input_ids,
(EngineCore pid=1833568)     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=1833568)   File "/home/yewentao256/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1968, in __getattr__
(EngineCore pid=1833568)     raise AttributeError(
(EngineCore pid=1833568) AttributeError: 'CohereASRDecoder' object has no attribute 'embed_input_ids'

Now

====================== 1 passed, 17 warnings in 72.84s (0:01:12) ======================
Signed-off-by: yewentao256 <zhyanwentao@126.com>

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.

Tip: disable this comment in your organization's Code Review settings.

@yewentao256 yewentao256 added the ready ONLY add when PR is ready to merge/full CI is needed label Jun 4, 2026
@mergify mergify Bot added the v1 label Jun 4, 2026

@njhill njhill left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @yewentao256

Comment on lines +17 to +18
"WhisperForConditionalGeneration" in vllm_config.model_config.architectures
or "CohereAsrForConditionalGeneration" in vllm_config.model_config.architectures

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any other models in this category that we should add?

Do we any other way to determine this from the model and/or config?

@yewentao256 yewentao256 Jun 4, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems not, took a look and vllm/model_executor/models/funasr.py vllm/model_executor/models/fireredasr2.py vllm/model_executor/models/fireredlid.py they all have embed_input_ids(), I don't find other models missing

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we check using supports_transcription_only instead, may influence other models, so I think current fix is minimal and precise.

@njhill njhill enabled auto-merge (squash) June 9, 2026 17:03
@WoosukKwon WoosukKwon disabled auto-merge June 10, 2026 18:05
@WoosukKwon WoosukKwon merged commit ffce72c into main Jun 10, 2026
75 of 78 checks passed
@WoosukKwon WoosukKwon deleted the wentao-fix-v2-CohereASRDecoder branch June 10, 2026 18:06
wcynb1023 pushed a commit to wcynb1023/vllm that referenced this pull request Jun 11, 2026
…as no attribute 'embed_input_ids'` (vllm-project#44568)

Signed-off-by: yewentao256 <zhyanwentao@126.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Saddss pushed a commit to Saddss/vllm that referenced this pull request Jun 14, 2026
…as no attribute 'embed_input_ids'` (vllm-project#44568)

Signed-off-by: yewentao256 <zhyanwentao@126.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
vivek8123 pushed a commit to odh-on-pz/vllm-upstream that referenced this pull request Jun 18, 2026
…as no attribute 'embed_input_ids'` (vllm-project#44568)

Signed-off-by: yewentao256 <zhyanwentao@126.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
divineearthly pushed a commit to divineearthly/vllm that referenced this pull request Jun 19, 2026
…as no attribute 'embed_input_ids'` (vllm-project#44568)

Signed-off-by: yewentao256 <zhyanwentao@126.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Signed-off-by: divineearthly <divineearthly@gmail.com>
tunglinwood pushed a commit to tunglinwood/vllm that referenced this pull request Jun 22, 2026
…as no attribute 'embed_input_ids'` (vllm-project#44568)

Signed-off-by: yewentao256 <zhyanwentao@126.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
nkzhenhua pushed a commit to nkzhenhua/vllm that referenced this pull request Jun 24, 2026
…as no attribute 'embed_input_ids'` (vllm-project#44568)

Signed-off-by: yewentao256 <zhyanwentao@126.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
MingqiWang-coder added a commit to vLLM-HUST/vllm-hust that referenced this pull request Jun 30, 2026
Cherry-pick 62 bugfix/security PRs from upstream vllm-project/vllm main
(2026-05-03 to 2026-06-17), covering scheduler, engine core, model runner,
worker, attention, KV cache, compilation, and structured output fixes.

Security (4): vllm-project#43286 vllm-project#44744 vllm-project#45118 vllm-project#45252
Bugfix (56): vllm-project#35536 vllm-project#36616 vllm-project#38895 vllm-project#39155 vllm-project#39324 vllm-project#39562 vllm-project#39805 vllm-project#40398 vllm-project#40726
vllm-project#40727 vllm-project#40737 vllm-project#40749 vllm-project#40961 vllm-project#41119 vllm-project#41133 vllm-project#41233 vllm-project#41237 vllm-project#41411 vllm-project#41496 vllm-project#41549
vllm-project#41674 vllm-project#41873 vllm-project#41895 vllm-project#42040 vllm-project#42112 vllm-project#42289 vllm-project#42479 vllm-project#42585 vllm-project#42692 vllm-project#42706 vllm-project#42709
vllm-project#42739 vllm-project#42967 vllm-project#43001 vllm-project#43079 vllm-project#43125 vllm-project#43160 vllm-project#43616 vllm-project#43669 vllm-project#43719 vllm-project#43768 vllm-project#43808
vllm-project#43961 vllm-project#43982 vllm-project#43988 vllm-project#43998 vllm-project#44057 vllm-project#44560 vllm-project#44574 vllm-project#44568 vllm-project#44603 vllm-project#44744 vllm-project#45195
vllm-project#45345 vllm-project#45383 vllm-project#45487 vllm-project#45564 vllm-project#45673
Runner fix (2): vllm-project#44568 vllm-project#44603

Skipped: vllm-project#43781 (ROCm-specific, not applicable to Ascend NPU)

Conflict resolutions:
- Manual merge: vllm-project#43286 vllm-project#45118 vllm-project#42112 vllm-project#43160 vllm-project#43719 vllm-project#44560
- Upstream-preferred (-X theirs): vllm-project#43808 vllm-project#43988 vllm-project#42967 vllm-project#35536 vllm-project#45195
- Test files (--theirs): vllm-project#44744 vllm-project#41895 vllm-project#42040 vllm-project#41233 vllm-project#45345 vllm-project#43982

Co-authored-by: GitHub Copilot
Signed-off-by: MingqiWang-coder <mingqiwang@hust.edu.cn>
MingqiWang-coder added a commit to vLLM-HUST/vllm-hust that referenced this pull request Jun 30, 2026
Cherry-pick 62 bugfix/security PRs from upstream vllm-project/vllm main
(2026-05-03 to 2026-06-17), covering scheduler, engine core, model runner,
worker, attention, KV cache, compilation, and structured output fixes.

Security (4): vllm-project#43286 vllm-project#44744 vllm-project#45118 vllm-project#45252
Bugfix (56): vllm-project#35536 vllm-project#36616 vllm-project#38895 vllm-project#39155 vllm-project#39324 vllm-project#39562 vllm-project#39805 vllm-project#40398 vllm-project#40726
vllm-project#40727 vllm-project#40737 vllm-project#40749 vllm-project#40961 vllm-project#41119 vllm-project#41133 vllm-project#41233 vllm-project#41237 vllm-project#41411 vllm-project#41496 vllm-project#41549
vllm-project#41674 vllm-project#41873 vllm-project#41895 vllm-project#42040 vllm-project#42112 vllm-project#42289 vllm-project#42479 vllm-project#42585 vllm-project#42692 vllm-project#42706 vllm-project#42709
vllm-project#42739 vllm-project#42967 vllm-project#43001 vllm-project#43079 vllm-project#43125 vllm-project#43160 vllm-project#43616 vllm-project#43669 vllm-project#43719 vllm-project#43768 vllm-project#43808
vllm-project#43961 vllm-project#43982 vllm-project#43988 vllm-project#43998 vllm-project#44057 vllm-project#44560 vllm-project#44574 vllm-project#44568 vllm-project#44603 vllm-project#44744 vllm-project#45195
vllm-project#45345 vllm-project#45383 vllm-project#45487 vllm-project#45564 vllm-project#45673
Runner fix (2): vllm-project#44568 vllm-project#44603

Skipped: vllm-project#43781 (ROCm-specific, not applicable to Ascend NPU)

Conflict resolutions:
- Manual merge: vllm-project#43286 vllm-project#45118 vllm-project#42112 vllm-project#43160 vllm-project#43719 vllm-project#44560
- Upstream-preferred (-X theirs): vllm-project#43808 vllm-project#43988 vllm-project#42967 vllm-project#35536 vllm-project#45195
- Test files (--theirs): vllm-project#44744 vllm-project#41895 vllm-project#42040 vllm-project#41233 vllm-project#45345 vllm-project#43982

Co-authored-by: GitHub Copilot
Signed-off-by: MingqiWang-coder <mingqiwang@hust.edu.cn>
MingqiWang-coder added a commit to vLLM-HUST/vllm-hust that referenced this pull request Jul 2, 2026
Cherry-pick 62 bugfix/security PRs from upstream vllm-project/vllm main
(2026-05-03 to 2026-06-17), covering scheduler, engine core, model runner,
worker, attention, KV cache, compilation, and structured output fixes.

Security (4): vllm-project#43286 vllm-project#44744 vllm-project#45118 vllm-project#45252
Bugfix (56): vllm-project#35536 vllm-project#36616 vllm-project#38895 vllm-project#39155 vllm-project#39324 vllm-project#39562 vllm-project#39805 vllm-project#40398 vllm-project#40726
Runner fix (2): vllm-project#44568 vllm-project#44603

Skipped: vllm-project#43781 (ROCm-specific, not applicable to Ascend NPU)

Conflict resolutions:
- Manual merge: vllm-project#43286 vllm-project#45118 vllm-project#42112 vllm-project#43160 vllm-project#43719 vllm-project#44560
- Upstream-preferred (-X theirs): vllm-project#43808 vllm-project#43988 vllm-project#42967 vllm-project#35536 vllm-project#45195
- Test files (--theirs): vllm-project#44744 vllm-project#41895 vllm-project#42040 vllm-project#41233 vllm-project#45345 vllm-project#43982

Co-authored-by: GitHub Copilot
Signed-off-by: MingqiWang-coder <mingqiwang@hust.edu.cn>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed v1

3 participants