[Model] Remove BambaForCausalLM#45990
Merged
Merged
Conversation
BambaForCausalLM (IBM Bamba-9B, hybrid Mamba-2 + attention model added in PR vllm-project#12538) is a removal candidate. The Bamba architecture served as the basis for nemotron_h.py (NemotronHForCausalLM) which remains fully supported and is the actively maintained successor. Adds BambaForCausalLM to _PREVIOUSLY_SUPPORTED_MODELS pointing to vLLM v0.23.0. Co-authored-by: Claude <noreply@anthropic.com> Signed-off-by: Xianbao QIAN <xianbao.qian@gmail.com>
Contributor
|
Documentation preview: https://vllm--45990.org.readthedocs.build/en/45990/ |
Contributor
Author
|
cc @tdoublep |
Member
|
Please fix the failing tests |
After removing Bamba from HYBRID_MODELS, the APC tests (which use HYBRID_MODELS[3]) shifted to granite-4.0-tiny-preview, causing OOM at gpu_memory_utilization=0.4. Move tiny-random/qwen3-next-moe to index [3] so the APC tests get a small model. Co-authored-by: Claude <noreply@anthropic.com> Signed-off-by: Xianbao QIAN <xianbao.qian@gmail.com>
auto-merge was automatically disabled
June 18, 2026 10:25
Head branch was pushed to by a user without write access
After removing BambaForCausalLM, the APC tests referenced HYBRID_MODELS[3] which no longer points to a tiny model that fits at gpu_memory_utilization=0.4. Rather than substituting another model, simplify APC parametrize to only use HYBRID_MODELS[0] (Jamba-tiny-dev), which is also a Mamba hybrid and provides sufficient APC coverage. Co-authored-by: Claude <noreply@anthropic.com> Signed-off-by: Xianbao QIAN <xianbao.qian@gmail.com>
auto-merge was automatically disabled
June 18, 2026 11:45
Head branch was pushed to by a user without write access
20d4fd2 to
9901e2f
Compare
AgenticSpark
added a commit
to AgenticSpark/vllm
that referenced
this pull request
Jun 18, 2026
PR vllm-project#45990 removed BambaForCausalLM (bamba.py is deleted; the model now lives in _PREVIOUSLY_SUPPORTED_MODELS), but two docs still referenced it: - docs/contributing/model/basic.md linked the deleted bamba.py as the case-(2) reference implementation. Repoint it to NemotronHForCausalLM (nemotron_h.py), which is also a Mamba-2 + attention hybrid (inherits IsHybrid, uses MambaMixer2). - docs/usage/v1_guide.md still listed BambaForCausalLM among supported hybrid models. Drop it. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: liejiang <jianglie2023@gmail.com>
divineearthly
pushed a commit
to divineearthly/vllm
that referenced
this pull request
Jun 19, 2026
Signed-off-by: Xianbao QIAN <xianbao.qian@gmail.com> Co-authored-by: Claude <noreply@anthropic.com> Signed-off-by: divineearthly <divineearthly@gmail.com>
xuebwang-amd
pushed a commit
to xuebwang-amd/vllm
that referenced
this pull request
Jun 21, 2026
Signed-off-by: Xianbao QIAN <xianbao.qian@gmail.com> Co-authored-by: Claude <noreply@anthropic.com>
tunglinwood
pushed a commit
to tunglinwood/vllm
that referenced
this pull request
Jun 22, 2026
Signed-off-by: Xianbao QIAN <xianbao.qian@gmail.com> Co-authored-by: Claude <noreply@anthropic.com>
nkzhenhua
pushed a commit
to nkzhenhua/vllm
that referenced
this pull request
Jun 24, 2026
Signed-off-by: Xianbao QIAN <xianbao.qian@gmail.com> Co-authored-by: Claude <noreply@anthropic.com>
AgenticSpark
added a commit
to AgenticSpark/vllm
that referenced
this pull request
Jun 25, 2026
PR vllm-project#45990 removed BambaForCausalLM (bamba.py is deleted; the model now lives in _PREVIOUSLY_SUPPORTED_MODELS), but docs/usage/v1_guide.md still lists it among the supported hybrid models. Drop it. The basic.md reference was fixed separately by vllm-project#46181. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: liejiang <jianglie2023@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
BambaForCausalLM(IBM Bamba-9B, hybrid Mamba-2 + attention model)nemotron_h.py(NemotronHForCausalLM) which remains fully supported and is the actively maintained successorBambaForCausalLMto_PREVIOUSLY_SUPPORTED_MODELSpointing to vLLM v0.23.0Files changed
vllm/model_executor/models/bamba.py— deletedvllm/model_executor/models/registry.py— removed from_TEXT_GENERATION_MODELS, added to_PREVIOUSLY_SUPPORTED_MODELStests/models/registry.py— removed test entrytests/models/language/generation/test_hybrid.py— removed from parametrized hybrid model listdocs/models/supported_models.md— removed table rowTest plan
BambaForCausalLMreferences invllm/,tests/, ordocs/models/supported_models.md(only in_PREVIOUSLY_SUPPORTED_MODELS)NemotronHForCausalLM(nemotron_h.py, adapted from bamba.py) is unaffectedAI assistance was used (Claude). This is not duplicating an existing PR.
Co-authored-by: Claude noreply@anthropic.com