Skip to content

[Model] Remove BambaForCausalLM#45990

Merged
vllm-bot merged 3 commits into
vllm-project:mainfrom
xianbaoqian:remove-bamba
Jun 18, 2026
Merged

[Model] Remove BambaForCausalLM#45990
vllm-bot merged 3 commits into
vllm-project:mainfrom
xianbaoqian:remove-bamba

Conversation

@xianbaoqian

Copy link
Copy Markdown
Contributor

Summary

  • Remove BambaForCausalLM (IBM Bamba-9B, hybrid Mamba-2 + attention model)
  • The Bamba architecture served as the basis for nemotron_h.py (NemotronHForCausalLM) which remains fully supported and is the actively maintained successor
  • Add BambaForCausalLM to _PREVIOUSLY_SUPPORTED_MODELS pointing to vLLM v0.23.0

Files changed

  • vllm/model_executor/models/bamba.py — deleted
  • vllm/model_executor/models/registry.py — removed from _TEXT_GENERATION_MODELS, added to _PREVIOUSLY_SUPPORTED_MODELS
  • tests/models/registry.py — removed test entry
  • tests/models/language/generation/test_hybrid.py — removed from parametrized hybrid model list
  • docs/models/supported_models.md — removed table row

Test plan

  • Pre-commit hooks pass
  • No residual BambaForCausalLM references in vllm/, tests/, or docs/models/supported_models.md (only in _PREVIOUSLY_SUPPORTED_MODELS)
  • Sibling model NemotronHForCausalLM (nemotron_h.py, adapted from bamba.py) is unaffected

AI assistance was used (Claude). This is not duplicating an existing PR.

Co-authored-by: Claude noreply@anthropic.com

BambaForCausalLM (IBM Bamba-9B, hybrid Mamba-2 + attention model added
in PR vllm-project#12538) is a removal candidate. The Bamba architecture served as
the basis for nemotron_h.py (NemotronHForCausalLM) which remains fully
supported and is the actively maintained successor.

Adds BambaForCausalLM to _PREVIOUSLY_SUPPORTED_MODELS pointing to
vLLM v0.23.0.

Co-authored-by: Claude <noreply@anthropic.com>
Signed-off-by: Xianbao QIAN <xianbao.qian@gmail.com>
@mergify

mergify Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor
@mergify mergify Bot added documentation Improvements or additions to documentation new-model Requests to new models labels Jun 18, 2026
@xianbaoqian

Copy link
Copy Markdown
Contributor Author
@DarkLight1337 DarkLight1337 enabled auto-merge (squash) June 18, 2026 04:08
@github-actions github-actions Bot added the ready ONLY add when PR is ready to merge/full CI is needed label Jun 18, 2026
@DarkLight1337

Copy link
Copy Markdown
Member

Please fix the failing tests

@tdoublep tdoublep left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

After removing Bamba from HYBRID_MODELS, the APC tests (which use
HYBRID_MODELS[3]) shifted to granite-4.0-tiny-preview, causing OOM
at gpu_memory_utilization=0.4. Move tiny-random/qwen3-next-moe to
index [3] so the APC tests get a small model.

Co-authored-by: Claude <noreply@anthropic.com>
Signed-off-by: Xianbao QIAN <xianbao.qian@gmail.com>
auto-merge was automatically disabled June 18, 2026 10:25

Head branch was pushed to by a user without write access

@DarkLight1337 DarkLight1337 enabled auto-merge (squash) June 18, 2026 10:26
After removing BambaForCausalLM, the APC tests referenced
HYBRID_MODELS[3] which no longer points to a tiny model that fits
at gpu_memory_utilization=0.4. Rather than substituting another
model, simplify APC parametrize to only use HYBRID_MODELS[0]
(Jamba-tiny-dev), which is also a Mamba hybrid and provides
sufficient APC coverage.

Co-authored-by: Claude <noreply@anthropic.com>
Signed-off-by: Xianbao QIAN <xianbao.qian@gmail.com>
auto-merge was automatically disabled June 18, 2026 11:45

Head branch was pushed to by a user without write access

@vllm-bot vllm-bot merged commit d682968 into vllm-project:main Jun 18, 2026
58 of 60 checks passed
AgenticSpark added a commit to AgenticSpark/vllm that referenced this pull request Jun 18, 2026
PR vllm-project#45990 removed BambaForCausalLM (bamba.py is deleted; the model now lives in
_PREVIOUSLY_SUPPORTED_MODELS), but two docs still referenced it:

- docs/contributing/model/basic.md linked the deleted bamba.py as the case-(2)
  reference implementation. Repoint it to NemotronHForCausalLM (nemotron_h.py),
  which is also a Mamba-2 + attention hybrid (inherits IsHybrid, uses MambaMixer2).
- docs/usage/v1_guide.md still listed BambaForCausalLM among supported hybrid
  models. Drop it.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: liejiang <jianglie2023@gmail.com>
divineearthly pushed a commit to divineearthly/vllm that referenced this pull request Jun 19, 2026
Signed-off-by: Xianbao QIAN <xianbao.qian@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>
Signed-off-by: divineearthly <divineearthly@gmail.com>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Jun 21, 2026
Signed-off-by: Xianbao QIAN <xianbao.qian@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>
tunglinwood pushed a commit to tunglinwood/vllm that referenced this pull request Jun 22, 2026
Signed-off-by: Xianbao QIAN <xianbao.qian@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>
nkzhenhua pushed a commit to nkzhenhua/vllm that referenced this pull request Jun 24, 2026
Signed-off-by: Xianbao QIAN <xianbao.qian@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>
AgenticSpark added a commit to AgenticSpark/vllm that referenced this pull request Jun 25, 2026
PR vllm-project#45990 removed BambaForCausalLM (bamba.py is deleted; the model now lives in
_PREVIOUSLY_SUPPORTED_MODELS), but docs/usage/v1_guide.md still lists it among
the supported hybrid models. Drop it.

The basic.md reference was fixed separately by vllm-project#46181.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: liejiang <jianglie2023@gmail.com>
@xianbaoqian xianbaoqian deleted the remove-bamba branch June 30, 2026 15:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation new-model Requests to new models ready ONLY add when PR is ready to merge/full CI is needed

4 participants