Skip to content

Deprecations for v0.23 and v0.24#44992

Merged
DarkLight1337 merged 11 commits into
vllm-project:mainfrom
hmellor:deprecations
Jun 11, 2026
Merged

Deprecations for v0.23 and v0.24#44992
DarkLight1337 merged 11 commits into
vllm-project:mainfrom
hmellor:deprecations

Conversation

@hmellor

@hmellor hmellor commented Jun 9, 2026

Copy link
Copy Markdown
Member

Perform deletions for deprecations scheduled for:

  • v0.23 - these should technically have already been deleted as v0.23 has already been cut
  • v0.24* - this will be the next minor release to be cut

*this PR does not include the deletion of the Transformers v4 code path. This is somewhat more complicated and will be done in a follow up PR.

hmellor added 4 commits June 9, 2026 10:04
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
@mergify

mergify Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor
@mergify mergify Bot added documentation Improvements or additions to documentation ci/build frontend performance Performance-related issues gpt-oss Related to GPT-OSS models labels Jun 9, 2026
@mergify mergify Bot added the nvidia label Jun 9, 2026
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

@DarkLight1337 DarkLight1337 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the cleanup

@github-project-automation github-project-automation Bot moved this to Ready in NVIDIA Jun 9, 2026
@github-project-automation github-project-automation Bot moved this from To Triage to Ready in gpt-oss Issues & Enhancements Jun 9, 2026
@DarkLight1337 DarkLight1337 enabled auto-merge (squash) June 9, 2026 09:44
@github-actions github-actions Bot added the ready ONLY add when PR is ready to merge/full CI is needed label Jun 9, 2026
@DarkLight1337 DarkLight1337 added ready-run-all-tests Trigger CI with all tests for wide-ranging PRs and removed ready ONLY add when PR is ready to merge/full CI is needed labels Jun 9, 2026
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

@yewentao256 yewentao256 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the work!

hmellor and others added 4 commits June 9, 2026 20:42
The `VLLM_USE_FLASHINFER_MOE_MXFP4_BF16=1` env var was device-aware:
it selected `FLASHINFER_CUTLASS_MXFP4_BF16` on SM90 (H100) but
`FLASHINFER_TRTLLM_MXFP4_BF16` on SM100 (B200). The migration to
`--moe-backend flashinfer_cutlass` lost that distinction, so on the
B200 GPQA eval the CUTLASS BF16 kernel is selected and rejects the
deployment (the kernel is unsupported on SM100), crashing the engine
with "does not support the deployment configuration".

Use `flashinfer_trtllm` to match the backend the env var selected on
B200, which is the hardware this eval step runs on.

Co-authored-by: Claude <noreply@anthropic.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
@AndreasKaratzas

Copy link
Copy Markdown
Member

We have updated AMD CI to use only MI325 for the moment cause MI300 cluster is a bit flaky at the moment. So probably merge main for the AMD tests failures at least so that they run on MI325

@DarkLight1337 DarkLight1337 merged commit 03878d1 into vllm-project:main Jun 11, 2026
247 checks passed
@github-project-automation github-project-automation Bot moved this from Ready to Done in NVIDIA Jun 11, 2026
@hmellor hmellor deleted the deprecations branch June 11, 2026 14:36
Saddss pushed a commit to Saddss/vllm that referenced this pull request Jun 14, 2026
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
afierka-intel added a commit to afierka-intel/vllm that referenced this pull request Jun 16, 2026
Rename VLLM_TRITON_ATTN_USE_TD to the general VLLM_TRITON_USE_TD per the
TD-adoption RFC (vllm-project#42545). The old name stays registered but is now ignored
and warns on use (removed in v0.25), mirroring the existing HOST_IP
deprecation in network_utils.py; vllm-project#44992 removed the generic deprecated_env
helper, so the warning is emitted inline. Tri-state semantics of the new
variable are unchanged.

Co-authored-by: Claude <noreply@anthropic.com>
Signed-off-by: Artur Fierka <artur.fierka@intel.com>
divineearthly pushed a commit to divineearthly/vllm that referenced this pull request Jun 19, 2026
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: divineearthly <divineearthly@gmail.com>
tunglinwood pushed a commit to tunglinwood/vllm that referenced this pull request Jun 22, 2026
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
nkzhenhua pushed a commit to nkzhenhua/vllm that referenced this pull request Jun 24, 2026
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
oonyshch pushed a commit to afierka-intel/vllm that referenced this pull request Jul 1, 2026
Rename VLLM_TRITON_ATTN_USE_TD to the general VLLM_TRITON_USE_TD per the
TD-adoption RFC (vllm-project#42545). The old name stays registered but is now ignored
and warns on use (removed in v0.25), mirroring the existing HOST_IP
deprecation in network_utils.py; vllm-project#44992 removed the generic deprecated_env
helper, so the warning is emitted inline. Tri-state semantics of the new
variable are unchanged.

Co-authored-by: Claude <noreply@anthropic.com>
Signed-off-by: Artur Fierka <artur.fierka@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build documentation Improvements or additions to documentation frontend gpt-oss Related to GPT-OSS models nvidia performance Performance-related issues ready-run-all-tests Trigger CI with all tests for wide-ranging PRs

4 participants