Deprecations for v0.23 and v0.24 by hmellor · Pull Request #44992 · vllm-project/vllm

hmellor · 2026-06-09T09:33:25Z

Perform deletions for deprecations scheduled for:

v0.23 - these should technically have already been deleted as v0.23 has already been cut
v0.24* - this will be the next minor release to be cut

*this PR does not include the deletion of the Transformers v4 code path. This is somewhat more complicated and will be done in a follow up PR.

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

mergify · 2026-06-09T09:34:05Z

Documentation preview: https://vllm--44992.org.readthedocs.build/en/44992/

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

DarkLight1337

Thanks for the cleanup

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

yewentao256

LGTM, thanks for the work!

The `VLLM_USE_FLASHINFER_MOE_MXFP4_BF16=1` env var was device-aware: it selected `FLASHINFER_CUTLASS_MXFP4_BF16` on SM90 (H100) but `FLASHINFER_TRTLLM_MXFP4_BF16` on SM100 (B200). The migration to `--moe-backend flashinfer_cutlass` lost that distinction, so on the B200 GPQA eval the CUTLASS BF16 kernel is selected and rejects the deployment (the kernel is unsupported on SM100), crashing the engine with "does not support the deployment configuration". Use `flashinfer_trtllm` to match the backend the env var selected on B200, which is the hardware this eval step runs on. Co-authored-by: Claude <noreply@anthropic.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

AndreasKaratzas · 2026-06-10T21:33:44Z

We have updated AMD CI to use only MI325 for the moment cause MI300 cluster is a bit flaky at the moment. So probably merge main for the AMD tests failures at least so that they run on MI325

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Rename VLLM_TRITON_ATTN_USE_TD to the general VLLM_TRITON_USE_TD per the TD-adoption RFC (vllm-project#42545). The old name stays registered but is now ignored and warns on use (removed in v0.25), mirroring the existing HOST_IP deprecation in network_utils.py; vllm-project#44992 removed the generic deprecated_env helper, so the warning is emitted inline. Tri-state semantics of the new variable are unchanged. Co-authored-by: Claude <noreply@anthropic.com> Signed-off-by: Artur Fierka <artur.fierka@intel.com>

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: divineearthly <divineearthly@gmail.com>

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Rename VLLM_TRITON_ATTN_USE_TD to the general VLLM_TRITON_USE_TD per the TD-adoption RFC (vllm-project#42545). The old name stays registered but is now ignored and warns on use (removed in v0.25), mirroring the existing HOST_IP deprecation in network_utils.py; vllm-project#44992 removed the generic deprecated_env helper, so the warning is emitted inline. Tri-state semantics of the new variable are unchanged. Co-authored-by: Claude <noreply@anthropic.com> Signed-off-by: Artur Fierka <artur.fierka@intel.com>

hmellor added 4 commits June 9, 2026 10:04

Remove llm.reward scheduled for v0.23

ffed4fe

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Remove fuse_minimax_qk_norm scheduled for v0.23

5160b67

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Remove env vars scheduled for v0.23

0490785

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Remove custom_mm scheduled for v0.24

937beb5

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

hmellor requested review from AndreasKaratzas, DarkLight1337, ProExpertProg, WoosukKwon, houseroad, jeejeelee, mgoin, noooop, pavanimajety, robertgshaw2-redhat, tlrmchlsmth, vadiklyutiy, yewentao256, youkaichao, ywang96 and zyongye as code owners June 9, 2026 09:33

mergify Bot added documentation Improvements or additions to documentation ci/build frontend performance Performance-related issues gpt-oss Related to GPT-OSS models labels Jun 9, 2026

github-project-automation Bot added this to gpt-oss Issues & Enhancements Jun 9, 2026

mergify Bot added the nvidia label Jun 9, 2026

github-project-automation Bot moved this to To Triage in gpt-oss Issues & Enhancements Jun 9, 2026

github-project-automation Bot added this to NVIDIA Jun 9, 2026

Document the correct removal version

c05f759

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

DarkLight1337 approved these changes Jun 9, 2026

View reviewed changes

github-project-automation Bot moved this to Ready in NVIDIA Jun 9, 2026

github-project-automation Bot moved this from To Triage to Ready in gpt-oss Issues & Enhancements Jun 9, 2026

DarkLight1337 enabled auto-merge (squash) June 9, 2026 09:44

github-actions Bot added the ready ONLY add when PR is ready to merge/full CI is needed label Jun 9, 2026

DarkLight1337 added ready-run-all-tests Trigger CI with all tests for wide-ranging PRs and removed ready ONLY add when PR is ready to merge/full CI is needed labels Jun 9, 2026

Cutlass for H100 test

d1ea93e

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

yewentao256 approved these changes Jun 9, 2026

View reviewed changes

hmellor and others added 4 commits June 9, 2026 20:42

Merge branch 'main' into deprecations

c7a0c67

Fix same config used on different hardware

f88efce

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Merge branch 'main' into deprecations

aff1f9e

Merge branch 'main' into deprecations

645cbb7

DarkLight1337 merged commit 03878d1 into vllm-project:main Jun 11, 2026
247 checks passed

github-project-automation Bot moved this from Ready to Done in gpt-oss Issues & Enhancements Jun 11, 2026

github-project-automation Bot moved this from Ready to Done in NVIDIA Jun 11, 2026

hmellor deleted the deprecations branch June 11, 2026 14:36

Saddss pushed a commit to Saddss/vllm that referenced this pull request Jun 14, 2026

Deprecations for v0.23 and v0.24 (vllm-project#44992)

03a5760

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

itayalroy mentioned this pull request Jun 14, 2026

nixl_ep: Skip post-receive quantization for NVFP4 #45606

Merged

afierka-intel mentioned this pull request Jun 16, 2026

[Misc] Rename VLLM_TRITON_ATTN_USE_TD to VLLM_TRITON_USE_TD #45781

Open

divineearthly pushed a commit to divineearthly/vllm that referenced this pull request Jun 19, 2026

Deprecations for v0.23 and v0.24 (vllm-project#44992)

f9f579e

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: divineearthly <divineearthly@gmail.com>

tunglinwood pushed a commit to tunglinwood/vllm that referenced this pull request Jun 22, 2026

Deprecations for v0.23 and v0.24 (vllm-project#44992)

e92e5d5

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

nkzhenhua pushed a commit to nkzhenhua/vllm that referenced this pull request Jun 24, 2026

Deprecations for v0.23 and v0.24 (vllm-project#44992)

c0c9d49

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Deprecations for v0.23 and v0.24#44992

Deprecations for v0.23 and v0.24#44992
DarkLight1337 merged 11 commits into
vllm-project:mainfrom
hmellor:deprecations

hmellor commented Jun 9, 2026 •

edited

Loading

mergify Bot commented Jun 9, 2026

DarkLight1337 left a comment

yewentao256 left a comment

AndreasKaratzas commented Jun 10, 2026

Uh oh!

Labels

4 participants

Uh oh!

Uh oh!

Conversation

hmellor commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

mergify Bot commented Jun 9, 2026

DarkLight1337 left a comment

Choose a reason for hiding this comment

yewentao256 left a comment

Choose a reason for hiding this comment

AndreasKaratzas commented Jun 10, 2026

Uh oh!

Labels

4 participants

hmellor commented Jun 9, 2026 •

edited

Loading