Skip to content

Pull requests: vllm-project/tpu-inference

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

update docs for v0.24.0 release documentation Improvements or additions to documentation ready ONLY add when PR is ready to merge/full CI is needed
#3071 opened Jul 2, 2026 by CienetStingLin Collaborator Draft
skip multimodal if configuration disables it.
#3070 opened Jul 1, 2026 by lc5211 Collaborator Loading…
Add sync scheduler test case for e2e continue decode ready ONLY add when PR is ready to merge/full CI is needed
#3069 opened Jul 1, 2026 by pv97 Collaborator Loading…
Add flag continue decode to not exit on EOS ready ONLY add when PR is ready to merge/full CI is needed
#3068 opened Jul 1, 2026 by pv97 Collaborator Loading…
[Torchax] Add compressed_tensor w4a4_nvfp4 support ready ONLY add when PR is ready to merge/full CI is needed
#3067 opened Jul 1, 2026 by lxhfirenking Collaborator Loading…
[DeepSeekv4 bingup] Have a dedicate model.py override for DSv4 in TPU ready ONLY add when PR is ready to merge/full CI is needed
#3065 opened Jul 1, 2026 by gxd3 Collaborator Loading…
Fix the dependsOn for the notify_test_result step ready ONLY add when PR is ready to merge/full CI is needed
#3064 opened Jul 1, 2026 by CienetStingLin Collaborator Loading…
A few improvements for jax/xla cache ready ONLY add when PR is ready to merge/full CI is needed
#3063 opened Jul 1, 2026 by theminghuang Collaborator Loading…
[not ready for review] kernels/sparse_core: support arbitrary hidden sizes in dense_gather_reduce ready ONLY add when PR is ready to merge/full CI is needed
#3062 opened Jul 1, 2026 by QiliangCui2023 Collaborator Loading…
Cse
#3061 opened Jul 1, 2026 by pritha90 Contributor Loading…
Test Disable Cache ready ONLY add when PR is ready to merge/full CI is needed
#3060 opened Jul 1, 2026 by theminghuang Collaborator Draft
Migrate Qwen3 benchmark cases from bm-infra ready ONLY add when PR is ready to merge/full CI is needed
#3058 opened Jul 1, 2026 by ylangtsou Collaborator Draft
[MTP] Implement mamba state rollback
#3057 opened Jul 1, 2026 by Lumosis Collaborator Draft
Support 2D DP attention sharding
#3056 opened Jun 30, 2026 by BirdsOfAFthr Collaborator Draft
Remove explicit PAT from pipeline now that we have Github App ready ONLY add when PR is ready to merge/full CI is needed
#3054 opened Jun 30, 2026 by theminghuang Collaborator Loading…
[Gemma4] Add cudagraph_mm_encoder to gemma4 recipes ready ONLY add when PR is ready to merge/full CI is needed
#3047 opened Jun 29, 2026 by kwang3939 Collaborator Loading…
[kernels][fused_moe] Add another fused EP MoE kernels
#3040 opened Jun 29, 2026 by rupengliu-meta Collaborator Loading…
Profile for limited step count to avoid xProf 2GB hard limitation ready ONLY add when PR is ready to merge/full CI is needed
#3039 opened Jun 29, 2026 by hosseinsarshar Collaborator Loading…
[DSv3] Read model config from HF config and add load test ready ONLY add when PR is ready to merge/full CI is needed
#3035 opened Jun 28, 2026 by jerviscz Loading…
4 of 5 tasks
Support GCS URIs for profiler output directories
#3033 opened Jun 28, 2026 by Sam-Si Loading…
2 of 5 tasks
Add q_split logic for mixed mode for Kimi 8k/1k prefill. ready ONLY add when PR is ready to merge/full CI is needed
#3028 opened Jun 26, 2026 by Xhark Collaborator Loading…
ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.