-
Notifications
You must be signed in to change notification settings - Fork 237
Pull requests: vllm-project/tpu-inference
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
update docs for v0.24.0 release
documentation
Improvements or additions to documentation
ready
ONLY add when PR is ready to merge/full CI is needed
#3071
opened Jul 2, 2026 by
CienetStingLin
Collaborator
•
Draft
skip multimodal if configuration disables it.
#3070
opened Jul 1, 2026 by
lc5211
Collaborator
Loading…
Add sync scheduler test case for e2e continue decode
ready
ONLY add when PR is ready to merge/full CI is needed
#3069
opened Jul 1, 2026 by
pv97
Collaborator
Loading…
Add flag continue decode to not exit on EOS
ready
ONLY add when PR is ready to merge/full CI is needed
#3068
opened Jul 1, 2026 by
pv97
Collaborator
Loading…
[Torchax] Add compressed_tensor w4a4_nvfp4 support
ready
ONLY add when PR is ready to merge/full CI is needed
#3067
opened Jul 1, 2026 by
lxhfirenking
Collaborator
Loading…
[DeepSeekv4 bingup] Have a dedicate model.py override for DSv4 in TPU
ready
ONLY add when PR is ready to merge/full CI is needed
#3065
opened Jul 1, 2026 by
gxd3
Collaborator
Loading…
Fix the dependsOn for the notify_test_result step
ready
ONLY add when PR is ready to merge/full CI is needed
#3064
opened Jul 1, 2026 by
CienetStingLin
Collaborator
Loading…
A few improvements for jax/xla cache
ready
ONLY add when PR is ready to merge/full CI is needed
#3063
opened Jul 1, 2026 by
theminghuang
Collaborator
Loading…
[not ready for review] kernels/sparse_core: support arbitrary hidden sizes in dense_gather_reduce
ready
ONLY add when PR is ready to merge/full CI is needed
#3062
opened Jul 1, 2026 by
QiliangCui2023
Collaborator
Loading…
Test Disable Cache
ready
ONLY add when PR is ready to merge/full CI is needed
#3060
opened Jul 1, 2026 by
theminghuang
Collaborator
•
Draft
Migrate Qwen3 benchmark cases from bm-infra
ready
ONLY add when PR is ready to merge/full CI is needed
Remove explicit PAT from pipeline now that we have Github App
ready
ONLY add when PR is ready to merge/full CI is needed
#3054
opened Jun 30, 2026 by
theminghuang
Collaborator
Loading…
[Gemma4] Add cudagraph_mm_encoder to gemma4 recipes
ready
ONLY add when PR is ready to merge/full CI is needed
#3047
opened Jun 29, 2026 by
kwang3939
Collaborator
Loading…
[Fix] Pass num_tokens to set_forward_context in model wrapper
#3042
opened Jun 29, 2026 by
gangchen03
Loading…
[kernels][fused_moe] Add another fused EP MoE kernels
#3040
opened Jun 29, 2026 by
rupengliu-meta
Collaborator
Loading…
Profile for limited step count to avoid xProf 2GB hard limitation
ready
ONLY add when PR is ready to merge/full CI is needed
#3039
opened Jun 29, 2026 by
hosseinsarshar
Collaborator
Loading…
[DSv3] Read model config from HF config and add load test
ready
ONLY add when PR is ready to merge/full CI is needed
#3035
opened Jun 28, 2026 by
jerviscz
Loading…
4 of 5 tasks
Support GCS URIs for profiler output directories
#3033
opened Jun 28, 2026 by
Sam-Si
Loading…
2 of 5 tasks
Add q_split logic for mixed mode for Kimi 8k/1k prefill.
ready
ONLY add when PR is ready to merge/full CI is needed
#3028
opened Jun 26, 2026 by
Xhark
Collaborator
Loading…
[Feature] Support for non-causal (encoder-only) attention
#3024
opened Jun 26, 2026 by
ArvendraChhonkar
Loading…
Previous Next
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.