[XPU] Cap topk/topp Triton BLOCK_SIZE to 4096 to fix Top-p mask difference failures by chaojun-zhang · Pull Request #44470 · vllm-project/vllm

chaojun-zhang · 2026-06-04T00:59:24Z

Summary

Cap the Triton BLOCK_SIZE to 4096 in topk_topp_triton.py on XPU to fix Top-p mask difference too large test failures.

Changes

vllm/v1/sample/ops/topk_topp_triton.py: limit XPU block size to 4096 for deterministic sampling
.buildkite/intel_jobs/misc_intel.yaml: enable test_topk_topp_sampler.py on XPU CI

mergify · 2026-06-07T02:19:29Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @chaojun-zhang.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: Chaojun Zhang <chaojun.zhang@intel.com>

…rence failures (vllm-project#44470) Signed-off-by: Chaojun Zhang <chaojun.zhang@intel.com> Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>

…rence failures (vllm-project#44470) Signed-off-by: Chaojun Zhang <chaojun.zhang@intel.com> Signed-off-by: Waqar Ahmed <waqar.ahmed@amd.com>

…rence failures (vllm-project#44470) Signed-off-by: Chaojun Zhang <chaojun.zhang@intel.com>

…rence failures (vllm-project#44470) Signed-off-by: Chaojun Zhang <chaojun.zhang@intel.com> Signed-off-by: divineearthly <divineearthly@gmail.com>

…rence failures (vllm-project#44470) Signed-off-by: Chaojun Zhang <chaojun.zhang@intel.com>

chaojun-zhang requested review from 22quinn, Harry-Chen, houseroad, khluu and njhill as code owners June 4, 2026 00:59

mergify Bot added ci/build intel-gpu Related to Intel GPU v1 labels Jun 4, 2026

chaojun-zhang closed this Jun 4, 2026

chaojun-zhang reopened this Jun 4, 2026

chaojun-zhang mentioned this pull request Jun 4, 2026

[XPU] Enable v1/sample tests on XPU CI #44472

Draft

mergify Bot added the needs-rebase label Jun 7, 2026

[XPU] Cap topk/topp Triton BLOCK_SIZE to 4096 for deterministic sampling

b829201

Signed-off-by: Chaojun Zhang <chaojun.zhang@intel.com>

chaojun-zhang force-pushed the sample_topk_topp_triton_fix branch from 7ea0bd7 to b829201 Compare June 8, 2026 05:44

mergify Bot removed the needs-rebase label Jun 8, 2026

jikunshang approved these changes Jun 8, 2026

View reviewed changes

jikunshang added the ready ONLY add when PR is ready to merge/full CI is needed label Jun 8, 2026

jikunshang enabled auto-merge (squash) June 8, 2026 08:56

jikunshang merged commit fa662b1 into vllm-project:main Jun 8, 2026
54 checks passed

Saddss pushed a commit to Saddss/vllm that referenced this pull request Jun 14, 2026

[XPU] Cap topk/topp Triton BLOCK_SIZE to 4096 to fix Top-p mask diffe…

d2cc041

…rence failures (vllm-project#44470) Signed-off-by: Chaojun Zhang <chaojun.zhang@intel.com>

vivek8123 pushed a commit to odh-on-pz/vllm-upstream that referenced this pull request Jun 18, 2026

[XPU] Cap topk/topp Triton BLOCK_SIZE to 4096 to fix Top-p mask diffe…

9351e7b

…rence failures (vllm-project#44470) Signed-off-by: Chaojun Zhang <chaojun.zhang@intel.com>

tunglinwood pushed a commit to tunglinwood/vllm that referenced this pull request Jun 22, 2026

[XPU] Cap topk/topp Triton BLOCK_SIZE to 4096 to fix Top-p mask diffe…

7be3bc0

…rence failures (vllm-project#44470) Signed-off-by: Chaojun Zhang <chaojun.zhang@intel.com>

nkzhenhua pushed a commit to nkzhenhua/vllm that referenced this pull request Jun 24, 2026

[XPU] Cap topk/topp Triton BLOCK_SIZE to 4096 to fix Top-p mask diffe…

199ca42

…rence failures (vllm-project#44470) Signed-off-by: Chaojun Zhang <chaojun.zhang@intel.com>

ohsono pushed a commit to ohsono/vllm that referenced this pull request Jul 3, 2026

[XPU] Cap topk/topp Triton BLOCK_SIZE to 4096 to fix Top-p mask diffe…

c91db5c

…rence failures (vllm-project#44470) Signed-off-by: Chaojun Zhang <chaojun.zhang@intel.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[XPU] Cap topk/topp Triton BLOCK_SIZE to 4096 to fix Top-p mask difference failures#44470

[XPU] Cap topk/topp Triton BLOCK_SIZE to 4096 to fix Top-p mask difference failures#44470
jikunshang merged 1 commit into
vllm-project:mainfrom
chaojun-zhang:sample_topk_topp_triton_fix

chaojun-zhang commented Jun 4, 2026

mergify Bot commented Jun 7, 2026

Uh oh!

Labels

2 participants

Uh oh!

Uh oh!

Conversation

chaojun-zhang commented Jun 4, 2026

Summary

Changes

mergify Bot commented Jun 7, 2026

Uh oh!

Labels

2 participants