Skip to content

[XPU] Cap topk/topp Triton BLOCK_SIZE to 4096 to fix Top-p mask difference failures#44470

Merged
jikunshang merged 1 commit into
vllm-project:mainfrom
chaojun-zhang:sample_topk_topp_triton_fix
Jun 8, 2026
Merged

[XPU] Cap topk/topp Triton BLOCK_SIZE to 4096 to fix Top-p mask difference failures#44470
jikunshang merged 1 commit into
vllm-project:mainfrom
chaojun-zhang:sample_topk_topp_triton_fix

Conversation

@chaojun-zhang

Copy link
Copy Markdown
Contributor

Summary

Cap the Triton BLOCK_SIZE to 4096 in topk_topp_triton.py on XPU to fix Top-p mask difference too large test failures.

Changes

  • vllm/v1/sample/ops/topk_topp_triton.py: limit XPU block size to 4096 for deterministic sampling
  • .buildkite/intel_jobs/misc_intel.yaml: enable test_topk_topp_sampler.py on XPU CI
@mergify

mergify Bot commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @chaojun-zhang.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify Bot added the needs-rebase label Jun 7, 2026
Signed-off-by: Chaojun Zhang <chaojun.zhang@intel.com>
@chaojun-zhang chaojun-zhang force-pushed the sample_topk_topp_triton_fix branch from 7ea0bd7 to b829201 Compare June 8, 2026 05:44
@mergify mergify Bot removed the needs-rebase label Jun 8, 2026
@jikunshang jikunshang added the ready ONLY add when PR is ready to merge/full CI is needed label Jun 8, 2026
@jikunshang jikunshang enabled auto-merge (squash) June 8, 2026 08:56
@jikunshang jikunshang merged commit fa662b1 into vllm-project:main Jun 8, 2026
54 checks passed
ekagra-ranjan pushed a commit to ekagra-ranjan/vllm that referenced this pull request Jun 9, 2026
…rence failures (vllm-project#44470)

Signed-off-by: Chaojun Zhang <chaojun.zhang@intel.com>
Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
waqahmed-amd-fi pushed a commit to waqahmed-amd-fi/vllm that referenced this pull request Jun 10, 2026
…rence failures (vllm-project#44470)

Signed-off-by: Chaojun Zhang <chaojun.zhang@intel.com>
Signed-off-by: Waqar Ahmed <waqar.ahmed@amd.com>
Saddss pushed a commit to Saddss/vllm that referenced this pull request Jun 14, 2026
…rence failures (vllm-project#44470)

Signed-off-by: Chaojun Zhang <chaojun.zhang@intel.com>
vivek8123 pushed a commit to odh-on-pz/vllm-upstream that referenced this pull request Jun 18, 2026
…rence failures (vllm-project#44470)

Signed-off-by: Chaojun Zhang <chaojun.zhang@intel.com>
divineearthly pushed a commit to divineearthly/vllm that referenced this pull request Jun 19, 2026
…rence failures (vllm-project#44470)

Signed-off-by: Chaojun Zhang <chaojun.zhang@intel.com>
Signed-off-by: divineearthly <divineearthly@gmail.com>
tunglinwood pushed a commit to tunglinwood/vllm that referenced this pull request Jun 22, 2026
…rence failures (vllm-project#44470)

Signed-off-by: Chaojun Zhang <chaojun.zhang@intel.com>
nkzhenhua pushed a commit to nkzhenhua/vllm that referenced this pull request Jun 24, 2026
…rence failures (vllm-project#44470)

Signed-off-by: Chaojun Zhang <chaojun.zhang@intel.com>
ohsono pushed a commit to ohsono/vllm that referenced this pull request Jul 3, 2026
…rence failures (vllm-project#44470)

Signed-off-by: Chaojun Zhang <chaojun.zhang@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build intel-gpu Related to Intel GPU ready ONLY add when PR is ready to merge/full CI is needed v1

2 participants