Skip to content

[DSV4][XPU] Add MHC fused_post_pre support#44144

Merged
jikunshang merged 1 commit into
vllm-project:mainfrom
majian4work:dsv4-pr5-mhc-fused-post-pre
Jun 9, 2026
Merged

[DSV4][XPU] Add MHC fused_post_pre support#44144
jikunshang merged 1 commit into
vllm-project:mainfrom
majian4work:dsv4-pr5-mhc-fused-post-pre

Conversation

@majian4work

Copy link
Copy Markdown
Contributor

Summary

Add MHCFusedPostPreOp XPU support for DeepSeek-V4 on Intel XPU, enabling the fused MHC post+pre path in the decoder loop (matching the AMD/CUDA pattern).

Changes

  • vllm/model_executor/layers/mhc.py: Implement forward_native for MHCFusedPostPreOp (decomposes into mhc_post_torch + mhc_pre_torch); add forward_xpu delegating to forward_native.
  • vllm/models/deepseek_v4/xpu/model.py: Update decoder loop to use fused MHC path (first layer → standalone hc_pre, middle layers → mhc_fused_post_pre, explicit hc_post after loop). Add weight loading guards for truncated model testing.

Dependencies

⚠️ This PR depends on #42953 being merged first.

PR #42953 introduces the XPU attention decode path (dsv4-pr4-attention-decode) which this PR builds upon.

@mergify mergify Bot added intel-gpu Related to Intel GPU v1 labels Jun 1, 2026
@mergify

mergify Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @majian4work.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@mergify mergify Bot removed the needs-rebase label Jun 8, 2026
@majian4work

Copy link
Copy Markdown
Contributor Author

@jikunshang @xinyu-intel @wuxun-zhang Please help to take a review, thanks.

@jikunshang jikunshang added the verified Run pre-commit for new contributors without triggering other tests label Jun 9, 2026

@wuxun-zhang wuxun-zhang left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mergify

mergify Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Hi @majian4work, the pre-commit checks have failed. Please run:

uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

- Add forward_xpu to MHCFusedPostPreOp (decomposes into mhc_post_torch + mhc_pre_torch)
- Update XPU model forward to use fused MHC path (matching AMD pattern):
  first layer uses standalone hc_pre, middle layers use mhc_fused_post_pre
- Add explicit hc_post after decoder loop

Signed-off-by: Ma Jian <jian1.ma@intel.com>
@majian4work majian4work force-pushed the dsv4-pr5-mhc-fused-post-pre branch from 4d2a1c7 to 1fab1f0 Compare June 9, 2026 06:13
@jikunshang jikunshang added the ready ONLY add when PR is ready to merge/full CI is needed label Jun 9, 2026
@jikunshang jikunshang merged commit 70db148 into vllm-project:main Jun 9, 2026
69 of 70 checks passed
ekagra-ranjan pushed a commit to ekagra-ranjan/vllm that referenced this pull request Jun 9, 2026
Signed-off-by: Ma Jian <jian1.ma@intel.com>
Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
waqahmed-amd-fi pushed a commit to waqahmed-amd-fi/vllm that referenced this pull request Jun 10, 2026
Signed-off-by: Ma Jian <jian1.ma@intel.com>
Signed-off-by: Waqar Ahmed <waqar.ahmed@amd.com>
Saddss pushed a commit to Saddss/vllm that referenced this pull request Jun 14, 2026
Signed-off-by: Ma Jian <jian1.ma@intel.com>
vivek8123 pushed a commit to odh-on-pz/vllm-upstream that referenced this pull request Jun 18, 2026
Signed-off-by: Ma Jian <jian1.ma@intel.com>
divineearthly pushed a commit to divineearthly/vllm that referenced this pull request Jun 19, 2026
Signed-off-by: Ma Jian <jian1.ma@intel.com>
Signed-off-by: divineearthly <divineearthly@gmail.com>
tunglinwood pushed a commit to tunglinwood/vllm that referenced this pull request Jun 22, 2026
Signed-off-by: Ma Jian <jian1.ma@intel.com>
nkzhenhua pushed a commit to nkzhenhua/vllm that referenced this pull request Jun 24, 2026
Signed-off-by: Ma Jian <jian1.ma@intel.com>
ohsono pushed a commit to ohsono/vllm that referenced this pull request Jul 3, 2026
Signed-off-by: Ma Jian <jian1.ma@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

intel-gpu Related to Intel GPU ready ONLY add when PR is ready to merge/full CI is needed v1 verified Run pre-commit for new contributors without triggering other tests

3 participants