Skip to content

[Bugfix] Defer offload reads while transfers are pending#46231

Merged
orozery merged 2 commits into
vllm-project:mainfrom
Palaiologos1453:fix-offloading-preempt-load-46014
Jun 21, 2026
Merged

[Bugfix] Defer offload reads while transfers are pending#46231
orozery merged 2 commits into
vllm-project:mainfrom
Palaiologos1453:fix-offloading-preempt-load-46014

Conversation

@Palaiologos1453

Copy link
Copy Markdown
Contributor

Fixes #46014.

This makes the offloading scheduler defer prefix-cache lookup for a request while that request still has in-flight transfer jobs. In the preemption/re-admission race described in the issue, this prevents the scheduler from issuing a load while a previously flushed store is still tracked in transfer_jobs.

The request is retried on a later scheduling step after the worker completion is consumed and the transfer set drains.

Test coverage:

  • Added a focused regression test for get_num_new_matched_tokens() to verify that a pending transfer returns (None, False), does not call lookup, and clears stale block ids for the attempted admission.

Local verification:

  • python -m pytest --confcutdir=tests/v1/kv_connector/unit/offloading_connector tests/v1/kv_connector/unit/offloading_connector/test_scheduler.py -k pending_transfer_defers_prefix_lookup -q
    • Run on Windows with a temporary uvloop stub because uvloop does not support Windows; the tested path does not use the event loop implementation.
  • python -m compileall -q vllm/distributed/kv_transfer/kv_connector/v1/offloading/scheduler.py tests/v1/kv_connector/unit/offloading_connector/test_scheduler.py
  • git diff --check
@Palaiologos1453

Copy link
Copy Markdown
Contributor Author

Thanks for reviewing. The pre-run-check is blocked because this account has fewer than 4 merged PRs and the PR does not yet have a ready/verified label. Could a maintainer please add the appropriate label if this fix is ready for CI?

@mergify mergify Bot added v1 bug Something isn't working kv-connector labels Jun 20, 2026

@orozery orozery left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Palaiologos1453.
Please see my comment on the issue:
#46014 (comment)

@Palaiologos1453 Palaiologos1453 force-pushed the fix-offloading-preempt-load-46014 branch from 2114400 to ec2d969 Compare June 21, 2026 08:20
@Palaiologos1453

Copy link
Copy Markdown
Contributor Author

I pushed an update with a scheduler-level regression test for this exact async batch-queue ordering.

The new test first creates pending store jobs, then calls schedule() to produce a preemption batch whose metadata contains jobs_to_flush. It intentionally does not feed that batch's ModelRunnerOutput back through update_from_output() before calling schedule() again. This simulates step_with_batch_queue() scheduling another batch before the queued preemption output is consumed. In that window the scheduler still has the store job in req_status.transfer_jobs, so the re-admission path should defer and avoid issuing a load.

Local checks I could run in this environment:

  • git diff --check
  • python -m compileall -q tests/v1/kv_connector/unit/offloading_connector/test_scheduler.py

The targeted pytest still cannot run on this Windows checkout because importing vLLM tries to load the unbuilt vllm._C_stable_libtorch extension.

@orozery orozery added the ready ONLY add when PR is ready to merge/full CI is needed label Jun 21, 2026
@mergify

mergify Bot commented Jun 21, 2026

Copy link
Copy Markdown
Contributor

Hi @Palaiologos1453, the pre-commit checks have failed. Please run:

uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Signed-off-by: test test <2260891073@qq.com>
@Palaiologos1453 Palaiologos1453 force-pushed the fix-offloading-preempt-load-46014 branch from ec2d969 to 9dcb4a8 Compare June 21, 2026 08:53
@Palaiologos1453 Palaiologos1453 requested a review from orozery June 21, 2026 11:07

@orozery orozery left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Palaiologos1453 Thanks for this fix!

@orozery orozery merged commit d3ad8e8 into vllm-project:main Jun 21, 2026
76 checks passed
tunglinwood pushed a commit to tunglinwood/vllm that referenced this pull request Jun 22, 2026
nkzhenhua pushed a commit to nkzhenhua/vllm that referenced this pull request Jun 24, 2026
qli88 pushed a commit to qli88/vllm that referenced this pull request Jun 26, 2026
…t#46231)

Signed-off-by: test test <2260891073@qq.com>
Signed-off-by: Qiang Li <qiang.li2@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working kv-connector ready ONLY add when PR is ready to merge/full CI is needed v1

2 participants