Skip to content

[KV-Offloading] : Expose CPU cache usage metric #45737

Merged
orozery merged 4 commits into
vllm-project:mainfrom
neuralmagic:varun/cpu-kv-metric
Jun 20, 2026
Merged

[KV-Offloading] : Expose CPU cache usage metric #45737
orozery merged 4 commits into
vllm-project:mainfrom
neuralmagic:varun/cpu-kv-metric

Conversation

@varun-sundar-rabindranath

@varun-sundar-rabindranath varun-sundar-rabindranath commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Purpose

Add vllm:kv_offload_cpu_cache_usage_perc metric to export CPU cache usage percent. The naming and the semantics is designed to match the gpu counter part vllm:kv_cache_usage_perc

This PR also refactors the build_metric_definitions function to move it into the OffloadingManager.

Test Plan

Add unit tests

Test Result

Unit tests pass

@varun-sundar-rabindranath varun-sundar-rabindranath marked this pull request as draft June 15, 2026 21:19
@varun-sundar-rabindranath varun-sundar-rabindranath marked this pull request as ready for review June 15, 2026 21:53
@varun-sundar-rabindranath varun-sundar-rabindranath marked this pull request as draft June 15, 2026 22:01
@varun-sundar-rabindranath varun-sundar-rabindranath marked this pull request as ready for review June 16, 2026 05:34
Comment thread vllm/v1/kv_offload/cpu/manager.py Outdated
Comment thread vllm/v1/kv_offload/base.py Outdated
Comment thread vllm/v1/kv_offload/cpu/manager.py Outdated
Comment thread vllm/v1/kv_offload/cpu/manager.py Outdated
Comment thread vllm/v1/kv_offload/cpu/manager.py Outdated
Comment thread tests/v1/kv_connector/unit/offloading_connector/test_metrics.py Outdated
Comment thread tests/v1/kv_offload/cpu/test_manager.py Outdated
Comment thread vllm/v1/kv_offload/cpu/common.py Outdated
Comment thread vllm/v1/kv_offload/cpu/spec.py Outdated
Varun Sundar Rabindranath added 2 commits June 19, 2026 09:33
Signed-off-by: Varun Sundar Rabindranath <varun-sundar-rabindranath@h100-01.nemg-001.lab.rdu2.dc.redhat.com>
Signed-off-by: Varun Sundar Rabindranath <varun-sundar-rabindranath@h100-01.nemg-001.lab.rdu2.dc.redhat.com>

Signed-off-by:  <>
Signed-off-by: Varun Sundar Rabindranath <varun-sundar-rabindranath@h100-01.nemg-001.lab.rdu2.dc.redhat.com>

Signed-off-by:  <>
@orozery orozery added the ready ONLY add when PR is ready to merge/full CI is needed label Jun 19, 2026
@orozery orozery enabled auto-merge (squash) June 19, 2026 16:18
@orozery orozery merged commit 3b4a76b into vllm-project:main Jun 20, 2026
68 checks passed
@github-project-automation github-project-automation Bot moved this from Backlog to Done in Prometheus Metrics Jun 20, 2026
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Jun 21, 2026
Signed-off-by: Varun Sundar Rabindranath <varun-sundar-rabindranath@h100-01.nemg-001.lab.rdu2.dc.redhat.com>
Signed-off-by: <>
Co-authored-by: Varun Sundar Rabindranath <varun-sundar-rabindranath@h100-01.nemg-001.lab.rdu2.dc.redhat.com>
tunglinwood pushed a commit to tunglinwood/vllm that referenced this pull request Jun 22, 2026
Signed-off-by: Varun Sundar Rabindranath <varun-sundar-rabindranath@h100-01.nemg-001.lab.rdu2.dc.redhat.com>
Signed-off-by: <>
Co-authored-by: Varun Sundar Rabindranath <varun-sundar-rabindranath@h100-01.nemg-001.lab.rdu2.dc.redhat.com>
nkzhenhua pushed a commit to nkzhenhua/vllm that referenced this pull request Jun 24, 2026
Signed-off-by: Varun Sundar Rabindranath <varun-sundar-rabindranath@h100-01.nemg-001.lab.rdu2.dc.redhat.com>
Signed-off-by: <>
Co-authored-by: Varun Sundar Rabindranath <varun-sundar-rabindranath@h100-01.nemg-001.lab.rdu2.dc.redhat.com>
qli88 pushed a commit to qli88/vllm that referenced this pull request Jun 26, 2026
Signed-off-by: Varun Sundar Rabindranath <varun-sundar-rabindranath@h100-01.nemg-001.lab.rdu2.dc.redhat.com>
Signed-off-by: <>
Co-authored-by: Varun Sundar Rabindranath <varun-sundar-rabindranath@h100-01.nemg-001.lab.rdu2.dc.redhat.com>
Signed-off-by: Qiang Li <qiang.li2@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kv-connector ready ONLY add when PR is ready to merge/full CI is needed v1

6 participants