Skip to content

[KV Events] Switch event structs from array to map encoding#42892

Merged
NickLucche merged 14 commits into
vllm-project:mainfrom
sagearc:kv-events-map-encoding
Jun 9, 2026
Merged

[KV Events] Switch event structs from array to map encoding#42892
NickLucche merged 14 commits into
vllm-project:mainfrom
sagearc:kv-events-map-encoding

Conversation

@sagearc

@sagearc sagearc commented May 17, 2026

Copy link
Copy Markdown
Contributor

Context

KV cache events are published over ZMQ as msgpack-encoded batches consumed by external subscribers (routing daemons, cache managers, etc.). The encoding used a positional array format — every field identified purely by its index — which makes every schema addition a potential break for subscribers that haven't been updated at the same time. This was first flagged by @njhill in #27577, and the schema has been extended four times since:

  • block_hashes, parent_block_hash, token_ids, block_size, lora_id#16750
  • medium#19737
  • lora_name#27577
  • extra_keys#33304
  • group_idx#37688
  • kv_cache_spec_kind, kv_cache_spec_sliding_window#40984

This PR removes array_like=True from the event structs, switching to named-key (map) encoding.

Example

Before — the entire message is a positional array. Inserting a field anywhere but the end shifts every subsequent position and breaks deserialization at an unrelated field.

[
  1.0,                             // ts
  [
    [
      "BlockStored",               // tag      (pos 0)
      [4291203, 1837291, 9104857], // block_hashes   (pos 1)
      null,                        // parent_block_hash (pos 2)
      [0, 1, 2, 3],                // token_ids  (pos 3)
      64,                          // block_size (pos 4)
      null,                        // lora_id    (pos 5)
      "GPU",                       // medium     (pos 6)
      null,                        // lora_name  (pos 7)
      [["mm_hash1", 50]]           // extra_keys (pos 8)
    ]
  ]
]

After — fields are matched by name. A subscriber that doesn't know about extra_keys ignores it. An older subscriber still decodes correctly as long as the fields it knows about are present.

{
  "ts": 1.0,
  "events": [
    {
      "type": "BlockStored",
      "block_hashes": [4291203, 1837291, 9104857],
      "parent_block_hash": null,
      "token_ids": [0, 1, 2, 3],
      "block_size": 64,
      "lora_id": null,
      "medium": "GPU",
      "lora_name": null,
      "extra_keys": [["mm_hash1", 50]]
    }
  ]
}

This is a one-time breaking change. Existing subscribers must drop array_like=True from their local struct definitions (the updated kv_events_subscriber.py example reflects this). After this, the schema can grow without further breaks.

Performance

Measured on 2048-tokens per request, 32-block event with 3 multimodal features (block_size=64):

Array Map Delta
Payload size 6196 B 6294 B +1.6%
Encode (200×) 1.4 ms 1.4 ms within noise
Decode (200×) 5.4 ms 5.5 ms within noise
ZMQ throughput ~25k msg/s ~27k msg/s within noise

The overhead is a fixed ~98 bytes per message — immeasurable at realistic KV event rates.

CC

@njhill @NickLucche @vMaroon @orozery

sagearc added 2 commits May 17, 2026 20:15
Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
@sagearc sagearc requested a review from markmc as a code owner May 17, 2026 17:44
@mergify

mergify Bot commented May 17, 2026

Copy link
Copy Markdown
Contributor
@mergify mergify Bot added the documentation Improvements or additions to documentation label May 17, 2026
@sagearc sagearc changed the title [KV Events] Switch BlockStored wire format from positional to named-key encoding May 17, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request removes the "array_like=True" parameter from the msgspec.Struct definitions for EventBatch and KVCacheEvent in both the core implementation and the example subscriber. I have no feedback to provide as there were no review comments.

@vMaroon

vMaroon commented May 17, 2026

Copy link
Copy Markdown
Contributor

Thanks @sagearc - I think backwards compatibility will have to be implemented either here or inside consumers. I think the first makes more sense, following standard deprecation protocols.

Looping @PeaBrane for Dynamo input on the change and the above point.

@PeaBrane

Copy link
Copy Markdown
Contributor

Thanks for pushing this. Moving BlockStored / BlockRemoved off array_like=True makes sense; those structs keep growing and positional fields are a fragile ABI for external consumers.

One question: do we also need to change EventBatch in this PR? The batch envelope seems much more stable (ts, events, data_parallel_rank), and changing it means consumers need to handle a new outer shape too:

[ts, events, data_parallel_rank]

to:

{"ts": ts, "events": events, "data_parallel_rank": data_parallel_rank}

On the Dynamo side, we already tolerate array/map forms for individual events, but would still need a top-level batch parser update if EventBatch changes. If there’s a strong reason to switch the batch level too, we’re open to patching Dynamo to support both formats during the transition.

Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
@sagearc

sagearc commented May 17, 2026

Copy link
Copy Markdown
Contributor Author

I think backwards compatibility will have to be implemented either here or inside consumers. I think the first makes more sense, following standard deprecation protocols.

@vMaroon We could add a flag to keep the old format during a transition. Worth noting the schema has grown 5 times without compat protocol so far, and the consumer update is one line. Happy to go either way, just want to flag the tradeoff of keeping two formats alive.

One question: do we also need to change EventBatch in this PR? The batch envelope seems much more stable (ts, events, data_parallel_rank), and changing it means consumers need to handle a new outer shape too

@PeaBrane You're right, scoped it out — EventBatch reverted to array_like=True. Thanks!

@sagearc sagearc changed the title [KV Events] Switch BlockStored wire format from array to map May 17, 2026

@NickLucche NickLucche left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for following up with this @sagearc .
I would personally list this as a breaking change and just change the KVCacheEvent format. Clients/consumers would have to handle backward compatibility if aiming to support multiple vllm versions. From this change onward, retro-compat is ensured by this format.
Otherwise with current code we're anyway breaking compatibility on each new addition.

Do you have a different opinion on this @njhill @markmc ?

@sagearc sagearc requested a review from NickLucche June 2, 2026 13:28

@NickLucche NickLucche left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but would appreciate another stamp here

@NickLucche NickLucche enabled auto-merge (squash) June 4, 2026 17:10
@github-actions github-actions Bot added the ready ONLY add when PR is ready to merge/full CI is needed label Jun 4, 2026
@NickLucche NickLucche merged commit 5b3807e into vllm-project:main Jun 9, 2026
72 checks passed
@sagearc sagearc deleted the kv-events-map-encoding branch June 9, 2026 11:52
ekagra-ranjan pushed a commit to ekagra-ranjan/vllm that referenced this pull request Jun 9, 2026
…ject#42892)

Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
waqahmed-amd-fi pushed a commit to waqahmed-amd-fi/vllm that referenced this pull request Jun 10, 2026
…ject#42892)

Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
Signed-off-by: Waqar Ahmed <waqar.ahmed@amd.com>
Saddss pushed a commit to Saddss/vllm that referenced this pull request Jun 14, 2026
vMaroon pushed a commit to llm-d/llm-d-kv-cache that referenced this pull request Jun 16, 2026
…ibility) (#661)

* fix(kvevents): support map-encoded vLLM KV events

vLLM dropped msgspec array_like=True from its KV cache event structs
(vllm-project/vllm#42892, merged 2026-06-09), so newer vLLM versions
publish each event as a field-name map with the tag under the "type"
key instead of a positional array. The VLLMAdapter only decoded
positional arrays, so every KV event from a new vLLM fails to parse
and the whole index goes dark.

Normalize map-encoded events to the existing positional layout in
decodeVLLMEvent before dispatch, keeping the converters and their
forward/backward-compatibility guards encoding-agnostic. Positional
arrays from older vLLM versions keep working unchanged; absent map
fields become nil exactly like omitted trailing array fields.

Verified against a captured event stream from a live vLLM serve run
(map encoding, multi-turn traffic with CPU offload store and eviction
events): all 313 frames parse, and replaying them through the
kvevents.Pool -> kvblock.InMemoryIndex path indexes and evicts every
block correctly. Unit tests cover map-encoded BlockStored /
BlockRemoved / AllBlocksCleared, a mixed-encoding batch, and malformed
map events.

Assisted-by: Claude (AI assistance for implementation and tests)
Signed-off-by: Change72 <changg@nvidia.com>

* fix(kvevents): distinct errors for malformed map-encoded events

Address review: report a dedicated error for a missing "type" tag
(instead of conflating it with the non-string case), drop the stale
"tagged union" wording from the encoding-agnostic unmarshal error, and
pin each malformed-map failure mode to its distinct error message in
the test.

Assisted-by: Claude (AI assistance for implementation and tests)
Signed-off-by: Change72 <changg@nvidia.com>

* fix(kvevents): tighten unmarshal error, drop unreachable map branch

Address review: the event-level unmarshal failure now wraps as
"unmarshal event payload" so ParseMessage's "failed to decode vLLM
event" wrap no longer doubles the same prefix. Drop the map[any]any
normalization branch: msgpack v5 decodes untyped maps via DecodeMap(),
which only produces map[string]any and rejects non-string keys inside
Unmarshal itself, so the branch was unreachable. Condense the encoding
doc comments; the PR description carries the full background.

Assisted-by: Claude (AI assistance for implementation and tests)
Signed-off-by: Change72 <changg@nvidia.com>

---------

Signed-off-by: Change72 <changg@nvidia.com>
vivek8123 pushed a commit to odh-on-pz/vllm-upstream that referenced this pull request Jun 18, 2026
divineearthly pushed a commit to divineearthly/vllm that referenced this pull request Jun 19, 2026
…ject#42892)

Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
Signed-off-by: divineearthly <divineearthly@gmail.com>
tunglinwood pushed a commit to tunglinwood/vllm that referenced this pull request Jun 22, 2026
nkzhenhua pushed a commit to nkzhenhua/vllm that referenced this pull request Jun 24, 2026
ohsono pushed a commit to ohsono/vllm that referenced this pull request Jul 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation ready ONLY add when PR is ready to merge/full CI is needed

4 participants