Skip to content

[llm][serve] Materialize chat completion message content in sanitizer#63119

Merged
kouroshHakha merged 2 commits into
ray-project:masterfrom
kouroshHakha:fix-chat-content-validator-iter
May 5, 2026
Merged

[llm][serve] Materialize chat completion message content in sanitizer#63119
kouroshHakha merged 2 commits into
ray-project:masterfrom
kouroshHakha:fix-chat-content-validator-iter

Conversation

@kouroshHakha

Copy link
Copy Markdown
Contributor

Summary

_sanitize_chat_completion_request (in ray/llm/_internal/serve/core/ingress/ingress.py) only materialized tool_calls on assistant messages, leaving content as a Pydantic ValidatorIterator whenever the request matched the Iterable[ContentPart] arm of the Union[str, Iterable[...], None] field type that OpenAI's TypedDicts use on every message variant. Cloudpickle then failed when the ingress forwarded the request to the LLM replica via model_handle.chat.remote(...), returning 500s for any payload that sends content as a list of content parts (e.g. [{"text": "...", "type": "text"}]).

Plain-string content requests were unaffected, which is why this slipped through earlier.

This PR extends the sanitizer to also materialize content to a list on every message variant (system, user, assistant, tool). The existing tool_calls handling is unchanged.

It also drops the optimistic TODO(seiji): Remove when we update to Pydantic v2.11+ — the Iterable[...] Union-arm bug is still reproducible on Pydantic 2.12, so this workaround is required regardless of Pydantic version.

Repro (before)

from ray.llm._internal.serve.core.configs.openai_api_models import ChatCompletionRequest
import pickle

body = ChatCompletionRequest.model_validate({
    "model": "qwen35-9b",
    "messages": [
        {"role": "user", "content": [{"text": "hi", "type": "text"}]},
        {"role": "assistant", "content": [{"text": "ok", "type": "text"}]},
    ],
})
pickle.dumps(body)
# TypeError: cannot pickle 'pydantic_core._pydantic_core.ValidatorIterator' object

After this PR (running through _sanitize_chat_completion_request), the request pickles successfully.

Related

Test plan

  • New unit test test_serializes_content_iterator exercises a payload with list-of-content-parts on every role and asserts the result pickles.
  • Existing tests test_serializes_tool_calls_iterator and test_handles_no_tool_calls still pass.
  • bash ci/lint/lint.sh pre_commit clean.
  • CI green.

🤖 Generated with Claude Code

@kouroshHakha kouroshHakha requested a review from a team as a code owner May 5, 2026 04:14

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the _sanitize_chat_completion_request function to address a Pydantic serialization bug where content fields in chat messages are stored as non-picklable ValidatorIterator objects. The changes ensure these fields are materialized into lists across all message roles, and a new test case has been added to verify picklability. Feedback suggests a more idiomatic approach to updating the message dictionary by utilizing the existing reference.

Comment thread python/ray/llm/_internal/serve/core/ingress/ingress.py Outdated
@eicherseiji eicherseiji added the go add ONLY when ready to merge, run all tests label May 5, 2026
@ray-gardener ray-gardener Bot added the serve Ray Serve Related Issue label May 5, 2026
kouroshHakha and others added 2 commits May 5, 2026 08:09
`_sanitize_chat_completion_request` only materialized `tool_calls` on
assistant messages, leaving `content` as a Pydantic ValidatorIterator
whenever the request matched the `Iterable[ContentPart]` arm of the
`Union[str, Iterable[...], None]` field type. Cloudpickle then failed
when the ingress forwarded the request to the LLM replica, returning
500s for any payload that sends `content` as a list of content parts.

Extend the sanitizer to also materialize `content` to a list on every
message variant (system, user, assistant, tool). The existing
`tool_calls` handling is unchanged.

Also drops the misleading "Remove when we update to Pydantic v2.11+"
TODO -- the `Iterable[...]` Union arm is still affected on Pydantic
2.12, so this workaround is required regardless of Pydantic version.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>
…ssages[i]`

Mutate the message dict via the local `message` reference rather than
re-indexing through `request.messages[i]`. Both refer to the same dict
after the model_dump line above, so this is a pure readability change.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>
@kouroshHakha kouroshHakha force-pushed the fix-chat-content-validator-iter branch from e44c52d to c283d26 Compare May 5, 2026 15:10
@kouroshHakha kouroshHakha merged commit 3c3bfc5 into ray-project:master May 5, 2026
5 of 6 checks passed
Lucas61000 pushed a commit to Lucas61000/ray that referenced this pull request May 15, 2026
…er (ray-project#63119)

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go add ONLY when ready to merge, run all tests serve Ray Serve Related Issue

3 participants