[llm][serve] Materialize chat completion message content in sanitizer#63119
Merged
kouroshHakha merged 2 commits intoMay 5, 2026
Merged
Conversation
Contributor
There was a problem hiding this comment.
Code Review
This pull request updates the _sanitize_chat_completion_request function to address a Pydantic serialization bug where content fields in chat messages are stored as non-picklable ValidatorIterator objects. The changes ensure these fields are materialized into lists across all message roles, and a new test case has been added to verify picklability. Feedback suggests a more idiomatic approach to updating the message dictionary by utilizing the existing reference.
eicherseiji
approved these changes
May 5, 2026
`_sanitize_chat_completion_request` only materialized `tool_calls` on assistant messages, leaving `content` as a Pydantic ValidatorIterator whenever the request matched the `Iterable[ContentPart]` arm of the `Union[str, Iterable[...], None]` field type. Cloudpickle then failed when the ingress forwarded the request to the LLM replica, returning 500s for any payload that sends `content` as a list of content parts. Extend the sanitizer to also materialize `content` to a list on every message variant (system, user, assistant, tool). The existing `tool_calls` handling is unchanged. Also drops the misleading "Remove when we update to Pydantic v2.11+" TODO -- the `Iterable[...]` Union arm is still affected on Pydantic 2.12, so this workaround is required regardless of Pydantic version. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>
…ssages[i]` Mutate the message dict via the local `message` reference rather than re-indexing through `request.messages[i]`. Both refer to the same dict after the model_dump line above, so this is a pure readability change. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>
e44c52d to
c283d26
Compare
jeffreywang88
approved these changes
May 5, 2026
Lucas61000
pushed a commit
to Lucas61000/ray
that referenced
this pull request
May 15, 2026
…er (ray-project#63119) Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
_sanitize_chat_completion_request(inray/llm/_internal/serve/core/ingress/ingress.py) only materializedtool_callson assistant messages, leavingcontentas a PydanticValidatorIteratorwhenever the request matched theIterable[ContentPart]arm of theUnion[str, Iterable[...], None]field type that OpenAI's TypedDicts use on every message variant. Cloudpickle then failed when the ingress forwarded the request to the LLM replica viamodel_handle.chat.remote(...), returning 500s for any payload that sendscontentas a list of content parts (e.g.[{"text": "...", "type": "text"}]).Plain-string
contentrequests were unaffected, which is why this slipped through earlier.This PR extends the sanitizer to also materialize
contentto a list on every message variant (system, user, assistant, tool). The existingtool_callshandling is unchanged.It also drops the optimistic
TODO(seiji): Remove when we update to Pydantic v2.11+— theIterable[...]Union-arm bug is still reproducible on Pydantic 2.12, so this workaround is required regardless of Pydantic version.Repro (before)
After this PR (running through
_sanitize_chat_completion_request), the request pickles successfully.Related
tool_callsiterator before processing by mistral-common when MistralTokenizer is used vllm-project/vllm#9951TypedDictfield hasValidationIteratorinstead of the original value pydantic/pydantic#9467pydantic-coreValidatorIteratorinstance pydantic/pydantic#9541tool_callsonly): [serve.llm] Serialize ChatCompletionRequest tool_calls ValidatorIterator object #55538 / [llm] fails to serialize assistant message withtool_calls#55294Test plan
test_serializes_content_iteratorexercises a payload with list-of-content-parts on every role and asserts the result pickles.test_serializes_tool_calls_iteratorandtest_handles_no_tool_callsstill pass.bash ci/lint/lint.sh pre_commitclean.🤖 Generated with Claude Code