[BugFix] Omit empty tool_calls from OpenAI chat responses by QwertyJack · Pull Request #44105 · vllm-project/vllm

QwertyJack · 2026-05-31T09:57:47Z

What this PR does / why we need it

This PR omits empty tool_calls arrays from serialized OpenAI chat completion responses:

non-stream message.tool_calls == []
stream delta.tool_calls == []

Non-empty tool calls are preserved unchanged.

This fixes an OpenAI API compatibility issue where final assistant responses after a tool result can contain normal text with finish_reason="stop", but still serialize tool_calls: []. OpenAI SDK clients then see message.tool_calls is not None, enter the tool-call path, and fail when indexing the empty list.

Does this PR introduce any user-facing change?

Yes. OpenAI-compatible chat completion responses no longer include empty tool_calls arrays. This aligns the response shape with the absence of tool calls while preserving non-empty tool-call payloads.

Duplicate-work check

gh issue view 44104 --repo vllm-project/vllm --comments
gh pr list --repo vllm-project/vllm --state open --search "44104 in:body"
gh pr list --repo vllm-project/vllm --state open --search "empty tool_calls OpenAI chat responses"

Result: #44105 is the only open PR directly addressing #44104 / empty tool_calls response serialization. Broader keyword hits are unrelated parser/model changes.

How was this patch tested?

PYTHONPATH=. .venv/bin/python -m pytest -q tests/entrypoints/openai/test_tool_choice_content_none.py
uv run --with ruff --no-project ruff check vllm/entrypoints/openai/chat_completion/protocol.py vllm/entrypoints/openai/engine/protocol.py tests/entrypoints/openai/test_tool_choice_content_none.py tests/entrypoints/openai/chat_completion/test_completion_with_function_calling.py tests/entrypoints/openai/chat_completion/test_serving_chat.py

Results:

pytest: 6 passed
ruff: passed

AI assistance

AI assistance was used to investigate the CI failure, add the serializer guard for unset default tool_calls, and align affected OpenAI SDK integration test expectations with omitted empty tool_calls. The human submitter should review every changed line and the final CI result.

hclsys · 2026-05-31T10:32:07Z

nice — mirrors the existing ChatCompletionToolsParam._serialize shape at protocol.py:166. one note: this changes the dict shape for any in-process consumer that previously read message["tool_calls"] expecting [] (callbacks, loggers, etc.) — they'd now hit KeyError. probably fine since the public surface is JSON over the wire, but worth a quick git grep for \.tool_calls\b consumers inside vllm just to be safe.

QwertyJack · 2026-05-31T11:23:39Z

Thanks, checked this.

Commands run:

git grep -n '\.tool_calls\b' -- vllm
git grep -n '\["tool_calls"\]' -- vllm
rg -n 'model_dump\(|model_dump_json\(' vllm/entrypoints/openai vllm/entrypoints/anthropic

Findings:

The .tool_calls production hits are object/attribute consumers (choice.message.tool_calls, delta_message.tool_calls, parser results, etc.) before serialization. Those still see the model field as []; the serializer only changes the dumped payload.
Direct message["tool_calls"] production hits are request/conversation message transforms, mostly guarded by message.get("tool_calls") or a prior presence check. I did not find a production path that calls model_dump() on ChatMessage/DeltaMessage and then directly indexes payload["tool_calls"] expecting an empty list.
The OpenAI chat response model_dump() / model_dump_json() call sites are the outbound API paths. The Harmony conversion paths that dump Pydantic messages use .get("tool_calls", []), so omitted empty fields are handled.

So I think the current shape is safe for in-process vLLM consumers while fixing the over-the-wire OpenAI-compatible payload.

…9792) ## What this PR does / why we need it? Backport of #9791 for #9790 to `releases/v0.20.2rc`. This adds the same local monkey patch for the upstream vLLM OpenAI API compatibility bug where final assistant responses after a tool result can serialize `tool_calls: []` even though `finish_reason="stop"` and the response contains normal assistant text. The patch removes empty `tool_calls` arrays from: - non-stream chat completion `message` - stream chat completion `delta` Non-empty tool calls are preserved unchanged. Upstream vLLM issue: vllm-project/vllm#44104 Upstream vLLM PR: vllm-project/vllm#44105 ## Does this PR introduce _any_ user-facing change? Yes. OpenAI-compatible chat completion responses no longer include empty `tool_calls` arrays on the v0.20.2rc release branch. ## How was this patch tested? On `fix/omit-empty-tool-calls-v0.20.2rc`: - `ruff check vllm_ascend/patch/platform/patch_tool_choice_none_content.py tests/ut/patch/platform/test_patch_tool_choice_none_content.py` - `git diff --check` - `pytest -q tests/ut/patch/platform/test_patch_deepseek_v4_tool_call_parser.py tests/ut/patch/platform/test_patch_tool_choice_none_content.py` The pytest run passed: `13 passed, 16 warnings`. Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com> Co-authored-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>

## What this PR does / why we need it? Fixes #9790. This adds a local monkey patch for an upstream vLLM OpenAI API compatibility bug where final assistant responses after a tool result can serialize `tool_calls: []` even though `finish_reason="stop"` and the response contains normal assistant text. OpenAI SDK clients treat `message.tool_calls is not None` as an active tool-call path, so an empty list can make client loops fail with `IndexError`. The patch removes empty `tool_calls` arrays from: - non-stream chat completion `message` - stream chat completion `delta` Non-empty tool calls are preserved unchanged. Upstream vLLM issue: vllm-project/vllm#44104 Upstream vLLM PR: vllm-project/vllm#44105 ## Does this PR introduce _any_ user-facing change? Yes. OpenAI-compatible chat completion responses no longer include empty `tool_calls` arrays. This aligns the payload with the absence of tool calls and avoids OpenAI SDK clients entering a false tool-call loop. ## How was this patch tested? - `ruff check vllm_ascend/patch/platform/patch_tool_choice_none_content.py tests/ut/patch/platform/test_patch_tool_choice_none_content.py` - `git diff --check` - `pytest -q tests/ut/patch/platform/test_patch_deepseek_v4_tool_call_parser.py tests/ut/patch/platform/test_patch_tool_choice_none_content.py` The pytest run passed: `13 passed, 16 warnings`. - vLLM version: v0.20.2 - vLLM main: vllm-project/vllm@39910f2 Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com> Co-authored-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>

…ct#9791) ## What this PR does / why we need it? Fixes vllm-project#9790. This adds a local monkey patch for an upstream vLLM OpenAI API compatibility bug where final assistant responses after a tool result can serialize `tool_calls: []` even though `finish_reason="stop"` and the response contains normal assistant text. OpenAI SDK clients treat `message.tool_calls is not None` as an active tool-call path, so an empty list can make client loops fail with `IndexError`. The patch removes empty `tool_calls` arrays from: - non-stream chat completion `message` - stream chat completion `delta` Non-empty tool calls are preserved unchanged. Upstream vLLM issue: vllm-project/vllm#44104 Upstream vLLM PR: vllm-project/vllm#44105 ## Does this PR introduce _any_ user-facing change? Yes. OpenAI-compatible chat completion responses no longer include empty `tool_calls` arrays. This aligns the payload with the absence of tool calls and avoids OpenAI SDK clients entering a false tool-call loop. ## How was this patch tested? - `ruff check vllm_ascend/patch/platform/patch_tool_choice_none_content.py tests/ut/patch/platform/test_patch_tool_choice_none_content.py` - `git diff --check` - `pytest -q tests/ut/patch/platform/test_patch_deepseek_v4_tool_call_parser.py tests/ut/patch/platform/test_patch_tool_choice_none_content.py` The pytest run passed: `13 passed, 16 warnings`. - vLLM version: v0.20.2 - vLLM main: vllm-project/vllm@39910f2 Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com> Co-authored-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com> Signed-off-by: yilunh <hanyilun1@huawei.com>

…ct#9791) ## What this PR does / why we need it? Fixes vllm-project#9790. This adds a local monkey patch for an upstream vLLM OpenAI API compatibility bug where final assistant responses after a tool result can serialize `tool_calls: []` even though `finish_reason="stop"` and the response contains normal assistant text. OpenAI SDK clients treat `message.tool_calls is not None` as an active tool-call path, so an empty list can make client loops fail with `IndexError`. The patch removes empty `tool_calls` arrays from: - non-stream chat completion `message` - stream chat completion `delta` Non-empty tool calls are preserved unchanged. Upstream vLLM issue: vllm-project/vllm#44104 Upstream vLLM PR: vllm-project/vllm#44105 ## Does this PR introduce _any_ user-facing change? Yes. OpenAI-compatible chat completion responses no longer include empty `tool_calls` arrays. This aligns the payload with the absence of tool calls and avoids OpenAI SDK clients entering a false tool-call loop. ## How was this patch tested? - `ruff check vllm_ascend/patch/platform/patch_tool_choice_none_content.py tests/ut/patch/platform/test_patch_tool_choice_none_content.py` - `git diff --check` - `pytest -q tests/ut/patch/platform/test_patch_deepseek_v4_tool_call_parser.py tests/ut/patch/platform/test_patch_tool_choice_none_content.py` The pytest run passed: `13 passed, 16 warnings`. - vLLM version: v0.20.2 - vLLM main: vllm-project/vllm@39910f2 Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com> Co-authored-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>

mergify · 2026-06-03T15:44:27Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @QwertyJack.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

…ct#9791) ## What this PR does / why we need it? Fixes vllm-project#9790. This adds a local monkey patch for an upstream vLLM OpenAI API compatibility bug where final assistant responses after a tool result can serialize `tool_calls: []` even though `finish_reason="stop"` and the response contains normal assistant text. OpenAI SDK clients treat `message.tool_calls is not None` as an active tool-call path, so an empty list can make client loops fail with `IndexError`. The patch removes empty `tool_calls` arrays from: - non-stream chat completion `message` - stream chat completion `delta` Non-empty tool calls are preserved unchanged. Upstream vLLM issue: vllm-project/vllm#44104 Upstream vLLM PR: vllm-project/vllm#44105 ## Does this PR introduce _any_ user-facing change? Yes. OpenAI-compatible chat completion responses no longer include empty `tool_calls` arrays. This aligns the payload with the absence of tool calls and avoids OpenAI SDK clients entering a false tool-call loop. ## How was this patch tested? - `ruff check vllm_ascend/patch/platform/patch_tool_choice_none_content.py tests/ut/patch/platform/test_patch_tool_choice_none_content.py` - `git diff --check` - `pytest -q tests/ut/patch/platform/test_patch_deepseek_v4_tool_call_parser.py tests/ut/patch/platform/test_patch_tool_choice_none_content.py` The pytest run passed: `13 passed, 16 warnings`. - vLLM version: v0.20.2 - vLLM main: vllm-project/vllm@39910f2 Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com> Co-authored-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com> Signed-off-by: shenqiangqiang <2416602906@qq.com>

…ct#9791) ## What this PR does / why we need it? Fixes vllm-project#9790. This adds a local monkey patch for an upstream vLLM OpenAI API compatibility bug where final assistant responses after a tool result can serialize `tool_calls: []` even though `finish_reason="stop"` and the response contains normal assistant text. OpenAI SDK clients treat `message.tool_calls is not None` as an active tool-call path, so an empty list can make client loops fail with `IndexError`. The patch removes empty `tool_calls` arrays from: - non-stream chat completion `message` - stream chat completion `delta` Non-empty tool calls are preserved unchanged. Upstream vLLM issue: vllm-project/vllm#44104 Upstream vLLM PR: vllm-project/vllm#44105 ## Does this PR introduce _any_ user-facing change? Yes. OpenAI-compatible chat completion responses no longer include empty `tool_calls` arrays. This aligns the payload with the absence of tool calls and avoids OpenAI SDK clients entering a false tool-call loop. ## How was this patch tested? - `ruff check vllm_ascend/patch/platform/patch_tool_choice_none_content.py tests/ut/patch/platform/test_patch_tool_choice_none_content.py` - `git diff --check` - `pytest -q tests/ut/patch/platform/test_patch_deepseek_v4_tool_call_parser.py tests/ut/patch/platform/test_patch_tool_choice_none_content.py` The pytest run passed: `13 passed, 16 warnings`. - vLLM version: v0.20.2 - vLLM main: vllm-project/vllm@39910f2 Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com> Co-authored-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>

chaunceyjiang

LGTM

Remove empty tool_calls arrays from serialized chat completion messages and streaming deltas while preserving non-empty tool calls. This keeps the response compatible with OpenAI clients that treat tool_calls=[] as an active tool-call path. Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>

Signed-off-by: Chauncey <chaunceyjiang@gmail.com>

Co-authored-by: OpenAI Codex <codex@openai.com> Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>

…ct#44105) Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com> Signed-off-by: Chauncey <chaunceyjiang@gmail.com> Co-authored-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com> Co-authored-by: Chauncey <chaunceyjiang@gmail.com>

…ct#44105) Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com> Signed-off-by: Chauncey <chaunceyjiang@gmail.com> Co-authored-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com> Co-authored-by: Chauncey <chaunceyjiang@gmail.com> Signed-off-by: Qiang Li <qiang.li2@amd.com>

### What this PR does / why we need it? Narrows `patch_tool_choice_none_content.py` after the main2main update to vLLM `v0.23.0`. vLLM `v0.23.0` already includes vllm-project/vllm#40148, which tolerates forced tool-choice parsing when reasoning extraction leaves `content=None`, so this PR removes the local `DelegatingParser._parse_tool_calls` monkey patch. The patch still keeps the empty `tool_calls` serializer shim because vllm-project/vllm#44105 is not included in vLLM `v0.23.0`. The shim now covers both non-streaming and streaming JSON payloads, and `model_dump_json()` defaults to `mode="json"` before calling `json.dumps()`. - vLLM version: v0.23.0 - vLLM main: vllm-project/vllm@967c5c3 Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com> Co-authored-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>

…ct#9791) ## What this PR does / why we need it? Fixes vllm-project#9790. This adds a local monkey patch for an upstream vLLM OpenAI API compatibility bug where final assistant responses after a tool result can serialize `tool_calls: []` even though `finish_reason="stop"` and the response contains normal assistant text. OpenAI SDK clients treat `message.tool_calls is not None` as an active tool-call path, so an empty list can make client loops fail with `IndexError`. The patch removes empty `tool_calls` arrays from: - non-stream chat completion `message` - stream chat completion `delta` Non-empty tool calls are preserved unchanged. Upstream vLLM issue: vllm-project/vllm#44104 Upstream vLLM PR: vllm-project/vllm#44105 ## Does this PR introduce _any_ user-facing change? Yes. OpenAI-compatible chat completion responses no longer include empty `tool_calls` arrays. This aligns the payload with the absence of tool calls and avoids OpenAI SDK clients entering a false tool-call loop. ## How was this patch tested? - `ruff check vllm_ascend/patch/platform/patch_tool_choice_none_content.py tests/ut/patch/platform/test_patch_tool_choice_none_content.py` - `git diff --check` - `pytest -q tests/ut/patch/platform/test_patch_deepseek_v4_tool_call_parser.py tests/ut/patch/platform/test_patch_tool_choice_none_content.py` The pytest run passed: `13 passed, 16 warnings`. - vLLM version: v0.20.2 - vLLM main: vllm-project/vllm@39910f2 Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com> Co-authored-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>

…-project#10903) ### What this PR does / why we need it? Narrows `patch_tool_choice_none_content.py` after the main2main update to vLLM `v0.23.0`. vLLM `v0.23.0` already includes vllm-project/vllm#40148, which tolerates forced tool-choice parsing when reasoning extraction leaves `content=None`, so this PR removes the local `DelegatingParser._parse_tool_calls` monkey patch. The patch still keeps the empty `tool_calls` serializer shim because vllm-project/vllm#44105 is not included in vLLM `v0.23.0`. The shim now covers both non-streaming and streaming JSON payloads, and `model_dump_json()` defaults to `mode="json"` before calling `json.dumps()`. - vLLM version: v0.23.0 - vLLM main: vllm-project/vllm@967c5c3 Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com> Co-authored-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>

…ct#9791) ## What this PR does / why we need it? Fixes vllm-project#9790. This adds a local monkey patch for an upstream vLLM OpenAI API compatibility bug where final assistant responses after a tool result can serialize `tool_calls: []` even though `finish_reason="stop"` and the response contains normal assistant text. OpenAI SDK clients treat `message.tool_calls is not None` as an active tool-call path, so an empty list can make client loops fail with `IndexError`. The patch removes empty `tool_calls` arrays from: - non-stream chat completion `message` - stream chat completion `delta` Non-empty tool calls are preserved unchanged. Upstream vLLM issue: vllm-project/vllm#44104 Upstream vLLM PR: vllm-project/vllm#44105 ## Does this PR introduce _any_ user-facing change? Yes. OpenAI-compatible chat completion responses no longer include empty `tool_calls` arrays. This aligns the payload with the absence of tool calls and avoids OpenAI SDK clients entering a false tool-call loop. ## How was this patch tested? - `ruff check vllm_ascend/patch/platform/patch_tool_choice_none_content.py tests/ut/patch/platform/test_patch_tool_choice_none_content.py` - `git diff --check` - `pytest -q tests/ut/patch/platform/test_patch_deepseek_v4_tool_call_parser.py tests/ut/patch/platform/test_patch_tool_choice_none_content.py` The pytest run passed: `13 passed, 16 warnings`. - vLLM version: v0.20.2 - vLLM main: vllm-project/vllm@39910f2 Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com> Co-authored-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>

…ct#44105) Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com> Signed-off-by: Chauncey <chaunceyjiang@gmail.com> Co-authored-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com> Co-authored-by: Chauncey <chaunceyjiang@gmail.com>

QwertyJack requested review from AndreasKaratzas, DarkLight1337, NickLucche, aarnphm, chaunceyjiang, robertgshaw2-redhat and russellb as code owners May 31, 2026 09:57

mergify Bot added frontend tool-calling bug Something isn't working labels May 31, 2026

github-project-automation Bot added this to Tool Calling May 31, 2026

QwertyJack force-pushed the fix/omit-empty-tool-calls branch from b6ca136 to 22a40e9 Compare June 1, 2026 14:48

mergify Bot added the needs-rebase label Jun 3, 2026

QwertyJack force-pushed the fix/omit-empty-tool-calls branch from 22a40e9 to ae36b1e Compare June 5, 2026 10:21

mergify Bot removed the needs-rebase label Jun 5, 2026

QwertyJack force-pushed the fix/omit-empty-tool-calls branch from ae36b1e to 03ce98d Compare June 22, 2026 03:31

chaunceyjiang reviewed Jun 22, 2026

View reviewed changes

Comment thread vllm/entrypoints/openai/chat_completion/protocol.py Outdated

chaunceyjiang reviewed Jun 22, 2026

View reviewed changes

Comment thread vllm/entrypoints/openai/engine/protocol.py Outdated

chaunceyjiang added the ready ONLY add when PR is ready to merge/full CI is needed label Jun 22, 2026

chaunceyjiang approved these changes Jun 22, 2026

View reviewed changes

QwertyJack and others added 5 commits June 23, 2026 09:36

Update vllm/entrypoints/openai/chat_completion/protocol.py

5df28e7

Signed-off-by: Chauncey <chaunceyjiang@gmail.com>

Update vllm/entrypoints/openai/engine/protocol.py

121c74a

Signed-off-by: Chauncey <chaunceyjiang@gmail.com>

fix(openai): handle unset empty tool calls in serializers

d9b92d1

Co-authored-by: OpenAI Codex <codex@openai.com> Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>

test(openai): expect omitted empty tool calls

29080b9

Co-authored-by: OpenAI Codex <codex@openai.com> Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>

QwertyJack force-pushed the fix/omit-empty-tool-calls branch from 2efc7d3 to 29080b9 Compare June 23, 2026 01:36

chaunceyjiang merged commit 6c427dd into vllm-project:main Jun 23, 2026
58 checks passed

github-project-automation Bot moved this to Done in Tool Calling Jun 23, 2026

QwertyJack mentioned this pull request Jul 3, 2026

[BugFix][Parser] Avoid partial MiniMax parameter arguments vllm-project/vllm-ascend#11384

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[BugFix] Omit empty tool_calls from OpenAI chat responses#44105

[BugFix] Omit empty tool_calls from OpenAI chat responses#44105
chaunceyjiang merged 5 commits into
vllm-project:mainfrom
QwertyJack:fix/omit-empty-tool-calls

QwertyJack commented May 31, 2026 •

edited

Loading

hclsys commented May 31, 2026

QwertyJack commented May 31, 2026

mergify Bot commented Jun 3, 2026

Uh oh!

Uh oh!

chaunceyjiang left a comment

Uh oh!

Labels

3 participants

Uh oh!

Uh oh!

Conversation

QwertyJack commented May 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it

Does this PR introduce any user-facing change?

Duplicate-work check

How was this patch tested?

AI assistance

hclsys commented May 31, 2026

QwertyJack commented May 31, 2026

mergify Bot commented Jun 3, 2026

Uh oh!

Uh oh!

chaunceyjiang left a comment

Choose a reason for hiding this comment

Uh oh!

Labels

3 participants

QwertyJack commented May 31, 2026 •

edited

Loading