Skip to content

[BugFix] Omit empty tool_calls from OpenAI chat responses#44105

Merged
chaunceyjiang merged 5 commits into
vllm-project:mainfrom
QwertyJack:fix/omit-empty-tool-calls
Jun 23, 2026
Merged

[BugFix] Omit empty tool_calls from OpenAI chat responses#44105
chaunceyjiang merged 5 commits into
vllm-project:mainfrom
QwertyJack:fix/omit-empty-tool-calls

Conversation

@QwertyJack

@QwertyJack QwertyJack commented May 31, 2026

Copy link
Copy Markdown
Contributor

Fixes #44104.

What this PR does / why we need it

This PR omits empty tool_calls arrays from serialized OpenAI chat completion responses:

  • non-stream message.tool_calls == []
  • stream delta.tool_calls == []

Non-empty tool calls are preserved unchanged.

This fixes an OpenAI API compatibility issue where final assistant responses after a tool result can contain normal text with finish_reason="stop", but still serialize tool_calls: []. OpenAI SDK clients then see message.tool_calls is not None, enter the tool-call path, and fail when indexing the empty list.

Does this PR introduce any user-facing change?

Yes. OpenAI-compatible chat completion responses no longer include empty tool_calls arrays. This aligns the response shape with the absence of tool calls while preserving non-empty tool-call payloads.

Duplicate-work check

  • gh issue view 44104 --repo vllm-project/vllm --comments
  • gh pr list --repo vllm-project/vllm --state open --search "44104 in:body"
  • gh pr list --repo vllm-project/vllm --state open --search "empty tool_calls OpenAI chat responses"

Result: #44105 is the only open PR directly addressing #44104 / empty tool_calls response serialization. Broader keyword hits are unrelated parser/model changes.

How was this patch tested?

  • PYTHONPATH=. .venv/bin/python -m pytest -q tests/entrypoints/openai/test_tool_choice_content_none.py
  • uv run --with ruff --no-project ruff check vllm/entrypoints/openai/chat_completion/protocol.py vllm/entrypoints/openai/engine/protocol.py tests/entrypoints/openai/test_tool_choice_content_none.py tests/entrypoints/openai/chat_completion/test_completion_with_function_calling.py tests/entrypoints/openai/chat_completion/test_serving_chat.py

Results:

  • pytest: 6 passed
  • ruff: passed

AI assistance

AI assistance was used to investigate the CI failure, add the serializer guard for unset default tool_calls, and align affected OpenAI SDK integration test expectations with omitted empty tool_calls. The human submitter should review every changed line and the final CI result.

@hclsys

hclsys commented May 31, 2026

Copy link
Copy Markdown
Contributor

nice — mirrors the existing ChatCompletionToolsParam._serialize shape at protocol.py:166. one note: this changes the dict shape for any in-process consumer that previously read message["tool_calls"] expecting [] (callbacks, loggers, etc.) — they'd now hit KeyError. probably fine since the public surface is JSON over the wire, but worth a quick git grep for \.tool_calls\b consumers inside vllm just to be safe.

@QwertyJack

Copy link
Copy Markdown
Contributor Author

Thanks, checked this.

Commands run:

git grep -n '\.tool_calls\b' -- vllm
git grep -n '\["tool_calls"\]' -- vllm
rg -n 'model_dump\(|model_dump_json\(' vllm/entrypoints/openai vllm/entrypoints/anthropic

Findings:

  • The .tool_calls production hits are object/attribute consumers (choice.message.tool_calls, delta_message.tool_calls, parser results, etc.) before serialization. Those still see the model field as []; the serializer only changes the dumped payload.
  • Direct message["tool_calls"] production hits are request/conversation message transforms, mostly guarded by message.get("tool_calls") or a prior presence check. I did not find a production path that calls model_dump() on ChatMessage/DeltaMessage and then directly indexes payload["tool_calls"] expecting an empty list.
  • The OpenAI chat response model_dump() / model_dump_json() call sites are the outbound API paths. The Harmony conversion paths that dump Pydantic messages use .get("tool_calls", []), so omitted empty fields are handled.

So I think the current shape is safe for in-process vLLM consumers while fixing the over-the-wire OpenAI-compatible payload.

linfeng-yuan pushed a commit to vllm-project/vllm-ascend that referenced this pull request Jun 1, 2026
…9792)

## What this PR does / why we need it?

Backport of #9791 for #9790 to `releases/v0.20.2rc`.

This adds the same local monkey patch for the upstream vLLM OpenAI API
compatibility bug where final assistant responses after a tool result
can serialize `tool_calls: []` even though `finish_reason="stop"` and
the response contains normal assistant text.

The patch removes empty `tool_calls` arrays from:

- non-stream chat completion `message`
- stream chat completion `delta`

Non-empty tool calls are preserved unchanged.

Upstream vLLM issue: vllm-project/vllm#44104
Upstream vLLM PR: vllm-project/vllm#44105

## Does this PR introduce _any_ user-facing change?

Yes. OpenAI-compatible chat completion responses no longer include empty
`tool_calls` arrays on the v0.20.2rc release branch.

## How was this patch tested?

On `fix/omit-empty-tool-calls-v0.20.2rc`:

- `ruff check
vllm_ascend/patch/platform/patch_tool_choice_none_content.py
tests/ut/patch/platform/test_patch_tool_choice_none_content.py`
- `git diff --check`
- `pytest -q
tests/ut/patch/platform/test_patch_deepseek_v4_tool_call_parser.py
tests/ut/patch/platform/test_patch_tool_choice_none_content.py`

The pytest run passed: `13 passed, 16 warnings`.

Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
Co-authored-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
linfeng-yuan pushed a commit to vllm-project/vllm-ascend that referenced this pull request Jun 1, 2026
## What this PR does / why we need it?

Fixes #9790.

This adds a local monkey patch for an upstream vLLM OpenAI API
compatibility bug where final assistant responses after a tool result
can serialize `tool_calls: []` even though `finish_reason="stop"` and
the response contains normal assistant text.

OpenAI SDK clients treat `message.tool_calls is not None` as an active
tool-call path, so an empty list can make client loops fail with
`IndexError`.

The patch removes empty `tool_calls` arrays from:

- non-stream chat completion `message`
- stream chat completion `delta`

Non-empty tool calls are preserved unchanged.

Upstream vLLM issue: vllm-project/vllm#44104
Upstream vLLM PR: vllm-project/vllm#44105

## Does this PR introduce _any_ user-facing change?

Yes. OpenAI-compatible chat completion responses no longer include empty
`tool_calls` arrays. This aligns the payload with the absence of tool
calls and avoids OpenAI SDK clients entering a false tool-call loop.

## How was this patch tested?

- `ruff check
vllm_ascend/patch/platform/patch_tool_choice_none_content.py
tests/ut/patch/platform/test_patch_tool_choice_none_content.py`
- `git diff --check`
- `pytest -q
tests/ut/patch/platform/test_patch_deepseek_v4_tool_call_parser.py
tests/ut/patch/platform/test_patch_tool_choice_none_content.py`

The pytest run passed: `13 passed, 16 warnings`.

- vLLM version: v0.20.2
- vLLM main:
vllm-project/vllm@39910f2

Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
Co-authored-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
@QwertyJack QwertyJack force-pushed the fix/omit-empty-tool-calls branch from b6ca136 to 22a40e9 Compare June 1, 2026 14:48
yilunh998 pushed a commit to yilunh998/vllm-ascend that referenced this pull request Jun 2, 2026
…ct#9791)

## What this PR does / why we need it?

Fixes vllm-project#9790.

This adds a local monkey patch for an upstream vLLM OpenAI API
compatibility bug where final assistant responses after a tool result
can serialize `tool_calls: []` even though `finish_reason="stop"` and
the response contains normal assistant text.

OpenAI SDK clients treat `message.tool_calls is not None` as an active
tool-call path, so an empty list can make client loops fail with
`IndexError`.

The patch removes empty `tool_calls` arrays from:

- non-stream chat completion `message`
- stream chat completion `delta`

Non-empty tool calls are preserved unchanged.

Upstream vLLM issue: vllm-project/vllm#44104
Upstream vLLM PR: vllm-project/vllm#44105

## Does this PR introduce _any_ user-facing change?

Yes. OpenAI-compatible chat completion responses no longer include empty
`tool_calls` arrays. This aligns the payload with the absence of tool
calls and avoids OpenAI SDK clients entering a false tool-call loop.

## How was this patch tested?

- `ruff check
vllm_ascend/patch/platform/patch_tool_choice_none_content.py
tests/ut/patch/platform/test_patch_tool_choice_none_content.py`
- `git diff --check`
- `pytest -q
tests/ut/patch/platform/test_patch_deepseek_v4_tool_call_parser.py
tests/ut/patch/platform/test_patch_tool_choice_none_content.py`

The pytest run passed: `13 passed, 16 warnings`.

- vLLM version: v0.20.2
- vLLM main:
vllm-project/vllm@39910f2

Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
Co-authored-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
Signed-off-by: yilunh <hanyilun1@huawei.com>
zzzzzmeng pushed a commit to zzzzzmeng/vllm-ascend that referenced this pull request Jun 2, 2026
…ct#9791)

## What this PR does / why we need it?

Fixes vllm-project#9790.

This adds a local monkey patch for an upstream vLLM OpenAI API
compatibility bug where final assistant responses after a tool result
can serialize `tool_calls: []` even though `finish_reason="stop"` and
the response contains normal assistant text.

OpenAI SDK clients treat `message.tool_calls is not None` as an active
tool-call path, so an empty list can make client loops fail with
`IndexError`.

The patch removes empty `tool_calls` arrays from:

- non-stream chat completion `message`
- stream chat completion `delta`

Non-empty tool calls are preserved unchanged.

Upstream vLLM issue: vllm-project/vllm#44104
Upstream vLLM PR: vllm-project/vllm#44105

## Does this PR introduce _any_ user-facing change?

Yes. OpenAI-compatible chat completion responses no longer include empty
`tool_calls` arrays. This aligns the payload with the absence of tool
calls and avoids OpenAI SDK clients entering a false tool-call loop.

## How was this patch tested?

- `ruff check
vllm_ascend/patch/platform/patch_tool_choice_none_content.py
tests/ut/patch/platform/test_patch_tool_choice_none_content.py`
- `git diff --check`
- `pytest -q
tests/ut/patch/platform/test_patch_deepseek_v4_tool_call_parser.py
tests/ut/patch/platform/test_patch_tool_choice_none_content.py`

The pytest run passed: `13 passed, 16 warnings`.

- vLLM version: v0.20.2
- vLLM main:
vllm-project/vllm@39910f2

Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
Co-authored-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
@mergify

mergify Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @QwertyJack.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify Bot added the needs-rebase label Jun 3, 2026
@QwertyJack QwertyJack force-pushed the fix/omit-empty-tool-calls branch from 22a40e9 to ae36b1e Compare June 5, 2026 10:21
@mergify mergify Bot removed the needs-rebase label Jun 5, 2026
2416602906 pushed a commit to 2416602906/vllm-ascend that referenced this pull request Jun 8, 2026
…ct#9791)

## What this PR does / why we need it?

Fixes vllm-project#9790.

This adds a local monkey patch for an upstream vLLM OpenAI API
compatibility bug where final assistant responses after a tool result
can serialize `tool_calls: []` even though `finish_reason="stop"` and
the response contains normal assistant text.

OpenAI SDK clients treat `message.tool_calls is not None` as an active
tool-call path, so an empty list can make client loops fail with
`IndexError`.

The patch removes empty `tool_calls` arrays from:

- non-stream chat completion `message`
- stream chat completion `delta`

Non-empty tool calls are preserved unchanged.

Upstream vLLM issue: vllm-project/vllm#44104
Upstream vLLM PR: vllm-project/vllm#44105

## Does this PR introduce _any_ user-facing change?

Yes. OpenAI-compatible chat completion responses no longer include empty
`tool_calls` arrays. This aligns the payload with the absence of tool
calls and avoids OpenAI SDK clients entering a false tool-call loop.

## How was this patch tested?

- `ruff check
vllm_ascend/patch/platform/patch_tool_choice_none_content.py
tests/ut/patch/platform/test_patch_tool_choice_none_content.py`
- `git diff --check`
- `pytest -q
tests/ut/patch/platform/test_patch_deepseek_v4_tool_call_parser.py
tests/ut/patch/platform/test_patch_tool_choice_none_content.py`

The pytest run passed: `13 passed, 16 warnings`.

- vLLM version: v0.20.2
- vLLM main:
vllm-project/vllm@39910f2

Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
Co-authored-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
Signed-off-by: shenqiangqiang <2416602906@qq.com>
LostFox11 pushed a commit to LostFox11/vllm-ascend that referenced this pull request Jun 15, 2026
…ct#9791)

## What this PR does / why we need it?

Fixes vllm-project#9790.

This adds a local monkey patch for an upstream vLLM OpenAI API
compatibility bug where final assistant responses after a tool result
can serialize `tool_calls: []` even though `finish_reason="stop"` and
the response contains normal assistant text.

OpenAI SDK clients treat `message.tool_calls is not None` as an active
tool-call path, so an empty list can make client loops fail with
`IndexError`.

The patch removes empty `tool_calls` arrays from:

- non-stream chat completion `message`
- stream chat completion `delta`

Non-empty tool calls are preserved unchanged.

Upstream vLLM issue: vllm-project/vllm#44104
Upstream vLLM PR: vllm-project/vllm#44105

## Does this PR introduce _any_ user-facing change?

Yes. OpenAI-compatible chat completion responses no longer include empty
`tool_calls` arrays. This aligns the payload with the absence of tool
calls and avoids OpenAI SDK clients entering a false tool-call loop.

## How was this patch tested?

- `ruff check
vllm_ascend/patch/platform/patch_tool_choice_none_content.py
tests/ut/patch/platform/test_patch_tool_choice_none_content.py`
- `git diff --check`
- `pytest -q
tests/ut/patch/platform/test_patch_deepseek_v4_tool_call_parser.py
tests/ut/patch/platform/test_patch_tool_choice_none_content.py`

The pytest run passed: `13 passed, 16 warnings`.

- vLLM version: v0.20.2
- vLLM main:
vllm-project/vllm@39910f2

Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
Co-authored-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
LostFox11 pushed a commit to LostFox11/vllm-ascend that referenced this pull request Jun 15, 2026
…ct#9791)

## What this PR does / why we need it?

Fixes vllm-project#9790.

This adds a local monkey patch for an upstream vLLM OpenAI API
compatibility bug where final assistant responses after a tool result
can serialize `tool_calls: []` even though `finish_reason="stop"` and
the response contains normal assistant text.

OpenAI SDK clients treat `message.tool_calls is not None` as an active
tool-call path, so an empty list can make client loops fail with
`IndexError`.

The patch removes empty `tool_calls` arrays from:

- non-stream chat completion `message`
- stream chat completion `delta`

Non-empty tool calls are preserved unchanged.

Upstream vLLM issue: vllm-project/vllm#44104
Upstream vLLM PR: vllm-project/vllm#44105

## Does this PR introduce _any_ user-facing change?

Yes. OpenAI-compatible chat completion responses no longer include empty
`tool_calls` arrays. This aligns the payload with the absence of tool
calls and avoids OpenAI SDK clients entering a false tool-call loop.

## How was this patch tested?

- `ruff check
vllm_ascend/patch/platform/patch_tool_choice_none_content.py
tests/ut/patch/platform/test_patch_tool_choice_none_content.py`
- `git diff --check`
- `pytest -q
tests/ut/patch/platform/test_patch_deepseek_v4_tool_call_parser.py
tests/ut/patch/platform/test_patch_tool_choice_none_content.py`

The pytest run passed: `13 passed, 16 warnings`.

- vLLM version: v0.20.2
- vLLM main:
vllm-project/vllm@39910f2

Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
Co-authored-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
ader47 pushed a commit to ader47/vllm-ascend that referenced this pull request Jun 18, 2026
…ct#9791)

## What this PR does / why we need it?

Fixes vllm-project#9790.

This adds a local monkey patch for an upstream vLLM OpenAI API
compatibility bug where final assistant responses after a tool result
can serialize `tool_calls: []` even though `finish_reason="stop"` and
the response contains normal assistant text.

OpenAI SDK clients treat `message.tool_calls is not None` as an active
tool-call path, so an empty list can make client loops fail with
`IndexError`.

The patch removes empty `tool_calls` arrays from:

- non-stream chat completion `message`
- stream chat completion `delta`

Non-empty tool calls are preserved unchanged.

Upstream vLLM issue: vllm-project/vllm#44104
Upstream vLLM PR: vllm-project/vllm#44105

## Does this PR introduce _any_ user-facing change?

Yes. OpenAI-compatible chat completion responses no longer include empty
`tool_calls` arrays. This aligns the payload with the absence of tool
calls and avoids OpenAI SDK clients entering a false tool-call loop.

## How was this patch tested?

- `ruff check
vllm_ascend/patch/platform/patch_tool_choice_none_content.py
tests/ut/patch/platform/test_patch_tool_choice_none_content.py`
- `git diff --check`
- `pytest -q
tests/ut/patch/platform/test_patch_deepseek_v4_tool_call_parser.py
tests/ut/patch/platform/test_patch_tool_choice_none_content.py`

The pytest run passed: `13 passed, 16 warnings`.

- vLLM version: v0.20.2
- vLLM main:
vllm-project/vllm@39910f2

Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
Co-authored-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
@QwertyJack QwertyJack force-pushed the fix/omit-empty-tool-calls branch from ae36b1e to 03ce98d Compare June 22, 2026 03:31
Comment thread vllm/entrypoints/openai/chat_completion/protocol.py Outdated
Comment thread vllm/entrypoints/openai/engine/protocol.py Outdated
@chaunceyjiang chaunceyjiang added the ready ONLY add when PR is ready to merge/full CI is needed label Jun 22, 2026

@chaunceyjiang chaunceyjiang left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

QwertyJack and others added 5 commits June 23, 2026 09:36
Remove empty tool_calls arrays from serialized chat completion messages and streaming deltas while preserving non-empty tool calls.

This keeps the response compatible with OpenAI clients that treat tool_calls=[] as an active tool-call path.

Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
Signed-off-by: Chauncey <chaunceyjiang@gmail.com>
Signed-off-by: Chauncey <chaunceyjiang@gmail.com>
Co-authored-by: OpenAI Codex <codex@openai.com>

Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
Co-authored-by: OpenAI Codex <codex@openai.com>

Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
@QwertyJack QwertyJack force-pushed the fix/omit-empty-tool-calls branch from 2efc7d3 to 29080b9 Compare June 23, 2026 01:36
@chaunceyjiang chaunceyjiang merged commit 6c427dd into vllm-project:main Jun 23, 2026
58 checks passed
nkzhenhua pushed a commit to nkzhenhua/vllm that referenced this pull request Jun 24, 2026
…ct#44105)

Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
Signed-off-by: Chauncey <chaunceyjiang@gmail.com>
Co-authored-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
qli88 pushed a commit to qli88/vllm that referenced this pull request Jun 26, 2026
…ct#44105)

Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
Signed-off-by: Chauncey <chaunceyjiang@gmail.com>
Co-authored-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
Signed-off-by: Qiang Li <qiang.li2@amd.com>
MengqingCao pushed a commit to vllm-project/vllm-ascend that referenced this pull request Jun 26, 2026
### What this PR does / why we need it?
Narrows `patch_tool_choice_none_content.py` after the main2main update
to vLLM `v0.23.0`.

vLLM `v0.23.0` already includes
vllm-project/vllm#40148, which tolerates forced
tool-choice parsing when reasoning extraction leaves `content=None`, so
this PR removes the local `DelegatingParser._parse_tool_calls` monkey
patch.

The patch still keeps the empty `tool_calls` serializer shim because
vllm-project/vllm#44105 is not included in vLLM
`v0.23.0`. The shim now covers both non-streaming and streaming JSON
payloads, and `model_dump_json()` defaults to `mode="json"` before
calling `json.dumps()`.

- vLLM version: v0.23.0
- vLLM main:
vllm-project/vllm@967c5c3

Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
Co-authored-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
CXY-Katrina pushed a commit to CXY-Katrina/vllm-ascend-zhx that referenced this pull request Jun 27, 2026
…ct#9791)

## What this PR does / why we need it?

Fixes vllm-project#9790.

This adds a local monkey patch for an upstream vLLM OpenAI API
compatibility bug where final assistant responses after a tool result
can serialize `tool_calls: []` even though `finish_reason="stop"` and
the response contains normal assistant text.

OpenAI SDK clients treat `message.tool_calls is not None` as an active
tool-call path, so an empty list can make client loops fail with
`IndexError`.

The patch removes empty `tool_calls` arrays from:

- non-stream chat completion `message`
- stream chat completion `delta`

Non-empty tool calls are preserved unchanged.

Upstream vLLM issue: vllm-project/vllm#44104
Upstream vLLM PR: vllm-project/vllm#44105

## Does this PR introduce _any_ user-facing change?

Yes. OpenAI-compatible chat completion responses no longer include empty
`tool_calls` arrays. This aligns the payload with the absence of tool
calls and avoids OpenAI SDK clients entering a false tool-call loop.

## How was this patch tested?

- `ruff check
vllm_ascend/patch/platform/patch_tool_choice_none_content.py
tests/ut/patch/platform/test_patch_tool_choice_none_content.py`
- `git diff --check`
- `pytest -q
tests/ut/patch/platform/test_patch_deepseek_v4_tool_call_parser.py
tests/ut/patch/platform/test_patch_tool_choice_none_content.py`

The pytest run passed: `13 passed, 16 warnings`.

- vLLM version: v0.20.2
- vLLM main:
vllm-project/vllm@39910f2

Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
Co-authored-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
CXY-Katrina pushed a commit to CXY-Katrina/vllm-ascend-zhx that referenced this pull request Jun 27, 2026
…-project#10903)

### What this PR does / why we need it?
Narrows `patch_tool_choice_none_content.py` after the main2main update
to vLLM `v0.23.0`.

vLLM `v0.23.0` already includes
vllm-project/vllm#40148, which tolerates forced
tool-choice parsing when reasoning extraction leaves `content=None`, so
this PR removes the local `DelegatingParser._parse_tool_calls` monkey
patch.

The patch still keeps the empty `tool_calls` serializer shim because
vllm-project/vllm#44105 is not included in vLLM
`v0.23.0`. The shim now covers both non-streaming and streaming JSON
payloads, and `model_dump_json()` defaults to `mode="json"` before
calling `json.dumps()`.

- vLLM version: v0.23.0
- vLLM main:
vllm-project/vllm@967c5c3

Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
Co-authored-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
HaoxinZong pushed a commit to HaoxinZong/vllm-ascend that referenced this pull request Jun 27, 2026
…ct#9791)

## What this PR does / why we need it?

Fixes vllm-project#9790.

This adds a local monkey patch for an upstream vLLM OpenAI API
compatibility bug where final assistant responses after a tool result
can serialize `tool_calls: []` even though `finish_reason="stop"` and
the response contains normal assistant text.

OpenAI SDK clients treat `message.tool_calls is not None` as an active
tool-call path, so an empty list can make client loops fail with
`IndexError`.

The patch removes empty `tool_calls` arrays from:

- non-stream chat completion `message`
- stream chat completion `delta`

Non-empty tool calls are preserved unchanged.

Upstream vLLM issue: vllm-project/vllm#44104
Upstream vLLM PR: vllm-project/vllm#44105

## Does this PR introduce _any_ user-facing change?

Yes. OpenAI-compatible chat completion responses no longer include empty
`tool_calls` arrays. This aligns the payload with the absence of tool
calls and avoids OpenAI SDK clients entering a false tool-call loop.

## How was this patch tested?

- `ruff check
vllm_ascend/patch/platform/patch_tool_choice_none_content.py
tests/ut/patch/platform/test_patch_tool_choice_none_content.py`
- `git diff --check`
- `pytest -q
tests/ut/patch/platform/test_patch_deepseek_v4_tool_call_parser.py
tests/ut/patch/platform/test_patch_tool_choice_none_content.py`

The pytest run passed: `13 passed, 16 warnings`.

- vLLM version: v0.20.2
- vLLM main:
vllm-project/vllm@39910f2

Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
Co-authored-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
pisceskkk pushed a commit to pisceskkk/vllm-ascend that referenced this pull request Jun 29, 2026
…ct#9791)

## What this PR does / why we need it?

Fixes vllm-project#9790.

This adds a local monkey patch for an upstream vLLM OpenAI API
compatibility bug where final assistant responses after a tool result
can serialize `tool_calls: []` even though `finish_reason="stop"` and
the response contains normal assistant text.

OpenAI SDK clients treat `message.tool_calls is not None` as an active
tool-call path, so an empty list can make client loops fail with
`IndexError`.

The patch removes empty `tool_calls` arrays from:

- non-stream chat completion `message`
- stream chat completion `delta`

Non-empty tool calls are preserved unchanged.

Upstream vLLM issue: vllm-project/vllm#44104
Upstream vLLM PR: vllm-project/vllm#44105

## Does this PR introduce _any_ user-facing change?

Yes. OpenAI-compatible chat completion responses no longer include empty
`tool_calls` arrays. This aligns the payload with the absence of tool
calls and avoids OpenAI SDK clients entering a false tool-call loop.

## How was this patch tested?

- `ruff check
vllm_ascend/patch/platform/patch_tool_choice_none_content.py
tests/ut/patch/platform/test_patch_tool_choice_none_content.py`
- `git diff --check`
- `pytest -q
tests/ut/patch/platform/test_patch_deepseek_v4_tool_call_parser.py
tests/ut/patch/platform/test_patch_tool_choice_none_content.py`

The pytest run passed: `13 passed, 16 warnings`.

- vLLM version: v0.20.2
- vLLM main:
vllm-project/vllm@39910f2

Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
Co-authored-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
wincent8 pushed a commit to wincent8/vllm that referenced this pull request Jun 29, 2026
…ct#44105)

Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
Signed-off-by: Chauncey <chaunceyjiang@gmail.com>
Co-authored-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working frontend ready ONLY add when PR is ready to merge/full CI is needed tool-calling

3 participants