Skip to content

[Rust Frontend] Support thinking_token_budget for chat and completions#46137

Merged
BugenZhao merged 3 commits into
vllm-project:mainfrom
ricky-chaoju:feat/rust-thinking-token-budget
Jun 22, 2026
Merged

[Rust Frontend] Support thinking_token_budget for chat and completions#46137
BugenZhao merged 3 commits into
vllm-project:mainfrom
ricky-chaoju:feat/rust-thinking-token-budget

Conversation

@ricky-chaoju

Copy link
Copy Markdown
Contributor

Add support for the thinking_token_budget request parameter in the Rust frontend, for both /v1/chat/completions and /v1/completions, reaching parity with the Python frontend (tracked in #44280, "Request compatibility and validation"). Previously the Rust frontend parsed thinking_token_budget on the chat endpoint but explicitly rejected it ("thinking_token_budget is not supported."), and the completions endpoint did not expose it at all. The V1 engine has supported the parameter since #20859 and the Python frontend exposes it on both endpoints, so this was a pure frontend gap. Normalization mirrors Python's validate_thinking_token_budget (None/-1 → unlimited; other negatives rejected; no upper bound) and happens once during lowering, so chat, completions, and /inference/v1/generate behave consistently.

Signed-off-by: RickyChen / 陳昭儒 <ricky.chen@infinirc.com>
@mergify mergify Bot added the rust label Jun 19, 2026

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dac87f740d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread rust/src/text/src/lower.rs
@BugenZhao BugenZhao added the ready ONLY add when PR is ready to merge/full CI is needed label Jun 22, 2026

@BugenZhao BugenZhao left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment thread rust/src/engine-core-client/src/protocol/mod.rs Outdated
@BugenZhao BugenZhao merged commit 80abe0d into vllm-project:main Jun 22, 2026
23 checks passed
tunglinwood pushed a commit to tunglinwood/vllm that referenced this pull request Jun 22, 2026
vllm-project#46137)

Co-authored-by: Bugen Zhao <i@bugenzhao.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Signed-off-by: RickyChen / 陳昭儒 <ricky.chen@infinirc.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
@ricky-chaoju ricky-chaoju deleted the feat/rust-thinking-token-budget branch June 22, 2026 09:12
nkzhenhua pushed a commit to nkzhenhua/vllm that referenced this pull request Jun 24, 2026
vllm-project#46137)

Co-authored-by: Bugen Zhao <i@bugenzhao.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Signed-off-by: RickyChen / 陳昭儒 <ricky.chen@infinirc.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
qli88 pushed a commit to qli88/vllm that referenced this pull request Jun 26, 2026
vllm-project#46137)

Co-authored-by: Bugen Zhao <i@bugenzhao.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Signed-off-by: RickyChen / 陳昭儒 <ricky.chen@infinirc.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Qiang Li <qiang.li2@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed rust

2 participants