Skip to content

[Security] Reject non-finite temperature and repetition_penalty values#45116

Merged
vllm-bot merged 1 commit into
vllm-project:mainfrom
jperezdealgaba:fix/reject-non-finite-temperature
Jun 11, 2026
Merged

[Security] Reject non-finite temperature and repetition_penalty values#45116
vllm-bot merged 1 commit into
vllm-project:mainfrom
jperezdealgaba:fix/reject-non-finite-temperature

Conversation

@jperezdealgaba

Copy link
Copy Markdown
Contributor

Summary

  • Add math.isfinite() validation for temperature and repetition_penalty in SamplingParams._verify_args().
  • NaN and Infinity bypass Python's comparison operators (<, >) due to IEEE 754 float semantics, allowing them to propagate to GPU sampling kernels where they cause undefined behavior or CUDA crashes.
  • Addresses advisory GHSA-7h4p-rffg-7823.

Test plan

  • Added tests/samplers/test_non_finite_params.py with 12 parametrized tests covering NaN, +Inf, -Inf rejection and valid value acceptance for both parameters.
  • pytest tests/samplers/test_non_finite_params.py -v — all 12 tests pass.
  • pre-commit run --files vllm/sampling_params.py tests/samplers/test_non_finite_params.py — all hooks pass.
Add math.isfinite() validation for temperature and repetition_penalty
in SamplingParams._verify_args(). NaN and Infinity bypass comparison
operators (< , >) in Python's IEEE 754 semantics, allowing them to
propagate to GPU sampling kernels where they cause undefined behavior
or CUDA crashes.

Signed-off-by: Juan Perez de Algaba Sierra <jperezde@redhat.com>

Signed-off-by: jperezde <jperezde@redhat.com>

@hmellor hmellor left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be faster to use <= float('inf')? These checks are going to run a lot so we should try to use the fastest method

@jperezdealgaba

Copy link
Copy Markdown
Contributor Author

@hmellor The problem using float is that <= float('inf') would still let temperature=Infinity through to the GPU kernels, which is the vulnerability I am trying to fix here.

That's the reason why. Do you think of a better solution for it? I don't really know it

@hmellor

hmellor commented Jun 10, 2026

Copy link
Copy Markdown
Member

Oh yeah of course, I should have suggested < not <=.

Anyway, a micro benchmark suggests that comparison is only faster if we write inf to a variable for reuse (not what I originally suggested)

Approach ns/call
x < inf (prebound) ~10
math.isfinite(x) ~15
-inf < x < inf ~25
x < float('inf') (inline) ~50

Let's stick with what you have

@hmellor hmellor enabled auto-merge (squash) June 10, 2026 22:54
@github-actions github-actions Bot added the ready ONLY add when PR is ready to merge/full CI is needed label Jun 10, 2026
@vllm-bot vllm-bot merged commit d598d23 into vllm-project:main Jun 11, 2026
63 of 65 checks passed
Saddss pushed a commit to Saddss/vllm that referenced this pull request Jun 14, 2026
vivek8123 pushed a commit to odh-on-pz/vllm-upstream that referenced this pull request Jun 18, 2026
divineearthly pushed a commit to divineearthly/vllm that referenced this pull request Jun 19, 2026
vllm-project#45116)

Signed-off-by: jperezde <jperezde@redhat.com>
Signed-off-by: divineearthly <divineearthly@gmail.com>
tunglinwood pushed a commit to tunglinwood/vllm that referenced this pull request Jun 22, 2026
nkzhenhua pushed a commit to nkzhenhua/vllm that referenced this pull request Jun 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed

4 participants