[Serve] Downstream deployments over-provision when receiving Deployme… by abrarsheikh · Pull Request #60747 · ray-project/ray

abrarsheikh · 2026-02-04T18:55:28Z

…ntResponse arguments from slow upstream Signed-off-by: abrar <abrar@anyscale.com>

gemini-code-assist

Code Review

This pull request correctly addresses an over-provisioning issue in downstream deployments by resolving request arguments before they are counted as queued. The logic change in router.py is direct and well-commented, and the new test case in test_autoscaling_policy.py effectively validates the fix. I have one suggestion to make the test even more robust against potential timing issues.

gemini-code-assist · 2026-02-04T18:57:22Z

+
+        # Wait for all 5 requests to be blocked at SlowUpstream (waiting on signal)
+        wait_for_condition(lambda: ray.get(signal.cur_num_waiters.remote()) == 5)
+


To make this test more robust against timing-related flakiness, it would be beneficial to add a short time.sleep() after waiting for the requests to be blocked and before asserting the number of replicas. This ensures that the autoscaler has had sufficient time to make a (potentially incorrect) scaling decision. Given upscale_delay_s is 0.2s, a sleep of 0.5s should be adequate.

Suggested change

# Give the autoscaler time to potentially make a wrong decision.

# A sleep duration longer than upscale_delay_s (0.2s) ensures that

# we would have seen an upscale event if the fix was not effective.

time.sleep(0.5)

ray-project#60747) fixes ray-project#60624 Signed-off-by: abrar <abrar@anyscale.com> Signed-off-by: tiennguyentony <46289799+tiennguyentony@users.noreply.github.com>

ray-project#60747) fixes ray-project#60624 Signed-off-by: abrar <abrar@anyscale.com>

#60747) fixes #60624 Signed-off-by: abrar <abrar@anyscale.com> Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>

#60747) fixes #60624 Signed-off-by: abrar <abrar@anyscale.com>

ray-project#60747) fixes ray-project#60624 Signed-off-by: abrar <abrar@anyscale.com> Signed-off-by: Adel Nour <ans9868@nyu.edu>

ray-project#60747) fixes ray-project#60624 Signed-off-by: abrar <abrar@anyscale.com>

[Serve] Downstream deployments over-provision when receiving Deployme…

04c2d20

…ntResponse arguments from slow upstream Signed-off-by: abrar <abrar@anyscale.com>

abrarsheikh requested a review from a team as a code owner February 4, 2026 18:55

abrarsheikh added the go add ONLY when ready to merge, run all tests label Feb 4, 2026

gemini-code-assist Bot reviewed Feb 4, 2026

View reviewed changes

ray-gardener Bot added the serve Ray Serve Related Issue label Feb 4, 2026

abrarsheikh requested a review from akyang-anyscale February 4, 2026 20:25

akyang-anyscale approved these changes Feb 4, 2026

View reviewed changes

abrarsheikh merged commit c40ef35 into master Feb 4, 2026
6 checks passed

abrarsheikh deleted the 60624-abrar-auto branch February 4, 2026 23:32

tiennguyentony pushed a commit to tiennguyentony/ray that referenced this pull request Feb 7, 2026

[Serve] Downstream deployments over-provision when receiving Deployme… (

09ec116

ray-project#60747) fixes ray-project#60624 Signed-off-by: abrar <abrar@anyscale.com>

elliot-barn pushed a commit that referenced this pull request Feb 9, 2026

[Serve] Downstream deployments over-provision when receiving Deployme… (

7a95f90

#60747) fixes #60624 Signed-off-by: abrar <abrar@anyscale.com> Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>

elliot-barn pushed a commit that referenced this pull request Feb 9, 2026

[Serve] Downstream deployments over-provision when receiving Deployme… (

28ec995

#60747) fixes #60624 Signed-off-by: abrar <abrar@anyscale.com>

Aydin-ab pushed a commit to kunling-anyscale/ray that referenced this pull request Feb 20, 2026

[Serve] Downstream deployments over-provision when receiving Deployme… (

2010aed

ray-project#60747) fixes ray-project#60624 Signed-off-by: abrar <abrar@anyscale.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Serve] Downstream deployments over-provision when receiving Deployme…#60747

[Serve] Downstream deployments over-provision when receiving Deployme…#60747
abrarsheikh merged 1 commit into
masterfrom
60624-abrar-auto

abrarsheikh commented Feb 4, 2026

gemini-code-assist Bot left a comment

gemini-code-assist Bot Feb 4, 2026

Uh oh!

Labels

2 participants


		# Wait for all 5 requests to be blocked at SlowUpstream (waiting on signal)
		wait_for_condition(lambda: ray.get(signal.cur_num_waiters.remote()) == 5)

+        # Give the autoscaler time to potentially make a wrong decision.
+        # A sleep duration longer than upscale_delay_s (0.2s) ensures that
+        # we would have seen an upscale event if the fix was not effective.
+        time.sleep(0.5)

Uh oh!

Conversation

abrarsheikh commented Feb 4, 2026

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

gemini-code-assist Bot Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Labels

2 participants