Skip to content

[Serve] Make rolling update percentage configurable (#54509)#62160

Merged
kouroshHakha merged 10 commits into
ray-project:masterfrom
suppagoddo:issue_54509_rolling_update_pct
Apr 29, 2026
Merged

[Serve] Make rolling update percentage configurable (#54509)#62160
kouroshHakha merged 10 commits into
ray-project:masterfrom
suppagoddo:issue_54509_rolling_update_pct

Conversation

@suppagoddo

Copy link
Copy Markdown
Contributor

Replace the hardcoded 20% rolling update percentage with a configurable rolling_update_percentage field on DeploymentConfig (default 0.2).

Users can now control what fraction of replicas are updated at a time during rolling updates via @serve.deployment(), REST API, or config YAML.

  • Add rolling_update_percentage to DeploymentConfig (gt=0.0, le=1.0)
  • Add to DeploymentSchema for REST/YAML/decorator support
  • Add to serve.proto for Python API round-trip
  • Handle proto3 zero-default in from_proto
  • Wire through _deployment_info_to_schema for status API
  • Add test_rolling_update_percentage_configurable unit test
  • Include 20 in expected worker thread count range (test_basic.py)
@suppagoddo suppagoddo requested review from a team as code owners March 28, 2026 18:11

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a configurable rolling_update_percentage for Ray Serve deployments, allowing users to control the number of replicas updated simultaneously during a rollout. The changes include updates to the deployment configuration, schema, and protobuf definitions, along with logic in the deployment state manager to respect this new setting. A suggestion was made to parameterize the new unit test to more robustly cover various scenarios, including default values, rounding behavior, and edge cases like 100% updates.

Comment thread python/ray/serve/tests/unit/test_deployment_state.py Outdated
Comment thread src/ray/protobuf/serve.proto Outdated
Comment thread python/ray/serve/schema.py
Comment thread python/ray/serve/tests/unit/test_deployment_state.py Outdated
Comment thread python/ray/serve/schema.py
Comment thread python/ray/serve/_private/config.py Outdated
@@ -219,6 +219,13 @@ class DeploymentConfig(BaseModel):
update_type=DeploymentOptionUpdateType.HeavyWeight,
)

rolling_update_percentage: float = Field(
default=0.2,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's move the default value as a env var in constants. That will be a nice touch if any user wishes to control this globally.

@suppagoddo suppagoddo Mar 28, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't it be hard to track if user have multiple clusters? For example lets say in one cluster if the DEFAULT env variable is 0.5 and then in another it is 0.2... I am open to discussion on this but as a service provider we should give the default which will be 0.2 but it is a tweakable feature via certain variables which we are providing in this PR [ Users will change this].

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One argument we can use against my above comment would be not hardcoding default value in code.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

env vars are scoped to a single cluster because they are generally part of the Docker image.

@abrarsheikh abrarsheikh added the go add ONLY when ready to merge, run all tests label Mar 28, 2026
Comment thread python/ray/serve/_private/config.py

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 3 total unresolved issues (including 2 from previous reviews).

Fix All in Cursor

Comment thread python/ray/serve/deployment.py
@ray-gardener ray-gardener Bot added serve Ray Serve Related Issue community-contribution Contributed by the community labels Mar 28, 2026
@suppagoddo

Copy link
Copy Markdown
Contributor Author

@abrarsheikh - feel free to merge once you get a chance 🙏

@suppagoddo suppagoddo requested a review from abrarsheikh March 30, 2026 19:46
@suppagoddo

Copy link
Copy Markdown
Contributor Author

@jeffreywang-anyscale @harshit-anyscale @kouroshHakha - Can you guys please review this PR please

@kouroshHakha kouroshHakha left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The feature is well-scoped and the core change is correct. The plumbing across all four config paths (decorator, REST, YAML, proto) is complete.

A few non-blocking suggestions:

Missing DEFAULT_ROLLING_UPDATE_PERCENTAGE constant
All other tunable Serve defaults (DEFAULT_GRACEFUL_SHUTDOWN_TIMEOUT_S, DEFAULT_HEALTH_CHECK_PERIOD_S, etc.) live as named constants in constants.py. The 0.2 is currently inlined in config.py, the api.py docstring, and the schema.py description — three places that drift independently if the default changes. (See inline comment.)

Use optional double in proto
The serve proto already uses optional double for fields that need "not set" semantics (upscale_smoothing_factor, downscaling_factor, etc.). _proto_to_dict already handles these correctly — optional fields skip the zero-default-filling loop via the containing_oneof check, so an unset field simply never appears in data and DeploymentConfig naturally falls back to 0.2. This would let you remove the special-case in from_proto entirely. (See inline comment.)

Proto round-trip test missing (Important)
The from_proto path has the most upgrade-brittleness risk (old controller → new worker during a rolling cluster upgrade). A small unit test verifying that rolling_update_percentage=0.5 survives a to_proto()from_proto() round-trip, and that an older proto (field absent) falls back to 0.2, would close this gap.

DeploymentConfig class docstring not updated
The class-level docstring (line 113–146 in config.py) lists all constructor args but rolling_update_percentage is missing.

Note

This review was co-written with AI assistance (Claude Code).

Comment thread python/ray/serve/_private/config.py Outdated
Comment thread python/ray/serve/_private/config.py Outdated
Comment thread python/ray/serve/api.py Outdated
@kouroshHakha

Copy link
Copy Markdown
Contributor

@suppagoddo any updates on this pr?

@suppagoddo

Copy link
Copy Markdown
Contributor Author

@kouroshHakha - I was on vacation the last 1.5 weeks and will be reaching home today evening, will address all your comments and will try to get it merged. Thanks again for the review 🙏

@suppagoddo

Copy link
Copy Markdown
Contributor Author

Looking into this today, will submit the requested change by eod.

Replace the hardcoded 20% rolling update percentage with a configurable
`rolling_update_percentage` field on DeploymentConfig (default 0.2).

Users can now control what fraction of replicas are updated at a time
during rolling updates via @serve.deployment(), REST API, or config YAML.

- Add `rolling_update_percentage` to DeploymentConfig (gt=0.0, le=1.0)
- Add to DeploymentSchema for REST/YAML/decorator support
- Add to serve.proto for Python API round-trip
- Handle proto3 zero-default in from_proto
- Wire through _deployment_info_to_schema for status API
- Add test_rolling_update_percentage_configurable unit test
- Include 20 in expected worker thread count range (test_basic.py)

Signed-off-by: uditagrawal <uagrawal@apple.com>
…nd deployment_to_schema

The field was added to DeploymentSchema but not passed through the
explicit conversion functions in deployment.py, causing values set
via REST API or config YAML to be silently dropped.

Signed-off-by: uditagrawal <uagrawal@apple.com>
Cover default percentage, rounding, minimum-of-1, non-clean divisors,
and 100% update scenarios.

Signed-off-by: uditagrawal <uagrawal@apple.com>
Add the parameter to the decorator signature, docstring, and
DeploymentConfig.from_default() call so users can configure it
via @serve.deployment(rolling_update_percentage=0.5).

Signed-off-by: uditagrawal <uagrawal@apple.com>
…ings

Replace the hardcoded 0.2 that was duplicated across config.py, schema.py,
and the api.py decorator docstring with a single named constant,
DEFAULT_ROLLING_UPDATE_PERCENTAGE, in _private/constants.py. This matches
the pattern used by DEFAULT_GRACEFUL_SHUTDOWN_TIMEOUT_S,
DEFAULT_HEALTH_CHECK_PERIOD_S, etc., and prevents the three call sites
from drifting independently.

Also:
- Add rolling_update_percentage to DeploymentConfig's class-level
  docstring (it was listed as a field but missing from the Args section).
- Insert the missing blank line before the Returns: section of the
  @serve.deployment() docstring to match the Google-style convention used
  throughout api.py.

Addresses review feedback on ray-project#62160.

Signed-off-by: uditagrawal <uagrawal@apple.com>
Declare rolling_update_percentage as `optional double` in serve.proto so
that "unset" (e.g. an older controller during a rolling upgrade that
predates this field) is distinguishable from an explicit value, matching
the pattern already used for upscale_smoothing_factor, downscaling_factor,
etc.

This lets us remove the special-case in DeploymentConfig.from_proto that
treated proto3's zero-value default as "use the Python default".
_proto_to_dict already skips synthetic-oneof-backed optional fields (via
the `containing_oneof` check), so an unset field never enters `data` and
Pydantic falls back to DEFAULT_ROLLING_UPDATE_PERCENTAGE on its own.

Also add test_rolling_update_percentage_proto_roundtrip covering:
- An explicit value (0.5) survives to_proto() -> from_proto() losslessly.
- A proto without the field set (simulating an older controller)
  deserializes to DEFAULT_ROLLING_UPDATE_PERCENTAGE rather than 0.0.

Addresses review feedback on ray-project#62160.

Signed-off-by: uditagrawal <uagrawal@apple.com>
@suppagoddo suppagoddo force-pushed the issue_54509_rolling_update_pct branch from 79b31df to fe8c2d4 Compare April 28, 2026 21:47
The @serve.deployment() docstring referenced
DEFAULT_ROLLING_UPDATE_PERCENTAGE via
:data:`~ray.serve._private.constants.DEFAULT_ROLLING_UPDATE_PERCENTAGE`
but `_private` modules are not indexed in Ray's published Sphinx API docs
and the docs build runs with nitpicky + warnings-as-errors, producing:

  WARNING: py:data reference target not found:
  ray.serve._private.constants.DEFAULT_ROLLING_UPDATE_PERCENTAGE

Replace the cross-reference with the literal default ("0.2 (20%)") so the
docstring still surfaces the value without creating a build-time warning.
The named constant is still referenced from code (config.py, schema.py)
so the "one place to change the default" guarantee is preserved.

Signed-off-by: uditagrawal <uagrawal@apple.com>
…e_proto_roundtrip

The multi-line assert fits within black's line limit on a single line;
black reformatted it as part of pre-commit. Apply the reformat here so
the lint-roller pre-commit job passes.

Signed-off-by: uditagrawal <uagrawal@apple.com>
…ng invalid proto

The test previously built `DeploymentConfigProto(num_replicas=1)` directly
to simulate "an older controller without this field". But that proto has
every other field at its proto3 zero default (max_ongoing_requests=0,
health_check_period_s=0.0, max_constructor_retry_count=0, ...), which
then fails Pydantic validation when `from_proto` constructs the
`DeploymentConfig` (fields require gt=0 / positive int / similar).

Start from a valid `DeploymentConfig().to_proto()` (so all non-optional
fields are populated with real defaults), then `ClearField` only the
optional `rolling_update_percentage` to simulate the older-controller
case. Clearing makes `HasField()` return False, which is exactly what a
proto serialized before this field existed would present.

Signed-off-by: uditagrawal <uagrawal@apple.com>
…cstring

Match the reST inline-literal formatting already used for the interval
``(0.0, 1.0]`` earlier in the same docstring. Purely cosmetic; also
triggers a fresh premerge run to retry a flaky docs build caused by an
upstream HuggingFace intersphinx 504.

Signed-off-by: uditagrawal <uagrawal@apple.com>
@suppagoddo

Copy link
Copy Markdown
Contributor Author

@kouroshHakha - Can you please review this again? Addressed all your recommended comments.

@suppagoddo suppagoddo requested a review from kouroshHakha April 29, 2026 17:12

@kouroshHakha kouroshHakha left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kouroshHakha kouroshHakha merged commit 5866967 into ray-project:master Apr 29, 2026
6 checks passed
Lucas61000 pushed a commit to Lucas61000/ray that referenced this pull request May 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution Contributed by the community go add ONLY when ready to merge, run all tests serve Ray Serve Related Issue

3 participants