Skip to content

[Serve] Upcoming Changes in Ray Serve 2.55+ #61212

Description

@abrarsheikh

The following changes are planned for the next several Ray Serve releases. Please review and prepare your applications accordingly.

  • Pydantic v1 support will be removed. Ray Serve will require Pydantic v2. If you are still on Pydantic v1, upgrade now with pip install -U pydantic. See #58876 for migration details.
  • Sync deployment methods will run in a threadpool by default. Synchronous user code will be executed in a threadpool rather than blocking the event loop, improving concurrency for sync handlers. If your deployment relies on the current single-threaded behavior, you should set max_ongoing_requests=1 on the deployment.
  • We are replacing the current Ray Serve HTTP Proxy with a more optimized ingress layer. This change means that the ingress deployment will no longer support model multiplexing or custom request routing. While this new ingress layer will be opt-in for several Ray versions, we strongly advise moving any multiplexing logic or custom request routing policy to downstream deployments.
  • RAY_SERVE_RUN_USER_CODE_IN_SEPARATE_THREAD will default to 0. User code will run in the same thread as the replica event loop by default. If your deployment depends on user code running in a separate thread, explicitly set RAY_SERVE_RUN_USER_CODE_IN_SEPARATE_THREAD=1.
  • RAY_SERVE_RUN_ROUTER_IN_SEPARATE_LOOP will default to 0. The request router will run in the same event loop as the proxy/replica by default. If your setup requires a separate router loop, explicitly set RAY_SERVE_RUN_ROUTER_IN_SEPARATE_LOOP=1.

Metadata

Metadata

Assignees

No one assigned

    Labels

    docsAn issue or change related to documentationserveRay Serve Related Issue

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    Status
    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions