Skip to content

[Serve][3/5] Add ingress request router targets to TargetGroup#62668

Merged
kouroshHakha merged 32 commits into
ray-project:masterfrom
eicherseiji:ingress-bypass/2-controller
May 5, 2026
Merged

[Serve][3/5] Add ingress request router targets to TargetGroup#62668
kouroshHakha merged 32 commits into
ray-project:masterfrom
eicherseiji:ingress-bypass/2-controller

Conversation

@eicherseiji

@eicherseiji eicherseiji commented Apr 16, 2026

Copy link
Copy Markdown
Contributor

Why this is needed

PR2 lets an app identify a peer ingress request router. This PR carries that role through the controller/runtime path so HAProxy can send /internal/route lookups to router replicas while forwarding data-plane traffic to the normal ingress replicas.

Summary

  • Propagates ingress_request_router through deploy args, DeploymentInfo, application state, and replica startup.
  • Adds TargetGroup.ingress_request_router_targets so HTTP target groups can expose router replicas separately from backend replicas.
  • Allows router replicas to own direct-ingress HTTP ports even though they are not the app ingress deployment.

Stack: #62667 -> #62680 -> this -> #62669 -> #62670

Test plan

  • Added / updated unit coverage in python/ray/serve/tests/unit/test_application_state.py
  • Added controller target-group coverage in python/ray/serve/tests/unit/test_controller_direct_ingress.py
  • python -m py_compile on the edited files
  • git diff --check
  • Repo commit hooks / linters on commit

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements an "ingress bypass" feature in Ray Serve, allowing direct data plane traffic to replicas via custom ASGI apps while utilizing a dedicated router deployment for Lua-based routing decisions. The changes include updates to the controller's target group management, the addition of a detached registry for direct ingress endpoints, and new replica-side logic for starting HTTP servers for custom apps. The review feedback identifies several critical issues: performance degradation caused by blocking the event loop with ray.get in asynchronous methods, a potential memory leak in the _DirectIngressRegistry due to the lack of an unregistration mechanism, missing imports for errno and retry constants, and the inclusion of unused helper functions in the controller.

Comment thread python/ray/serve/_private/replica.py Outdated
Comment thread python/ray/serve/_private/replica.py Outdated
Comment thread python/ray/serve/_private/replica.py Outdated
Comment thread python/ray/serve/_private/replica.py Outdated
Comment thread python/ray/serve/_private/replica.py Outdated
Comment thread python/ray/serve/_private/replica.py Outdated
Comment thread python/ray/serve/_private/controller.py Outdated
@eicherseiji eicherseiji force-pushed the ingress-bypass/2-controller branch from d4933ca to 07e5877 Compare April 16, 2026 17:25
@eicherseiji eicherseiji changed the title [Serve] Add ingress bypass controller and app plumbing Apr 16, 2026
@eicherseiji eicherseiji force-pushed the ingress-bypass/2-controller branch 2 times, most recently from a300185 to a65881c Compare April 16, 2026 20:19
@eicherseiji eicherseiji changed the title [Serve][2/4] Add ingress bypass controller and app plumbing Apr 16, 2026
@eicherseiji eicherseiji force-pushed the ingress-bypass/2-controller branch 5 times, most recently from cdbf07b to f1c72e6 Compare April 18, 2026 01:56
@eicherseiji eicherseiji changed the title [Serve][3/5] Add ingress bypass controller and app plumbing Apr 20, 2026
@eicherseiji eicherseiji force-pushed the ingress-bypass/2-controller branch 3 times, most recently from 20a6b28 to bbd335c Compare April 21, 2026 05:28
@eicherseiji eicherseiji changed the title [Serve][3/5] Carry http_router role through controller plumbing Apr 21, 2026
@eicherseiji eicherseiji force-pushed the ingress-bypass/2-controller branch 3 times, most recently from 7d0f0e7 to bc6fc38 Compare April 21, 2026 06:24
@eicherseiji eicherseiji force-pushed the ingress-bypass/2-controller branch 4 times, most recently from 6e07037 to d413d76 Compare April 30, 2026 21:31
@eicherseiji eicherseiji changed the base branch from master to eicherseiji/ingress-bypass/2-router-spec-review-base April 30, 2026 21:34
@eicherseiji eicherseiji changed the title [Serve][3/5] Carry ingress_request_router role through controller plumbing Apr 30, 2026
@eicherseiji eicherseiji added the go add ONLY when ready to merge, run all tests label May 1, 2026
eicherseiji added 13 commits May 1, 2026 09:35
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
This reverts commit 9600be7f0e4337fa1e8911aba1dd05425d4451f7.

Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
…d json

The TargetGroup schema gained an ingress_request_router_targets field
earlier in this PR; update the json_serializable expected output to
include the empty default for both HTTP and gRPC target groups.

Signed-off-by: Seiji Eicher <seiji@anyscale.com>
@eicherseiji eicherseiji force-pushed the ingress-bypass/2-controller branch from f1942c5 to 0ae40c1 Compare May 1, 2026 16:35
@eicherseiji eicherseiji changed the title [Serve][3/5] Separate ingress request router targets in controller May 1, 2026
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
@eicherseiji eicherseiji self-assigned this May 1, 2026
The serializer in get_serve_instance_details uses model_dump(exclude_unset=True),
which drops fields that were never explicitly passed to the constructor — even
when they have default_factory=list. Five of six TargetGroup(...) construction
sites in controller.py omitted the new field, causing
test_get_serve_instance_details_json_serializable to fail because the JSON
output was missing "ingress_request_router_targets": [].

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

@kouroshHakha kouroshHakha left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice incremental wire-up — most of the role plumbing was done in #62680, so this PR is appropriately small. The router replicas correctly receive direct-ingress HTTP ports without gRPC, and the new ingress_request_router_targets field gives HAProxy what it needs without disturbing the data-plane target list.

A few things to consider before merge:

  1. Validation in _set_target_state raises after partially clearing the cached ingress names — minor but leaves the object in a half-mutated state on exception.
  2. The "no replicas → return []" short-circuit means routers won't be advertised until ingress is also up. Probably intentional, but worth a one-line comment so future maintainers don't try to "fix" it.
  3. Naming nits inline: a few ingress_request_router / _ingress symbols would read better as is_*.
  4. Stale gemini/cursor bot comments above no longer apply — feel free to dismiss.

No critical findings.

Note

This review was co-written with AI assistance (Claude Code).

Comment thread python/ray/serve/_private/replica.py Outdated
Comment thread python/ray/serve/_private/replica.py Outdated
Comment thread python/ray/serve/_private/deployment_state.py
Comment thread python/ray/serve/_private/replica.py
Comment thread python/ray/serve/_private/application_state.py Outdated
Comment thread python/ray/serve/_private/application_state.py Outdated
Comment thread python/ray/serve/_private/controller.py
Comment thread python/ray/serve/_private/replica.py
Comment thread python/ray/serve/_private/deployment_state.py Outdated
Comment thread python/ray/serve/schema.py
- Assign _ingress*_deployment_name only after target_state is built so a
  validation failure can no longer leave the application half-mutated.
- Drop the defensive multi-router check in _set_target_state; build_app.py
  already enforces single ingress request router per app.
- Rename Replica/ReplicaActor's new boolean to is_ingress_request_router
  (parameter and self attribute) for consistency with is_ingress.
- Skip the gRPC config fetch on ingress-request-router replicas; they
  only need HTTP for /internal/route.
- Comment the load-bearing "no ingress replicas -> empty target groups"
  short-circuit in _get_target_groups_for_app.
- Note that ingress_request_router_targets is HTTP-only.
- Expand get_ingress_replicas_info docstring to reflect that routers
  are also covered.

Signed-off-by: Seiji Eicher <seiji@anyscale.com>
…er_targets

The non-direct-ingress test_controller.py mirror of
test_get_serve_instance_details_json_serializable was missed when the test in
test_direct_ingress.py was updated. After the controller.py fix, the proxy
TargetGroups now correctly emit "ingress_request_router_targets": [], so this
test's expected_json must include it too.

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

@kouroshHakha kouroshHakha left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All four points from my prior review are addressed cleanly:

  1. _set_target_state now builds target_state first and only assigns self._ingress_*_deployment_name / self._target_state after — no more half-mutated state if construction raises. New test_application_state_clears_stale_ingress_request_router covers the clearing-on-transition case.
  2. The "no ingress replicas → return []" short-circuit is now commented as load-bearing.
  3. Replica/ReplicaActor parameter and attr renamed to is_ingress_request_router, consistent with is_ingress.
  4. Bonus: gRPC config fetch skipped on router replicas, HTTP-only nature of ingress_request_router_targets documented in both schema and controller docstring, and get_ingress_replicas_info docstring expanded.

LGTM.

Note

This review was co-written with AI assistance (Claude Code).

@kouroshHakha kouroshHakha merged commit 26590b7 into ray-project:master May 5, 2026
6 checks passed
Lucas61000 pushed a commit to Lucas61000/ray that referenced this pull request May 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go add ONLY when ready to merge, run all tests serve Ray Serve Related Issue

3 participants