[Serve][3/5] Add ingress request router targets to TargetGroup#62668
Conversation
There was a problem hiding this comment.
Code Review
This pull request implements an "ingress bypass" feature in Ray Serve, allowing direct data plane traffic to replicas via custom ASGI apps while utilizing a dedicated router deployment for Lua-based routing decisions. The changes include updates to the controller's target group management, the addition of a detached registry for direct ingress endpoints, and new replica-side logic for starting HTTP servers for custom apps. The review feedback identifies several critical issues: performance degradation caused by blocking the event loop with ray.get in asynchronous methods, a potential memory leak in the _DirectIngressRegistry due to the lack of an unregistration mechanism, missing imports for errno and retry constants, and the inclusion of unused helper functions in the controller.
d4933ca to
07e5877
Compare
a300185 to
a65881c
Compare
cdbf07b to
f1c72e6
Compare
20a6b28 to
bbd335c
Compare
7d0f0e7 to
bc6fc38
Compare
6e07037 to
d413d76
Compare
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
This reverts commit 9600be7f0e4337fa1e8911aba1dd05425d4451f7. Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
…d json The TargetGroup schema gained an ingress_request_router_targets field earlier in this PR; update the json_serializable expected output to include the empty default for both HTTP and gRPC target groups. Signed-off-by: Seiji Eicher <seiji@anyscale.com>
f1942c5 to
0ae40c1
Compare
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
The serializer in get_serve_instance_details uses model_dump(exclude_unset=True), which drops fields that were never explicitly passed to the constructor — even when they have default_factory=list. Five of six TargetGroup(...) construction sites in controller.py omitted the new field, causing test_get_serve_instance_details_json_serializable to fail because the JSON output was missing "ingress_request_router_targets": []. Signed-off-by: Seiji Eicher <seiji@anyscale.com>
kouroshHakha
left a comment
There was a problem hiding this comment.
Nice incremental wire-up — most of the role plumbing was done in #62680, so this PR is appropriately small. The router replicas correctly receive direct-ingress HTTP ports without gRPC, and the new ingress_request_router_targets field gives HAProxy what it needs without disturbing the data-plane target list.
A few things to consider before merge:
- Validation in
_set_target_stateraises after partially clearing the cached ingress names — minor but leaves the object in a half-mutated state on exception. - The "no replicas → return []" short-circuit means routers won't be advertised until ingress is also up. Probably intentional, but worth a one-line comment so future maintainers don't try to "fix" it.
- Naming nits inline: a few
ingress_request_router/_ingresssymbols would read better asis_*. - Stale gemini/cursor bot comments above no longer apply — feel free to dismiss.
No critical findings.
Note
This review was co-written with AI assistance (Claude Code).
- Assign _ingress*_deployment_name only after target_state is built so a validation failure can no longer leave the application half-mutated. - Drop the defensive multi-router check in _set_target_state; build_app.py already enforces single ingress request router per app. - Rename Replica/ReplicaActor's new boolean to is_ingress_request_router (parameter and self attribute) for consistency with is_ingress. - Skip the gRPC config fetch on ingress-request-router replicas; they only need HTTP for /internal/route. - Comment the load-bearing "no ingress replicas -> empty target groups" short-circuit in _get_target_groups_for_app. - Note that ingress_request_router_targets is HTTP-only. - Expand get_ingress_replicas_info docstring to reflect that routers are also covered. Signed-off-by: Seiji Eicher <seiji@anyscale.com>
…er_targets The non-direct-ingress test_controller.py mirror of test_get_serve_instance_details_json_serializable was missed when the test in test_direct_ingress.py was updated. After the controller.py fix, the proxy TargetGroups now correctly emit "ingress_request_router_targets": [], so this test's expected_json must include it too. Signed-off-by: Seiji Eicher <seiji@anyscale.com>
kouroshHakha
left a comment
There was a problem hiding this comment.
All four points from my prior review are addressed cleanly:
_set_target_statenow buildstarget_statefirst and only assignsself._ingress_*_deployment_name/self._target_stateafter — no more half-mutated state if construction raises. Newtest_application_state_clears_stale_ingress_request_routercovers the clearing-on-transition case.- The "no ingress replicas → return []" short-circuit is now commented as load-bearing.
- Replica/ReplicaActor parameter and attr renamed to
is_ingress_request_router, consistent withis_ingress. - Bonus: gRPC config fetch skipped on router replicas, HTTP-only nature of
ingress_request_router_targetsdocumented in both schema and controller docstring, andget_ingress_replicas_infodocstring expanded.
LGTM.
Note
This review was co-written with AI assistance (Claude Code).
…roject#62668) Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Why this is needed
PR2 lets an app identify a peer ingress request router. This PR carries that role through the controller/runtime path so HAProxy can send
/internal/routelookups to router replicas while forwarding data-plane traffic to the normal ingress replicas.Summary
ingress_request_routerthrough deploy args,DeploymentInfo, application state, and replica startup.TargetGroup.ingress_request_router_targetsso HTTP target groups can expose router replicas separately from backend replicas.Stack: #62667 -> #62680 -> this -> #62669 -> #62670
Test plan
python/ray/serve/tests/unit/test_application_state.pypython/ray/serve/tests/unit/test_controller_direct_ingress.pypython -m py_compileon the edited filesgit diff --check