[Serve][Bugfix] Fail loud when ingress request router dispatch fails#63215
Merged
kouroshHakha merged 3 commits intoMay 8, 2026
Merged
Conversation
Two related fixes for HAProxy + ingress-request-router peer deployments (e.g. LLMRouter): 1. ``_direct_ingress_asgi`` was calling ``route.startswith(self._route_prefix)`` with ``_route_prefix=None`` for ingress-request-router peer deployments (their route prefix is None), raising TypeError and surfacing as uvicorn 500. Treat a missing prefix as ``/``. 2. When the Lua dispatch failed (router unreachable / non-200 / unparseable reply / unknown replica), HAProxy was silently falling through to the primary backend, invisibly bypassing the operator's router policy. Arm ``txn.ingress_request_router_failed = "<reason>"`` in those paths and add a frontend ``http-request return 503`` (with ``X-Serve-Reason``) ordered before any ``use_backend`` so a failed dispatch cannot reach the primary backend. Tests: - ``test_router_failure_503_rule_appears_before_use_backend`` pins the template-ordering invariant. - ``test_router_failure_fails_loud_with_reason`` runs HAProxy against a stub broken router; asserts 503 + ``X-Serve-Reason``, and primary backend cumulative ``stot == 0``. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Code Review
This pull request introduces 'fail-loud' behavior for the ingress request router, ensuring that routing failures result in a 503 error with a descriptive X-Serve-Reason header rather than silently falling back to the primary backend. This is achieved through updates to HAProxy templates, Lua routing logic, and replica path matching. Feedback suggests using an empty string as the default route prefix in replica.py to more reliably match all incoming paths.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Reviewed by Cursor Bugbot for commit 3e7924c. Configure here.
…ing prefix
Address Gemini's review comment: ``route.startswith("")`` is always true,
which correctly handles the empty-path ASGI edge case where ``"".startswith("/")``
would 404 the request. No behavior change for any non-empty path.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two issues caught in code review:
1. The empty-body early return in the Lua action was leaving
``txn.ingress_request_router_failed`` unset, so a POST with an empty
body to a router-bearing app silently fell through to the primary
backend — the exact silent bypass this change is meant to close. Arm
the sentinel with ``empty_body`` so it surfaces as 503 +
``X-Serve-Reason``. Promote the failure-semantics comment to a
top-level note describing the contract for the whole action; the
"past this point" framing no longer applies because every router
decision now arms the sentinel on failure.
2. ``test_router_failure_503_rule_appears_before_use_backend`` searched
for the substring ``use_backend``, but the rendered template includes
that string inside an explanatory comment block ("static use_backend
selection below") that renders before the 503 rule. The assertion
would fail deterministically. Switch to line-based parsing that
matches actual ``use_backend <name>`` directive lines.
Extends the integration test to also cover the empty-body path:
verified manually that an empty-body POST returns 503 with
``X-Serve-Reason: empty_body`` and primary backend stot stays 0.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
akyang-anyscale
approved these changes
May 8, 2026
4 tasks
kouroshHakha
added a commit
to eicherseiji/ray
that referenced
this pull request
May 8, 2026
Brings in ray-project#63215 (fail-loud) and resolves the conflict in ``ingress_request_router.lua.tmpl`` against this branch's FORWARD_BODY gate. Resolution: drop the ``empty_body`` sentinel that ray-project#63215 added. With the FORWARD_BODY gate, an empty body is no longer a distinct failure mode -- both modes call the router with ``body=""`` and let the router decide. The "router or fail" invariant still holds because the router *is* called; we just don't synthesize a sentinel locally. The 4 remaining failure-mode sentinels from ray-project#63215 (``router_unreachable`` / ``router_non_200`` / ``unparseable_replica_id`` / ``unknown_replica_id``) are unchanged. ``test_router_failure_fails_loud_with_reason`` updates fall out of the resolution: both empty and non-empty POSTs now surface as ``router_non_200`` against the broken-router stub. The empty-body case stays in the test to pin that empty bodies are routed through the router rather than silently bypassing it. Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>
Lucas61000
pushed a commit
to Lucas61000/ray
that referenced
this pull request
May 15, 2026
…ay-project#63215) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Summary
Two related fixes for HAProxy + ingress-request-router peer deployments (e.g.
LLMRouter):_direct_ingress_asgiTypeError (replica.py) — the route-prefix check didroute.startswith(self._route_prefix), but ingress-request-router peer deployments have_route_prefix=None, so this TypeError'd into a uvicorn 500. Treat missing prefix as"/".Silent bypass of router policy (Lua + HAProxy template) — when Lua dispatch failed (router unreachable / non-200 / unparseable reply / unknown replica), HAProxy fell through to the primary backend, invisibly bypassing the operator's router. Failure paths now arm
txn.ingress_request_router_failed = "<reason>"and the frontend returns 503 withX-Serve-Reason: <reason>, ordered before anyuse_backendso the request cannot reach the primary backend.Product invariant: a request to a router-bearing app is served via the router or it fails. There is no quiet alternative path. Failure-mode bucketing belongs in observability, not the data plane.
Test plan
test_router_failure_503_rule_appears_before_use_backend— template ordering invarianttest_router_failure_fails_loud_with_reason— HAProxy against a stub broken router; asserts 503 +X-Serve-Reason, primary backend cumulativestot == 0X-Serve-Reason: router_non_200, both backendsstot=0🤖 Generated with Claude Code