[Serve] Preserve user-set gRPC status codes when exceptions are raised#60482
Conversation
Signed-off-by: abrar <abrar@anyscale.com>
There was a problem hiding this comment.
Code Review
This pull request introduces a valuable feature to preserve user-set gRPC status codes when exceptions are raised in Ray Serve deployments. The implementation is well-structured, introducing a gRPCStatusError to carry the context, updating replica methods to wrap exceptions, and modifying the proxy to respect the user's intended status code. The addition of message truncation is a thoughtful improvement to prevent issues with large error messages. The documentation and tests are comprehensive and clearly cover the new functionality.
I've found one potential issue related to handling asyncio.CancelledError on older Python versions, which could lead to inconsistent behavior. My specific comments provide suggestions to address this. Overall, this is a great enhancement to Ray Serve's gRPC capabilities.
Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: abrar <abrar@anyscale.com>
harshit-anyscale
left a comment
There was a problem hiding this comment.
one minor comment, else LGTM
ray-project#60482) Fixes ray-project#58851 ### Changes 1. **New `gRPCStatusError` exception class** - Wraps exceptions with user-set gRPC status codes so they flow through Ray's error handling path. 2. **Exception wrapping in replica methods** - `handle_request`, `handle_request_streaming`, and `handle_request_with_rejection` now wrap exceptions with `gRPCStatusError` when the user has set a status code on the gRPC context. 3. **Status code preservation in proxy** - `get_grpc_response_status()` now detects `gRPCStatusError` and returns the user's intended status code instead of `INTERNAL`. 4. **Message truncation** - Added `_truncate_message()` to limit error details to 4KB, avoiding HTTP/2 trailer size limits. 5. **Documentation updates** - Updated the gRPC guide to document the new behavior. --------- Signed-off-by: abrar <abrar@anyscale.com> Signed-off-by: Adel Nour <ans9868@nyu.edu>
## Why Fixes #62358. For non-gRPC Serve requests, `Replica.handle_request*` currently does: ```python raise self._maybe_wrap_grpc_exception(e, request_metadata) from e ``` When `_maybe_wrap_grpc_exception()` returns the original exception unchanged, this degenerates into `raise e from e`, creating a self-referential `__cause__` chain (`e.__cause__ is e`). That breaks error-reporting tools that walk `__cause__` recursively and can also confuse debugging and logging paths for normal HTTP and handle-based requests. ## Root cause This regression came from `#60482` (`221a19395a`), which unified exception handling across the `handle_request*` paths so gRPC requests could preserve user-set status codes. That behavior is correct when a new gRPC wrapper exception is created. It is wrong for non-gRPC requests, because `_maybe_wrap_grpc_exception()` just returns `e`, so the code effectively executes `raise e from e`. ## Fix Introduce a small `_raise_user_exception()` helper in `replica.py`: - if `_maybe_wrap_grpc_exception()` returns the original exception, raise it directly - if it returns a wrapped gRPC exception, raise the wrapped exception `from e` This preserves the intended gRPC chaining behavior while eliminating the circular `__cause__` chain on non-gRPC paths. ## Testing ```bash source /home/simple/github/ray/.venv/bin/activate PYTHONPATH=/home/simple/github/ray/.worktrees/fix-issue-62358-replica-exception-chaining/python \ python -m pytest python/ray/serve/tests/unit/test_replica_exception_chaining.py -v ``` --------- Signed-off-by: Ziy1-Tan <ajb459684460@gmail.com>
…62412) ## Why Fixes ray-project#62358. For non-gRPC Serve requests, `Replica.handle_request*` currently does: ```python raise self._maybe_wrap_grpc_exception(e, request_metadata) from e ``` When `_maybe_wrap_grpc_exception()` returns the original exception unchanged, this degenerates into `raise e from e`, creating a self-referential `__cause__` chain (`e.__cause__ is e`). That breaks error-reporting tools that walk `__cause__` recursively and can also confuse debugging and logging paths for normal HTTP and handle-based requests. ## Root cause This regression came from `ray-project#60482` (`221a19395a`), which unified exception handling across the `handle_request*` paths so gRPC requests could preserve user-set status codes. That behavior is correct when a new gRPC wrapper exception is created. It is wrong for non-gRPC requests, because `_maybe_wrap_grpc_exception()` just returns `e`, so the code effectively executes `raise e from e`. ## Fix Introduce a small `_raise_user_exception()` helper in `replica.py`: - if `_maybe_wrap_grpc_exception()` returns the original exception, raise it directly - if it returns a wrapped gRPC exception, raise the wrapped exception `from e` This preserves the intended gRPC chaining behavior while eliminating the circular `__cause__` chain on non-gRPC paths. ## Testing ```bash source /home/simple/github/ray/.venv/bin/activate PYTHONPATH=/home/simple/github/ray/.worktrees/fix-issue-62358-replica-exception-chaining/python \ python -m pytest python/ray/serve/tests/unit/test_replica_exception_chaining.py -v ``` --------- Signed-off-by: Ziy1-Tan <ajb459684460@gmail.com>
…62412) ## Why Fixes ray-project#62358. For non-gRPC Serve requests, `Replica.handle_request*` currently does: ```python raise self._maybe_wrap_grpc_exception(e, request_metadata) from e ``` When `_maybe_wrap_grpc_exception()` returns the original exception unchanged, this degenerates into `raise e from e`, creating a self-referential `__cause__` chain (`e.__cause__ is e`). That breaks error-reporting tools that walk `__cause__` recursively and can also confuse debugging and logging paths for normal HTTP and handle-based requests. ## Root cause This regression came from `ray-project#60482` (`221a19395a`), which unified exception handling across the `handle_request*` paths so gRPC requests could preserve user-set status codes. That behavior is correct when a new gRPC wrapper exception is created. It is wrong for non-gRPC requests, because `_maybe_wrap_grpc_exception()` just returns `e`, so the code effectively executes `raise e from e`. ## Fix Introduce a small `_raise_user_exception()` helper in `replica.py`: - if `_maybe_wrap_grpc_exception()` returns the original exception, raise it directly - if it returns a wrapped gRPC exception, raise the wrapped exception `from e` This preserves the intended gRPC chaining behavior while eliminating the circular `__cause__` chain on non-gRPC paths. ## Testing ```bash source /home/simple/github/ray/.venv/bin/activate PYTHONPATH=/home/simple/github/ray/.worktrees/fix-issue-62358-replica-exception-chaining/python \ python -m pytest python/ray/serve/tests/unit/test_replica_exception_chaining.py -v ``` --------- Signed-off-by: Ziy1-Tan <ajb459684460@gmail.com>
Fixes #58851
Changes
gRPCStatusErrorexception class - Wraps exceptions with user-set gRPC status codes so they flow through Ray's error handling path.handle_request,handle_request_streaming, andhandle_request_with_rejectionnow wrap exceptions withgRPCStatusErrorwhen the user has set a status code on the gRPC context.get_grpc_response_status()now detectsgRPCStatusErrorand returns the user's intended status code instead ofINTERNAL._truncate_message()to limit error details to 4KB, avoiding HTTP/2 trailer size limits.