Skip to content

[Serve] Preserve user-set gRPC status codes when exceptions are raised#60482

Merged
abrarsheikh merged 4 commits into
masterfrom
58851-abrar-grpc
Jan 28, 2026
Merged

[Serve] Preserve user-set gRPC status codes when exceptions are raised#60482
abrarsheikh merged 4 commits into
masterfrom
58851-abrar-grpc

Conversation

@abrarsheikh

Copy link
Copy Markdown
Contributor

Fixes #58851

Changes

  1. New gRPCStatusError exception class - Wraps exceptions with user-set gRPC status codes so they flow through Ray's error handling path.
  2. Exception wrapping in replica methods - handle_request, handle_request_streaming, and handle_request_with_rejection now wrap exceptions with gRPCStatusError when the user has set a status code on the gRPC context.
  3. Status code preservation in proxy - get_grpc_response_status() now detects gRPCStatusError and returns the user's intended status code instead of INTERNAL.
  4. Message truncation - Added _truncate_message() to limit error details to 4KB, avoiding HTTP/2 trailer size limits.
  5. Documentation updates - Updated the gRPC guide to document the new behavior.
Signed-off-by: abrar <abrar@anyscale.com>
@abrarsheikh abrarsheikh added the go add ONLY when ready to merge, run all tests label Jan 25, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a valuable feature to preserve user-set gRPC status codes when exceptions are raised in Ray Serve deployments. The implementation is well-structured, introducing a gRPCStatusError to carry the context, updating replica methods to wrap exceptions, and modifying the proxy to respect the user's intended status code. The addition of message truncation is a thoughtful improvement to prevent issues with large error messages. The documentation and tests are comprehensive and clearly cover the new functionality.

I've found one potential issue related to handling asyncio.CancelledError on older Python versions, which could lead to inconsistent behavior. My specific comments provide suggestions to address this. Overall, this is a great enhancement to Ray Serve's gRPC capabilities.

Comment thread python/ray/serve/_private/replica.py
Comment thread python/ray/serve/_private/replica.py
Comment thread python/ray/serve/_private/replica.py
@abrarsheikh abrarsheikh marked this pull request as ready for review January 26, 2026 17:38
@abrarsheikh abrarsheikh requested a review from a team as a code owner January 26, 2026 17:38
@ray-gardener ray-gardener Bot added serve Ray Serve Related Issue docs An issue or change related to documentation labels Jan 26, 2026
Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: abrar <abrar@anyscale.com>

@harshit-anyscale harshit-anyscale left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one minor comment, else LGTM

Comment thread python/ray/serve/_private/replica.py
@abrarsheikh abrarsheikh merged commit 221a193 into master Jan 28, 2026
6 checks passed
@abrarsheikh abrarsheikh deleted the 58851-abrar-grpc branch January 28, 2026 07:43
ans9868 pushed a commit to ans9868/ray that referenced this pull request Feb 18, 2026
ray-project#60482)

Fixes ray-project#58851

### Changes

1. **New `gRPCStatusError` exception class** - Wraps exceptions with
user-set gRPC status codes so they flow through Ray's error handling
path.
2. **Exception wrapping in replica methods** - `handle_request`,
`handle_request_streaming`, and `handle_request_with_rejection` now wrap
exceptions with `gRPCStatusError` when the user has set a status code on
the gRPC context.
3. **Status code preservation in proxy** - `get_grpc_response_status()`
now detects `gRPCStatusError` and returns the user's intended status
code instead of `INTERNAL`.
4. **Message truncation** - Added `_truncate_message()` to limit error
details to 4KB, avoiding HTTP/2 trailer size limits.
5. **Documentation updates** - Updated the gRPC guide to document the
new behavior.

---------

Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: Adel Nour <ans9868@nyu.edu>
abrarsheikh pushed a commit that referenced this pull request Apr 17, 2026
## Why

Fixes #62358.

For non-gRPC Serve requests, `Replica.handle_request*` currently does:

```python
raise self._maybe_wrap_grpc_exception(e, request_metadata) from e
```

When `_maybe_wrap_grpc_exception()` returns the original exception
unchanged, this degenerates into `raise e from e`, creating a
self-referential `__cause__` chain (`e.__cause__ is e`).

That breaks error-reporting tools that walk `__cause__` recursively and
can also confuse debugging and logging paths for normal HTTP and
handle-based requests.

## Root cause

This regression came from `#60482` (`221a19395a`), which unified
exception handling across the `handle_request*` paths so gRPC requests
could preserve user-set status codes.

That behavior is correct when a new gRPC wrapper exception is created.
It is wrong for non-gRPC requests, because
`_maybe_wrap_grpc_exception()` just returns `e`, so the code effectively
executes `raise e from e`.

## Fix

Introduce a small `_raise_user_exception()` helper in `replica.py`:

- if `_maybe_wrap_grpc_exception()` returns the original exception,
raise it directly
- if it returns a wrapped gRPC exception, raise the wrapped exception
`from e`

This preserves the intended gRPC chaining behavior while eliminating the
circular `__cause__` chain on non-gRPC paths.

## Testing

```bash
source /home/simple/github/ray/.venv/bin/activate
PYTHONPATH=/home/simple/github/ray/.worktrees/fix-issue-62358-replica-exception-chaining/python \
python -m pytest python/ray/serve/tests/unit/test_replica_exception_chaining.py -v
```

---------

Signed-off-by: Ziy1-Tan <ajb459684460@gmail.com>
HLDKNotFound pushed a commit to chichic21039/ray that referenced this pull request Apr 22, 2026
…62412)

## Why

Fixes ray-project#62358.

For non-gRPC Serve requests, `Replica.handle_request*` currently does:

```python
raise self._maybe_wrap_grpc_exception(e, request_metadata) from e
```

When `_maybe_wrap_grpc_exception()` returns the original exception
unchanged, this degenerates into `raise e from e`, creating a
self-referential `__cause__` chain (`e.__cause__ is e`).

That breaks error-reporting tools that walk `__cause__` recursively and
can also confuse debugging and logging paths for normal HTTP and
handle-based requests.

## Root cause

This regression came from `ray-project#60482` (`221a19395a`), which unified
exception handling across the `handle_request*` paths so gRPC requests
could preserve user-set status codes.

That behavior is correct when a new gRPC wrapper exception is created.
It is wrong for non-gRPC requests, because
`_maybe_wrap_grpc_exception()` just returns `e`, so the code effectively
executes `raise e from e`.

## Fix

Introduce a small `_raise_user_exception()` helper in `replica.py`:

- if `_maybe_wrap_grpc_exception()` returns the original exception,
raise it directly
- if it returns a wrapped gRPC exception, raise the wrapped exception
`from e`

This preserves the intended gRPC chaining behavior while eliminating the
circular `__cause__` chain on non-gRPC paths.

## Testing

```bash
source /home/simple/github/ray/.venv/bin/activate
PYTHONPATH=/home/simple/github/ray/.worktrees/fix-issue-62358-replica-exception-chaining/python \
python -m pytest python/ray/serve/tests/unit/test_replica_exception_chaining.py -v
```

---------

Signed-off-by: Ziy1-Tan <ajb459684460@gmail.com>
Lucas61000 pushed a commit to Lucas61000/ray that referenced this pull request May 15, 2026
…62412)

## Why

Fixes ray-project#62358.

For non-gRPC Serve requests, `Replica.handle_request*` currently does:

```python
raise self._maybe_wrap_grpc_exception(e, request_metadata) from e
```

When `_maybe_wrap_grpc_exception()` returns the original exception
unchanged, this degenerates into `raise e from e`, creating a
self-referential `__cause__` chain (`e.__cause__ is e`).

That breaks error-reporting tools that walk `__cause__` recursively and
can also confuse debugging and logging paths for normal HTTP and
handle-based requests.

## Root cause

This regression came from `ray-project#60482` (`221a19395a`), which unified
exception handling across the `handle_request*` paths so gRPC requests
could preserve user-set status codes.

That behavior is correct when a new gRPC wrapper exception is created.
It is wrong for non-gRPC requests, because
`_maybe_wrap_grpc_exception()` just returns `e`, so the code effectively
executes `raise e from e`.

## Fix

Introduce a small `_raise_user_exception()` helper in `replica.py`:

- if `_maybe_wrap_grpc_exception()` returns the original exception,
raise it directly
- if it returns a wrapped gRPC exception, raise the wrapped exception
`from e`

This preserves the intended gRPC chaining behavior while eliminating the
circular `__cause__` chain on non-gRPC paths.

## Testing

```bash
source /home/simple/github/ray/.venv/bin/activate
PYTHONPATH=/home/simple/github/ray/.worktrees/fix-issue-62358-replica-exception-chaining/python \
python -m pytest python/ray/serve/tests/unit/test_replica_exception_chaining.py -v
```

---------

Signed-off-by: Ziy1-Tan <ajb459684460@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs An issue or change related to documentation go add ONLY when ready to merge, run all tests serve Ray Serve Related Issue

2 participants