Skip to content

[Serve] Fix Java long poll timeout serialization#61875

Merged
abrarsheikh merged 1 commit into
ray-project:masterfrom
weimingdiit:fix/long-poll-java-timeout-handling
Apr 6, 2026
Merged

[Serve] Fix Java long poll timeout serialization#61875
abrarsheikh merged 1 commit into
ray-project:masterfrom
weimingdiit:fix/long-poll-java-timeout-handling

Conversation

@weimingdiit

@weimingdiit weimingdiit commented Mar 19, 2026

Copy link
Copy Markdown
Contributor

Description

Fixes a bug in LongPollHost.listen_for_change_java() when the upstream listen_for_change() call times out.

listen_for_change() can return LongPollState.TIME_OUT, but the Java wrapper path previously forwarded that value directly to _listen_result_to_proto_bytes(), which expects a dictionary of updated objects. This could cause the Java long-poll path to fail on timeout.

This PR converts LongPollState.TIME_OUT to an empty result before serialization, so Java callers receive a valid empty LongPollResult instead.

Related issues

#61874

Additional information

Added a unit test to verify that listen_for_change_java() returns an empty LongPollResult on timeout.

Fix a bug in Serve long polling where the Python-to-Java long poll path could raise on normal timeout.

LongPollHost.listen_for_change() can legally return LongPollState.TIME_OUT when no updates arrive before the timeout window. The Java bridge method listen_for_change_java() was forwarding that result directly into protobuf serialization, but _listen_result_to_proto_bytes() assumes it always receives a mapping and unconditionally calls .items().

As a result, a normal long poll timeout on the Java path could fail the controller RPC instead of returning an empty update set.

Signed-off-by: weimingdiit <weimingdiit@gmail.com>

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a bug in the Java long poll mechanism where a timeout would lead to a serialization failure. The fix involves correctly handling the LongPollState.TIME_OUT case by converting it to an empty dictionary, which prevents the error and aligns with the expected behavior of returning an empty result on timeout. A new test case has been added to verify this fix, ensuring that timeouts are handled gracefully. The changes are logical, well-implemented, and properly tested.

@weimingdiit weimingdiit marked this pull request as ready for review March 20, 2026 01:56
@weimingdiit weimingdiit requested a review from a team as a code owner March 20, 2026 01:56
@weimingdiit

weimingdiit commented Mar 20, 2026

Copy link
Copy Markdown
Contributor Author

Hi @abrarsheikh @simon-mo when you have a chance, could you take a look at this PR?

@ray-gardener ray-gardener Bot added serve Ray Serve Related Issue community-contribution Contributed by the community labels Mar 20, 2026
@harshit-anyscale harshit-anyscale added the go add ONLY when ready to merge, run all tests label Mar 26, 2026
@weimingdiit

Copy link
Copy Markdown
Contributor Author

Hi @bveeramani, When you get a chance, could you take another look at this PR?

@abrarsheikh abrarsheikh merged commit 81e3293 into ray-project:master Apr 6, 2026
9 checks passed
@weimingdiit

Copy link
Copy Markdown
Contributor Author

@harshit-anyscale @abrarsheikh Thanks for your review and merge!

@weimingdiit weimingdiit deleted the fix/long-poll-java-timeout-handling branch April 7, 2026 00:04
Lucas61000 pushed a commit to Lucas61000/ray that referenced this pull request May 15, 2026
## Description
Fixes a bug in LongPollHost.listen_for_change_java() when the upstream
listen_for_change() call times out.

listen_for_change() can return LongPollState.TIME_OUT, but the Java
wrapper path previously forwarded that value directly to
_listen_result_to_proto_bytes(), which expects a dictionary of updated
objects. This could cause the Java long-poll path to fail on timeout.

This PR converts LongPollState.TIME_OUT to an empty result before
serialization, so Java callers receive a valid empty LongPollResult
instead.

## Related issues
ray-project#61874

## Additional information
Added a unit test to verify that listen_for_change_java() returns an
empty LongPollResult on timeout.

Signed-off-by: weimingdiit <weimingdiit@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution Contributed by the community go add ONLY when ready to merge, run all tests serve Ray Serve Related Issue

3 participants