Skip to content

Streaming generator doc#63791

Merged
edoakes merged 9 commits into
ray-project:masterfrom
rueian:streaming-gen-doc
Jun 8, 2026
Merged

Streaming generator doc#63791
edoakes merged 9 commits into
ray-project:masterfrom
rueian:streaming-gen-doc

Conversation

@rueian

@rueian rueian commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

The doc that explains how streaming generators work.
Preview link: https://anyscale-ray--63791.com.readthedocs.build/en/63791/ray-core/internals/streaming-generator.html

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new documentation page explaining the internals of streaming generator tasks in Ray. The review feedback correctly identifies that the hardcoded Ray version 2.55 and the corresponding ray-2.55.0 tags in the GitHub URLs within the new documentation do not exist, which will result in broken links. These should be updated to a valid branch or tag like master.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread doc/source/ray-core/internals/streaming-generator.rst Outdated
Comment thread doc/source/ray-core/internals/streaming-generator.rst Outdated
@rueian rueian added core Issues that should be addressed in Ray Core go add ONLY when ready to merge, run all tests labels Jun 2, 2026
@rueian rueian marked this pull request as ready for review June 2, 2026 06:36
@rueian rueian requested a review from a team as a code owner June 2, 2026 06:36
@rueian rueian changed the title Streaming gen doc Jun 2, 2026
Signed-off-by: Rueian Huang <rueiancsie@gmail.com>
@rueian rueian force-pushed the streaming-gen-doc branch from 4ea2998 to 328f34e Compare June 2, 2026 16:16
rueian added 3 commits June 2, 2026 10:32
Signed-off-by: Rueian Huang <rueiancsie@gmail.com>
Signed-off-by: Rueian Huang <rueiancsie@gmail.com>
Signed-off-by: Rueian Huang <rueiancsie@gmail.com>
@Yicheng-Lu-llll Yicheng-Lu-llll self-assigned this Jun 2, 2026

@sampan-s-nayak sampan-s-nayak left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very detailed writeup! thanks for taking the time to write this up!


The following diagram shows the reporting path for the first yielded value:

```

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can consider using mermaid UML diagrams here (not sure if it is supported in our docs page though)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is not supported, unfortunately.

Comment thread doc/source/ray-core/internals/streaming-generator.md Outdated
Signed-off-by: Rueian <rueiancsie@gmail.com>

@Yicheng-Lu-llll Yicheng-Lu-llll left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Comment thread doc/source/ray-core/internals/streaming-generator.md Outdated
Comment thread doc/source/ray-core/internals/streaming-generator.md Outdated
Comment thread doc/source/ray-core/internals/streaming-generator.md

@Kunchd Kunchd left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the helpful doc on streaming generator! I left some nits and questions to help me better understand it.

Comment thread doc/source/ray-core/internals/streaming-generator.md Outdated
Comment thread doc/source/ray-core/internals/streaming-generator.md Outdated
Comment thread doc/source/ray-core/internals/streaming-generator.md Outdated

1. Insert the returned object into the caller-side `ObjectRefStream` at the yield index.
2. Handle the reported return object using the same direct-return versus plasma-return logic as normal task returns.
3. [Make the reported ObjectRef ready](https://github.com/ray-project/ray/blob/ray-2.55.0/src/ray/core_worker/task_manager.cc#L821-L847). If a caller is [waiting in next(gen)](https://github.com/ray-project/ray/blob/ray-2.55.0/python/ray/_private/object_ref_generator.py#L188-L237) for this in-order yield index, that wait can now finish.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I might've missed something when I was looking through this, but the return path was actually a little involved. The report generator item returns call doesn't actually send the object to the caller if the object was in plasma. Instead, a get call was needed. Do we want to clarify this here or is this too much detail.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to Handle the reported return object as a direct return or a plasma return, using the same logic as normal task returns.. Hoping that is much clearer now.


When the `ObjectRefGenerator` goes out of scope, its [destructor](https://github.com/ray-project/ray/blob/ray-2.55.0/python/ray/_private/object_ref_generator.py#L289-L295) asks the caller-side core worker to delete the `ObjectRefStream` for the ObjectID of the generator ObjectRef.
[Deleting the stream](https://github.com/ray-project/ray/blob/ray-2.55.0/src/ray/core_worker/task_manager.cc#L690-L727) releases unconsumed streamed refs from the caller-side stream and replies to pending backpressured report RPCs with `NotFound`.
This unblocks an executor that was waiting for the caller to consume more refs.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the backpressure mechanism simply stop checking if we're under the threshold if not found is returned?

Also, the caller is technically the owner for the generated object. If the object ref stream already gone out of scope, are the created object immediately marked for eviction/deleted?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the backpressure mechanism simply stop checking if we're under the threshold if not found is returned?

It doesn't stop checking, but each NotFound will unblock it.

Also, the caller is technically the owner for the generated object. If the object ref stream already gone out of scope, are the created object immediately marked for eviction/deleted?

The ObjectRefStream is simply a caller struct that tracks stream slots. Even if it is out of scope, refs for the created objects can still be in scope, and then they won't be immediately marked for eviction.

Comment thread doc/source/ray-core/internals/streaming-generator.md Outdated
2. The executor sends the final `PushTask` reply. This reply includes the generator ObjectRef and the [list of streamed return ObjectIDs](https://github.com/ray-project/ray/blob/ray-2.55.0/src/ray/core_worker/task_execution/task_receiver.cc#L54-L60) produced by the task.
3. The caller handles the final `PushTask` reply, records how many streaming return objects the task produced, and marks the stream as ended.
4. The caller [writes an internal end-of-stream marker](https://github.com/ray-project/ray/blob/ray-2.55.0/src/ray/core_worker/task_manager.cc#L1078-L1085) into the caller-side `ObjectRefStream`.
5. Later, when `ObjectRefGenerator` reaches the marker, it checks the generator ObjectRef.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if stop iteration is raised in the executor UDF prematurely?

@rueian rueian Jun 3, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That will be treated as an application error. You will get a ref on the caller side, and when you ray.get on it, it will raise a ray.exceptions.RayTaskError wrapped with StopIteration.

rueian added 4 commits June 3, 2026 15:56
Signed-off-by: Rueian Huang <rueiancsie@gmail.com>
Signed-off-by: Rueian Huang <rueiancsie@gmail.com>
Signed-off-by: Rueian Huang <rueiancsie@gmail.com>
Signed-off-by: Rueian Huang <rueiancsie@gmail.com>
@rueian rueian requested a review from Kunchd June 3, 2026 23:50

@Kunchd Kunchd left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the doc!

@edoakes edoakes merged commit a76de30 into ray-project:master Jun 8, 2026
5 of 6 checks passed
sampan-s-nayak pushed a commit to sampan-s-nayak/ray that referenced this pull request Jun 10, 2026
The doc that explains how streaming generators work.
Preview link:
https://anyscale-ray--63791.com.readthedocs.build/en/63791/ray-core/internals/streaming-generator.html

---------

Signed-off-by: Rueian Huang <rueiancsie@gmail.com>
Signed-off-by: Rueian <rueiancsie@gmail.com>
limarkdcunha pushed a commit to limarkdcunha/ray that referenced this pull request Jun 30, 2026
The doc that explains how streaming generators work.
Preview link:
https://anyscale-ray--63791.com.readthedocs.build/en/63791/ray-core/internals/streaming-generator.html

---------

Signed-off-by: Rueian Huang <rueiancsie@gmail.com>
Signed-off-by: Rueian <rueiancsie@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Issues that should be addressed in Ray Core go add ONLY when ready to merge, run all tests

5 participants