[Data] Move `requester_id` in `DefaultAutoscalingCoordinator` from Method Signatures into the Constructor by rayhhome · Pull Request #62838 · ray-project/ray

rayhhome · 2026-04-21T21:34:27Z

Description

The client wrapper DefaultAutoscalingCoordinator is always tied to a single requester for its entire lifetime—it is constructed once per execution with a fixed requester_id and never shared across multiple requesters. Despite this, requester_id was threaded through every method call (request_resources, cancel_request, get_allocated_resources), adding noise to both call sites and the abstract base class interface.

This PR moves requester_id from the method signatures into the constructor, making the single-requester ownership explicit in the type. The abstract base class (AutoscalingCoordinator) and testing implementation (FakeAutoscalingCoordinator) are updated to match. FakeAutoscalingCoordinator is also simplified from a Dict-keyed multi-tenant structure to a single Optional[Allocation], reflecting its single-tenant usage.

Additional information

API changes:

AutoscalingCoordinator abstract methods no longer take requester_id:

# Before
def request_resources(self, requester_id: str, resources, ...) -> None: ...
def cancel_request(self, requester_id: str) -> None: ...
def get_allocated_resources(self, requester_id: str) -> List[ResourceDict]: ...

# After
def request_resources(self, resources, ...) -> None: ...
def cancel_request(self) -> None: ...
def get_allocated_resources(self) -> List[ResourceDict]: ...

DefaultAutoscalingCoordinator.__init__ now requires requester_id:

# Before
DefaultAutoscalingCoordinator()

# After
DefaultAutoscalingCoordinator(requester_id="data-<execution_id>")

The _AutoscalingCoordinatorActor (underlying Ray actor that actually coordinates across multiple datasets) is unchanged. It still takes requester_id per call, as it genuinely serves multiple requesters.

Signed-off-by: Sirui Huang <ray.huang@anyscale.com>

gemini-code-assist

Code Review

This pull request refactors the autoscaling coordinator to be single-tenant by moving the requester_id from method arguments to the class constructor. This change simplifies the API across the base, default, and fake coordinator implementations. The review feedback suggests further simplifying the internal caching mechanism in DefaultAutoscalingCoordinator by replacing the dictionary-based cache with a single list, as the class no longer needs to manage resources for multiple requesters.

Copilot

Pull request overview

This PR refactors the AutoscalingCoordinator interface so requester_id is bound at construction time (single-tenant coordinator instances), removing requester_id from method signatures and updating the default + fake implementations and call sites accordingly.

Changes:

Update AutoscalingCoordinator abstract methods to drop the requester_id parameter.
Make DefaultAutoscalingCoordinator require requester_id in its constructor and thread it internally to the underlying Ray actor.
Simplify FakeAutoscalingCoordinator from multi-tenant dict storage to a single optional allocation and update tests/callers.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
`python/ray/data/_internal/cluster_autoscaler/base_autoscaling_coordinator.py`	Updates the abstract interface to remove `requester_id` from method signatures and docs.
`python/ray/data/_internal/cluster_autoscaler/default_autoscaling_coordinator.py`	Binds `requester_id` in the constructor and uses it internally for actor calls and timeout handling.
`python/ray/data/_internal/cluster_autoscaler/default_cluster_autoscaler_v2.py`	Constructs `DefaultAutoscalingCoordinator` with a per-execution requester id; updates coordinator calls to the new signatures.
`python/ray/data/_internal/cluster_autoscaler/fake_autoscaling_coordinator.py`	Updates fake coordinator to the new single-tenant interface and simplifies internal state.
`python/ray/data/tests/test_autoscaling_coordinator.py`	Adjusts tests to construct and call `DefaultAutoscalingCoordinator` using the new API.

Comments suppressed due to low confidence (1)

python/ray/data/_internal/cluster_autoscaler/default_cluster_autoscaler_v2.py:345

This warning message includes self._requester_id, but AutoscalingCoordinator.cancel_request() no longer takes a requester id, so the id in the log is no longer guaranteed to correspond to what was actually canceled (especially when a custom coordinator is injected). Consider removing the id from the message or deriving it from the coordinator instance (if available) to avoid misleading operational logs.

            msg = (
                f"Failed to cancel resource request for {self._requester_id}."
                " The request will still expire after the timeout of"
                f" {self._min_gap_between_autoscaling_requests_s} seconds."
            )

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{Reviewed by Cursor Bugbot for commit 0454a47. Configure here.}

…Coordinator Signed-off-by: Sirui Huang <ray.huang@anyscale.com>

…thod Signatures into the Constructor (ray-project#62838) ## Description The client wrapper `DefaultAutoscalingCoordinator` is always tied to a single requester for its entire lifetime—it is constructed once per execution with a fixed `requester_id` and never shared across multiple requesters. Despite this, `requester_id` was threaded through every method call (`request_resources`, `cancel_request`, `get_allocated_resources`), adding noise to both call sites and the abstract base class interface. This PR moves `requester_id` from the method signatures into the constructor, making the single-requester ownership explicit in the type. The abstract base class (`AutoscalingCoordinator`) and testing implementation (`FakeAutoscalingCoordinator`) are updated to match. `FakeAutoscalingCoordinator` is also simplified from a Dict-keyed multi-tenant structure to a single `Optional[Allocation]`, reflecting its single-tenant usage. ## Additional information API changes: - `AutoscalingCoordinator` abstract methods no longer take `requester_id`: ```python # Before def request_resources(self, requester_id: str, resources, ...) -> None: ... def cancel_request(self, requester_id: str) -> None: ... def get_allocated_resources(self, requester_id: str) -> List[ResourceDict]: ... # After def request_resources(self, resources, ...) -> None: ... def cancel_request(self) -> None: ... def get_allocated_resources(self) -> List[ResourceDict]: ... ``` - `DefaultAutoscalingCoordinator.__init__` now requires `requester_id`: ```python # Before DefaultAutoscalingCoordinator() # After DefaultAutoscalingCoordinator(requester_id="data-<execution_id>") ``` The `_AutoscalingCoordinatorActor` (underlying Ray actor that actually coordinates across multiple datasets) is unchanged. It still takes `requester_id` per call, as it genuinely serves multiple requesters. --------- Signed-off-by: Sirui Huang <ray.huang@anyscale.com>

Get rid of extraneous requester_id in autoscaler

88a964d

Signed-off-by: Sirui Huang <ray.huang@anyscale.com>

rayhhome self-assigned this Apr 21, 2026

Copilot AI review requested due to automatic review settings April 21, 2026 21:34

rayhhome requested a review from a team as a code owner April 21, 2026 21:34

rayhhome added data Ray Data-related issues go add ONLY when ready to merge, run all tests labels Apr 21, 2026

Copilot started reviewing on behalf of rayhhome April 21, 2026 21:35 View session

gemini-code-assist Bot reviewed Apr 21, 2026

View reviewed changes

Merge branch 'master' into requester-id-in-dac

0454a47

Copilot AI reviewed Apr 21, 2026

View reviewed changes

Comment thread python/ray/data/_internal/cluster_autoscaler/fake_autoscaling_coordinator.py Outdated

cursor Bot reviewed Apr 21, 2026

View reviewed changes

Comment thread python/ray/data/_internal/cluster_autoscaler/fake_autoscaling_coordinator.py Outdated

Comment thread python/ray/data/_internal/cluster_autoscaler/default_autoscaling_coordinator.py

Update _cached_allocated_resources to list + simplify FakeAutoscaling…

c5032e9

…Coordinator Signed-off-by: Sirui Huang <ray.huang@anyscale.com>

bveeramani approved these changes Apr 21, 2026

View reviewed changes

bveeramani enabled auto-merge (squash) April 21, 2026 22:19

bveeramani merged commit 3823f83 into ray-project:master Apr 21, 2026
7 checks passed

rayhhome deleted the requester-id-in-dac branch April 22, 2026 20:33

rayhhome mentioned this pull request Apr 24, 2026

[Data] Non-blocking Default Autoscaling Coordinator #62725

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Data] Move `requester_id` in `DefaultAutoscalingCoordinator` from Method Signatures into the Constructor#62838

[Data] Move `requester_id` in `DefaultAutoscalingCoordinator` from Method Signatures into the Constructor#62838
bveeramani merged 3 commits into
ray-project:masterfrom
rayhhome:requester-id-in-dac

rayhhome commented Apr 21, 2026 •

edited

Loading

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Labels

3 participants

Uh oh!

Conversation

rayhhome commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Additional information

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Labels

3 participants

rayhhome commented Apr 21, 2026 •

edited

Loading