[Data] Fixing ReservationOpResourceAllocator to properly borrow resources for ActorPoolMapOperator#60882
Conversation
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
There was a problem hiding this comment.
Code Review
This pull request correctly fixes a resource allocation bug for ActorPoolMapOperator by using min_scheduling_resources() instead of incremental_resource_usage() for resource borrowing. This change allows actor-based operators to properly acquire resources for scaling up. The addition of a targeted regression test is excellent, and the refactoring of existing tests to remove mocking of incremental_resource_usage improves the test suite's realism and maintainability. The changes are well-implemented and look solid.
| # is still enough. | ||
| to_borrow = ( | ||
| op.incremental_resource_usage() | ||
| op.min_scheduling_resources() |
There was a problem hiding this comment.
Borrowing and task submission use mismatched resource methods
Low Severity
The borrowing logic now uses op.min_scheduling_resources() while can_submit_new_task still checks against op.incremental_resource_usage(). HashShuffleOperator overrides incremental_resource_usage() to return cpu=1 but does not override min_scheduling_resources() (inheriting the base class zero return). This means the borrowing logic will never borrow CPU for shuffle operators, yet can_submit_new_task still requires cpu >= 1 — so in resource-constrained scenarios the shuffle operator may be unable to make progress where it previously could.
Additional Locations (1)
There was a problem hiding this comment.
Yes, this behavior is correct.
- can_submit_new_task checks whether a new task can be launched which for APs won't incur new resource usage (unlike scheduling new actors)
| # is still enough. | ||
| to_borrow = ( | ||
| op.incremental_resource_usage() | ||
| op.min_scheduling_resources() |
There was a problem hiding this comment.
Do u know why it was set to incremental_resource_usage usage in the 1st place? Since this is all 0s for actors, I wonder how much this affect CPU workloads?
There was a problem hiding this comment.
min_scheduling_resources is a recent thing
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
…ources for `ActorPoolMapOperator` (#60882) ## Description `ActorPoolMapOperator.incremental_resource_usage` returns 0, since no new incremental resources are necessary to launch new tasks. However, for resource allocation to properly borrow from shared resources `min_scheduling_resources` should be used instead. ## Related issues > Link related issues: "Fixes #1234", "Closes #1234", or "Related to #1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
…ources for `ActorPoolMapOperator` (#60891) ## Description Cherry-pick of #60882 ## Related issues > Link related issues: "Fixes #1234", "Closes #1234", or "Related to #1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
…ources for `ActorPoolMapOperator` (ray-project#60882) ## Description `ActorPoolMapOperator.incremental_resource_usage` returns 0, since no new incremental resources are necessary to launch new tasks. However, for resource allocation to properly borrow from shared resources `min_scheduling_resources` should be used instead. ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Alexey Kudinkin <ak@anyscale.com> Signed-off-by: Adel Nour <ans9868@nyu.edu>
…ources for `ActorPoolMapOperator` (ray-project#60882) ## Description `ActorPoolMapOperator.incremental_resource_usage` returns 0, since no new incremental resources are necessary to launch new tasks. However, for resource allocation to properly borrow from shared resources `min_scheduling_resources` should be used instead. ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Alexey Kudinkin <ak@anyscale.com>


Description
ActorPoolMapOperator.incremental_resource_usagereturns 0, since no new incremental resources are necessary to launch new tasks.However, for resource allocation to properly borrow from shared resources
min_scheduling_resourcesshould be used instead.Related issues
Additional information