[data] Support multiple datasets in a cluster (2/2): partition cluster resources by subcluster label#63375
Conversation
There was a problem hiding this comment.
Code Review
This pull request implements label_selector and subcluster_label_key support across Ray Data and Ray Train, allowing users to constrain task and actor placement to specific labeled subsets of a cluster. The changes include updates to ExecutionOptions, the AutoscalingCoordinator for resource bucketing, and broad propagation of these selectors through physical operators, planners, and data source implementations. Feedback was provided regarding the merge_label_selector utility, suggesting that it should always return a new dictionary to resolve a contradiction in its docstring and prevent potential mutation bugs.
Signed-off-by: Timothy Seah <tseah@anyscale.com>
Signed-off-by: Timothy Seah <tseah@anyscale.com>
Signed-off-by: Timothy Seah <tseah@anyscale.com>
Signed-off-by: Timothy Seah <tseah@anyscale.com>
60ccec0 to
7061322
Compare
…resources + request_remaining=True Signed-off-by: Timothy Seah <tseah@anyscale.com>
Signed-off-by: Timothy Seah <tseah@anyscale.com>
Signed-off-by: Timothy Seah <tseah@anyscale.com>
Signed-off-by: Timothy Seah <tseah@anyscale.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.
Reviewed by Cursor Bugbot for commit ef0e3e5. Configure here.
There was a problem hiding this comment.
when is request_resources(label_selectors) used? what happens if you do request_resources(label_selectors, subcluster_selector)? Does one overwrite the other?
Is that meant to be "non-subcluster related labels"?
Can we also raise an error to explicitly disallow one requester trying to request bundles from multiple subclusters?
if subcluster_selector and label_selectors:
req_subcluster = subcluster_selector.get(SUBCLUSTER_LABEL_KEY)
for i, sel in enumerate(label_selectors):
bundle_subcluster = sel.get(SUBCLUSTER_LABEL_KEY)
if bundle_subcluster is not None and bundle_subcluster != req_subcluster:
raise ValueError(
f"Bundle {i} label_selector targets subcluster "
f"{bundle_subcluster!r}, but requester is registered to "
f"{req_subcluster!r}. Per-bundle cross-subcluster "
f"allocation is not supported."
)Signed-off-by: Timothy Seah <tseah@anyscale.com>
…ataset subcluster changes Signed-off-by: Timothy Seah <tseah@anyscale.com>
Check out #58845 and #63287. In the former PR, my goal was to support placing Ray Train workers on nodes with particular attributes. These would usually be subcluster labels, but could also be nodes within a subcluster. For example, we may want to place Ray Train workers on gpu nodes within the training subcluster, as opposed to Ray Data workers for the training dataset on the cpu nodes within the training subcluster. However, I forgot to update the Right now, there are two types of requesters - datasets and ray train. Datasets will always request the subcluster using
subcluster_selector takes precedence: https://github.com/ray-project/ray/pull/63375/changes#diff-23e42254510d06fc2e4595cb52c69872e0b16f6c52932f06b502d63548e72067R361. I also implemented your |
justinvyu
left a comment
There was a problem hiding this comment.
Thanks! Can you update the PR description?
…utoscaler_v2.py Co-authored-by: Justin Yu <justin.v.yu@gmail.com> Signed-off-by: Timothy Seah <tseah@anyscale.com>
Signed-off-by: Timothy Seah <tseah@anyscale.com>
…r resources by subcluster label (ray-project#63375) The end goal is to support 2 ray data datasets in 1 cluster with subcluster label scheduling. In such a setup, we have 2 datasets sharing the same AutoscalingCoordinator. The previous PR in this stack (ray-project#63331) made sure that each dataset's tasks ended up in the correct subcluster. This PR ensures that all requesters, whether they are trainers or datasets, only request and receive resources in their subcluster. --------- Signed-off-by: Timothy Seah <tseah@anyscale.com> Co-authored-by: Justin Yu <justin.v.yu@gmail.com>
…r resources by subcluster label (ray-project#63375) The end goal is to support 2 ray data datasets in 1 cluster with subcluster label scheduling. In such a setup, we have 2 datasets sharing the same AutoscalingCoordinator. The previous PR in this stack (ray-project#63331) made sure that each dataset's tasks ended up in the correct subcluster. This PR ensures that all requesters, whether they are trainers or datasets, only request and receive resources in their subcluster. --------- Signed-off-by: Timothy Seah <tseah@anyscale.com> Co-authored-by: Justin Yu <justin.v.yu@gmail.com>
…r resources by subcluster label (ray-project#63375) The end goal is to support 2 ray data datasets in 1 cluster with subcluster label scheduling. In such a setup, we have 2 datasets sharing the same AutoscalingCoordinator. The previous PR in this stack (ray-project#63331) made sure that each dataset's tasks ended up in the correct subcluster. This PR ensures that all requesters, whether they are trainers or datasets, only request and receive resources in their subcluster. --------- Signed-off-by: Timothy Seah <tseah@anyscale.com> Co-authored-by: Justin Yu <justin.v.yu@gmail.com>
…ter (#64003) #63375 doesn't work because subcluster is not a valid label name. I am testing whether subcluster works on this PR (#63737) and cherrypicked that change here. Merged to 2.56.0 release branch already #63982 --------- Signed-off-by: Timothy Seah <tseah@anyscale.com> Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com> Co-authored-by: Timothy Seah <tseah@anyscale.com>
…r resources by subcluster label (ray-project#63375) The end goal is to support 2 ray data datasets in 1 cluster with subcluster label scheduling. In such a setup, we have 2 datasets sharing the same AutoscalingCoordinator. The previous PR in this stack (ray-project#63331) made sure that each dataset's tasks ended up in the correct subcluster. This PR ensures that all requesters, whether they are trainers or datasets, only request and receive resources in their subcluster. --------- Signed-off-by: Timothy Seah <tseah@anyscale.com> Co-authored-by: Justin Yu <justin.v.yu@gmail.com>
…ter (ray-project#64003) ray-project#63375 doesn't work because subcluster is not a valid label name. I am testing whether subcluster works on this PR (ray-project#63737) and cherrypicked that change here. Merged to 2.56.0 release branch already ray-project#63982 --------- Signed-off-by: Timothy Seah <tseah@anyscale.com> Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com> Co-authored-by: Timothy Seah <tseah@anyscale.com>

Summary
The end goal is to support 2 ray data datasets in 1 cluster with subcluster label scheduling. In such a setup, we have 2 datasets and 2 trainers sharing the same AutoscalingCoordinator. The previous PR in this stack (#63331) made sure that each dataset's tasks ended up in the correct subcluster. This PR ensures that all requesters, whether they are trainers or datasets, only request and receive resources in their subcluster.
To this end, the main change was to
AutoscalingCoordinator._tick, which is called at regular intervals.AutoscalingCoordinator._tickcalls 3 helper methods, which this PR changes as follows:merge_and_send_requests: each autoscaling request now includes the subcluster label of the requesterupdate_cluster_node_resources: we now group cluster nodes by subcluster_reallocate_resources: we now updateOngoingRequests with their subcluster-scoped resources.I also changed the
try_trigger_scalingmethod, which creates datasets' autoscaling requests. Before this change, this method tried to scale up every node in the cluster. Now, it only scales up the relevant subcluster. Note that this only applies to dataset requesters; trainer requesters attempt scaleup by requesting resource bundles with their corresponding label selectors (which includes subcluster labels), so I didn't need to touch that path.API summary
To use subcluster scheduling, the user must set the
__subcluster__label in their compute configand the
label_selectoron their datasetTesting
Ran multitenancy stress test based on this PR (PR: #63737, test: https://buildkite.com/ray-project/release/builds/95982).