[autoscaler] Add kill and get IP commands to CLI for testing#3731
Conversation
|
Test PASSed. |
| help="Override the configured cluster name.") | ||
| def get_worker_ips(cluster_config_file, cluster_name): | ||
| click.echo('\n'.join( | ||
| get_worker_node_ips(cluster_config_file, cluster_name))) |
There was a problem hiding this comment.
nit: splitting this into two lines instead of inlining can help with debuggability
| return provider.external_ip(head_node) | ||
|
|
||
|
|
||
| def get_worker_node_ips(config_file, override_cluster_name): |
There was a problem hiding this comment.
Can you add a docstring?
|
Test FAILed. |
| cli.add_command(submit) | ||
| cli.add_command(teardown) | ||
| cli.add_command(teardown, name="down") | ||
| cli.add_command(kill, name="kill_node") |
There was a problem hiding this comment.
Also we should guard this somehow so that users aren't doing this to kill clusters; maybe
name=_kill_random_node?
|
|
||
| provider = get_node_provider(config["provider"], config["cluster_name"]) | ||
| nodes = provider.nodes({TAG_RAY_NODE_TYPE: "worker"}) | ||
| node = nodes[random.randint(0, len(nodes))] |
There was a problem hiding this comment.
nit: could also random.choice(nodes)
|
|
||
| _exec(updater, "ray stop", False, False) | ||
|
|
||
| time.sleep(5) |
There was a problem hiding this comment.
To give the Raylet process some time to exit. Not strictly necessary, but I copied the code for cluster teardown: https://github.com/ray-project/ray/blob/master/python/ray/autoscaler/commands.py#L93
richardliaw
left a comment
There was a problem hiding this comment.
A few nits but looks fine; will approve after questions
Co-Authored-By: stephanie-wang <swang@cs.berkeley.edu>
|
Test PASSed. |
|
Test PASSed. |
What do these changes do?
Adds 2 commands to the CLI that take in an autoscaler config:
These commands are both for testing and are not recommended for normal use.
Related issue number
Closes #3685.