Prepare socket file when start ray#3925
Conversation
|
Test FAILed. |
| Args: | ||
| socket_path (string): the socket file to prepare. | ||
| """ | ||
| if not os.path.isfile(socket_path): |
There was a problem hiding this comment.
if not os.path.exist(socket_path) will be more explicit?
| shutil.rmtree(tempdir) | ||
|
|
||
|
|
||
| def test_socket_directory_none_existant(shutdown_only): |
There was a problem hiding this comment.
typo, existent. or maybe shorter test_socket_dir_not_existing
| def test_socket_directory_none_existant(shutdown_only): | ||
| level1_name = ray.ObjectID(_random_string()).hex() | ||
| level2_name = ray.ObjectID(_random_string()).hex() | ||
| temp_raylet_socket_dir = "/tmp/{}/{}".format(level1_name, level2_name) |
There was a problem hiding this comment.
nit, let's put the socket under something like /tmp/ray/tests to avoid creating too many files in /tmp
| if not os.path.isdir(path): | ||
| try_to_create_directory(path) | ||
| else: | ||
| os.remove(socket_path) |
There was a problem hiding this comment.
I'm not sure if it's a good idea to remove the existing socket file. If a raylet is still running and user starts ray again, this will break the previous raylet.
We can check if this socket file is still used by any processes before removing it. Or maybe simply raise an error if it already exists.
There was a problem hiding this comment.
Because this function is used in raylet socket and plasma socket, it not easy to connect to it to check whether it is in use. I raise exception here. It is better than nothing. If we don't do this and start cluster using ray start, there will be no error after this command. We need to check it in raylet.err. With this exception, ray start will stop immediately.
There was a problem hiding this comment.
Yep, raising the error earlier is better.
BTW, we can detect if a socket file is in use with the lsof command. But raising an error is okay as well.
|
Test PASSed. |
|
Test PASSed. |
|
@robertnishihara The Linux Wheel test fails in all PRs. Shall we change the number of expecting wheels to 4 to unblock the test? |
What do these changes do?
Sometimes, users want to specify their own socket file name. However, there are two cases that they could fail.
For the first case, raylet will crash with:
For the second case, raylet will crash with:
Related issue number