[Data] Fix reference cycle in by moving to module scope#61934
Conversation
Signed-off-by: bittoby <bittoby@users.noreply.github.com>
There was a problem hiding this comment.
Code Review
This pull request fixes a reference cycle in ActorMethod.options() by moving an inner class to module scope. This prevents a closure from keeping actor-related objects alive, ensuring proper garbage collection. The change is accompanied by a new regression test that validates the fix by disabling the garbage collector and checking for leaks using weak references. The implementation appears correct and complete.
4e06aeb to
d7153f8
Compare
…r-method-options-ref-cycle
Signed-off-by: bittoby <bittoby@users.noreply.github.com>
|
@edoakes All tests passed. what else should I need to update? |
|
@codope could you please help review? I remember you working on something similar as one of your first tickets |
|
@codope would you please review this pr? welcome to any feedbacks. |
codope
left a comment
There was a problem hiding this comment.
@bittoby thanks for the fix! Overall, looks good to me. Left a couple of comments. PTAL.
| class FuncWrapper: | ||
| def remote(self, *args, **kwargs): | ||
| return func_cls._remote(args=args, kwargs=kwargs, **options) | ||
|
|
||
| @DeveloperAPI | ||
| def bind(self, *args, **kwargs): | ||
| return func_cls._bind(args=args, kwargs=kwargs, **options) | ||
|
|
||
| return FuncWrapper() | ||
| return _ActorMethodOptionsWrapper(self, options) |
There was a problem hiding this comment.
There's an identical FuncWrapper-inside-closure pattern in remote_function.py:
ray/python/ray/remote_function.py
Line 291 in fe32fba
It has the same __class__ cell cycle. The impact is lower (no ActorHandle or __ray_shutdown__ involved), but can you check if it still creates an unnecessary garbage island on every .options() call? Maybe worth a follow-up fix for consistency.
There was a problem hiding this comment.
Yes, remote_function.py has the same __class__ cell cycle. It creates a small garbage island on every .options() call, though the impact is lower since RemoteFunction objects are long-lived and no ActorHandle/__ray_shutdown__ is involved. I'll send a follow-up PR to fix it for consistency.
There was a problem hiding this comment.
sounds good, please tag me on the follow-up PR when it's ready.
…r-method-options-ref-cycle
Signed-off-by: bittoby <bittoby@users.noreply.github.com>
|
@codope Thanks for your feedback! I updated all following your feedback. Could you review again? |
|
@edoakes would you merge this PR? This testing failure is not relate to my change. Thanks |
|
Thanks @bittoby! |
## Description In #61700, a direct call to `_remote()` is used as workaround to get over the issue described in #61922. This issue has been fixed in #61934 and this PR changes the direct `_remote()` call back to the method chaining style. ## Related issues Related to #61922. --------- Signed-off-by: Sirui Huang <ray.huang@anyscale.com> Signed-off-by: HFFuture <ray.huang@anyscale.com> Co-authored-by: Hao Chen <chenh1024@gmail.com>
…61934) ## Description Fix reference cycle in `ActorMethod.options()` by moving `FuncWrapper` to module scope. The old `FuncWrapper` class was defined inside the `options()` closure, which triggered CPython's implicit `__class__` cell. This kept the `ActorMethod` and its `ActorHandle` alive until the cyclic GC ran, delaying `__ray_shutdown__` and actor cleanup. The fix replaces it with `_ActorMethodOptionsWrapper` at module scope. Same interface, no closure, no cycle. ## Related issues Fixes ray-project#61922 ## Additional information Added regression test `test_options_no_ref_cycle` that disables GC and asserts the `ActorHandle` is freed by reference counting alone after `options().remote()`. --------- Signed-off-by: bittoby <bittoby@users.noreply.github.com> Co-authored-by: bittoby <bittoby@users.noreply.github.com>
## Description In ray-project#61700, a direct call to `_remote()` is used as workaround to get over the issue described in ray-project#61922. This issue has been fixed in ray-project#61934 and this PR changes the direct `_remote()` call back to the method chaining style. ## Related issues Related to ray-project#61922. --------- Signed-off-by: Sirui Huang <ray.huang@anyscale.com> Signed-off-by: HFFuture <ray.huang@anyscale.com> Co-authored-by: Hao Chen <chenh1024@gmail.com>
…61934) ## Description Fix reference cycle in `ActorMethod.options()` by moving `FuncWrapper` to module scope. The old `FuncWrapper` class was defined inside the `options()` closure, which triggered CPython's implicit `__class__` cell. This kept the `ActorMethod` and its `ActorHandle` alive until the cyclic GC ran, delaying `__ray_shutdown__` and actor cleanup. The fix replaces it with `_ActorMethodOptionsWrapper` at module scope. Same interface, no closure, no cycle. ## Related issues Fixes ray-project#61922 ## Additional information Added regression test `test_options_no_ref_cycle` that disables GC and asserts the `ActorHandle` is freed by reference counting alone after `options().remote()`. --------- Signed-off-by: bittoby <bittoby@users.noreply.github.com> Co-authored-by: bittoby <bittoby@users.noreply.github.com>
## Description In ray-project#61700, a direct call to `_remote()` is used as workaround to get over the issue described in ray-project#61922. This issue has been fixed in ray-project#61934 and this PR changes the direct `_remote()` call back to the method chaining style. ## Related issues Related to ray-project#61922. --------- Signed-off-by: Sirui Huang <ray.huang@anyscale.com> Signed-off-by: HFFuture <ray.huang@anyscale.com> Co-authored-by: Hao Chen <chenh1024@gmail.com>
Description
Fix reference cycle in
ActorMethod.options()by movingFuncWrapperto module scope.The old
FuncWrapperclass was defined inside theoptions()closure, which triggered CPython's implicit__class__cell. This kept theActorMethodand itsActorHandlealive until the cyclic GC ran, delaying__ray_shutdown__and actor cleanup.The fix replaces it with
_ActorMethodOptionsWrapperat module scope. Same interface, no closure, no cycle.Related issues
Fixes #61922
Additional information
Added regression test
test_options_no_ref_cyclethat disables GC and asserts theActorHandleis freed by reference counting alone afteroptions().remote().