Skip to content

[Data] Fix reference cycle in by moving to module scope#61934

Merged
edoakes merged 6 commits into
ray-project:masterfrom
Crystora:fix-actor-method-options-ref-cycle
Mar 24, 2026
Merged

[Data] Fix reference cycle in by moving to module scope#61934
edoakes merged 6 commits into
ray-project:masterfrom
Crystora:fix-actor-method-options-ref-cycle

Conversation

@Crystora

Copy link
Copy Markdown
Contributor

Description

Fix reference cycle in ActorMethod.options() by moving FuncWrapper to module scope.

The old FuncWrapper class was defined inside the options() closure, which triggered CPython's implicit __class__ cell. This kept the ActorMethod and its ActorHandle alive until the cyclic GC ran, delaying __ray_shutdown__ and actor cleanup.

The fix replaces it with _ActorMethodOptionsWrapper at module scope. Same interface, no closure, no cycle.

Related issues

Fixes #61922

Additional information

Added regression test test_options_no_ref_cycle that disables GC and asserts the ActorHandle is freed by reference counting alone after options().remote().

Signed-off-by: bittoby <bittoby@users.noreply.github.com>
@Crystora Crystora requested a review from a team as a code owner March 21, 2026 08:58

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes a reference cycle in ActorMethod.options() by moving an inner class to module scope. This prevents a closure from keeping actor-related objects alive, ensuring proper garbage collection. The change is accompanied by a new regression test that validates the fix by disabling the garbage collector and checking for leaks using weak references. The implementation appears correct and complete.

@ray-gardener ray-gardener Bot added core Issues that should be addressed in Ray Core data Ray Data-related issues community-contribution Contributed by the community labels Mar 21, 2026
@edoakes edoakes added the go add ONLY when ready to merge, run all tests label Mar 23, 2026
Signed-off-by: bittoby <bittoby@users.noreply.github.com>
@Crystora Crystora force-pushed the fix-actor-method-options-ref-cycle branch from 4e06aeb to d7153f8 Compare March 23, 2026 04:50
bittoby added 2 commits March 23, 2026 04:51
Signed-off-by: bittoby <bittoby@users.noreply.github.com>
@Crystora

Crystora commented Mar 23, 2026

Copy link
Copy Markdown
Contributor Author

@edoakes All tests passed. what else should I need to update?

@edoakes

edoakes commented Mar 23, 2026

Copy link
Copy Markdown
Collaborator

@codope could you please help review? I remember you working on something similar as one of your first tickets

@Crystora

Copy link
Copy Markdown
Contributor Author

@codope would you please review this pr? welcome to any feedbacks.
thank you

@codope codope left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bittoby thanks for the fix! Overall, looks good to me. Left a couple of comments. PTAL.

Comment thread python/ray/actor.py Outdated
Comment thread python/ray/actor.py
Comment on lines -935 to +953
class FuncWrapper:
def remote(self, *args, **kwargs):
return func_cls._remote(args=args, kwargs=kwargs, **options)

@DeveloperAPI
def bind(self, *args, **kwargs):
return func_cls._bind(args=args, kwargs=kwargs, **options)

return FuncWrapper()
return _ActorMethodOptionsWrapper(self, options)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an identical FuncWrapper-inside-closure pattern in remote_function.py:

class FuncWrapper:

It has the same __class__ cell cycle. The impact is lower (no ActorHandle or __ray_shutdown__ involved), but can you check if it still creates an unnecessary garbage island on every .options() call? Maybe worth a follow-up fix for consistency.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, remote_function.py has the same __class__ cell cycle. It creates a small garbage island on every .options() call, though the impact is lower since RemoteFunction objects are long-lived and no ActorHandle/__ray_shutdown__ is involved. I'll send a follow-up PR to fix it for consistency.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good, please tag me on the follow-up PR when it's ready.

bittoby added 2 commits March 24, 2026 04:30
Signed-off-by: bittoby <bittoby@users.noreply.github.com>
@Crystora

Copy link
Copy Markdown
Contributor Author

@codope Thanks for your feedback! I updated all following your feedback. Could you review again?

@Crystora

Copy link
Copy Markdown
Contributor Author

@edoakes would you merge this PR? This testing failure is not relate to my change. Thanks

@edoakes edoakes merged commit 7985f5d into ray-project:master Mar 24, 2026
5 of 6 checks passed
@edoakes

edoakes commented Mar 24, 2026

Copy link
Copy Markdown
Collaborator

Thanks @bittoby!

@Crystora Crystora deleted the fix-actor-method-options-ref-cycle branch March 24, 2026 14:28
raulchen added a commit that referenced this pull request Mar 25, 2026
## Description
In #61700, a direct call to `_remote()` is used as workaround to get
over the issue described in #61922. This issue has been fixed in #61934
and this PR changes the direct `_remote()` call back to the method
chaining style.

## Related issues
Related to #61922.

---------

Signed-off-by: Sirui Huang <ray.huang@anyscale.com>
Signed-off-by: HFFuture <ray.huang@anyscale.com>
Co-authored-by: Hao Chen <chenh1024@gmail.com>
ryanaoleary pushed a commit to ryanaoleary/ray that referenced this pull request Mar 25, 2026
…61934)

## Description

Fix reference cycle in `ActorMethod.options()` by moving `FuncWrapper`
to module scope.

The old `FuncWrapper` class was defined inside the `options()` closure,
which triggered CPython's implicit `__class__` cell. This kept the
`ActorMethod` and its `ActorHandle` alive until the cyclic GC ran,
delaying `__ray_shutdown__` and actor cleanup.

The fix replaces it with `_ActorMethodOptionsWrapper` at module scope.
Same interface, no closure, no cycle.

## Related issues

Fixes ray-project#61922

## Additional information

Added regression test `test_options_no_ref_cycle` that disables GC and
asserts the `ActorHandle` is freed by reference counting alone after
`options().remote()`.

---------

Signed-off-by: bittoby <bittoby@users.noreply.github.com>
Co-authored-by: bittoby <bittoby@users.noreply.github.com>
ryanaoleary pushed a commit to ryanaoleary/ray that referenced this pull request Mar 25, 2026
## Description
In ray-project#61700, a direct call to `_remote()` is used as workaround to get
over the issue described in ray-project#61922. This issue has been fixed in ray-project#61934
and this PR changes the direct `_remote()` call back to the method
chaining style.

## Related issues
Related to ray-project#61922.

---------

Signed-off-by: Sirui Huang <ray.huang@anyscale.com>
Signed-off-by: HFFuture <ray.huang@anyscale.com>
Co-authored-by: Hao Chen <chenh1024@gmail.com>
Lucas61000 pushed a commit to Lucas61000/ray that referenced this pull request May 15, 2026
…61934)

## Description

Fix reference cycle in `ActorMethod.options()` by moving `FuncWrapper`
to module scope.

The old `FuncWrapper` class was defined inside the `options()` closure,
which triggered CPython's implicit `__class__` cell. This kept the
`ActorMethod` and its `ActorHandle` alive until the cyclic GC ran,
delaying `__ray_shutdown__` and actor cleanup.

The fix replaces it with `_ActorMethodOptionsWrapper` at module scope.
Same interface, no closure, no cycle.

## Related issues

Fixes ray-project#61922

## Additional information

Added regression test `test_options_no_ref_cycle` that disables GC and
asserts the `ActorHandle` is freed by reference counting alone after
`options().remote()`.

---------

Signed-off-by: bittoby <bittoby@users.noreply.github.com>
Co-authored-by: bittoby <bittoby@users.noreply.github.com>
Lucas61000 pushed a commit to Lucas61000/ray that referenced this pull request May 15, 2026
## Description
In ray-project#61700, a direct call to `_remote()` is used as workaround to get
over the issue described in ray-project#61922. This issue has been fixed in ray-project#61934
and this PR changes the direct `_remote()` call back to the method
chaining style.

## Related issues
Related to ray-project#61922.

---------

Signed-off-by: Sirui Huang <ray.huang@anyscale.com>
Signed-off-by: HFFuture <ray.huang@anyscale.com>
Co-authored-by: Hao Chen <chenh1024@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution Contributed by the community core Issues that should be addressed in Ray Core data Ray Data-related issues go add ONLY when ready to merge, run all tests

3 participants