[llm] lazy-load batch stage and processor submodules by kouroshHakha · Pull Request #62861 · ray-project/ray

kouroshHakha · 2026-04-22T21:50:48Z

Why

ray.llm._internal.batch.stages and ray.llm._internal.batch.processor previously imported every stage / engine submodule eagerly from their __init__.py. Several of those submodules pull in heavy optional dependencies (transformers, vllm, sglang, mistral_common, huggingface_hub, ...).

As a result, even importing a lightweight piece like HttpRequestProcessorConfig — which only needs an aiohttp client and a few pydantic models — was loading the entire ML stack. On a typical machine that meant ~7s of import latency, a multi-hundred-MB process footprint, and a hard ImportError whenever any of those optional deps were not installed.

Closes #52632

Concretely, before this change:

import sys, time
t = time.perf_counter()
import ray.llm._internal.batch.processor.http_request_proc  # noqa
print(f"{time.perf_counter() - t:.2f}s")
print("transformers loaded:", "transformers" in sys.modules)
print("vllm.transformers_utils loaded:", "vllm.transformers_utils" in sys.modules)
print("sglang loaded:", "sglang" in sys.modules)
print("mistral_common loaded:", "mistral_common" in sys.modules)

prints something like:

7.46s
transformers loaded: True
vllm.transformers_utils loaded: True
sglang loaded: True
mistral_common loaded: True

After this change the same script prints ~1.1s and all four flags are False.

What

Convert both __init__.py files to PEP 562 __getattr__ lazy re-exports:

Cheap symbols (StatefulStage, Processor, ProcessorBuilder, ProcessorConfig, the various *StageConfig classes) stay eagerly imported because they have no heavy transitive deps and are used everywhere.
Engine-specific stage / processor classes are listed in a small _LAZY_ATTRS map and resolved on first attribute access via __getattr__. The result is then cached in globals() so subsequent lookups are free.
__dir__ is overridden to keep tab-completion / introspection working.
A TYPE_CHECKING block re-exports the lazy names statically so type checkers / IDEs continue to see them.

Side-effect note

Each *_proc.py calls ProcessorBuilder.register(...) at import time. With this change the registration happens the first time the corresponding config is accessed via the package, which is exactly when a user constructs that config and then calls ProcessorBuilder.build, so the registry is populated in time for every realistic use. This is exercised in the tests — see test_lazy_imports.py and the existing ProcessorBuilder.build(HttpRequestProcessorConfig(...)) path.

Test plan

New regression test file python/ray/llm/tests/batch/cpu/processor/test_lazy_imports.py (17 cases, all green) pinning the new behavior:
- importing HttpRequestProcessorConfig must not pull transformers, vllm, sglang, mistral_common, tokenizers, huggingface_hub or any non-HTTP stage / processor submodule into sys.modules (verified in a clean subprocess).
- importing HttpRequestStage from the stages package must only load http_request_stage.py and not any other stage submodule.
- every lazy attr resolves to the right class from the right submodule.
- unknown attributes raise AttributeError (so hasattr etc. behave correctly).
- dir(pkg) lists all lazy attrs.
Pre-existing python/ray/llm/tests/batch/cpu/processor/ and python/ray/llm/tests/batch/cpu/stages/ tests pass with this change identically to baseline (73 passed, same 13 pre-existing env-skew failures unrelated to this change).
Sanity-checked the public API end-to-end: from ray.llm._internal.batch import HttpRequestProcessorConfig, then ProcessorBuilder.build(cfg) builds an HTTP processor with the expected stages.
pre-commit passes on all changed files (black, ruff, pydoclint, import order, etc.).

Made with Cursor

The ``ray.llm._internal.batch.stages`` and ``ray.llm._internal.batch.processor`` packages previously imported every stage / engine submodule eagerly from their ``__init__.py``. Several of those submodules pull in heavy optional dependencies (``transformers``, ``vllm``, ``sglang``, ``mistral_common``, ``huggingface_hub`` etc.). As a result, even importing a lightweight piece like ``HttpRequestProcessorConfig`` -- which only needs an ``aiohttp`` client and a few pydantic models -- was loading the entire ML stack. On a typical machine that meant ~7s of import latency, a multi-hundred-MB process footprint, and a hard ``ImportError`` whenever any of those optional deps were not installed. This change converts both ``__init__.py`` files to PEP 562 ``__getattr__`` lazy re-exports: * Cheap symbols (``StatefulStage``, ``Processor``, ``ProcessorBuilder``, ``ProcessorConfig``, the various ``*StageConfig`` classes) stay eagerly imported because they have no heavy transitive deps and are used everywhere. * Engine-specific stage / processor classes are listed in a small ``_LAZY_ATTRS`` map and resolved on first attribute access via ``__getattr__``. The result is then cached in ``globals()`` so subsequent lookups are free. * ``__dir__`` is overridden to keep tab-completion working. * A ``TYPE_CHECKING`` block re-exports the lazy names statically so type checkers / IDEs continue to see them. Side-effect note: each ``*_proc.py`` calls ``ProcessorBuilder.register`` at import time. With this change the registration happens the first time the corresponding config is accessed via the package, which is exactly when a user constructs that config and calls ``ProcessorBuilder.build``, so the registry is populated in time for every realistic use. Adds ``test_lazy_imports.py`` to pin the new behaviour: importing ``HttpRequestProcessorConfig`` must not pull ``transformers``, ``vllm``, ``sglang``, ``mistral_common`` or any non-HTTP stage / processor submodule into ``sys.modules``. Made-with: Cursor Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com> Made-with: Cursor

gemini-code-assist

Code Review

This pull request implements lazy loading for batch processors and stages using PEP 562 getattr to avoid unnecessary loading of heavy ML dependencies. It also includes regression tests to ensure that importing lightweight components does not trigger heavy imports and that attribute resolution works correctly. I have no feedback to provide.

Made-with: Cursor Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

jeffreywang88 · 2026-04-27T17:15:59Z

+}
+
+
+def __getattr__(name):


This is quite similar to stages/_init_.py's _get_attr_. Consider refactoring to a shared utility.

…ct#62861) Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

kouroshHakha requested a review from a team as a code owner April 22, 2026 21:50

gemini-code-assist Bot reviewed Apr 22, 2026

View reviewed changes

Retrigger CI

db13e78

Made-with: Cursor Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

ray-gardener Bot added the serve Ray Serve Related Issue label Apr 23, 2026

kouroshHakha added the go add ONLY when ready to merge, run all tests label Apr 23, 2026

kouroshHakha self-assigned this Apr 23, 2026

kouroshHakha requested a review from jeffreywang88 April 23, 2026 19:09

kouroshHakha assigned jeffreywang88 Apr 23, 2026

jeffreywang88 approved these changes Apr 27, 2026

View reviewed changes

kouroshHakha merged commit b3eb203 into ray-project:master Apr 27, 2026
8 checks passed

Lucas61000 pushed a commit to Lucas61000/ray that referenced this pull request May 15, 2026

[data][llm] lazy-load batch stage and processor submodules (ray-proje…

6c66018

…ct#62861) Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[llm] lazy-load batch stage and processor submodules#62861

[llm] lazy-load batch stage and processor submodules#62861
kouroshHakha merged 2 commits into
ray-project:masterfrom
kouroshHakha:lazy-batch-stages

kouroshHakha commented Apr 22, 2026 •

edited

Loading

gemini-code-assist Bot left a comment

jeffreywang88 Apr 27, 2026

Uh oh!

Labels

2 participants

		}


		def __getattr__(name):

Uh oh!

Conversation

kouroshHakha commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

What

Side-effect note

Test plan

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

jeffreywang88 Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Labels

2 participants

kouroshHakha commented Apr 22, 2026 •

edited

Loading