Skip to content

[data][llm] Promote max_tasks_in_flight_per_actor to a first-class config field and adjust defaults#63214

Merged
kouroshHakha merged 3 commits into
masterfrom
data-llm-max-tasks-in-flight
May 8, 2026
Merged

[data][llm] Promote max_tasks_in_flight_per_actor to a first-class config field and adjust defaults#63214
kouroshHakha merged 3 commits into
masterfrom
data-llm-max-tasks-in-flight

Conversation

@jeffreywang88

@jeffreywang88 jeffreywang88 commented May 8, 2026

Copy link
Copy Markdown
Contributor

Why

Ray Data LLM hardcoded DEFAULT_MAX_TASKS_IN_FLIGHT = 16 instead of using Ray Data's actor-pool fallback, which (a) didn't track max_concurrent_batches when users tuned it and (b) bypassed both DataContext.max_tasks_in_flight_per_actor and the env-var override of the factor.

What changes?

  • New top-level field OfflineProcessorConfig.max_tasks_in_flight_per_actor: Optional[int] = None.
  • Removed the DEFAULT_MAX_TASKS_IN_FLIGHT = 16 constant; engine processors pass config.max_tasks_in_flight_per_actor straight through to ActorPoolStrategy (including None).
  • Default in-flight cap: hardcoded 16max_concurrent_batches × FACTOR, resolved by Ray Data's actor pool.
  • DataContext.max_tasks_in_flight_per_actor and RAY_DATA_ACTOR_DEFAULT_MAX_TASKS_IN_FLIGHT_TO_MAX_CONCURRENCY_FACTOR are now honored (previously bypassed by the explicit 16).
  • experimental["max_tasks_in_flight_per_actor"] is deprecated: migrated to the new field at construction with a logger.warning. Top-level field wins if both are set.

Original API

OfflineProcessorConfig(
    ...,
    experimental={"max_tasks_in_flight_per_actor": 32},  # only knob
)

New API

OfflineProcessorConfig(
    ...,
    max_concurrent_batches=8,           # unchanged
    max_tasks_in_flight_per_actor=32,   # new top-level field, Optional[int]
)

Behavior changes

  • Users who set RAY_DATA_ACTOR_DEFAULT_MAX_TASKS_IN_FLIGHT_TO_MAX_CONCURRENCY_FACTOR now have their override honored by Ray Data LLM (previously ignored).
  • Setting via experimental still works but logs a deprecation warning. The top-level field overrides experimental if both are set.
max_concurrent_batches max_tasks_in_flight_per_actor Ray actor max_concurrency Effective in-flight cap
unset (default 8) unset (None) 8 16
16 unset (None) 16 32
unset (default 8) 50 8 50
16 50 16 50

Related issues

Link related issues: "Fixes #1234", "Closes #1234", or "Related to #1234".

Additional information

Optional: Add implementation details, API changes, usage examples, screenshots, etc.

Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
@jeffreywang88 jeffreywang88 requested a review from a team as a code owner May 8, 2026 00:46
@jeffreywang88 jeffreywang88 added the go add ONLY when ready to merge, run all tests label May 8, 2026
@jeffreywang88 jeffreywang88 changed the title [data][llm] Promote to a first-class config field and adjust defaults May 8, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request promotes max_tasks_in_flight_per_actor from an experimental configuration to a top-level field in OfflineProcessorConfig and its subclasses. It introduces deprecation warnings for the experimental key and implements a resolution strategy that defaults to a calculated value based on max_concurrent_batches. Feedback identifies a potential type mismatch where a float could be assigned to an integer field when bypassing Pydantic validation, suggesting an explicit integer cast to ensure compatibility with Ray Data's actor pool.

* DEFAULT_ACTOR_MAX_TASKS_IN_FLIGHT_TO_MAX_CONCURRENCY_FACTOR,
)
# Bypass `validate_assignment=True` so we don't re-fire the deprecation warning
object.__setattr__(self, "max_tasks_in_flight_per_actor", resolved)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The resolved value for max_tasks_in_flight_per_actor can be a float, particularly when calculated using DEFAULT_ACTOR_MAX_TASKS_IN_FLIGHT_TO_MAX_CONCURRENCY_FACTOR, which is defined as a float. The max_tasks_in_flight_per_actor field is typed as Optional[int], but using object.__setattr__ bypasses Pydantic's type coercion. This could result in a float value being passed to ray.data.ActorPoolStrategy, which expects an integer and may lead to unexpected behavior or a runtime error.

To ensure an integer is always assigned, the resolved value should be explicitly cast to int. This would also align with the original logic in Ray Data's actor pool, which performs this integer conversion.

Suggested change
object.__setattr__(self, "max_tasks_in_flight_per_actor", resolved)
object.__setattr__(self, "max_tasks_in_flight_per_actor", int(resolved))
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
@ray-gardener ray-gardener Bot added the data Ray Data-related issues label May 8, 2026

@Aydin-ab Aydin-ab left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

making a bit more explicit in the doc that it's 2 * max_concurrent_batches

Comment thread python/ray/data/llm.py Outdated
Comment thread python/ray/data/llm.py Outdated
Comment thread python/ray/llm/_internal/batch/processor/base.py Outdated

@kouroshHakha kouroshHakha left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach is sound — using a Pydantic mode="after" validator to eagerly resolve the None sentinel is clean, and the resolution order (explicit > experimental > formula) is implemented correctly. The behavioral no-op for default users (8×2=16) is a good property.

Note

This review was co-written with AI assistance (Claude Code).

Comment thread python/ray/llm/_internal/batch/processor/base.py
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
@kouroshHakha kouroshHakha enabled auto-merge (squash) May 8, 2026 21:09
@kouroshHakha kouroshHakha merged commit 75f55e3 into master May 8, 2026
7 checks passed
@kouroshHakha kouroshHakha deleted the data-llm-max-tasks-in-flight branch May 8, 2026 21:42
Lucas61000 pushed a commit to Lucas61000/ray that referenced this pull request May 15, 2026
…config field and adjust defaults (ray-project#63214)

Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

data Ray Data-related issues go add ONLY when ready to merge, run all tests

3 participants