[core] Warn when runtime_env package approaches upload size limit#63404
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a warning mechanism to alert users when a zipped runtime_env package approaches the GCS upload size limit, specifically addressing cases where many small files (e.g., .git directories) might bypass existing per-file size checks. The changes include a new PACKAGE_SIZE_WARNING constant, a helper function _warn_if_package_size_near_limit, and comprehensive regression tests. The reviewer feedback suggests improving the actionability of the warning by including the source module_path in the log message, which helps users identify the specific directory causing the bloat.
| def _warn_if_package_size_near_limit( | ||
| package_path: Path, | ||
| logger: Optional[logging.Logger] = default_logger, | ||
| ) -> None: | ||
| """Warn if the zipped package is approaching the GCS upload size limit. | ||
|
|
||
| The per-file warning in `_zip_files` does not fire for directories that | ||
| contain many small files (e.g. `.git`), so the user has no signal that | ||
| they are about to hit `GCS_STORAGE_MAX_SIZE` until the upload itself | ||
| fails. This warning closes that gap and includes the local package path | ||
| so the user can inspect its contents. See GH #45602. | ||
| """ | ||
| if logger is None: | ||
| logger = default_logger | ||
| try: | ||
| package_size = package_path.stat().st_size | ||
| except OSError: | ||
| return | ||
| if package_size < PACKAGE_SIZE_WARNING: | ||
| return | ||
| logger.warning( | ||
| f"The runtime_env package at '{package_path}' is " | ||
| f"{_mib_string(package_size)}, approaching the maximum upload size " | ||
| f"of {_mib_string(GCS_STORAGE_MAX_SIZE)}. If the upload fails, exclude " | ||
| "large directories (commonly '.git', '.venv', or build artifacts) via " | ||
| "the 'excludes' option in runtime_env, or list them in '.gitignore' / " | ||
| "'.rayignore'. For details see " | ||
| "https://docs.ray.io/en/latest/ray-core/handling-dependencies.html" | ||
| "#api-reference" | ||
| ) |
There was a problem hiding this comment.
Including the module_path (the source directory or file being packaged) in the warning message would significantly improve actionability. Currently, the warning only shows the path to the temporary zip file, which often has an obscure name (e.g., including timestamps and PIDs) and is deleted shortly after creation in the upload_package_if_needed flow. Providing the source path helps users immediately identify which part of their runtime_env is causing the bloat.
def _warn_if_package_size_near_limit(
package_path: Path,
logger: Optional[logging.Logger] = default_logger,
module_path: Optional[str] = None,
) -> None:
"""Warn if the zipped package is approaching the GCS upload size limit.
The per-file warning in `_zip_files` does not fire for directories that
contain many small files (e.g. `.git`), so the user has no signal that
they are about to hit `GCS_STORAGE_MAX_SIZE` until the upload itself
fails. This warning closes that gap and includes the local package path
so the user can inspect its contents. See GH #45602.
"""
if logger is None:
logger = default_logger
try:
package_size = package_path.stat().st_size
except OSError:
return
if package_size < PACKAGE_SIZE_WARNING:
return
source_info = f" for '{module_path}'" if module_path else ""
logger.warning(
f"The runtime_env package{source_info} at '{package_path}' is "
f"{_mib_string(package_size)}, approaching the maximum upload size "
f"of {_mib_string(GCS_STORAGE_MAX_SIZE)}. If the upload fails, exclude "
"large directories (commonly '.git', '.venv', or build artifacts) via "
"the 'excludes' option in runtime_env, or list them in '.gitignore' / "
"'.rayignore'. For details see "
"https://docs.ray.io/en/latest/ray-core/handling-dependencies.html"
"#api-reference"
)There was a problem hiding this comment.
Good catch — applied in 64e1dd0. _warn_if_package_size_near_limit now takes an optional module_path and includes it in the message when available, and the test was extended to assert the source path is surfaced. The temp zip path is also kept (still useful for inspection when the user catches the warning before upload_package_if_needed unlinks it).
| @@ -697,6 +733,7 @@ def create_package( | |||
| include_parent_dir=include_parent_dir, | |||
| logger=logger, | |||
| ) | |||
| _warn_if_package_size_near_limit(target_path, logger) | |||
There was a problem hiding this comment.
Done in 64e1dd0 — wired module_path through at the call site as suggested.
The existing per-file warning in `_zip_files` (`FILE_SIZE_WARNING`) only triggers when an individual file exceeds 10 MiB. A working_dir made up of many small files — most commonly a `.git` directory — can grow past `GCS_STORAGE_MAX_SIZE` without ever tripping that warning, and the user only finds out when the upload itself fails with a hard `ValueError`. This change emits an actionable warning from `create_package` whenever the resulting zip is at least half of `GCS_STORAGE_MAX_SIZE`, including the local package path so the user can inspect its contents. The check is a single `stat()` call (no directory walk) and runs on both the `ray.init(runtime_env=...)` path and the job-submit / dashboard SDK path, since both call `create_package`. Closes ray-project#45602 Signed-off-by: lonexreb <reach2shubhankar@gmail.com>
Address gemini-code-assist review feedback on ray-project#63404: The warning previously named only the local zip path, which has an auto-generated `<time_ns>_<pid>_<name>.zip` form and is unlinked seconds later in `upload_package_if_needed`. That makes it hard to identify which `runtime_env` input is responsible for the bloat. Thread the user-supplied `module_path` through to the warning and include it in the message when available, so users can act on the warning even after the temp zip is gone. Extend the regression test to assert the source path is surfaced. Signed-off-by: lonexreb <reach2shubhankar@gmail.com>
64e1dd0 to
89e4446
Compare
| # If the resulting zipped package is at least this large, emit a warning before | ||
| # attempting upload. This catches cases the per-file `FILE_SIZE_WARNING` misses, | ||
| # notably large directories of many small files (e.g. `.git`). See GH #45602. | ||
| PACKAGE_SIZE_WARNING = GCS_STORAGE_MAX_SIZE // 2 |
There was a problem hiding this comment.
Let's make this configurable via env var, and if users set it to -1 disable the warning. Also mention how to disable it in the warning that's logged.
Suggest RAY_PACKAGE_SIZE_WARNING_MIB
There was a problem hiding this comment.
Done in 6f2221a. Implemented RAY_PACKAGE_SIZE_WARNING_MIB (MiB units, read at call time so users can flip it without restarting). Setting it to -1 disables the warning. The warning message now ends with an inline hint pointing at the env var so the disable mechanism is discoverable from the log line itself. A malformed value falls back to the default threshold rather than silently suppressing — wanted to avoid a typo accidentally disabling the warning. Added 4 tests for: disable, high-threshold suppress, low-threshold warn-with-disable-hint, and malformed-value fallback. All 7 tests pass; lint clean.
Address @edoakes review on ray-project#63404: the half-of-GCS-limit threshold is a reasonable default but inevitably wrong for some users. Add a runtime override via `RAY_PACKAGE_SIZE_WARNING_MIB` (in MiB); setting it to `-1` disables the warning entirely. The warning message now ends with an inline hint pointing at the env var so users discover both knobs without having to read source. The env var is read at call time, so users can flip it from inside the driver without restarting. A malformed value falls back to the default threshold rather than silently suppressing — a noisy default is safer than a silent miss for an upload-size limit. Four new tests cover: disable (`-1`), high-threshold suppress, low-threshold warn-with-disable-hint, and malformed-value fallback. Signed-off-by: lonexreb <reach2shubhankar@gmail.com>
…y-project#63404) ## Description Adds an actionable warning when a `runtime_env` package is approaching `GCS_STORAGE_MAX_SIZE`, so users can react before the upload itself fails with a hard `ValueError`. Today, `_zip_files` already warns about *individual* files over 10 MiB (`FILE_SIZE_WARNING`). It does **not** warn about a working_dir built from many small files — most commonly a `.git` directory — so a user can silently zip past `GCS_STORAGE_MAX_SIZE` and only find out when the upload fails. The error message they hit (`Package size (… MiB) exceeds the maximum size of …`) does not tell them where the temp zip lives, so they can't inspect what bloated it. This PR closes that gap: - Adds a `PACKAGE_SIZE_WARNING = GCS_STORAGE_MAX_SIZE // 2` threshold next to the existing `FILE_SIZE_WARNING`. - Adds `_warn_if_package_size_near_limit(...)` which calls `package_path.stat().st_size` (single syscall — no directory walk) and emits a warning naming the local zip path and the most common offenders (`.git`, `.venv`, build artifacts). - Calls it from `create_package` *after* `_zip_files`. That single call site covers **both** entry points: `ray.init(runtime_env=...)` (which goes through `upload_package_if_needed` → `create_package`) and `ray job submit` / dashboard SDK (which goes through `_upload_package` → `create_package`). The hard limit behavior in `_store_package_in_gcs` is unchanged — this is purely an earlier, friendlier warning. ## Related issues Closes ray-project#45602 This picks up the work from the now-stale ray-project#45818, which a maintainer (@edoakes) explicitly offered to shepherd if a contributor resumed it. The previous attempt only modified the dashboard SDK path; this PR puts the check in `create_package` so the warning also fires for the in-process `ray.init(runtime_env=...)` upload path. ## Additional information ### Test plan New `TestCreatePackageSizeWarning` class in `test_runtime_env_packaging.py`: - `test_warns_when_zip_exceeds_threshold` — builds a directory of many small incompressible files, lowers `PACKAGE_SIZE_WARNING` to 1 byte via monkeypatch, calls `create_package`, asserts the warning was emitted and includes the local package path. - `test_does_not_warn_when_zip_below_threshold` — same fixture but threshold set to 10 MiB, asserts no warning. - `test_warn_handles_missing_file` — calls `_warn_if_package_size_near_limit` on a path that doesn't exist; the helper must not raise (defensive against concurrent cleanup between zip creation and stat). Each test passes its own `logging.Logger` with a list handler, because the default Ray logger doesn't propagate to `caplog`. All three tests pass locally: ``` TestCreatePackageSizeWarning::test_warns_when_zip_exceeds_threshold PASSED TestCreatePackageSizeWarning::test_does_not_warn_when_zip_below_threshold PASSED TestCreatePackageSizeWarning::test_warn_handles_missing_file PASSED ``` `ruff check` and `ruff format --check` are clean on both modified files (the only remaining `ruff format` diff in the repo is on pre-existing code I didn't touch). --------- Signed-off-by: lonexreb <reach2shubhankar@gmail.com> Signed-off-by: phattruong <23120318@student.hcmus.edu.vn>
Description
Adds an actionable warning when a
runtime_envpackage is approachingGCS_STORAGE_MAX_SIZE, so users can react before the upload itself fails with a hardValueError.Today,
_zip_filesalready warns about individual files over 10 MiB (FILE_SIZE_WARNING). It does not warn about a working_dir built from many small files — most commonly a.gitdirectory — so a user can silently zip pastGCS_STORAGE_MAX_SIZEand only find out when the upload fails. The error message they hit (Package size (… MiB) exceeds the maximum size of …) does not tell them where the temp zip lives, so they can't inspect what bloated it.This PR closes that gap:
PACKAGE_SIZE_WARNING = GCS_STORAGE_MAX_SIZE // 2threshold next to the existingFILE_SIZE_WARNING._warn_if_package_size_near_limit(...)which callspackage_path.stat().st_size(single syscall — no directory walk) and emits a warning naming the local zip path and the most common offenders (.git,.venv, build artifacts).create_packageafter_zip_files. That single call site covers both entry points:ray.init(runtime_env=...)(which goes throughupload_package_if_needed→create_package) andray job submit/ dashboard SDK (which goes through_upload_package→create_package).The hard limit behavior in
_store_package_in_gcsis unchanged — this is purely an earlier, friendlier warning.Related issues
Closes #45602
This picks up the work from the now-stale #45818, which a maintainer (@edoakes) explicitly offered to shepherd if a contributor resumed it. The previous attempt only modified the dashboard SDK path; this PR puts the check in
create_packageso the warning also fires for the in-processray.init(runtime_env=...)upload path.Additional information
Test plan
New
TestCreatePackageSizeWarningclass intest_runtime_env_packaging.py:test_warns_when_zip_exceeds_threshold— builds a directory of many small incompressible files, lowersPACKAGE_SIZE_WARNINGto 1 byte via monkeypatch, callscreate_package, asserts the warning was emitted and includes the local package path.test_does_not_warn_when_zip_below_threshold— same fixture but threshold set to 10 MiB, asserts no warning.test_warn_handles_missing_file— calls_warn_if_package_size_near_limiton a path that doesn't exist; the helper must not raise (defensive against concurrent cleanup between zip creation and stat).Each test passes its own
logging.Loggerwith a list handler, because the default Ray logger doesn't propagate tocaplog.All three tests pass locally:
ruff checkandruff format --checkare clean on both modified files (the only remainingruff formatdiff in the repo is on pre-existing code I didn't touch).