Skip to content

[Docs] Note that /home/ray is not a personal-path leak#63646

Merged
elliot-barn merged 1 commit into
ray-project:masterfrom
dstrodtman:doc-1054-home-ray-anon-rule
May 26, 2026
Merged

[Docs] Note that /home/ray is not a personal-path leak#63646
elliot-barn merged 1 commit into
ray-project:masterfrom
dstrodtman:doc-1054-home-ray-anon-rule

Conversation

@dstrodtman

Copy link
Copy Markdown
Contributor

Description

Adds a documentation-authoring rule to doc/.claude/CLAUDE.md: /home/ray is Ray's runtime home directory in containers and clusters, not a personal-path leak. Notebook-output anonymization passes must not rewrite it. Anonymize only real user identifiers (/Users/<name>, /home/<person>) and experiment or output dirs that encode a person or the deprecated AIR runtime.

Related issues

Surfaced during review of #63464, where an anonymization pass over Tune example notebook outputs incorrectly rewrote /home/ray to ~. [DOC-1054]

Long-term recurrence prevention (stripping or anonymizing notebook outputs in the test/refresh pipeline) is tracked in DOC-907.

Additional information

The companion content fix — restoring the wrongly-anonymized /home/ray paths and anonymizing the leaked christy-air experiment dir to ray_results — is in #63464.

🤖 Generated with Claude Code

Adds a rule to doc/.claude/CLAUDE.md so notebook-output anonymization
passes don't treat /home/ray (Ray's runtime home in containers and
clusters) as a personal path. Anonymize only real user identifiers and
experiment dirs that encode a person or the deprecated AIR runtime.

Surfaced during review of ray-project#63464.

[DOC-1054]

Signed-off-by: Douglas Strodtman <douglas@anyscale.com>
@dstrodtman dstrodtman requested a review from a team as a code owner May 26, 2026 20:24

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds a new section to doc/.claude/CLAUDE.md outlining guidelines for anonymizing paths in notebook output cells. The reviewer suggested condensing this new section to keep the file under its 50-line limit.

Comment thread doc/.claude/CLAUDE.md
Comment on lines +49 to +51
## Anonymizing paths in notebook output cells

`/home/ray` is Ray's runtime home directory in containers and clusters, not a personal-path leak — never anonymize it. Anonymize only real user identifiers: personal home prefixes (`/Users/<name>`, `/home/<person>`) become `~`, and experiment or output dirs that encode a person or the deprecated AIR runtime (e.g. `christy-air`) become a neutral equivalent such as `ray_results`. Precedent: #63464.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The file doc/.claude/CLAUDE.md now exceeds the 50-line limit specified in the header instruction (<!-- Keep under 50 lines. Multi-step procedures → skills. Code style → rules/. -->). Consider condensing this new section to help keep the file size closer to the limit.

Suggested change
## Anonymizing paths in notebook output cells
`/home/ray` is Ray's runtime home directory in containers and clusters, not a personal-path leak — never anonymize it. Anonymize only real user identifiers: personal home prefixes (`/Users/<name>`, `/home/<person>`) become `~`, and experiment or output dirs that encode a person or the deprecated AIR runtime (e.g. `christy-air`) become a neutral equivalent such as `ray_results`. Precedent: #63464.
## Anonymizing paths in notebook output cells
Never anonymize /home/ray (Ray's container/cluster runtime home). Only anonymize real user identifiers: personal home prefixes (/Users/<name>, /home/<person>) to ~, and personal/deprecated AIR experiment dirs (e.g., christy-air) to ray_results. Precedent: #63464.
@elliot-barn elliot-barn merged commit 11d9be4 into ray-project:master May 26, 2026
3 of 5 checks passed
dstrodtman added a commit to dstrodtman/ray that referenced this pull request May 27, 2026
The anonymized Tune log path in lightgbm_example.ipynb output used
/Users/user/ray_results. Switch to ~/ray_results to match the
convention in ray-project#63464 and codified in ray-project#63646 (personal home prefixes
become ~).

[DOC-991]

Signed-off-by: Douglas Strodtman <douglas@anyscale.com>
matthewdeng pushed a commit that referenced this pull request Jun 2, 2026
)

## Update (post-review)

Three changes since the original commit, in response to review feedback:

1. **`/home/ray` restored (not a leak).** `/home/ray` is Ray's runtime
home directory in containers and clusters, not a personal path. The
first commit wrongly anonymized it to `~`; reverted across
`batch_tuning`, `pbt_guide`, and `tune-pytorch-lightning` (26 paths).
2. **`christy-air` → `ray_results`.** That experiment dir in
`batch_tuning` encodes a person's name plus the deprecated AIR runtime
tag, so it stays anonymized — to `ray_results` (Tune's default storage
dir), across 18 checkpoint paths.
3. **`pbt_visualization.ipynb` folded in.** Adjacent file with 90
`/Users/rdecal/ray_results` leaks in output cells, anonymized to
`~/ray_results`. Brings the total to 10 notebooks.

The `/Users/<name>` leaks (kai, rdecal) remain anonymized to `~`. A
companion agent rule capturing the `/home/ray` guidance is in #63646.

The original description below predates this update; its "substitute `~`
for `/home/ray`" method note no longer applies.

---

## Description

Cleans up personal-path leaks (`/Users/<name>/...`, `/home/ray/...`) in
**output cells** of nine Tune example notebooks under
`doc/source/tune/examples/`. 127 leaks removed across 9 files; cell
sources untouched.

Surfaced by the
[DOC-991](https://anyscale1.atlassian.net/browse/DOC-991)
(#36167) resolving agent — flagged as adjacent rot during
the `pbt_transformers.ipynb` / `lightgbm_example.ipynb` structural fix.

## Related issues

[DOC-1054]

## Additional information

Method: a one-shot Python script anonymized the leaks (substitute `~`
for `/Users/<name>` and `/home/ray` in output-cell text and HTML,
preserving per-file JSON indentation 1/2/4-space). Diff is 126±/126±
lines across the 9 files, proportional to the original leak count.

The 9 affected notebooks:
- `ax_example.ipynb` (orthogonal to DOC-1019 ax-platform 1.0.0 API
change)
- `bayesopt_example.ipynb` (orthogonal to DOC-77 numpy.float
deprecation)
- `bohb_example.ipynb`
- `nevergrad_example.ipynb`
- `tune-xgboost.ipynb`
- `batch_tuning.ipynb`
- `pbt_guide.ipynb`
- `tune-pytorch-lightning.ipynb`
- `tune_mnist_keras.ipynb`

Long-term: leaks will recur until the notebook test/refresh pipeline
strips outputs or anonymizes paths before commit. Out of scope for this
PR — see DOC-907 for the broader notebook-test-coverage work.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Signed-off-by: Douglas Strodtman <douglas@anyscale.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
rueian pushed a commit to rueian/ray that referenced this pull request Jun 4, 2026
…-project#63464)

## Update (post-review)

Three changes since the original commit, in response to review feedback:

1. **`/home/ray` restored (not a leak).** `/home/ray` is Ray's runtime
home directory in containers and clusters, not a personal path. The
first commit wrongly anonymized it to `~`; reverted across
`batch_tuning`, `pbt_guide`, and `tune-pytorch-lightning` (26 paths).
2. **`christy-air` → `ray_results`.** That experiment dir in
`batch_tuning` encodes a person's name plus the deprecated AIR runtime
tag, so it stays anonymized — to `ray_results` (Tune's default storage
dir), across 18 checkpoint paths.
3. **`pbt_visualization.ipynb` folded in.** Adjacent file with 90
`/Users/rdecal/ray_results` leaks in output cells, anonymized to
`~/ray_results`. Brings the total to 10 notebooks.

The `/Users/<name>` leaks (kai, rdecal) remain anonymized to `~`. A
companion agent rule capturing the `/home/ray` guidance is in ray-project#63646.

The original description below predates this update; its "substitute `~`
for `/home/ray`" method note no longer applies.

---

## Description

Cleans up personal-path leaks (`/Users/<name>/...`, `/home/ray/...`) in
**output cells** of nine Tune example notebooks under
`doc/source/tune/examples/`. 127 leaks removed across 9 files; cell
sources untouched.

Surfaced by the
[DOC-991](https://anyscale1.atlassian.net/browse/DOC-991)
(ray-project#36167) resolving agent — flagged as adjacent rot during
the `pbt_transformers.ipynb` / `lightgbm_example.ipynb` structural fix.

## Related issues

[DOC-1054]

## Additional information

Method: a one-shot Python script anonymized the leaks (substitute `~`
for `/Users/<name>` and `/home/ray` in output-cell text and HTML,
preserving per-file JSON indentation 1/2/4-space). Diff is 126±/126±
lines across the 9 files, proportional to the original leak count.

The 9 affected notebooks:
- `ax_example.ipynb` (orthogonal to DOC-1019 ax-platform 1.0.0 API
change)
- `bayesopt_example.ipynb` (orthogonal to DOC-77 numpy.float
deprecation)
- `bohb_example.ipynb`
- `nevergrad_example.ipynb`
- `tune-xgboost.ipynb`
- `batch_tuning.ipynb`
- `pbt_guide.ipynb`
- `tune-pytorch-lightning.ipynb`
- `tune_mnist_keras.ipynb`

Long-term: leaks will recur until the notebook test/refresh pipeline
strips outputs or anonymizes paths before commit. Out of scope for this
PR — see DOC-907 for the broader notebook-test-coverage work.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Signed-off-by: Douglas Strodtman <douglas@anyscale.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
limarkdcunha pushed a commit to limarkdcunha/ray that referenced this pull request Jun 30, 2026
…3646)

## Description

Adds a documentation-authoring rule to `doc/.claude/CLAUDE.md`:
`/home/ray` is Ray's runtime home directory in containers and clusters,
not a personal-path leak. Notebook-output anonymization passes must not
rewrite it. Anonymize only real user identifiers (`/Users/<name>`,
`/home/<person>`) and experiment or output dirs that encode a person or
the deprecated AIR runtime.

## Related issues

Surfaced during review of ray-project#63464, where an anonymization pass over Tune
example notebook outputs incorrectly rewrote `/home/ray` to `~`.
[DOC-1054]

Long-term recurrence prevention (stripping or anonymizing notebook
outputs in the test/refresh pipeline) is tracked in DOC-907.

## Additional information

The companion content fix — restoring the wrongly-anonymized `/home/ray`
paths and anonymizing the leaked `christy-air` experiment dir to
`ray_results` — is in ray-project#63464.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Signed-off-by: Douglas Strodtman <douglas@anyscale.com>
limarkdcunha pushed a commit to limarkdcunha/ray that referenced this pull request Jun 30, 2026
…-project#63464)

## Update (post-review)

Three changes since the original commit, in response to review feedback:

1. **`/home/ray` restored (not a leak).** `/home/ray` is Ray's runtime
home directory in containers and clusters, not a personal path. The
first commit wrongly anonymized it to `~`; reverted across
`batch_tuning`, `pbt_guide`, and `tune-pytorch-lightning` (26 paths).
2. **`christy-air` → `ray_results`.** That experiment dir in
`batch_tuning` encodes a person's name plus the deprecated AIR runtime
tag, so it stays anonymized — to `ray_results` (Tune's default storage
dir), across 18 checkpoint paths.
3. **`pbt_visualization.ipynb` folded in.** Adjacent file with 90
`/Users/rdecal/ray_results` leaks in output cells, anonymized to
`~/ray_results`. Brings the total to 10 notebooks.

The `/Users/<name>` leaks (kai, rdecal) remain anonymized to `~`. A
companion agent rule capturing the `/home/ray` guidance is in ray-project#63646.

The original description below predates this update; its "substitute `~`
for `/home/ray`" method note no longer applies.

---

## Description

Cleans up personal-path leaks (`/Users/<name>/...`, `/home/ray/...`) in
**output cells** of nine Tune example notebooks under
`doc/source/tune/examples/`. 127 leaks removed across 9 files; cell
sources untouched.

Surfaced by the
[DOC-991](https://anyscale1.atlassian.net/browse/DOC-991)
(ray-project#36167) resolving agent — flagged as adjacent rot during
the `pbt_transformers.ipynb` / `lightgbm_example.ipynb` structural fix.

## Related issues

[DOC-1054]

## Additional information

Method: a one-shot Python script anonymized the leaks (substitute `~`
for `/Users/<name>` and `/home/ray` in output-cell text and HTML,
preserving per-file JSON indentation 1/2/4-space). Diff is 126±/126±
lines across the 9 files, proportional to the original leak count.

The 9 affected notebooks:
- `ax_example.ipynb` (orthogonal to DOC-1019 ax-platform 1.0.0 API
change)
- `bayesopt_example.ipynb` (orthogonal to DOC-77 numpy.float
deprecation)
- `bohb_example.ipynb`
- `nevergrad_example.ipynb`
- `tune-xgboost.ipynb`
- `batch_tuning.ipynb`
- `pbt_guide.ipynb`
- `tune-pytorch-lightning.ipynb`
- `tune_mnist_keras.ipynb`

Long-term: leaks will recur until the notebook test/refresh pipeline
strips outputs or anonymizes paths before commit. Out of scope for this
PR — see DOC-907 for the broader notebook-test-coverage work.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Signed-off-by: Douglas Strodtman <douglas@anyscale.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants