Skip to content

[Data] Fix wide_schema_pipeline_tensors cloudpickle deserialization#62149

Merged
goutamvenkat-anyscale merged 4 commits into
ray-project:masterfrom
goutamvenkat-anyscale:fix-wide-schema-pipeline-tensors-deserialization
Mar 30, 2026
Merged

[Data] Fix wide_schema_pipeline_tensors cloudpickle deserialization#62149
goutamvenkat-anyscale merged 4 commits into
ray-project:masterfrom
goutamvenkat-anyscale:fix-wide-schema-pipeline-tensors-deserialization

Conversation

@goutamvenkat-anyscale

Copy link
Copy Markdown
Contributor

Summary

  • The wide_schema_pipeline_tensors release test fails because the S3 test data was written by Ray 2.49-2.54, which serialized tensor shape metadata with cloudpickle. Current Ray expects JSON and raises ValueError.
  • Sets RAY_DATA_AUTOLOAD_CLOUDPICKLE_TENSOR_METADATA=1 in the test script to enable the cloudpickle fallback for this trusted internal dataset.

Test plan

  • Re-run the wide_schema_pipeline_tensors release test and verify it passes
  • Confirm other wide_schema_pipeline_* variants (primitives, objects, nested_structs) are unaffected

🤖 Generated with Claude Code

…tensor metadata deserialization

The test data in S3 was written by Ray 2.49-2.54 which used cloudpickle
for tensor shape metadata. Current Ray expects JSON and fails. Setting
RAY_DATA_AUTOLOAD_CLOUDPICKLE_TENSOR_METADATA=1 enables the fallback.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Goutam <goutam@anyscale.com>

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the release_data_tests.yaml file to include the RAY_DATA_AUTOLOAD_CLOUDPICKLE_TENSOR_METADATA=1 environment variable when running the wide_schema_pipeline_benchmark.py script. I have no feedback to provide.

@goutamvenkat-anyscale goutamvenkat-anyscale added the data Ray Data-related issues label Mar 28, 2026
goutamvenkat-anyscale and others added 2 commits March 27, 2026 19:02
The previous fix only set the env var on the driver. The deserialization
also happens in worker tasks (_fetch_parquet_file_info), so propagate
via ray.init(runtime_env=...) for the tensors data type.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Goutam <goutam@anyscale.com>
Setting the env var only on the driver script doesn't reach worker
tasks. Use cluster.byod.runtime_env which is the established pattern
in release tests for setting env vars across all nodes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Goutam <goutam@anyscale.com>

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Comment thread release/release_data_tests.yaml
@goutamvenkat-anyscale goutamvenkat-anyscale added the go add ONLY when ready to merge, run all tests label Mar 30, 2026
…de_schema_pipeline tests

Signed-off-by: Goutam <goutam@anyscale.com>
@goutamvenkat-anyscale goutamvenkat-anyscale enabled auto-merge (squash) March 30, 2026 16:54
@goutamvenkat-anyscale goutamvenkat-anyscale merged commit 8b07fc4 into ray-project:master Mar 30, 2026
7 checks passed
mancfactor pushed a commit to mancfactor/ray that referenced this pull request Apr 2, 2026
…ay-project#62149)

## Summary
- The `wide_schema_pipeline_tensors` release test fails because the S3
test data was written by Ray 2.49-2.54, which serialized tensor shape
metadata with cloudpickle. Current Ray expects JSON and raises
`ValueError`.
- Sets `RAY_DATA_AUTOLOAD_CLOUDPICKLE_TENSOR_METADATA=1` in the test
script to enable the cloudpickle fallback for this trusted internal
dataset.

## Test plan
- [ ] Re-run the `wide_schema_pipeline_tensors` release test and verify
it passes
- [ ] Confirm other `wide_schema_pipeline_*` variants (primitives,
objects, nested_structs) are unaffected

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Signed-off-by: Goutam <goutam@anyscale.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Frank Mancina <fmancina@haproxy.com>
Lucas61000 pushed a commit to Lucas61000/ray that referenced this pull request May 15, 2026
…ay-project#62149)

## Summary
- The `wide_schema_pipeline_tensors` release test fails because the S3
test data was written by Ray 2.49-2.54, which serialized tensor shape
metadata with cloudpickle. Current Ray expects JSON and raises
`ValueError`.
- Sets `RAY_DATA_AUTOLOAD_CLOUDPICKLE_TENSOR_METADATA=1` in the test
script to enable the cloudpickle fallback for this trusted internal
dataset.

## Test plan
- [ ] Re-run the `wide_schema_pipeline_tensors` release test and verify
it passes
- [ ] Confirm other `wide_schema_pipeline_*` variants (primitives,
objects, nested_structs) are unaffected

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Signed-off-by: Goutam <goutam@anyscale.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

data Ray Data-related issues go add ONLY when ready to merge, run all tests

2 participants