[Data] Fix Timer JSON serialization broken by DistributionTracker by xinyuangui2 · Pull Request #63825 · ray-project/ray

xinyuangui2 · 2026-06-03T17:05:33Z

Why

#63586 added a DistributionTracker to Timer for percentile tracking. DistributionTracker is not JSON-serializable, which broke the training-ingest-benchmark release test (image_classification.full_training, s3_url + parquet):

TypeError: Object of type DistributionTracker is not JSON serializable

The benchmark checkpoints each Timer via v.__dict__.copy() + json.dump, and __dict__ now carries the tracker. The tracker's __getstate__/__setstate__ only cover pickle, not json, so they don't help here.

Fix

Add Timer.as_dict() / Timer.from_dict() that round-trip only the scalar fields (_total, _min, _max, _total_count). _distribution is excluded — the KLL sketch isn't reconstructable from summary stats and isn't meant to persist across checkpoints, so it restarts empty on restore (same graceful degradation as when datasketches isn't installed).
Switch the benchmark runner to use these instead of poking __dict__.

🤖 Generated with Claude Code

…ngest benchmark PR ray-project#63586 added a DistributionTracker to Timer for percentile tracking. DistributionTracker is not JSON-serializable, which broke the training-ingest benchmark: its checkpointing code dumps each Timer via `v.__dict__.copy()` + `json.dump`, raising `TypeError: Object of type DistributionTracker is not JSON serializable`. (The tracker's __getstate__/__setstate__ only cover pickle, not json.) Add `Timer.as_dict()` / `from_dict()` that round-trip only the scalar fields (the sketch is not persisted; it restarts empty on restore, same graceful degradation as when datasketches is unavailable), and switch the benchmark runner to use them instead of poking `__dict__`. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: xgui <xgui@anyscale.com>

gemini-code-assist

Code Review

This pull request introduces as_dict and from_dict methods to safely serialize and deserialize scalar stats, resolving JSON serialization issues caused by the non-serializable DistributionTracker in the benchmark runner. Direct __dict__ access has been replaced with these new methods, and comprehensive unit tests have been added. The review feedback recommends improving the robustness of from_dict by adding defensive checks to handle cases where the input state is None, not a dictionary, or contains None values for _total and _total_count.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Ignore a non-dict ``state`` and treat explicit ``None`` values as the empty-Timer defaults (``dict.get(k, 0)`` only defaults a missing key, not a present ``None``), so a partial/corrupt metrics.json can't crash or leave a Timer in an invalid state on restore. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: xgui <xgui@anyscale.com>

…y-project#63825) ## Why ray-project#63586 added a `DistributionTracker` to `Timer` for percentile tracking. `DistributionTracker` is not JSON-serializable, which broke the `training-ingest-benchmark` release test (`image_classification.full_training`, s3_url + parquet): ``` TypeError: Object of type DistributionTracker is not JSON serializable ``` The benchmark checkpoints each `Timer` via `v.__dict__.copy()` + `json.dump`, and `__dict__` now carries the tracker. The tracker's `__getstate__`/`__setstate__` only cover **pickle**, not `json`, so they don't help here. ## Fix - Add `Timer.as_dict()` / `Timer.from_dict()` that round-trip only the scalar fields (`_total`, `_min`, `_max`, `_total_count`). `_distribution` is excluded — the KLL sketch isn't reconstructable from summary stats and isn't meant to persist across checkpoints, so it restarts empty on restore (same graceful degradation as when `datasketches` isn't installed). - Switch the benchmark runner to use these instead of poking `__dict__`. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Signed-off-by: xgui <xgui@anyscale.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

xinyuangui2 requested a review from a team as a code owner June 3, 2026 17:05

gemini-code-assist Bot reviewed Jun 3, 2026

View reviewed changes

Comment thread python/ray/data/_internal/stats.py Outdated

liulehui approved these changes Jun 3, 2026

View reviewed changes

xinyuangui2 added the go add ONLY when ready to merge, run all tests label Jun 3, 2026

ray-gardener Bot added data Ray Data-related issues release-test release test labels Jun 3, 2026

elliot-barn merged commit 691df86 into ray-project:master Jun 3, 2026
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Data] Fix Timer JSON serialization broken by DistributionTracker#63825

[Data] Fix Timer JSON serialization broken by DistributionTracker#63825
elliot-barn merged 2 commits into
ray-project:masterfrom
xinyuangui2:xgui/fix-timer-json-serialization

xinyuangui2 commented Jun 3, 2026

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Labels

3 participants

Uh oh!

Conversation

xinyuangui2 commented Jun 3, 2026

Why

Fix

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Labels

3 participants