Skip to content

[Data] Ensure consistent nan_is_null semantics in unique_post_fn of encoder#62623

Merged
goutamvenkat-anyscale merged 1 commit into
ray-project:masterfrom
ayushk7102:fix_encoder_follow_up
Apr 15, 2026
Merged

[Data] Ensure consistent nan_is_null semantics in unique_post_fn of encoder#62623
goutamvenkat-anyscale merged 1 commit into
ray-project:masterfrom
ayushk7102:fix_encoder_follow_up

Conversation

@ayushk7102

@ayushk7102 ayushk7102 commented Apr 15, 2026

Copy link
Copy Markdown
Contributor

Description

This is a follow-up PR to #62618 to ensure consistent behavior in gen_value_index_arrow_from_arrow of encoder.py. Previously, if drop_na_values was True, we were dropping nulls (and not NaNs). This PR ensures that we drop NaNs as well

…ncoder.py

Signed-off-by: Ayush Kumar <ayushk7102@gmail.com>
@ayushk7102 ayushk7102 requested a review from a team as a code owner April 15, 2026 00:05

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the null-dropping logic in the gen_value_index_arrow_from_arrow function within the Ray Data encoder preprocessor. The change replaces pc.drop_null with a manual filter that explicitly treats NaNs as nulls when drop_na_values is enabled, ensuring more robust handling of missing data. I have no feedback to provide as there are no review comments to evaluate.

@goutamvenkat-anyscale goutamvenkat-anyscale enabled auto-merge (squash) April 15, 2026 00:14
@github-actions github-actions Bot added the go add ONLY when ready to merge, run all tests label Apr 15, 2026
@goutamvenkat-anyscale goutamvenkat-anyscale merged commit f7bf8e0 into ray-project:master Apr 15, 2026
6 of 7 checks passed
HLDKNotFound pushed a commit to chichic21039/ray that referenced this pull request Apr 22, 2026
…ncoder (ray-project#62623)

## Description
This is a follow-up PR to ray-project#62618
to ensure consistent behavior in `gen_value_index_arrow_from_arrow` of
`encoder.py`. Previously, if `drop_na_values` was True, we were dropping
nulls (and not NaNs). This PR ensures that we drop NaNs as well

Signed-off-by: Ayush Kumar <ayushk7102@gmail.com>
Lucas61000 pushed a commit to Lucas61000/ray that referenced this pull request May 15, 2026
…ncoder (ray-project#62623)

## Description
This is a follow-up PR to ray-project#62618
to ensure consistent behavior in `gen_value_index_arrow_from_arrow` of
`encoder.py`. Previously, if `drop_na_values` was True, we were dropping
nulls (and not NaNs). This PR ensures that we drop NaNs as well

Signed-off-by: Ayush Kumar <ayushk7102@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go add ONLY when ready to merge, run all tests

2 participants