[Data] Add file partitioning for DataSourceV2 [3/n]#61997
Conversation
Signed-off-by: Goutam <goutam@anyscale.com>
There was a problem hiding this comment.
Code Review
The pull request introduces the FilePartitioner abstract base class and its concrete implementation, RoundRobinPartitioner, to handle file partitioning for DataSourceV2. The RoundRobinPartitioner effectively groups files into manifests based on estimated in-memory sizes, ensuring balanced read tasks. The overall design is clear and addresses the stated goal of adding file partitioning abstractions.
Signed-off-by: Goutam <goutam@anyscale.com>
| InMemorySizeEstimator, | ||
| ) | ||
|
|
||
| logger = logging.getLogger(__name__) |
There was a problem hiding this comment.
Unused logger variable defined but never referenced
Low Severity
The logging import and logger variable on line 13 are unused — no logger.debug(...), logger.warning(...), or any other call appears anywhere in this file. Other files in this module (e.g. file_indexer.py) define logger and actually use it, so this looks like copy-paste scaffolding that was never wired up.
Signed-off-by: Goutam <goutam@anyscale.com>
## Description Add requisite abstractions for File partitioning, particularly the RoundRobinPartitioner. ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Goutam <goutam@anyscale.com>
## Description Add requisite abstractions for File partitioning, particularly the RoundRobinPartitioner. ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Goutam <goutam@anyscale.com>


Description
Add requisite abstractions for File partitioning, particularly the RoundRobinPartitioner.
Related issues
Additional information