Add logical range partitioning representation#22777
Conversation
|
Thank you for opening this pull request! Reviewer note: cargo-semver-checks reported the current version number is not SemVer-compatible with the changes in this pull request (compared against the base branch). Details |
|
cc: @gabotechs @stuhood |
| return not_impl_err!( | ||
| "Physical plan does not support Range repartitioning" | ||
| ); |
There was a problem hiding this comment.
This is a TODO, right? Should it point at a ticket?
There was a problem hiding this comment.
yes good catch I can add the epic: #22395
There was a problem hiding this comment.
I created small issues for each not impl. They are not very great descriptions but for tracking for now
7f0949d to
d3907a8
Compare
There was a problem hiding this comment.
Just left some non blocking comments. Pretty straight forward PR! thanks @gene-bordegaray and @stuhood.
|
Sorry for not responding when travelling but addressed comments, thanks @gabotechs 😄 |
|
I know 👍 it's completely fine, you responded with code so that's good enough! |
## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes apache#123` indicates that this PR will close issue apache#123. --> - Closes apache#22778. - Related: apache#21992, apache#22395. - Needed by apache#22657. ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> Declared scan output partitioning should use logical partitioning metadata, not physical partitioning types. This adds logical range partitioning so range-partitioned sources can declare their layout at the logical layer. ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> - Add logical `Partitioning::Range` and `RangePartitioning`. - Move `SplitPoint` and shared split-point validation to `datafusion-common`. - Wire logical range partitioning through expression traversal, rewrites, and display. - Keep planning, logical proto, and Substrait support explicitly unsupported for now. ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> Yes. Unit tests added ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> Yes. This adds public logical range partitioning API. No breaking API changes. <!-- If there are any breaking changes to public APIs, please add the `api change` label. -->
Which issue does this PR close?
Rationale for this change
Declared scan output partitioning should use logical partitioning metadata, not physical partitioning types. This adds logical range partitioning so range-partitioned sources can declare their layout at the logical layer.
What changes are included in this PR?
Partitioning::RangeandRangePartitioning.SplitPointand shared split-point validation todatafusion-common.Are these changes tested?
Yes. Unit tests added
Are there any user-facing changes?
Yes. This adds public logical range partitioning API. No breaking API changes.