Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat] Support columns_sorted in row_filters with pageIndex #3477

Closed
wants to merge 1 commit into from

Conversation

Ted-Jiang
Copy link
Member

Which issue does this PR close?

Closes #3476.

I think we can set true with only one col with pageIndex which is ordered. 馃

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

Signed-off-by: yangjiang <yangjiang@ebay.com>
@@ -79,6 +79,7 @@ object_store = "0.5.0"
ordered-float = "3.0"
parking_lot = "0.12"
parquet = { version = "22.0.0", features = ["arrow", "async"] }
parquet-format = "4.0.0"
Copy link
Member Author

@Ted-Jiang Ted-Jiang Sep 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we move this logic to arrow-rs to avoid import this parquet-format

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that makes sense

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think parquet 23.0.0 will have properly exposed arrow definitions (added by @tustvold in apache/arrow-rs#2626)

Though now that I see this, I see the proposal is probably to move the functions columns_sorted and check_is_ordered into the parquet crate which makes sense to me (and will likely get reviewed by some others with more parquet knowledge)

@Ted-Jiang can you file a ticket / PR to do so?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure!

@github-actions github-actions bot added the core Core datafusion crate label Sep 14, 2022
@Ted-Jiang
Copy link
Member Author

@alamb @thinkharderdev @tustvold PTAL

Copy link
Contributor

@thinkharderdev thinkharderdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -79,6 +79,7 @@ object_store = "0.5.0"
ordered-float = "3.0"
parking_lot = "0.12"
parquet = { version = "22.0.0", features = ["arrow", "async"] }
parquet-format = "4.0.0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think parquet 23.0.0 will have properly exposed arrow definitions (added by @tustvold in apache/arrow-rs#2626)

Though now that I see this, I see the proposal is probably to move the functions columns_sorted and check_is_ordered into the parquet crate which makes sense to me (and will likely get reviewed by some others with more parquet knowledge)

@Ted-Jiang can you file a ticket / PR to do so?

@alamb
Copy link
Contributor

alamb commented Sep 16, 2022

marking this PR as a draft pending resolution of @liukun4515 's review: #3477 (comment)

@alamb
Copy link
Contributor

alamb commented Nov 28, 2023

Closing as this PR is over a year old. Please feel free to reopen it / rebase it if you plan to keep working on it

@alamb alamb closed this Nov 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core datafusion crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support columns_sorted in row_filters
4 participants