New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move intersect_row_selections
from datafusion to arrow-rs.
#3047
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Ted-Jiang !
@@ -590,6 +590,66 @@ impl ParquetRecordBatchReader { | |||
} | |||
} | |||
|
|||
// Combine two lists of `RowSelection` return the intersection of them |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend we put this function in parquet/src/arrow/arrow_reader/selection.rs
that way it is logically grouped with similar functionality as well as is in the same module as the test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alamb I used to put it in selection.rs
refactor avoid pub mod,
but it need pub the selection
mod avoid method not used, it this ok?🤔 or any method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps you could extend
https://github.com/apache/arrow-rs/blob/4f525fe/parquet/src/arrow/arrow_reader/mod.rs#L46
pub use selection::{RowSelection, RowSelector};
to
pub use selection::{intersect_row_selections, RowSelection, RowSelector};
So it is exposed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or perhaps we could rework this method as a member function on RowSelection? i.e. fn intersection(&self, other: &RowSelection) -> RowSelection
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or perhaps we could rework this method as a member function on RowSelection? i.e.
fn intersection(&self, other: &RowSelection) -> RowSelection
?
This is a good idea , but it may call from(selectors: Vec<RowSelector>)
more times🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps you could extend
https://github.com/apache/arrow-rs/blob/4f525fe/parquet/src/arrow/arrow_reader/mod.rs#L46
pub use selection::{RowSelection, RowSelector};to
pub use selection::{intersect_row_selections, RowSelection, RowSelector};So it is exposed?
@alamb 😂 yes, thanks notice its ok only one method in one private mod
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong opinion
7627627
to
8a2b540
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me -- any concerns @tustvold ?
Maybe as a follow on we could make a wrapper around Vec to add such functions
/// a list of RowSelector
pub struct RowSelections {
selections: Vec<RowSelector>
}
impl RowSelections {
fn intersect(other: &Self) -> Self {
...
}
}
🤔
Benchmark runs are scheduled for baseline = 4dd7fea and contender = 9f14683. 9f14683 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Which issue does this PR close?
Closes #3003.
Rationale for this change
What changes are included in this PR?
Are there any user-facing changes?