Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Builtin Window Function Implementation #4441

Merged
merged 1 commit into from Dec 1, 2022

Conversation

mustafasrepo
Copy link
Contributor

Which issue does this PR close?

Closes #4440.

Rationale for this change

Handling of the Builtin Window Functions (such as ROW_NUMBER, FIRST_VALUE, etc.) are quite different than accumulators (such as SUM, AVG, etc.) even though they have significant commonalities. We can increase readability if we make their implementation similar in spirit; and we can increase code reuse in this way too. This PR changes the PartitionEvaluator trait such that it resembles the Accumulator trait.

What changes are included in this PR?

The create_evaluator method of BuiltInWindowFunctionExpr no longer takes a record batch during creation. Instead; when we are doing evaluation, we pass the relevant batch section to the evaluate method of the PartitionEvaluator trait.

Are these changes tested?

N.A (Existing tests should work after refactor)

Are there any user-facing changes?

@github-actions github-actions bot added the physical-expr Physical Expressions label Nov 30, 2022
@mustafasrepo mustafasrepo changed the title Feature/partition evaluator Refactor Builtin Window Function Implementation Nov 30, 2022
@ozankabak
Copy link
Contributor

FYI, we did an internal review of this -- please let us know if there are any concerns and we will be happy to address them if necessary.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a good change to me -- thank you @mustafasrepo and @metesynnada


/// Get values columns(argument of Window Function)
/// and order by columns (columns of the ORDER BY expression)used in evaluators
fn get_values_orderbys(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 this is a nice refactor of common functionality

&self,
_batch: &RecordBatch,
) -> Result<Box<dyn PartitionEvaluator>> {
fn create_evaluator(&self) -> Result<Box<dyn PartitionEvaluator>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@alamb alamb merged commit 09aea09 into apache:master Dec 1, 2022
@ursabot
Copy link

ursabot commented Dec 1, 2022

Benchmark runs are scheduled for baseline = a0485e7 and contender = 09aea09. 09aea09 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@mustafasrepo mustafasrepo deleted the feature/partition_evaluator branch December 6, 2022 05:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
physical-expr Physical Expressions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactor Built-in, Aggregate window functions to increase code reuse.
4 participants