Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix IsNull pruning expression generation without null_count statistics #3044

Merged
merged 1 commit into from Aug 8, 2022

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Aug 5, 2022

(this is a one line code change PR, the rest is tests)

Which issue does this PR close?

Closes #3042

Rationale for this change

See #3042

What changes are included in this PR?

  1. Set the nullability annotation correctly
  2. Add test coverage

Are there any user-facing changes?

@@ -639,7 +639,7 @@ fn build_is_null_column_expr(
Expr::Column(ref col) => {
let field = schema.field_with_name(&col.name).ok()?;

let null_count_field = &Field::new(field.name(), DataType::UInt64, false);
let null_count_field = &Field::new(field.name(), DataType::UInt64, true);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the fix

@github-actions github-actions bot added the core Core datafusion crate label Aug 5, 2022

// i IS NULL, no null statistics
let expr = col("i").is_null();
let p = PruningPredicate::try_new(expr, schema.clone()).unwrap();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prior to the fix, this line would panic

],
);

let expected_ret = vec![false, true, true, true, false];
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case simply didn't have coverage before that I could find

@alamb alamb marked this pull request as ready for review August 5, 2022 15:31
@codecov-commenter
Copy link

Codecov Report

Merging #3044 (a7ff091) into master (581934d) will increase coverage by 0.01%.
The diff coverage is 100.00%.

❗ Current head a7ff091 differs from pull request most recent head ceb9bfd. Consider uploading reports for the commit ceb9bfd to get more accurate results

@@            Coverage Diff             @@
##           master    #3044      +/-   ##
==========================================
+ Coverage   85.85%   85.86%   +0.01%     
==========================================
  Files         286      286              
  Lines       51670    51704      +34     
==========================================
+ Hits        44359    44395      +36     
+ Misses       7311     7309       -2     
Impacted Files Coverage Δ
datafusion/core/src/physical_optimizer/pruning.rs 94.75% <100.00%> (+0.58%) ⬆️
datafusion/core/src/physical_plan/metrics/value.rs 86.93% <0.00%> (-0.51%) ⬇️
datafusion/expr/src/logical_plan/plan.rs 77.60% <0.00%> (ø)
datafusion/expr/src/window_frame.rs 93.27% <0.00%> (+0.84%) ⬆️

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

Copy link
Contributor

@Dandandan Dandandan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@alamb
Copy link
Contributor Author

alamb commented Aug 8, 2022

Thanks for the review @Dandandan

@alamb alamb merged commit 6e6f3bf into apache:master Aug 8, 2022
@ursabot
Copy link

ursabot commented Aug 8, 2022

Benchmark runs are scheduled for baseline = acd1f40 and contender = 6e6f3bf. 6e6f3bf is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@alamb alamb deleted the alamb/null_count branch August 8, 2023 20:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core datafusion crate
Projects
None yet
4 participants