Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add top-level Like, ILike, SimilarTo expressions in logical plan #3298

Merged
merged 2 commits into from
Sep 2, 2022

Conversation

andygrove
Copy link
Member

Which issue does this PR close?

Part of #3099

Rationale for this change

This is a subset of #3101 which has been open for a few weeks now and depends on other PRs.

This adds new Like, ILike, and SimilarTo expressions to the logical plan but does not plumb it all the way through, so DataFusion SQL query planner continues to use binary expressions with the Operator::Like and Operator::NotLike for now.

There should be no functional changes with this PR but it allows other query engines (such as Dask SQL) to add support for these expressions with escaped characters.

What changes are included in this PR?

New Expr types.

Are there any user-facing changes?

No

@github-actions github-actions bot added core Core datafusion crate logical-expr Logical plan and expressions optimizer Optimizer rules sql labels Aug 30, 2022
@andygrove
Copy link
Member Author

@alamb @tustvold @jdye64 PTAL when you can. Ideally, I would like to get this merged by Friday before cutting the 12.0.0 RC.

@codecov-commenter
Copy link

codecov-commenter commented Aug 30, 2022

Codecov Report

Merging #3298 (73db5b6) into master (516ad0d) will decrease coverage by 0.32%.
The diff coverage is 0.96%.

@@            Coverage Diff             @@
##           master    #3298      +/-   ##
==========================================
- Coverage   85.75%   85.42%   -0.33%     
==========================================
  Files         294      294              
  Lines       53749    53955     +206     
==========================================
  Hits        46091    46091              
- Misses       7658     7864     +206     
Impacted Files Coverage Δ
datafusion/core/src/datasource/listing/helpers.rs 95.01% <ø> (ø)
datafusion/core/src/physical_plan/planner.rs 77.39% <0.00%> (-2.06%) ⬇️
datafusion/expr/src/expr_rewriter.rs 77.01% <0.00%> (-6.74%) ⬇️
datafusion/expr/src/expr_schema.rs 61.67% <0.00%> (-1.91%) ⬇️
datafusion/expr/src/expr_visitor.rs 53.68% <0.00%> (-5.62%) ⬇️
datafusion/expr/src/utils.rs 91.10% <ø> (ø)
...tafusion/optimizer/src/common_subexpr_eliminate.rs 90.77% <0.00%> (-2.09%) ⬇️
datafusion/optimizer/src/simplify_expressions.rs 83.26% <0.00%> (-0.27%) ⬇️
datafusion/proto/src/from_proto.rs 34.26% <0.00%> (-0.87%) ⬇️
datafusion/proto/src/to_proto.rs 48.89% <0.00%> (-2.19%) ⬇️
... and 4 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Contributor

@jdye64 jdye64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me. Had a question about anticipated Expr type I couldn't quite derive from the rest of the PR.

Note I didn't really look at the physical planner piece as I am not as familiar with those bits.

Like {
negated: bool,
expr: Box<Expr>,
pattern: Box<Expr>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious, would Expr map to a Expr::ScalarValue for a pattern, aka would the user expect a scalar value, or would it map to some sort of function? Just curious what to expect here when using this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typically this would be a Expr::Literal, but it can be any expression that evaluates to a string. It could be another column for example.

@andygrove andygrove merged commit a5d6ae4 into apache:master Sep 2, 2022
@andygrove andygrove deleted the new-like-expr branch September 2, 2022 14:18
@andygrove andygrove added the api change Changes the API exposed to users of the crate label Sep 2, 2022
@ursabot
Copy link

ursabot commented Sep 2, 2022

Benchmark runs are scheduled for baseline = dadd2dc and contender = a5d6ae4. a5d6ae4 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

kmitchener pushed a commit to kmitchener/arrow-datafusion that referenced this pull request Sep 4, 2022
MazterQyou pushed a commit to cube-js/arrow-datafusion that referenced this pull request Dec 1, 2022
MazterQyou pushed a commit to cube-js/arrow-datafusion that referenced this pull request Dec 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api change Changes the API exposed to users of the crate core Core datafusion crate logical-expr Logical plan and expressions optimizer Optimizer rules sql
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants