New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for aggregate expressions with filters #585
Conversation
Pull Request Test Coverage Report for Build 2980739859
💛 - Coveralls |
Code needs cleaning up but maybe this approach can work? |
tests/sqlparser_hive.rs
Outdated
|
||
#[test] | ||
fn filter_as_alias() { | ||
let rename = "SELECT name filter FROM region"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need a different style test - this fails because it is re-written as SELECT name AS filter FROM region
This was the approach I was crawling towards. Hopefully I'm not being too obvious, but the
fails with
|
Ugh, this was passing at one point. I will try and take another look over the weekend. |
I think this is failing because the
|
Hmm, I think the FILTER is applied to the specific expression in the select. It's not for the whole select, but for each column. The example of this SQL that led me down this rabbit hole:
so I think the logic has to be pushed "deeper" in to the parser logic, pushed from |
@@ -36,4 +36,8 @@ impl Dialect for HiveDialect { | |||
|| ch == '{' | |||
|| ch == '}' | |||
} | |||
|
|||
fn supports_filter_during_aggregation(&self) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we would want to add this to src/dialect/postgres.rs
as well since postgresql supports FILTER
also https://www.postgresql.org/docs/current/sql-expressions.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
I will resume work on this tomorrow |
This work got me closer but I'm unable to parse expressions with the filter proceeded a
using this PR as a git crate dependency now yields:
background: https://trino.io/docs/current/functions/aggregate.html
|
Should the select item |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Hi @xrl. I am not sure I understand this. Is the goal to extract one array element from the aggregated array? I tried this in postgres and it also did not like this syntax:
Is this query working for you in Hive/Trino? Maybe you could do this with a subquery for the aggregation? This worked for me in Postgres:
|
@alamb Could you review when you have time |
Interesting. Ok, I filed a new issue for supporting this. I think this is separate from the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Every time I look at a PR in this repo I feel like I learn something new about SQL. TIL FILTER (where)
🤯
Thanks @andygrove this PR looks reasonable to me
@@ -42,6 +42,10 @@ impl Dialect for PostgreSqlDialect { | |||
None | |||
} | |||
} | |||
|
|||
fn supports_filter_during_aggregation(&self) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For what it is worth, I think the DataFusion actually uses the Generic Dialect, https://github.com/apache/arrow-datafusion/blob/0084aeb686b318cbdb49cab00cb8f15c9f520d1e/datafusion/sql/src/parser.rs#L102
Thus we may want to add return true from supports_filter_during_aggregation
in https://github.com/sqlparser-rs/sqlparser-rs/blob/main/src/dialect/generic.rs as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Dask SQL has their own dialect where they want to use this, but it makes sense to add to the dialect that DataFusion is using as well.
Closes #577