Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support PostgreSQL array subquery constructor #566

Merged
merged 1 commit into from Aug 12, 2022

Conversation

MazterQyou
Copy link
Contributor

This PR adds support for PostgreSQL array subquery constructor (e.g. SELECT ARRAY(SELECT 1 UNION SELECT 2)), as well as a related test.

@MazterQyou MazterQyou force-pushed the upstream-patch/array-subquery branch from c6edf5f to 7ed261e Compare August 10, 2022 00:58
@coveralls
Copy link

coveralls commented Aug 10, 2022

Pull Request Test Coverage Report for Build 2848516561

  • 62 of 63 (98.41%) changed or added relevant lines in 5 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.07%) to 85.444%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/parser.rs 8 9 88.89%
Totals Coverage Status
Change from base Build 2842935516: 0.07%
Covered Lines: 9498
Relevant Lines: 11116

💛 - Coveralls

@@ -2518,7 +2518,7 @@ fn parse_bad_constraint() {

#[test]
fn parse_scalar_function_in_projection() {
let names = vec!["sqrt", "array", "foo"];
let names = vec!["sqrt", "foo"];
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be a controversial change for this test. PostgreSQL doesn't have a function-based array constructor (array(1, 2, 3)), nor can I find information on other databases using such a syntax. Some, like BigQuery, use the ARRAY(subquery) syntax implemented there. It'd be great to hear some thoughts on this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test was added in #432

Can you review the description on that PR and let me know what you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly, postgres doesn't allow the syntax used in the DataFusion test (and referred to in #432)

alamb=# SELECT array(c1, cast(c2 as varchar)) FROM test;
ERROR:  syntax error at or near "c1"
LINE 1: SELECT array(c1, cast(c2 as varchar)) FROM test;
                     ^

https://github.com/apache/arrow-datafusion/blame/834924f274d9e04444030b8eb3e07e1035fcd3cf/datafusion/core/tests/sql/functions.rs#L134 doesn't really help me figure out from who the test originated from as it was moved (maybe some more poking around could find the original author)

@ovr do you remember any of the previous history here?

🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I poked around and found this commit: apache/datafusion@98b710a

Copy link
Collaborator

@alamb alamb Aug 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I poked around and found this commit: apache/datafusion@98b710a

Nice -- that came in from apache/arrow#8102 / https://issues.apache.org/jira/browse/ARROW-9902 by @jorgecarleitao and approved by @andygrove

Can either if you remember if/how using array as a function (rather than the more specialized SQL support offered by postgres) was important? We are triyng to sort out how important backwards compatibility is in this case

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I support this as long as DataFusion team's okay with that. I haven't been able to find a dialect with built-in array function, but at least two of them with built-in array subquery constructor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alamb do you feel it's a good idea to remove the GenericDialiect | PostgreSqlDialect check now so later someone can make negated checks for dialects that shouldn't go the subquery route?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't been able to find a dialect with built-in array function, but at least two of them with built-in array subquery constructor.

Here is the spark reference docs: https://spark.apache.org/docs/2.4.1/api/sql/index.html#array (though perhaps you mean there is no SparkDialect in this crate)?

do you feel it's a good idea to remove the GenericDialiect | PostgreSqlDialect check now so later someone can make negated checks for dialects that shouldn't go the subquery route?

Sounds reasonable to me -- can you also possibly link one or two examples from other dialects that support the array subquery constructor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PostgreSQL array subquery constructor (in fact, all of array constructors): https://www.postgresql.org/docs/current/sql-expressions.html#SQL-SYNTAX-ARRAY-CONSTRUCTORS

I haven't been able to find info on other dialects present in the crate to support array subquery, but at least CockroachDB and BigQuery support array subquery.

However while searching around I found that ClickHouse (which we do have a dialect for) actually has array function constructor built in: https://clickhouse.com/docs/en/sql-reference/functions/array-functions/#arrayx1--operator-x1-

So we can either make a decision for leaving this on for specific dialects (GenericDialect | PostgreSqlDialect for instance) or negate this to support ARRAY(x1, x2, …) for dialects known to support array function constructor (currently ClickHouseDialect only).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's negate it for ClickHouseDialect and call it a day

Copy link
Collaborator

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @MazterQyou -- the code looks great but as you have identified I am not sure about the array test change. I left some comments

@@ -2518,7 +2518,7 @@ fn parse_bad_constraint() {

#[test]
fn parse_scalar_function_in_projection() {
let names = vec!["sqrt", "array", "foo"];
let names = vec!["sqrt", "foo"];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test was added in #432

Can you review the description on that PR and let me know what you think?

@@ -2518,7 +2518,7 @@ fn parse_bad_constraint() {

#[test]
fn parse_scalar_function_in_projection() {
let names = vec!["sqrt", "array", "foo"];
let names = vec!["sqrt", "foo"];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly, postgres doesn't allow the syntax used in the DataFusion test (and referred to in #432)

alamb=# SELECT array(c1, cast(c2 as varchar)) FROM test;
ERROR:  syntax error at or near "c1"
LINE 1: SELECT array(c1, cast(c2 as varchar)) FROM test;
                     ^

https://github.com/apache/arrow-datafusion/blame/834924f274d9e04444030b8eb3e07e1035fcd3cf/datafusion/core/tests/sql/functions.rs#L134 doesn't really help me figure out from who the test originated from as it was moved (maybe some more poking around could find the original author)

@ovr do you remember any of the previous history here?

🤔

@MazterQyou MazterQyou force-pushed the upstream-patch/array-subquery branch 2 times, most recently from 13ce6f5 to 01a69b2 Compare August 12, 2022 18:00
@MazterQyou MazterQyou force-pushed the upstream-patch/array-subquery branch from 01a69b2 to 76e5030 Compare August 12, 2022 18:04
@MazterQyou
Copy link
Contributor Author

@alamb updated the if clause with dialects and added a test for ClickHouse's array function (renaming the file in the process due to a typo 😶). Please re-review

@MazterQyou MazterQyou requested a review from alamb August 12, 2022 18:08
Copy link
Collaborator

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great -- thanks for sticking with it @MazterQyou

@@ -121,6 +121,25 @@ fn parse_array_expr() {
)
}

#[test]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👌 Very nice @MazterQyou

@alamb alamb merged commit 31ba001 into sqlparser-rs:main Aug 12, 2022
@MazterQyou MazterQyou deleted the upstream-patch/array-subquery branch August 12, 2022 18:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants