Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to parse a combination of WITH, CASE and EXTRACT(... FROM ...) #291

Closed
qs opened this issue Jul 1, 2021 · 16 comments
Closed

Failed to parse a combination of WITH, CASE and EXTRACT(... FROM ...) #291

qs opened this issue Jul 1, 2021 · 16 comments

Comments

@qs
Copy link

qs commented Jul 1, 2021

Example (for BigQuery):

with t as (CASE EXTRACT(dayofweek FROM CURRENT_DATETIME()) when 1 then "S" end) select * from t

error:

  File "/Library/Python/3.8/site-packages/pyparsing.py", line 1955, in parseString
    raise exc
  File "/Library/Python/3.8/site-packages/pyparsing.py", line 2969, in parseImpl
    raise ParseException(instring, loc, self.errmsg, self)
pyparsing.ParseException: Expected {select statement | {Suppress:("(") select statement Suppress:(")")}}, found 'w'  (at char 13), (line:2, col:13)
@ptmcg
Copy link
Member

ptmcg commented Aug 15, 2021

I would welcome a PR to make these enhancements, but cannot put time into this just now. Not likely to be in 3.0 otherwise.

@ptmcg
Copy link
Member

ptmcg commented Nov 13, 2021

@smedberg - I did some minor tuning and reformatting (mostly suppressing black) in this example, but did not see any obvious fix. Can you take a look at this issue?

@smedberg
Copy link
Contributor

@ptmcg I'm retired and not really doing programming anymore, so it's not super-easy for me to work on this. Is it important?

@ptmcg
Copy link
Member

ptmcg commented Nov 13, 2021

No problem - as the original author I thought you might have an insight on where these clauses might be overlooked or misparsed. Does not seem to be urgent, I'm just trying to clear out some old tickets.

Glad to hear that people can still retire. I expect I'll be working until my dying day.

@klahnakoski
Copy link

I assume this issue relates to one of the examples. Which one? the simpleSQL.py? Is there some docs that may confirm this is valid SQL? The parser is "Expected {select statement", which is also what I expect in the WITH clause.

@ptmcg
Copy link
Member

ptmcg commented Nov 16, 2021

This is on the bigquery_view_parser.py. I looked at it briefly, it looks like the parser should handle this as-is, but didn't find where it is going off the rails. Here is a link to BQ's SELECT grammar: https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax At minimum, we should add this link at the top of the example code.

@klahnakoski
Copy link

The link you provided indicates that the WITH clause must have the form

WITH t AS (SELECT...

The example is not of this form

 with t as (CASE ...

The BigQuery parser is parsing correctly

    with_clause = Group(
        identifier.setParseAction(record_with_alias)
        + AS
        + LPAR
        + select_stmt
        + RPAR
    )

@ptmcg
Copy link
Member

ptmcg commented Nov 17, 2021

The current parser includes CASE as a keyword and a CASE construct, but I can't find that in that SQL reference anywhere either.

@qs can you locate a BiqQuery SQL reference that includes these grammar elements?

@klahnakoski
Copy link

Case is here:

    case_stmt = (
        CASE
        + Optional(case_expr.copy())
        + case_clauses("case_clauses")
        + Optional(case_else)
        + END
    )("case")

@smedberg
Copy link
Contributor

FWIW, I originally wrote this code for my employer, and was parsing pre-existing BigQuery SQL statements. I think there were a few cases where the SQL clearly worked in BigQuery, but I couldn't find documentation saying that it should work. I.e. I think the parser currently handles some undocumented, but working, BigQuery constructs. It was years ago, so I don't remember the details.

@ptmcg
Copy link
Member

ptmcg commented Nov 17, 2021

@klahnakoski yes, I know. I was looking for it in that web page I referenced above. I didn't see it anywhere in those docs, so this parser must come from a different reference, which, hopefully, would shed light on why @qs thinks a CASE clause inside a WITH should be valid in their example.

@klahnakoski
Copy link

@ptmcg You are probably correct I have a SQL parser allowing both queries and expressions

    with_expr = delimited_list(Group(
        (
            (var_name("name") + Optional(LB + delimited_list(ident("col")) + RB))
            / to_alias
        )("name")
        + (AS + LB + (query | expr)("value") + RB)
    ))

@ptmcg
Copy link
Member

ptmcg commented Apr 30, 2022

I've updated the bigquery_select_parser.py example to match the grammar at https://cloud.google.com/bigquery/docs/reference/legacy-sql, to be released in 3.0.9. It does not parse the example you posted. Please verify that it is a valid BigQuery legacy SQL statement.

@smedberg
Copy link
Contributor

FWIW, I'm the original author of this code (well, not really, since it started as a copy of SimpleSQL.py- thanks to that author!) Historically, I based it on actual queries in a production system, rather than basing it on the query syntax documentation, which may explain why there are sometimes weird edge cases that are documented in the query syntax but not handled or vice versa.
In the example of

with w as (CASE x when y then z end) select ...

I don't think I've seen that kind of query before, and as I read the doc at https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax#sql_syntax I don't expect it to work. I don't have easy access to a live system so I haven't tested it- maybe it does work?
Instead, I'd naively expect that this is the correct syntax:

with w as (SELECT CASE x when y then z end) select ...

That is, the WITH_CLAUSE in with w as (WITH_CLAUSE) select has to be a "query expression" and hence has to either be a SELECT or a set operation (and the set operation in turn has to have SELECTs.)

I think I'm just agreeing with @ptmcg that it seems like the documentation does not say that the query in question is supported. Which isn't to say that it doesn't work! And I could totally be reading the doc wrong, of course.

@ptmcg
Copy link
Member

ptmcg commented Apr 30, 2022

Thanks @smedberg for chiming in. With your change, and looking a little further at the CASE examples on that web page, this version of that statement works:

with t as (select CASE when EXTRACT(dayofweek FROM CURRENT_DATETIME()) == 1 then "S" end) select * from t

This parses successfully with the latest version of the example code. I'm going to chalk this issue up to a typo and close it out. The updated parser will go out in the next release, and should be pretty faithful to the legacy BigQuery SELECT SQL.

@smedberg
Copy link
Contributor

Thanks @ptmcg !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants