New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[postgres] Add support for custom binary operators #548
Conversation
More details about operators in general are at: https://www.postgresql.org/docs/current/sql-createoperator.html. This patch attempts to parse `SELECT` queries that reference an operator using `OPERATOR(<optional_schema>.<operator_name>)` syntax. This is a PostgreSQL extension. There are no provisions for user-defined operators in the SQL standard.
@alamb could you please help with triggering the workflow? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the contribution @iskakaushik -- I'll trigger the tests
Pull Request Test Coverage Report for Build 2805285978
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @iskakaushik -- this is looking close.
src/parser.rs
Outdated
@@ -1159,6 +1159,32 @@ impl<'a> Parser<'a> { | |||
} | |||
} | |||
Keyword::XOR => Some(BinaryOperator::Xor), | |||
Keyword::OPERATOR if dialect_of!(self is PostgreSqlDialect) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given @ovr 's comment here, perhaps we should also add support when parsing in GenericDialect
? apache/datafusion#3037 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done! I kept the name as PGCustomOperator
but happy to change it as well.
src/ast/operator.rs
Outdated
pub struct PGCustomOperator { | ||
pub schema: Option<String>, | ||
pub name: String, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would you think about using the pre-existing Identifier
here (that supports various qualified names)
like:
pub struct PGCustomOperator { | |
pub schema: Option<String>, | |
pub name: String, | |
pub struct PGCustomOperator { | |
pub ident: ObjectName |
Thank you could use parse_object_name
instead of custom parsing logic.
Lines 3114 to 3125 in 076b587
/// Parse a possibly qualified, possibly quoted identifier, e.g. | |
/// `foo` or `myschema."table" | |
pub fn parse_object_name(&mut self) -> Result<ObjectName, ParserError> { | |
let mut idents = vec![]; | |
loop { | |
idents.push(self.parse_identifier()?); | |
if !self.consume_token(&Token::Period) { | |
break; | |
} | |
} | |
Ok(ObjectName(idents)) | |
} |
This would likely require less code as well as handle parsing more cases (like OPERATOR("~")
-- aka quoted strings)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried doing this, but looks like Identifier
doesn't allow for Token
s that are un-quoted operators, for e.g: ~
is not a valid identifier. I am not sure if extending the parse_identifier
to allow for un-quoted operators is the right call here.
IMO we could start with the existing mechanism of schema qualifier operator name (though I would've liked to use the parse_object_name
logic) and can extend it as needed later. Let me know.
@iskakaushik I am planning to make a sqlparser release sometime soon -- I think this PR has just a few more items and it could be ready to merge. Let me know what you think. |
I will address the feedback today 😊 |
src/ast/operator.rs
Outdated
pub schema: Option<String>, | ||
pub name: String, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for trying -- Something about hard coding "schema" seems not right to me here -- at the very least because there can also be a database in identifiers https://www.postgresql.org/docs/current/ddl-schemas.html
-- e.g. database.schema.table
, so maybe it should be a Vec<String>
I think given the special set of allowed characters described in https://www.postgresql.org/docs/current/sql-createoperator.html in The operator name is a sequence of up to NAMEDATALEN-1 (63 by default) characters from the following list:
to properly parse this limitation some extra code would be needed
Let me give it a shot
@iskakaushik here is my proposal: iskakaushik#1 let me know what you think |
@alamb I like you approach. The comment you made about converting to Thanks for fixing up my patch, really appreciate it! |
I will fix CI |
Thanks again @iskakaushik |
More details about operators in general are at: https://www.postgresql.org/docs/current/sql-createoperator.html. This patch attempts to parse
SELECT
queries that reference an operator usingOPERATOR(<optional_schema>.<operator_name>)
syntax. Tests added in the patch illustrate a common use-case of this feature.This is a PostgreSQL extension. There are no provisions for user-defined operators in the SQL standard.