-
Notifications
You must be signed in to change notification settings - Fork 819
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix; do not add double quotes to identifiers already double quoted fixes Issue #2223 #2224
Conversation
needs tests |
try { | ||
sbuf.append('"'); | ||
if ( !alreadyQuoted ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I thought it should be more like
if (alreadyQuoted) {
return;
}
Otherwise you might quote already quoted names because the for loop is still executed. This would result in badly quoted values that will fail executing, e.g. "column"
as input will become ""column""
because of the for-loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well we need deal with internally quoted names
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well ok, but now valid input will become broken escaped output, e.g. "column"
will become ""column""
which always fails. Would make more sense if the for-loop would simply copy the first and last quote if alreadyQuoted
is true
without quoting it again. That way "column"
stays "column"
and "column"name"
becomes "column""name"
This will break other oddball cases like identifiers that start and end with quotes. As dumb as it is to define a schema via something like I don't think this method needs to change, it's only job should be to blindly escape the values provided. The fix for the problem referenced in #2223 would be to change any internal call sites for this method to check if the source values parsed from the user SQL are already escaped and if so skip calling this method entirely. |
@sehrope There is also pgjdbc/pgjdbc/src/main/java/org/postgresql/PGConnection.java Lines 193 to 202 in 151b287
which calls that same Utils method, however JavaDoc states quotes are only added if needed. So either the javadoc is wrong or it can not use the Utils method. I think at some point you need to assume a String is correctly quoted if it starts and ends with quotes and thus do nothing (thus my comment above). If it is not correctly quoted, e.g. |
ugh ! |
That comment refers to adding surrounding quotes if the parameter to be escaped requires any escaping. So Well mostly ... there's a separate issue that depending on the context there are some reserved words that must always be quoted (https://www.postgresql.org/docs/current/sql-keywords-appendix.html) but the driver has never handled those correctly. For example try creating a table with a column named
... but that's a separate issue.
It's the caller's responsibility to know whether the value being handled is already escaped. Having the driver guess based on whether the value appears quoted unfortunately leads to broken edge cases because the server allows just about anything other than zero bytes. |
Well that's what I meant. Since there are no checks even
Yes that's the case we stumbled upon using a table having a column named
So pgjdbc needs a helper method that checks wether or not inputs are correctly quoted and then use that helper to fix various locations within pgjdbc. So it would make sense to add
Then this new method can be used in all places that currently use |
Okay I see what you're saying here. Yes that does not match the documented behavior. Given that not adding quotes would be a breaking change I think the fix for that would be to update the comment to reflect what it is currently doing.
Again, I don't think this is possible. Only the caller would know if the value is already escaped, it can't be inferred from the value because things like If the value is coming from something like parsing user SQL to infer a Similarly, if the value is coming from a column name we queried from the data dictionary, we know it's the raw value and must be escaped prior to usage. I don't see a situation where we'd be guessing based on the value. It's always something you have to do or you must not do.
That would be a good check to add to code that is parsing user SQL for column tokens (e.g. the RETURNING clause stuff). I'm not sure how that's implemented today but something of that form would be a sensible addition. |
Added tests for MixedCase return. If anyone knows a better way to deal with Parameratized tests I want to ignore please let me know |
@sehrope Well but the |
I think that's the only solution and also agree to jnehlmeier to trust userinput from the connection API. A specific case in the API for example is:
Currently every value in columnNames is escaped. |
At this point, I think leaving the code the way it is and having users fix their code is where I'm headed. Personally, I would have avoided the use of the column user. Most ORM's allow you to specify a different name if there is a conflict. As for the argument that other drivers seem to do the right thing. Do other databases allow for quoted identifiers ? |
@davecramer So you are saying every JDBC API user that uses PostgreSQL JDBC implementation must know that IMHO this current inconsistency needs to be fixed and a new major version of pgjdbc released indicating that breaking change. |
@jnehlmeier Well I'm really trying to get a clear understanding of the issue. Initially it was proposed that if an identifier had double quotes, then do nothing. Apparently that doesn't work for all cases. So if we can clarify this problem then we can find a solution. |
@davecramer In the issue report @amirnajmi discovered a quoting issue while using Datanucleus JDO. We discovered the same issue with Ebean. Both are ORMs. Both use JDBC API Because PG JDBC unconditionally quotes So ORMs do for example: prepareStatement("INSERT INTO \"table\" (\"column\") VALUES ('...')", new String[]{ "\"id\"" }) However when doing so Postgres looks for column named So the ultimate root cause is that |
OK, thanks you, the issue is now clear. So the challenge is that we don't actually look at the sql query or do anything with it. We surround the returning in the event that the user has put mixed case columns in their sql query. As @sehrope has pointed out postgres allows very unusual names if the user really wants them. I'm not sure how we please everyone here? |
Yes but you do that always, regardless of the actual column name. There is no check in code wether or not quoting is required. Thats why the first suggestion to fix the issue was checking the first and last char of the column name to see if it is already quoted and if it is then trust it.
And that's totally fine. PG JDBC now has to decide wether it asks the developer to provide correctly quoted column names when calling into the JDBC API or if it is fine with the raw column name and quotes it internally automatically. Currently the
Make both parameters of the Given that the parser likely relies on the fact that the provided SQL is correctly quoted in order to simplify parsing a bit, it is probably easier to require the
|
This would be about the only solution possible since we don't parse the query. I'm going to have a look to see what prompted us to quote the returning columns. Dave |
So looking at git blame. This code has been here for around 5 years... That's not an argument, but an observation. At this point. I think the best thing to do is add a connection parameter which will not quote the returning columns. The default will remain the same. |
Should be fine to make the behavior configurable via connection parameter / datasource property. That way ORMs can configure the datasource to their needs internally or ask the developer to do so. |
… double quotes around identifiers that are provided in the returning array. There are some ORMs that now quote all identifiers and if we in turn quote them this will cause an error.
I took a look at how other drivers handle this. Here's a summary:
So it's a toss up for support for quoted identifiers on other platforms. If we do end up adding a flag to handle this, does it also handle I think that's the only other place where a user supplied list of columns has the driver generating custom SQL from column names. |
Looks like it should always end up in |
So if nobody else does this why are we ? |
@sehrope Are you sure about Oracle? Because the main reason that I opened issue #2223 was that we did not have any problem on Oracle and when we changed the driver to Postgersql, this problem appeared. |
@amirnajmi Yes. Here's my basic test case for Oracle: private static void testGeneratedKeys(Connection conn, String sql, String[] columnNames) throws SQLException {
try (PreparedStatement stmt = conn.prepareStatement(sql, columnNames)) {
stmt.execute();
ResultSet rs = stmt.getGeneratedKeys();
rs.next();
System.out.println("Success for columnNames: " + StringUtils.join(columnNames, ','));
for(int i =1;i<=columnNames.length;i++) {
System.out.println(" KEY[" + i + "] = " + rs.getInt(i));
}
} catch (SQLException e) {
System.out.println("Error for columnNames: " + StringUtils.join(columnNames, ','));
e.printStackTrace();
}
}
@Test
public void testOracle() throws Exception {
String url = "jdbc:oracle:thin:@localhost:32770:xe";
String user = "sys as sysdba";
String password = "dbpass";
try (Connection conn = DriverManager.getConnection(url, user, password)) {
executeSqlIgnoreErrors(conn, "DROP TABLE foo");
executeSql(conn, "CREATE TABLE foo (id NUMBER GENERATED BY DEFAULT ON NULL AS IDENTITY, x NUMBER)");
String insertSql = "INSERT INTO foo (x) VALUES (1)";
testGeneratedKeys(conn, insertSql, new String[] {"ID"});
testGeneratedKeys(conn, insertSql, new String[] {"ID", "ID"});
testGeneratedKeys(conn, insertSql, new String[] {"id"});
testGeneratedKeys(conn, insertSql, new String[] {"id", "ID"});
testGeneratedKeys(conn, insertSql, new String[] {"id", "ID", "iD"});
testGeneratedKeys(conn, insertSql, new String[] {"\"ID\""});
testGeneratedKeys(conn, insertSql, new String[] {"bad"});
}
} And here's the output:
And mixed case combination that gets case folded is fine. Invalid or already quoted column names are rejected. |
@sehrope This is part of my code that I called conn.preparedStatement() (I used number instead of x for identity field): And the part for getting result: Output: |
@davecramer As I mentioned in my last message, Oracle is working with double-quotes and other drivers are ignoring or handling that either. |
@davecramer, When this PR getting merge? is there anything else to check? |
Hmmm ya, I need to get back to this. There are changes in this PR that don't belong here. |
2cbd84e
to
5f537af
Compare
@davecramer It would be greatly appreciated if you kindly put a bit higher priority on this PR. This is becoming a blocking issue for some of our customers and they may start considering switching their database from PostgreSQL to another. |
@davecramer Really appreciated! Thank you very much! |
@TakahikoKawasaki I'd love it if you could test our snapshots. I'd like to release 43.0.0 https://oss.sonatype.org/content/repositories/snapshots/org/postgresql/postgresql/42.3.0-SNAPSHOT/postgresql-42.3.0-20211013.161600-11.jar |
@davecramer Sure. We'll test it. |
@TakahikoKawasaki have you tested it ? |
@davecramer We have tested the jar file(snapshot version) you sent, and the problem that we had with Datanucleus JDO was resolved; by adding the new flag(quoteReturningIdentifiers) to avoid adding additional escaped identifiers in returning columns. |
Thank you. I want to release as soon as possible |
Sounds great. |
No description provided.