Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lost update after successfull commit #1697

Closed
1 of 2 tasks
tl-sfo opened this issue Feb 5, 2020 · 42 comments · Fixed by #1729
Closed
1 of 2 tasks

Lost update after successfull commit #1697

tl-sfo opened this issue Feb 5, 2020 · 42 comments · Fixed by #1729
Assignees

Comments

@tl-sfo
Copy link

tl-sfo commented Feb 5, 2020

I'm submitting a ...

  • bug report
  • feature request

Describe the issue
At the beginning we define a table with a not null column of type integer. Afterwards we insert a valid integer and then null which is invalid. At the end a commit on the given connection is executed und successfull, but the valid insert is lost.
If we change the order of the insertion i.e. execute the invalid insertion first, then the commit fails, which is expected.

Driver Version?
42.2.9

Java Version?
jdk1.8.0_172

OS Version?
Windows 10 Pro

PostgreSQL Version?
12

To Reproduce

statement.executeUpdate("CREATE TABLE TEST_COMMIT_AFTER_SQL(COLUMN1 INT NOT NULL)");
connection.commit();

// Successfull insert which is lost
statement.executeUpdate("INSERT INTO TEST_COMMIT_AFTER_SQL(COLUMN1) VALUES(5)");

try {
  // Invalid insert that should make the transaction fail
  statement.executeUpdate("INSERT INTO TEST_COMMIT_AFTER_SQL(COLUMN1) VALUES (NULL)");
} catch (SQLException ex) {
  // Exptected, null value can not be inserted in a not null column.
}

try {
  // Exptected failure
  connection.commit();

  ResultSet result = statement.executeQuery("SELECT * FROM TEST_COMMIT_AFTER_SQL");
	
  // Contrary to expectation the commit succeeds but the successfully inserted value is not present
  boolean next = result.next();
  assertTrue(next);
  int value = result.getInt(1);
  assertEquals(5, value);
} catch (SQLException ex) {
  // If commit is executed then we expect a failure
}

Expected behaviour
Commit should fail.

@davecramer
Copy link
Member

in order for this to work as you expect you need to turn autocommit off.
connection.autocommit(false);

@tl-sfo
Copy link
Author

tl-sfo commented Feb 5, 2020

Autocommit is already set to off.

connection.getAutoCommit()

gives false.

@davecramer
Copy link
Member

send me a small example that fails.

@davecramer
Copy link
Member

import com.sun.javafx.binding.StringFormatter;

import java.sql.*;
import java.util.Properties;


public class TestNullColumnCommit {
    public static void main(String []args) {
        String url = "jdbc:postgresql://localhost:5432/test";

        Properties props = new Properties();
        props.setProperty("user", "test");
        props.setProperty("password", "test");
        try {
            Connection conn = DriverManager.getConnection(url, props);
            conn.setAutoCommit(false);
            try (Statement statement = conn.createStatement() ) {
                statement.executeUpdate("CREATE TABLE if not exists  TEST_COMMIT_AFTER_SQL(COLUMN1 INT NOT NULL)");
                conn.commit();

// Successfull insert which is lost
                statement.executeUpdate("INSERT INTO TEST_COMMIT_AFTER_SQL(COLUMN1) VALUES(5)");

                try {
                    // Invalid insert that should make the transaction fail
                    statement.executeUpdate("INSERT INTO TEST_COMMIT_AFTER_SQL(COLUMN1) VALUES (NULL)");
                } catch (SQLException ex) {
                   ex.printStackTrace();
                }

                try {
                    // Expected failure
                    conn.commit();

                    ResultSet result = statement.executeQuery("SELECT * FROM TEST_COMMIT_AFTER_SQL");


                    if (result.next() ) {
                        int value = result.getInt(1);
                        System.out.println(StringFormatter.format( "Value is {0}", value));
                    } else {
                        System.out.println("There is nothing in the table");
                    }

                } catch (SQLException ex) {
                    ex.printStackTrace();
                }
            }
        } catch ( Exception ex ) {
            ex.printStackTrace();
        }


    }
}

output

org.postgresql.util.PSQLException: ERROR: null value in column "column1" violates not-null constraint
  Detail: Failing row contains (null).
	at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2497)
	at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2233)
	at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:310)
	at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:446)
	at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:370)
	at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:311)
	at org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:297)
	at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:274)
	at org.postgresql.jdbc.PgStatement.executeUpdate(PgStatement.java:246)
	at TestNullColumnCommit.main(TestNullColumnCommit.java:26)
There is nothing in the table

@haumacher
Copy link

Yes, this exactly reproduces the problematic behavior. Since conn.commit() succeeds, I would expect that either all successful updates I've done before (statement.executeUpdate("INSERT INTO TEST_COMMIT_AFTER_SQL(COLUMN1) VALUES(5)")) are persisted or the final commit to fail. The code that causes the intermediate SQLException (statement.executeUpdate("INSERT INTO TEST_COMMIT_AFTER_SQL(COLUMN1) VALUES (NULL)")) might be executed from some call I don't have under control - in the worst case some library call that contributes to my transaction.

Therefore, accepted behavior would be either the final commit to fail (which seems to be the PostgreSQL way) or the final commit to succeed and all successful inserts to be persisted (which would be the Oracle way).

@davecramer
Copy link
Member

well I get what you are trying to say however here are the javadocs:

void commit()
throws SQLException
Makes all changes made since the previous commit/rollback permanent and releases any database locks currently held by this Connection object. This method should be used only when auto-commit mode has been disabled.
Throws:
SQLException - if a database access error occurs, this method is called while participating in a distributed transaction, if this method is called on a closed connection or this Connection object is in auto-commit mode

Note the function returns void so not much we can do there and as per the docs we can only throw an exception under two cases.

@bokken
Copy link
Member

bokken commented Feb 5, 2020 via email

@davecramer
Copy link
Member

also for more information:

begin;
BEGIN
test=# insert into test_commit_after_sql (column1) values (NULL);
ERROR:  null value in column "column1" violates not-null constraint
DETAIL:  Failing row contains (null).
test=# commit;
ROLLBACK

The commit executes with no issues.

@haumacher
Copy link

haumacher commented Feb 5, 2020

Note the function returns void so not much we can do there and as per the docs we can only throw an exception under two cases.

Don't understand that. If I do the following:

con = getConnection()
try {
   con.executeUpdate(failingStatement)
} catch {...}
con.executeUpdate(succeedingStatement)
con.commit()

The commit() fails with the message that the transaction has partially rolled back. The exactly same result I expect regardless of the order of the two updates (the failing and the succeeding one).

And no, auto-commit is not an option since a all-or-nothing semantics is required. Suppose the slightly modified code:

con = getConnection();
try {
  doTheFirstThing(con);
  doTheSecondThing(con);
  doTheThirdThing(con);
  con.commit();
} finally {
  con.rollback();
}

You would expect that either all changes from the three steps are persistet, or nothing, right? That's what the ACID properties of a database should guarantee. This must hold even if one of the steps fail. And that's what all other RDBMs I've seen so far do - including MySQL, Oracle, DB2, MSSQL, H2, Derby. With PostgreSQL however, it depends on the order of the failing and succeeding steps.

@davecramer
Copy link
Member

Well, pretty sure once the transaction has failed nothing will be committed. In fact I'll give you long odds on it, unless you set autosave.

In the second example you get all or nothing, again presuming autosave has not been set.

My guess is the first example the commit is not failing but the second executeUpdate.

@bokken
Copy link
Member

bokken commented Feb 5, 2020

@haumacher, setting autosave to always results in behavior basically same as ojdbc.

@davecramer, I think a plausible reading of the commit javadoc javadoc:

Makes all changes made since the previous commit/rollback permanent...
throws SQLException - if a database access error occurs

This could plausibly be interpreted that a "database access error" has occurred because not all changes since the previous commit/rollback are being made permanent.

To make sure I am understanding what is going on under the covers, in this scenario is the database actually rolling back to the previous savepoint (in this case the transaction start?).
In the absence of an actual save point (whether implicit or explicit), should the "transaction" basically go into a rollback only mode (i.e. any call to commit would fail)?

@davecramer
Copy link
Member

@bokken

begin;
BEGIN
test=# insert into test_commit_after_sql (column1) values (NULL);
ERROR:  null value in column "column1" violates not-null constraint
DETAIL:  Failing row contains (null).
test=# commit;
ROLLBACK

commit doesn't actually fail and there is no access error so AFAICT we are doing the right thing here.

@bokken
Copy link
Member

bokken commented Feb 5, 2020

@davecramer I am not saying that current behavior is definitively wrong, but it is surprising to those coming from other jdbc driver implementations like ojdbc.

It seems that postgres basically rollsback on error. In the case of an open transaction with no save points, this involves rolling back to the beginning of the transaction. This does not cleanly align to the jdbc api. A bit of this is thinking out loud, but I wonder if failing any call to commit after this type of failure in this scenario would be less surprising.

@davecramer
Copy link
Member

I'm torn between least surprising and looking less like postgres

@davecramer
Copy link
Member

found this https://crate.io/docs/sql-99/en/latest/chapters/36.html#commit-statement

What’s supposed to happen with COMMIT is:

SQL Cursors are closed, SQL statements are unprepared and locks are released, as noted earlier. Any savepoints established in the current transaction are also destroyed.
Any temporary Table whose definition includes ON COMMIT DELETE ROWS gets its rows deleted.
All deferred Constraints are checked. Any Constraint that is found to be violated will cause a “failed COMMIT”: your DBMS will implicitly and automatically do a ROLLBACK.

while it may be surprising it is the SQL spec.

@tl-sfo
Copy link
Author

tl-sfo commented Feb 6, 2020

Using the snippet above

con = getConnection();

try {
  ok1 = doTheFirstThing(con);
  ok2 = doTheSecondThing(con);
  ok3 = doTheThirdThing(con);

  con.commit();
  // Reaching this point
} finally {
  con.rollback();
}

doTheFirstThing(con) is for instance a valid insertion and succeeds (ok1 is true), doTheSecondThing(con) is a invalid insertion and throws a SQLException which is catched (ok2 is false) and doTheThirdThing(con) is a valid insertion but throws a subsequent SQLException ("current transaction is aborted, commands ignored until end of transaction block") caused by the doTheSecondThing which is also catched (ok3 is false).

If I execute now the commit then it succeeds and i expect due to https://crate.io/docs/sql-99/en/latest/chapters/36.html#commit-statement

If all goes well, any changes to your SQL-data – whether they are changes to Objects or to data values – become “persistent”, and thus visible to subsequent transactions.

the successfull changes, here doTheFirstThing, are applied.

@davecramer
Copy link
Member

So let me understand this.

You would expect doTheFirstThing to be succeed in this scenario ?

Thoughts on this statement from the same page.

"An SQL-transaction (transaction) is a sequence of executions of SQL-statements that is atomic with respect to recovery. That is to say: either the execution result is completely successful, or it has no effect on any SQL-schemas or SQL-data.”

– The SQL Standard

@tl-sfo
Copy link
Author

tl-sfo commented Feb 6, 2020

If con.commit() succeeds (no SQLException is thrown) then i expect

You would expect doTheFirstThing to be succeed in this scenario ?

Yes.

Another plausible option is that the con.commit() fails (SQLException is thrown) and nothing is applied.

But none of the above mentioned options occurs. Instead, PostgreSQL uses a mix of the two.

To make it clear, dependent on the result of con.commit(), i expect either successfull changes are made or nothing is applied and a SQLException is thrown.

Based on your argument, one might rather wonder why the con.commit() succeeds even though changes failed.

@davecramer
Copy link
Member

Pretty sure nothing is committed. The example I provided above does not commit anything.
commit does a rollback at that point, as per the sql example I showed above.
If you have code that does a half commit, can you please post the entire example?

@tl-sfo
Copy link
Author

tl-sfo commented Feb 6, 2020

Pretty sure nothing is committed. The example I provided above does not commit anything.

Exactly, but the commit con.commit() does not throw a SQLException. It appears that the commit have been successfull and so according to

If all goes well, any changes to your SQL-data – whether they are changes to Objects or to data values – become “persistent”, and thus visible to subsequent transactions.

i expect that changes to the database are made which is wrong.

@davecramer
Copy link
Member

ah, ok. The problem is that the JDBC API specifies that an exception is only thrown in a few cases as I eluded to above:

SQLException - if a database access error occurs, this method is called while participating in a distributed transaction, if this method is called on a closed connection or this Connection object is in auto-commit mode

None of the above has occurred.

@bokken
Copy link
Member

bokken commented Feb 6, 2020

@tl-sfo, set the autosave property to always. In the scenario you described, both doTheFirstThing(con) and doTheThirdThing(con) would succeed and then become visible after commit (which would also succeed). Note that this is not the same as auto-commit. If you chose to rollback the transaction rather than commit then none of the changes would be visible.

@davecramer, as I stated earlier, I think a liberal reading of the spec would allow for a SQLException to be thrown in the case that commit is called on a transaction but a previously successful change (i.e. doTheFirstThing(con)) is not actually made visible. Basically once a failure occurs with no savepoint, the connection enters a "rollback only" mode. Any use of the connection other than rollback would result in an exception.

@davecramer
Copy link
Member

@bokken I'd be OK if we had a property which changed this behaviour. Given that the driver has had this behaviour for 20 odd years, changing it now would likely break a few apps :)

@tl-sfo
Copy link
Author

tl-sfo commented Feb 6, 2020

@bokken Does setting the autosave option have any perfomance impacts?

@davecramer
Copy link
Member

Yes, a savepoint is created for each transaction.

@bokken
Copy link
Member

bokken commented Feb 6, 2020

@tl-sfo, my observation is that while there is work associated with the save points, the observed time impact is negligible.
I have a solution which supports both oracle and postgresql. To achieve similar behavior we run with autosave set to always with pgjdbc. The query performance for similar workloads on similar table sizes is still slightly faster in postgresql than oracle. I do not have performance comparisons with the default autosave (disabled), because we would have had to make code changes (manually set savepoints and rollback when needed) for our solution to still function correctly in that setting.

@haumacher
Copy link

haumacher commented Feb 7, 2020

Let me explain the background of our request in a little bit more detail. We are not facing a concrete single problem, where we could easily add an explicit savepoint to fix it as described above.

Upon request from a large customer, we are currently adding postgresql support to the database abstraction layout of our app framework Top-Logic. Our hope was to support postgresql "without any restriction" - not imposing any constraints on the connection settings. Unfortunately, this seems not to be possible due to the issue described here. Let me show:

The central transaction handler of the the framework in principal looks like this:

con = getConnection();
try {
   insertInformationOfTheNewlyCreatedRevision(con);
   makeChangesToObjectsPersistent(con);
   addApplicationSpecials(con);
   con.commit();
   publishLocalChangesToGlobalCaches();
} finally {
   con.rollback();
}

The assumption behind the code above is that if commit() succeeds, all changes made so far are made persistent. With postgresql, this seems not to be true in non-autosafe mode. If addApplicationSpecials() calls custom code that has a bug (not handling safepoints properly) this central transaction handler has no chance to detect this situation. If it publishes local changes to global caches after a successful commit (which has not made changes from the first two steps persistent) then things go terribly wrong. Consistency between transient caches and persistent state is no longer guaranteed causing the application to crash in the mid-term.

If we announce "postgresql support" for our framework, we not really like to add a side-note that one has to configure the connection to autosafe mode "for safety reasons". If the user fails to do that configuration, things might work perfectly well for the next two years, until this buggy user-specific exception handler is triggered causing the app to crash and waiting for a reboot.

@bokken's suggestion "Once a failure occurs with no savepoint, the connection enters a "rollback only" mode. Any use of the connection other than rollback would result in an exception." would be the ideal solution.

Of cause, changing anything has the potential to break something - but in this situation, I don't think that the impact would be broadly noticable. The suggested change only affects the response to a situation where the application has encountered an error anyway (invalid exception handling without safepoints). The change only affects the response to that error (reporting it instead of simply ignoring it).

@davecramer
Copy link
Member

So I'm looking at this code and trying to figure out why you always rollback the connection. Luckily no other driver throws an exception for that. That said are you catching the SQLException thrown by the bad statement in makeChangesToObjectsPersistent? If not then this should just work as is.

@haumacher
Copy link

haumacher commented Feb 7, 2020

As said, we are developing a framework. Not I'm catching this exception, but some application specific code possibly could catch the SQLException without propper safepoint handling. The problem is that this minor bug would completely ruin the consistency of the framwork's caches and cause the app to crash waiting for a reboot.

What should be the reason for throwing an exception from rollback() with a transaction containing no changes (started after the successful commit)? If something goes wrong, rollback() is neccessary, if everything works fine, rollback() is a no-op.

@davecramer
Copy link
Member

OK, So I get your conundrum however as a driver we write to the spec, so please understand mine.

Here's the javadoc for when commit throws an exception:

SQLException - if a database access error occurs, this method is called while participating in a distributed transaction, this method is called on a closed connection or this Connection object is in auto-commit mode

Here's the javadoc for when rollback throws an exception

SQLException - if a database access error occurs, this method is called while participating in a distributed transaction, this method is called on a closed connection or this Connection object is in auto-commit mode

Note: they are identical. Based on your logic rollback() should also throw an exception in the case that the transaction failed.

Have a read https://debezium.io/blog/2018/12/05/automating-cache-invalidation-with-change-data-capture/ for a different way to handle caches.

Now it appears that others have interpreted the spec differently to suit themselves. I would still be willing to accept a PR that threw an exception in the case of the commit failing but this can't be the default behaviour. It would have to be a property provided on the connection URL or in the properties of the connection.

@vlsi
Copy link
Member

vlsi commented Mar 6, 2020

I've implemented a workaround in #1729, reviews are welcome.

vlsi added a commit to vlsi/pgjdbc that referenced this issue Mar 6, 2020
@vlsi vlsi closed this as completed in #1729 Mar 6, 2020
42.3.0 Release automation moved this from To do to Done Mar 6, 2020
@haumacher
Copy link

Cool, thanks for the fix!

What do you think is the best way for an application to determine the PostgreSQL driver version (to check that the fix is applied and no auto-safe handling must be enforced)? I think of a configuration, where the database driver is configured in the web server and made available to the application through JNDI.

@davecramer
Copy link
Member

The API has getMajorVersion and getMinorVersion. Seems like that should be enough

vlsi added a commit that referenced this issue Mar 7, 2020
…mit command (#1729)

The feature is enabled by default.
It can be disabled by setting raiseExceptionOnSilentRollback=false connection property.

See discussion in pgsql-hackers: https://www.postgresql.org/message-id/b9fb50dc-0f6e-15fb-6555-8ddb86f4aa71%40postgresfriends.org

fixes #1697
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging a pull request may close this issue.

6 participants