Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQLServerBulkCSVFileRecord doesn't revert IDENTITY_INSERT to OFF after a failure #2380

Closed
ingvard opened this issue Apr 4, 2024 · 5 comments
Assignees
Labels
As designed The issue in question is as designed, and will not be addressed
Projects

Comments

@ingvard
Copy link

ingvard commented Apr 4, 2024

SQLServerBulkCopy doesn't revert IDENTITY_INSERT to OFF after a failure

Driver version

mssql-jdbc-12.6.1.jre11

SQL Server version

Microsoft SQL Server 2017 (RTM-CU31-GDR) (KB5021126) - 14.0.3460.9 (X64) Developer Edition (64-bit) on Linux (Ubuntu 18.04.6 LTS)

Client Operating System

Linux (Ubuntu 18.04.6 LTS)

JAVA/JVM version

Kotlin 1.9.22 (Open JDK 17)

Table schema

CREATE TABLE "TEST_IDENTITY_TABLE" (
    "ID" "int"  IDENTITY (1, 1) NOT NULL,
    "TEXT" DATETIME
)

CREATE TABLE "TEST_IDENTITY_TABLE2" (
    "ID" "int"  IDENTITY (1, 1) NOT NULL
)

Problem description

After using SQLServerBulkCopy and it finishes with an exception, a connection will have IDENTITY_INSERT ON. The example below shows and repeats the problem step by step:

  1. We had to make two tables. One for data and the other to try to find out if the first one has IDENTITY_INSERT ON state.
CREATE TABLE "TEST_IDENTITY_TABLE" (
    "ID" "int"  IDENTITY (1, 1) NOT NULL,
    "TEXT" DATETIME
)

CREATE TABLE "TEST_IDENTITY_TABLE2" (
    "ID" "int"  IDENTITY (1, 1) NOT NULL
)
  1. Open the connection. In this example, we're using one connection for all actions. However, in real production code, the connection is usually returned to the connection pool with a faulty state.
val conn: Connection = DriverManager.getConnection(
        "jdbc:sqlserver://localhost:1433;encrypt=false;databaseName=test;username=sa;password=A_Str0ng_Required_Password"
    )
  1. Make a CSV file for inserting data at once, and then insert it into the database.
val csvFile = Files.createTempFile("bulk_csv", ".csv")
csvFile.toFile().writeText("1, 2022/7/28 12:21:00.0000000")


val bulkRecord = SQLServerBulkCSVFileRecord(
    csvFile.toAbsolutePath().toString(),
    StandardCharsets.UTF_8.name(),
     ",",
     false
).apply {
    addColumnMetadata(1, "ID", 4, 1, 1)
     addColumnMetadata(2, "TEXT", 93 /* timestamp */, 50, 3)
}

try {
   SQLServerBulkCopy(conn).use { bulkCopy ->
      bulkCopy.bulkCopyOptions = SQLServerBulkCopyOptions().apply {
         isKeepIdentity = true // Insert identity keys as is
       }
       bulkCopy.destinationTableName = "TEST_IDENTITY_TABLE"
       bulkCopy.writeToServer(bulkRecord)
    }
} catch (e: Exception) {
   e.printStackTrace()

    // Ignore exception ...
}

I have examined various situations where the insertion process breaks. From what I observe, it seems necessary to insert data that the server will accept, and then throw an exception during the response parsing stage.

com.microsoft.sqlserver.jdbc.SQLServerException: Conversion failed when converting date and/or time from character string.
	at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:261)
	at com.microsoft.sqlserver.jdbc.TDSTokenHandler.onEOF(tdsparser.java:316)
	at com.microsoft.sqlserver.jdbc.TDSParser.parse(tdsparser.java:137)
	at com.microsoft.sqlserver.jdbc.TDSParser.parse(tdsparser.java:42)
	at com.microsoft.sqlserver.jdbc.TDSParser.parse(tdsparser.java:31)
	at com.microsoft.sqlserver.jdbc.SQLServerBulkCopy.doInsertBulk(SQLServerBulkCopy.java:1605)
	at com.microsoft.sqlserver.jdbc.SQLServerBulkCopy$1InsertBulk.doExecute(SQLServerBulkCopy.java:673)
	at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:7739)
	at com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLServerConnection.java:4384)
	at com.microsoft.sqlserver.jdbc.SQLServerBulkCopy.sendBulkLoadBCP(SQLServerBulkCopy.java:707)
	at com.microsoft.sqlserver.jdbc.SQLServerBulkCopy.writeToServer(SQLServerBulkCopy.java:1670)
	at com.microsoft.sqlserver.jdbc.SQLServerBulkCopy.writeToServer(SQLServerBulkCopy.java:630)
	at db.SQLServerBulkCopyReproducerKt.main(SQLServerBulkCopyReproducer.kt:53)
	at db.SQLServerBulkCopyReproducerKt.main(SQLServerBulkCopyReproducer.kt)
  1. Attempt to reuse the connection, but with a different identity.
val statement = conn.createStatement()

statement.execute("SET IDENTITY_INSERT dbo.TEST_IDENTITY_TABLE2 ON;")

The code above leads to the following exception:

Exception in thread "main" com.microsoft.sqlserver.jdbc.SQLServerException: IDENTITY_INSERT is already ON for table 'test.dbo.TEST_IDENTITY_TABLE'. Cannot perform SET operation for table 'dbo.TEST_IDENTITY_TABLE2'.
	at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:261)
	at com.microsoft.sqlserver.jdbc.SQLServerStatement.getNextResult(SQLServerStatement.java:1752)
	at com.microsoft.sqlserver.jdbc.SQLServerStatement.doExecuteStatement(SQLServerStatement.java:946)
	at com.microsoft.sqlserver.jdbc.SQLServerStatement$StmtExecCmd.doExecute(SQLServerStatement.java:840)
	at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:7739)
	at com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLServerConnection.java:4384)
	at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeCommand(SQLServerStatement.java:293)
	at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeStatement(SQLServerStatement.java:263)
	at com.microsoft.sqlserver.jdbc.SQLServerStatement.execute(SQLServerStatement.java:813)
	at db.SQLServerBulkCopyReproducerKt.main(SQLServerBulkCopyReproducer.kt:64)
	at db.SQLServerBulkCopyReproducerKt.main(SQLServerBulkCopyReproducer.kt)

because SQLServerBulkCSVFileRecord doesn't revert IDENTITY_INSERT ON after the failure.

Expected behavior

The expected behavior is that the connection returns to its original state as it was before starting. For instance, in a successful case, it reverts back to its initial state.

Actual behavior

The actual behavior is that the connection state for each conn.createStatement() or another SQLServerBulkCSVFileRecord will remain as IDENTITY_INSERT dbo.TEST_IDENTITY_TABLE ON.

@Jeffery-Wasty Jeffery-Wasty added this to Under Investigation in MSSQL JDBC via automation Apr 4, 2024
@Jeffery-Wasty
Copy link
Member

Hi @ingvard,

I'm not sure this is a mistake within the driver.

It's in the nature of IDENTITY_INSERT to persist for that connection, regardless of whether the connection is being re-used or not. If we're 'resetting' the value of IDENTITY_INSERT after failure, we're changing the state of the connection as its being returned to the pool, and that's not something the driver should do.

Reading up on best practices for IDENTITY_INSERT, the recommendation is to be sure to set IDENTITY_INSERT to OFF for a table, if there is a plan to set it to ON for another. This can easily be added to your catch clause for the case above.

Let us know if you have more to add regarding this topic. If not, we will move forward with closing the issue.

@Jeffery-Wasty Jeffery-Wasty moved this from Under Investigation to Waiting for Customer in MSSQL JDBC Apr 4, 2024
@ingvard
Copy link
Author

ingvard commented Apr 5, 2024

@Jeffery-Wasty, thank you for your response.

Your argument seems logical, but it's worth noting that SQLServerBulkCopy for SQLServerBulkCSVFileRecord already restores the state to its original form upon successful execution. This results in inconsistent behavior between two scenarios:

  1. Returning to the original state after a successful execution.
  2. Remaining in a broken state after a failed execution.

Reading up on best practices for IDENTITY_INSERT, the recommendation is to be sure to set IDENTITY_INSERT to OFF for a table, if there is a plan to set it to ON for another. This can easily be added to your catch clause for the case above.

However, it's important to acknowledge that checking whether the current connection has IDENTITY_INSERT set to ON for a specific table isn't a trivial task, as per information from StackOverflow.

It looks like the leaked abstraction, as the user of the SQLServerBulkCopy API needs to remember that certain errors during its execution can lead to the IDENTITY_INSERT state being left hanging, and they must manually reset it. This requirement adds complexity and burden to the user, which could be avoided with a more robust abstraction.

What do you think about it?

@Jeffery-Wasty
Copy link
Member

Hi @ingvard,

I agree that the inconsistency you have mentioned is not something we desire, and you make a good point about the the burden of resetting IDENTITY_INSERT for the user. I'll spend more time looking into this/check in with the team, and see what more we can do from our end.

@Jeffery-Wasty Jeffery-Wasty moved this from Waiting for Customer to Under Investigation in MSSQL JDBC Apr 8, 2024
@Jeffery-Wasty
Copy link
Member

Jeffery-Wasty commented Apr 23, 2024

Hi @ingvard,

So I spent additional time looking into this. After trying this other drivers and with just SQL through SSMS, I was able to replicate the same behavior as seen in JDBC. That is JDBC was consistent with other sources. Furthermore, the driver does no modification to IDENTITY_INSERT, regardless of success or failure. We just pass along the value and use it to determine whether additional actions are taken against the identity column.

I did try to add in behavior to set IDENTITY_INSERT for the most recent used table to OFF, in case of a failure, but was not able to find a solution.

Addressing your concerns from you last comment:

it's worth noting that SQLServerBulkCopy for SQLServerBulkCSVFileRecord already restores the state to its original form upon successful execution

As noted above. This 'inconsistent' behavior is consistent with other sources, and we can not determine how to change this in the desired way.

it's important to acknowledge that checking whether the current connection has IDENTITY_INSERT set to ON for a specific table isn't a trivial task

No, but it would be trivial to remember to set IDENTITY_INSERT to OFF after setting it to ON. In fact, I would urge that this behavior be extended to all scenarios. If you set to ON at the start of a program or query, set to OFF after transactions with that table have completed.

Given the above, I'm inclined to chalk this up as 'intended driver behavior' and mark the issue closed. Please let me know if you have any additional questions, concerns, or information.

@Jeffery-Wasty Jeffery-Wasty moved this from Under Investigation to Waiting for Customer in MSSQL JDBC Apr 23, 2024
@Jeffery-Wasty Jeffery-Wasty added the Waiting for Response Waiting for a reply from the original poster, or affiliated party label Apr 23, 2024
@Jeffery-Wasty Jeffery-Wasty self-assigned this Apr 23, 2024
@Jeffery-Wasty
Copy link
Member

Closing for above reasons.

MSSQL JDBC automation moved this from Waiting for Customer to Closed Issues Apr 29, 2024
@Jeffery-Wasty Jeffery-Wasty added As designed The issue in question is as designed, and will not be addressed and removed Waiting for Response Waiting for a reply from the original poster, or affiliated party labels Apr 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
As designed The issue in question is as designed, and will not be addressed
Projects
MSSQL JDBC
  
Closed Issues
Development

No branches or pull requests

2 participants