Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very confused by SqlBulkCopy documentation #2350

Open
Korporal opened this issue Feb 8, 2024 · 6 comments
Open

Very confused by SqlBulkCopy documentation #2350

Korporal opened this issue Feb 8, 2024 · 6 comments
Labels
⏳ Waiting for Customer We're waiting for your response :)

Comments

@Korporal
Copy link

Korporal commented Feb 8, 2024

This is .Net Framework but I hope someone can help.

I started seeing timeouts in some code we have here, code that runs well but we're starting to see issues as volumes increase.

Let me show you the code:

using (var bulk = new SqlBulkCopy(connection, SqlBulkCopyOptions.FireTriggers, null))
{
    bulk.BulkCopyTimeout = 60 * 20; // 20 minutes
    bulk.DestinationTableName = DataType.Name;
    bulk.BatchSize = 5000;
    bulk.WriteToServer(table);
}

Recently the size of some of these bulk inserts has reached 15 or more million rows and sometimes this throws a timeout exception.

Now whenever this does fail we never see rows in the target table, ever. Yet the documentations says that when coded as shown, the inserts take place without any kind of transaction so I'd expect data from the first batches, prior to the timeout exception, to be present.

The behavior looks as if the entire operation was inside a transaction, the code itself contains no explicit transactions.

The docs are just bewildering, each of these is stated in the documentation:

By default, a bulk copy operation is performed as an isolated operation. The bulk copy operation occurs in a non-transacted way, with no opportunity for rolling it back.

and

By default, a bulk copy operation is its own transaction.

So I have no idea whether or not the code I have is or is not doing the huge inserts inside a transaction or not, but it seems to behave as if it is inside a transaction.

Questions

  1. Does the timeout apply to the writing of each distinct batch or to the entire WriterServer call?
  2. What effect would setting UseInternalTransaction have on the above code?
  3. Why the apparent contradiction in the documentation?
  4. When the docs say "a bulk copy operation is its own transaction" is an "operation" a batch or the entire write?
@JRahnama JRahnama added this to Needs triage in SqlClient Triage Board via automation Feb 9, 2024
@David-Engel
Copy link
Contributor

  1. The entire WriteServer call ("the operation").

  2. From the API docs:

    UseInternalTransaction - When specified, each batch of the bulk-copy operation will occur within a transaction. If you indicate this option and also provide a SqlTransaction object to the constructor, an ArgumentException occurs.

    In your scenario, you'd get data committed to the database through the last completed batch prior to the timeout.

  3. There is no contradiction. "An isolation operation" is the same as "one transaction" with the limitation that there is no controlling of the commit or rollback for that "one transaction". It either succeeds and commits all or it fails and rolls back all, Without some level of transaction (internal to the database), there would be no way to guarantee data integrity on the server.

  4. An operation is the entire write.

What's your goal? To write all 15 million rows at once? Increase your timeout until it succeeds. Are you trying not to increase the timeout? Either use UseInternalTransaction and figure out how to pick up where it left off. Or split your data into smaller chunks for each WriteServer operation.

@Korporal
Copy link
Author

Korporal commented Feb 9, 2024

  1. The entire WriteServer call ("the operation").

  2. From the API docs:

    UseInternalTransaction - When specified, each batch of the bulk-copy operation will occur within a transaction. If you indicate this option and also provide a SqlTransaction object to the constructor, an ArgumentException occurs.

    In your scenario, you'd get data committed to the database through the last completed batch prior to the timeout.

  3. There is no contradiction. "An isolation operation" is the same as "one transaction" with the limitation that there is no controlling of the commit or rollback for that "one transaction". It either succeeds and commits all or it fails and rolls back all, Without some level of transaction (internal to the database), there would be no way to guarantee data integrity on the server.

  4. An operation is the entire write.

What's your goal? To write all 15 million rows at once? Increase your timeout until it succeeds. Are you trying not to increase the timeout? Either use UseInternalTransaction and figure out how to pick up where it left off. Or split your data into smaller chunks for each WriteServer operation.

Thank you.

The goal is to tighten up that code so that ideally we write all rows or no rows (i.e. transaction protect the update) and set batch size to a sensible value and do not timeout.

I don't really want to see a partial write at all, either all rows get written or none at all, so I guess I need a separate transaction that I can pass into the bulk copy.

That code has been fine for years but two changes are taking place:

  1. The volumes have increased over the years, from less than a million rows five years ago to 20 million rows today.
  2. The database is now in Azure, whereas all prior testing was against our own hosted servers.

I am concerned that writing 20 million inserts as a single transaction might place a large burden on the server, whereas multiple smaller "batched" transactions does not do that, ideally we want a write all or none design.

Several large writes failed recently under a test. These were multiple millions of rows and the operation timed out. Now the code as it stands (and you seem to agree) implies that each batch would be written/committed one at a time, So to experience a 20 minute timeout implies that several of the 5,000 row writes must have worked, yet the table was empty and I've never sever seen a timeout occur and some rows written.

The system seems to behave as if it is an all or nothing operation, but that isn't consistent with the documentation. This is very puzzling.

@David-Engel
Copy link
Contributor

Now the code as it stands (and you seem to agree) implies that each batch would be written/committed one at a time,

I don't see UseInternalTransaction set in your code, so I would NOT expect each batch to be committed separately.

@Korporal
Copy link
Author

Now the code as it stands (and you seem to agree) implies that each batch would be written/committed one at a time,

I don't see UseInternalTransaction set in your code, so I would NOT expect each batch to be committed separately.

Yes, I do not (knowingly) use any transactions here, so I'd expect writes performed in distinct batches to never be rolled back, if twenty batch writes succeed but the 21st failed, (exception etc) then I'd expect to see data in the table, but we never do, we never have.

@David-Engel
Copy link
Contributor

Now the code as it stands (and you seem to agree) implies that each batch would be written/committed one at a time,

I don't see UseInternalTransaction set in your code, so I would NOT expect each batch to be committed separately.

Yes, I do not (knowingly) use any transactions here, so I'd expect writes performed in distinct batches to never be rolled back, if twenty batch writes succeed but the 21st failed, (exception etc) then I'd expect to see data in the table, but we never do, we never have.

I'm confused by your last statement. You just said you aren't using UseInternalTransaction but you expect each batch to be committed. That's the opposite of how the docs describe the functionality.

A batch != a transaction. By default, you get one transaction across all batches and nothing is persisted if the operation fails. If you set UseInternalTransaction = true, you get one transaction per batch, and everything up to the last completed batch is persisted upon failure.

@JRahnama JRahnama added ⏳ Waiting for Customer We're waiting for your response :) and removed untriaged labels Feb 13, 2024
@JRahnama JRahnama moved this from Needs triage to Needs More Info in SqlClient Triage Board Feb 13, 2024
@Dean-NC
Copy link

Dean-NC commented Mar 21, 2024

@David-Engel I also think the documentation is confusing, specifically the page at https://learn.microsoft.com/en-us/dotnet/framework/data/adonet/sql/transaction-and-bulk-copy-operations

I fully get that by default a batch != transaction. I think you're saying that by default (not using UseInternalTransaction and not using my own transaction), if an error happens nothing is persisted. But the doc page above (under section Performing a Non-transacted Bulk Copy Operation) says:
"When the operation encounters the invalid row, an exception is thrown. In this first example, the bulk copy operation is non-transacted. All batches copied up to the point of the error are committed; the batch containing the duplicate key is rolled back..."

In the next section, Performing a Dedicated Bulk Copy Operation in a Transaction, when it talks about using UseInternalTransaction, it says: "All batches copied up to the point of the error are committed; the batch containing the duplicate key is rolled back". That sounds the same as the 1st section when not using the internal trans. I've seen numerous people on StackOverflow say the same.

My guess is the statement under the 1st section is incorrect about previous batches being committed, otherwise there would be no need for UseInternalTransaction, as far as I can see.

I think the 3rd section of that doc (Using Existing Transactions) is fine and makes good sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⏳ Waiting for Customer We're waiting for your response :)
Projects
SqlClient Triage Board
  
Needs More Info
Development

No branches or pull requests

4 participants