Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid Operation, Error connection is closed. Azure Function with Azure SQL serverless #19842

Closed
NitinNotebook opened this issue Feb 8, 2020 · 16 comments

Comments

@NitinNotebook
Copy link

NitinNotebook commented Feb 8, 2020

Client: Azure Function V3 on App Service Plan: (Linux) Premium v2
Codebase: .Net core 3.1 / Entity Framework Core 3.1.0
Database: Azure SQL (Serverless, Min 2 Cores/ Max 6 cores)

Intermittently but with high frequency getting error : Invalid Operation. Connection is closed. Below is complete error details

System.InvalidOperationException: Invalid operation. The connection is closed.
   at Microsoft.Data.SqlClient.TdsParserStateObject.WritePacket(Byte flushMode, Boolean canAccumulate)
   at Microsoft.Data.SqlClient.TdsParser.WriteInt(Int32 v, TdsParserStateObject stateObj)
   at Microsoft.Data.SqlClient.Server.ValueUtilsSmi.SetCompatibleValue(SmiEventSink_Default sink, ITypedSettersV3 setters, Int32 ordinal, SmiMetaData metaData, Object value, ExtendedClrTypeCode typeCode, Int32 offset)
   at Microsoft.Data.SqlClient.Server.ValueUtilsSmi.SetDataTable_Unchecked(SmiEventSink_Default sink, SmiTypedGetterSetter setters, Int32 ordinal, SmiMetaData metaData, DataTable value)
   at Microsoft.Data.SqlClient.TdsParser.WriteSmiParameter(SqlParameter param, Int32 paramIndex, Boolean sendDefault, TdsParserStateObject stateObj)
   at Microsoft.Data.SqlClient.TdsParser.TdsExecuteRPC(SqlCommand cmd, _SqlRPC[] rpcArray, Int32 timeout, Boolean inSchema, SqlNotificationRequest notificationRequest, TdsParserStateObject stateObj, Boolean isCommandProc, Boolean sync, TaskCompletionSource`1 completion, Int32 startRpc, Int32 startParam)
   at Microsoft.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean isAsync, Int32 timeout, Task& task, Boolean asyncWrite, Boolean inRetry, SqlDataReader ds, Boolean describeParameterEncryptionRequest)
   at Microsoft.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, TaskCompletionSource`1 completion, Int32 timeout, Task& task, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry, String method)
   at Microsoft.Data.SqlClient.SqlCommand.InternalExecuteNonQuery(TaskCompletionSource`1 completion, Boolean sendToPipe, Int32 timeout, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry, String methodName)
   at Microsoft.Data.SqlClient.SqlCommand.BeginExecuteNonQueryInternal(CommandBehavior behavior, AsyncCallback callback, Object stateObject, Int32 timeout, Boolean inRetry, Boolean asyncWrite)
   at Microsoft.Data.SqlClient.SqlCommand.BeginExecuteNonQueryAsync(AsyncCallback callback, Object stateObject)
   at System.Threading.Tasks.TaskFactory`1.FromAsyncImpl(Func`3 beginMethod, Func`2 endFunction, Action`1 endAction, Object state, TaskCreationOptions creationOptions)
   at Microsoft.Data.SqlClient.SqlCommand.ExecuteNonQueryAsync(CancellationToken cancellationToken)

EF Core version: 3.1.0
Database provider: (Microsoft.EntityFrameworkCore.SqlServer) UseSqlServer
Target framework: (.NET Core 3.1)
Operating system: Azure App Service Plan Premium v2 on Linux
IDE: (e.g. Visual Studio 2019 16.3)

@ErikEJ
Copy link
Contributor

ErikEJ commented Feb 8, 2020

Are you using the retry execution strategy?

@NitinNotebook
Copy link
Author

NitinNotebook commented Feb 8, 2020

Is the below enough or something we are missing:

We are using below for retry strategy in startup.cs

  builder.Services.AddDbContext<QuotaDbContext>(
               options =>
               {
                   options.UseSqlServer(
                       config["QuotaDB"],
                       sqlServerOptionsAction: sqlOptions =>
                       {
                           sqlOptions.EnableRetryOnFailure(
                           maxRetryCount: 3,
                           maxRetryDelay: TimeSpan.FromSeconds(30),
                           errorNumbersToAdd: null);
                       });

                   options.UseLoggerFactory(new LoggerFactory(new[] { new DebugLoggerProvider() }, new LoggerFilterOptions() { MinLevel = LogLevel.Information }));
               }, ServiceLifetime.Transient);

Then below code in the repository class:

using var quotaDbContext = this.serviceProvider.GetService<QuotaDbContext>();
await quotaDbContext.Database.ExecuteSqlRawAsync("StoredProcName").ConfigureAwait(false);

@NitinNotebook
Copy link
Author

NitinNotebook commented Feb 9, 2020

To provide more context:
We have multiple micro-services which are interacting with different sql databases. Problem is happening only at one place, we call stored procedure and pass DataTable as parameter which is used for Bulk Insert.

Lower environment is on Windows App Service Plan and this problem is not reported till now there.

And this Stored Proc has output parameter also.

@NitinNotebook
Copy link
Author

After changing ExecuteSqlRawAsync to Synchronous, issue is not repeating:
await quotaDbContext.Database.ExecuteSqlRaw("StoredProcName"); //No async

@ErikEJ
Copy link
Contributor

ErikEJ commented Feb 10, 2020

Why do you still have "await" ??

@NitinNotebook
Copy link
Author

Why do you still have "await" ??

That's a copy paste mistake while posting comment :)

@ajcvickers
Copy link
Member

@cheenamalhotra Does this look like a duplicate of one of the open TdsParser issues?

@cheenamalhotra
Copy link
Member

Intermittently but with high frequency getting error

This seems to be problem with the environment as we know sometimes in cluster environments, limitations are applied to the number of TCP connections that can be created at a time.

There has been a very active issue in Kubernetes pods, where the client (k8s instance) drops connections to Server host machine by sending RST (Reset) packets, and I think this might be conceptually related.

Do you know if there a backend service running on your client that drops connections which the driver doesn't know anything about, and when it tries to work with the connection, it cannot find the active connection?

This is observable when a long query/write operation is in progress, and the connection has been dropped by backend service.

SqlClient manages most of the other connection resiliency drops, but not within the same execution cycle,
e.g. when waiting for response from SQL Server for a long query.

@NitinNotebook
Copy link
Author

NitinNotebook commented Feb 11, 2020

Do you know if there a backend service running on your client that drops connections which the driver doesn't know anything about, and when it tries to work with the connection, it cannot find the active connection?
[Nitin]No there is no backend service to drop connections. Three components involved
UI (web app) -> Azure function v3 (http) -> serverless SQL DB. Also as per telemetry total number of active connections on DB was under 30.

This is observable when a long query/write operation is in progress, and the connection has been dropped by backend service.
[Nitin] It is really not very long running, it is user transaction max takes 8-10 seconds for large volume of data. Large volume means 3K-5K records

@cheenamalhotra
Copy link
Member

cheenamalhotra commented Feb 11, 2020

You need to collect TCP network traces to figure out issues like these by looking into what communication is happening between Client and Server and what happens with connections or why do they close. Also compare the network traces from that of a regular Windows/Linux OS machine to capture any additional packet differences.

A driver issue is possible if the same problem is also reproducible outside any special environment, in a normal setup on Windows/Linux OS. Have you tried running the same app on your own desktop and does it fail the same way?

@Kiechlus
Copy link

Kiechlus commented Feb 11, 2020

We have the same issue on AKS with kubernetes 1.13.10 with linux containers running EF Core 3.1 connected to Azure Sql Server. A little long running query works locally on Windows against the Azure Sql database, but the same call fails on the cluster with above error. As soon as I change to a syncronous call, it works in both cases.

@ajcvickers
Copy link
Member

EF Team Triage: Closing this issue as the requested additional details have not been provided and we have been unable to reproduce it.

BTW this is a canned response and may have info or details that do not directly apply to this particular issue. While we'd like to spend the time to uniquely address every incoming issue, we get a lot traffic on the EF projects and that is not practical. To ensure we maximize the time we have to work on fixing bugs, implementing new features, etc. we use canned responses for common triage decisions.

@simader
Copy link

simader commented Sep 9, 2020

@Kiechlus: Did you find a solution for this? I think we just are running into the same problem.

@Kiechlus
Copy link

@simader We did not try to switch back to async tasks so far.

@mehdipanjwani-ains
Copy link

This could happen if you do not implement Async operation all the way up.

@cheenamalhotra
Copy link
Member

Hi.

We later got it reproduced and fixed this issue in Microsoft.Data.SqlClient v2.1.0 (PR dotnet/SqlClient#796) please give it a try.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants