Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After migration from .NET7 to .NET8 SqlException 0x80131904 started to appear randomly when connecting to Azure SQL databases #2400

Open
JiriZidek opened this issue Mar 12, 2024 · 16 comments

Comments

@JiriZidek
Copy link

JiriZidek commented Mar 12, 2024

After migration from .NET7 to .NET8 SqlException 0x80131904 started to appear randomly when connecting to Azure SQL databases

The error happens in two flavors:

  • Microsoft.Data.SqlClient.SqlException (0x80131904): A connection was successfully established with the server, but then an error occurred during the login process. (provider: TCP Provider, error: 35 - An internal exception was caught)
  • Microsoft.Data.SqlClient.SqlException (0x80131904): A connection was successfully established with the server, but then an error occurred during the login process. (provider: TCP Provider, error: 0 - Success)

We observed that this error happens usually after a longer (>minutes) period of inactivity (kestrell web app not being hit). After first occurence, the successive DB operations succeed. We use SqlClient in our code indirectly through EF.

Version information

Microsoft.Data.SqlClient version: 5.2.0 (but same observed for 5.1.5) - Microsoft.Data.SqlClient.dll 5.20.324059.2 dated 28-FEB-2024 size 911480 bytes
Target framework: .NET 8 - 8.0.3 (but same observed for 8.0.2, 8.0.1 & 8.0.0)
Operating system: Linux docker image - same result on mcr.microsoft.com/dotnet/aspnet:8.0 and mcr.microsoft.com/dotnet/aspnet:8.0-jammy - hosted in AKS, similar behavior for backend service using image runtime:8.0

Relevant code

Original code - in .NET7 - now randomly failing .NET8:

        protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder) {
            if(!optionsBuilder.IsConfigured) {
                optionsBuilder.UseSqlServer(DbConfiguration.ConnectionString);
            }
        }

Improved code - no better, same problems:

        protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder) {
            if(!optionsBuilder.IsConfigured) {
                optionsBuilder.UseSqlServer(DbConfiguration.ConnectionString,
                    opt => opt.UseAzureSqlDefaults(DbConfiguration.UseAzureDefaults));
            }
        }

Failing place examples (they are ranadom in fact, in various programes):

using(var ctx = new LicensingDbContext()) {
    var header = await (from h in ctx.Headers.AsNoTracking() where h.Id == licenceId && h.Status == Status.Issued select h).FirstOrDefaultAsync();

using var ctx = new DeploymentDbContext();
ctx.Events.Add(new Event { EventType = EventType.TenantResource_Operational, ScriptId = ScriptId, Reference = (int)ScriptData.Process, ResourceType = trdb.ResourceType, ResourceId = trdb.TenantResourceId, TenantId = trdb.TenantId, TimeUtc = DateTime.UtcNow, DetailText = $"PG SQL K8s Database: {dbname}" });
await ctx.SaveChangesAsync(cancel);

Connection string

Connection string looks similar to
Server=tcp:SOME-sql.database.windows.net,1433;Initial Catalog=MyDB;Persist Security Info=False;User ID=some-admin;Password=123456789;MultipleActiveResultSets=False;Encrypt=True;TrustServerCertificate=False;Connection Timeout=30;

  • We have tried to set TrustServerCertificate=True with no effect.
  • We have tried to set MultipleActiveResultSets=True with no effect.

Full stack traces

fail: Microsoft.AspNetCore.Server.Kestrel[13]
      Connection id "0HN1UUOR4NG60", Request id "0HN1UUOR4NG60:00000002": An unhandled exception was thrown by the application.
      Microsoft.Data.SqlClient.SqlException (0x80131904): A connection was successfully established with the server, but then an error occurred during the login process. (provider: TCP Provider, error: 35 - An internal exception was caught)
       ---> System.IO.IOException: Unable to write data to the transport connection: Connection reset by peer.
       ---> System.Net.Sockets.SocketException (104): Connection reset by peer
         at System.Net.Sockets.NetworkStream.Write(Byte[] buffer, Int32 offset, Int32 count)
         --- End of inner exception stack trace ---
         at System.Net.Sockets.NetworkStream.Write(Byte[] buffer, Int32 offset, Int32 count)
         at System.IO.Stream.Write(ReadOnlySpan`1 buffer)
         at Microsoft.Data.SqlClient.SNI.SslOverTdsStream.Write(ReadOnlySpan`1 buffer)
         at System.Net.Security.SslStream.WriteSingleChunk[TIOAdapter](ReadOnlyMemory`1 buffer, CancellationToken cancellationToken)
         at System.Net.Security.SslStream.WriteAsyncInternal[TIOAdapter](ReadOnlyMemory`1 buffer, CancellationToken cancellationToken)
         at Microsoft.Data.SqlClient.SNI.SNITCPHandle.Send(SNIPacket packet)
         at Microsoft.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
         at Microsoft.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, SqlCommand command, Boolean callerHasConnectionLock, Boolean asyncClose)
         at Microsoft.Data.SqlClient.TdsParserStateObject.SNIWritePacket(PacketHandle packet, UInt32& sniError, Boolean canAccumulate, Boolean callerHasConnectionLock, Boolean asyncClose)
         at Microsoft.Data.SqlClient.TdsParserStateObject.WriteSni(Boolean canAccumulate)
         at Microsoft.Data.SqlClient.TdsParserStateObject.WritePacket(Byte flushMode, Boolean canAccumulate)
         at Microsoft.Data.SqlClient.TdsParser.TdsLogin(SqlLogin rec, FeatureExtension requestedFeatures, SessionData recoverySessionData, FederatedAuthenticationFeatureExtensionData fedAuthFeatureExtensionData, SqlConnectionEncryptOption encrypt)
         at Microsoft.Data.SqlClient.SqlInternalConnectionTds.Login(ServerInfo server, TimeoutTimer timeout, String newPassword, SecureString newSecurePassword, SqlConnectionEncryptOption encrypt)
         at Microsoft.Data.SqlClient.SqlInternalConnectionTds.AttemptOneLogin(ServerInfo serverInfo, String newPassword, SecureString newSecurePassword, TimeoutTimer timeout, Boolean withFailover)
         at Microsoft.Data.SqlClient.SqlInternalConnectionTds.LoginNoFailover(ServerInfo serverInfo, String newPassword, SecureString newSecurePassword, Boolean redirectedUserInstance, SqlConnectionString connectionOptions, SqlCredential credential, TimeoutTimer timeout)
         at Microsoft.Data.SqlClient.SqlInternalConnectionTds.OpenLoginEnlist(TimeoutTimer timeout, SqlConnectionString connectionOptions, SqlCredential credential, String newPassword, SecureString newSecurePassword, Boolean redirectedUserInstance)
         at Microsoft.Data.SqlClient.SqlInternalConnectionTds..ctor(DbConnectionPoolIdentity identity, SqlConnectionString connectionOptions, SqlCredential credential, Object providerInfo, String newPassword, SecureString newSecurePassword, Boolean redirectedUserInstance, SqlConnectionString userConnectionOptions, SessionData reconnectSessionData, Boolean applyTransientFaultHandling, String accessToken, DbConnectionPool pool, Func`3 accessTokenCallback)
         at Microsoft.Data.SqlClient.SqlConnectionFactory.CreateConnection(DbConnectionOptions options, DbConnectionPoolKey poolKey, Object poolGroupProviderInfo, DbConnectionPool pool, DbConnection owningConnection, DbConnectionOptions userOptions)
         at Microsoft.Data.ProviderBase.DbConnectionFactory.CreatePooledConnection(DbConnectionPool pool, DbConnection owningObject, DbConnectionOptions options, DbConnectionPoolKey poolKey, DbConnectionOptions userOptions)
         at Microsoft.Data.ProviderBase.DbConnectionPool.CreateObject(DbConnection owningObject, DbConnectionOptions userOptions, DbConnectionInternal oldConnection)
         at Microsoft.Data.ProviderBase.DbConnectionPool.UserCreateRequest(DbConnection owningObject, DbConnectionOptions userOptions, DbConnectionInternal oldConnection)
         at Microsoft.Data.ProviderBase.DbConnectionPool.TryGetConnection(DbConnection owningObject, UInt32 waitForMultipleObjectsTimeout, Boolean allowCreate, Boolean onlyOneCheckConnection, DbConnectionOptions userOptions, DbConnectionInternal& connection)
         at Microsoft.Data.ProviderBase.DbConnectionPool.WaitForPendingOpen()
      --- End of stack trace from previous location ---
         at Microsoft.EntityFrameworkCore.Storage.RelationalConnection.OpenInternalAsync(Boolean errorsExpected, CancellationToken cancellationToken)
         at Microsoft.EntityFrameworkCore.Storage.RelationalConnection.OpenInternalAsync(Boolean errorsExpected, CancellationToken cancellationToken)
         at Microsoft.EntityFrameworkCore.Storage.RelationalConnection.OpenAsync(CancellationToken cancellationToken, Boolean errorsExpected)
         at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteReaderAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken)
         at Microsoft.EntityFrameworkCore.Query.Internal.SingleQueryingEnumerable`1.AsyncEnumerator.InitializeReaderAsync(AsyncEnumerator enumerator, CancellationToken cancellationToken)
         at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.<>c__DisplayClass30_0`2.<<ExecuteAsync>b__0>d.MoveNext()
      --- End of stack trace from previous location ---
         at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.ExecuteImplementationAsync[TState,TResult](Func`4 operation, Func`4 verifySucceeded, TState state, CancellationToken cancellationToken)
         at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.ExecuteImplementationAsync[TState,TResult](Func`4 operation, Func`4 verifySucceeded, TState state, CancellationToken cancellationToken)
         at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.ExecuteAsync[TState,TResult](TState state, Func`4 operation, Func`4 verifySucceeded, CancellationToken cancellationToken)
         at Microsoft.EntityFrameworkCore.Query.Internal.SingleQueryingEnumerable`1.AsyncEnumerator.MoveNextAsync()
         at Microsoft.EntityFrameworkCore.Query.ShapedQueryCompilingExpressionVisitor.SingleOrDefaultAsync[TSource](IAsyncEnumerable`1 asyncEnumerable, CancellationToken cancellationToken)
         at Microsoft.EntityFrameworkCore.Query.ShapedQueryCompilingExpressionVisitor.SingleOrDefaultAsync[TSource](IAsyncEnumerable`1 asyncEnumerable, CancellationToken cancellationToken)
         at Atlantis.LicensingWeb.Controllers.TelemetryController.FindLicenceIdDetails(Guid licenceId, LicensingDbContext ctx) in /src/Cloud/LicensingWeb/Controllers/TelemetryController.cs:line 113
         at Atlantis.LicensingWeb.Controllers.TelemetryController.GetMyIp(Guid licenceId, String appName) in /src/Cloud/LicensingWeb/Controllers/TelemetryController.cs:line 91
         at lambda_method262(Closure, Object)
         at Microsoft.AspNetCore.Mvc.Infrastructure.ActionMethodExecutor.AwaitableObjectResultExecutor.Execute(ActionContext actionContext, IActionResultTypeMapper mapper, ObjectMethodExecutor executor, Object controller, Object[] arguments)
         at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.<InvokeActionMethodAsync>g__Awaited|12_0(ControllerActionInvoker invoker, ValueTask`1 actionResultValueTask)
         at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.<InvokeNextActionFilterAsync>g__Awaited|10_0(ControllerActionInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted)
         at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.Rethrow(ActionExecutedContextSealed context)
         at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.Next(State& next, Scope& scope, Object& state, Boolean& isCompleted)
         at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.<InvokeInnerFilterAsync>g__Awaited|13_0(ControllerActionInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted)
         at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.<InvokeNextResourceFilter>g__Awaited|25_0(ResourceInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted)
         at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.Rethrow(ResourceExecutedContextSealed context)
         at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.Next(State& next, Scope& scope, Object& state, Boolean& isCompleted)
         at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.<InvokeFilterPipelineAsync>g__Awaited|20_0(ResourceInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted)
         at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.<InvokeAsync>g__Awaited|17_0(ResourceInvoker invoker, Task task, IDisposable scope)
         at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.<InvokeAsync>g__Awaited|17_0(ResourceInvoker invoker, Task task, IDisposable scope)
         at Microsoft.AspNetCore.Authorization.AuthorizationMiddleware.Invoke(HttpContext context)
         at Microsoft.AspNetCore.Authentication.AuthenticationMiddleware.Invoke(HttpContext context)
         at Atlantis.General.Web.SecurityFeaturesMiddleware.InvokeAsync(HttpContext context) in /src/General/Web/SecurityFeaturesMiddleware.cs:line 38
         at Microsoft.AspNetCore.Builder.Extensions.UsePathBaseMiddleware.InvokeCore(HttpContext context, PathString matchedPath, PathString remainingPath)
         at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http.HttpProtocol.ProcessRequests[TContext](IHttpApplication`1 application)

2024-03-12T10:20:22+01:00       Microsoft.Data.SqlClient.SqlException (0x80131904): A connection was successfully established with the server, but then an error occurred during the login process. (provider: TCP Provider, error: 0 - Success)
2024-03-12T10:20:22+01:00          at Microsoft.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
2024-03-12T10:20:22+01:00          at Microsoft.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, SqlCommand command, Boolean callerHasConnectionLock, Boolean asyncClose)
2024-03-12T10:20:22+01:00          at Microsoft.Data.SqlClient.TdsParserStateObject.ThrowExceptionAndWarning(Boolean callerHasConnectionLock, Boolean asyncClose)
2024-03-12T10:20:22+01:00          at Microsoft.Data.SqlClient.TdsParserStateObject.ReadSniError(TdsParserStateObject stateObj, UInt32 error)
2024-03-12T10:20:22+01:00          at Microsoft.Data.SqlClient.TdsParserStateObject.ReadSniSyncOverAsync()
2024-03-12T10:20:22+01:00          at Microsoft.Data.SqlClient.TdsParserStateObject.TryPrepareBuffer()
2024-03-12T10:20:22+01:00          at Microsoft.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)
2024-03-12T10:20:22+01:00          at Microsoft.Data.SqlClient.TdsParser.Run(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj)
2024-03-12T10:20:22+01:00          at Microsoft.Data.SqlClient.SqlInternalConnectionTds.CompleteLogin(Boolean enlistOK)
2024-03-12T10:20:22+01:00          at Microsoft.Data.SqlClient.SqlInternalConnectionTds.AttemptOneLogin(ServerInfo serverInfo, String newPassword, SecureString newSecurePassword, TimeoutTimer timeout, Boolean withFailover)
2024-03-12T10:20:22+01:00          at Microsoft.Data.SqlClient.SqlInternalConnectionTds.LoginNoFailover(ServerInfo serverInfo, String newPassword, SecureString newSecurePassword, Boolean redirectedUserInstance, SqlConnectionString connectionOptions, SqlCredential credential, TimeoutTimer timeout)
2024-03-12T10:20:22+01:00          at Microsoft.Data.SqlClient.SqlInternalConnectionTds.OpenLoginEnlist(TimeoutTimer timeout, SqlConnectionString connectionOptions, SqlCredential credential, String newPassword, SecureString newSecurePassword, Boolean redirectedUserInstance)
2024-03-12T10:20:22+01:00          at Microsoft.Data.SqlClient.SqlInternalConnectionTds..ctor(DbConnectionPoolIdentity identity, SqlConnectionString connectionOptions, SqlCredential credential, Object providerInfo, String newPassword, SecureString newSecurePassword, Boolean redirectedUserInstance, SqlConnectionString userConnectionOptions, SessionData reconnectSessionData, Boolean applyTransientFaultHandling, String accessToken, DbConnectionPool pool, Func`3 accessTokenCallback)
2024-03-12T10:20:22+01:00          at Microsoft.Data.SqlClient.SqlConnectionFactory.CreateConnection(DbConnectionOptions options, DbConnectionPoolKey poolKey, Object poolGroupProviderInfo, DbConnectionPool pool, DbConnection owningConnection, DbConnectionOptions userOptions)
2024-03-12T10:20:22+01:00          at Microsoft.Data.ProviderBase.DbConnectionFactory.CreatePooledConnection(DbConnectionPool pool, DbConnection owningObject, DbConnectionOptions options, DbConnectionPoolKey poolKey, DbConnectionOptions userOptions)
2024-03-12T10:20:22+01:00          at Microsoft.Data.ProviderBase.DbConnectionPool.CreateObject(DbConnection owningObject, DbConnectionOptions userOptions, DbConnectionInternal oldConnection)
2024-03-12T10:20:22+01:00          at Microsoft.Data.ProviderBase.DbConnectionPool.UserCreateRequest(DbConnection owningObject, DbConnectionOptions userOptions, DbConnectionInternal oldConnection)
2024-03-12T10:20:22+01:00          at Microsoft.Data.ProviderBase.DbConnectionPool.TryGetConnection(DbConnection owningObject, UInt32 waitForMultipleObjectsTimeout, Boolean allowCreate, Boolean onlyOneCheckConnection, DbConnectionOptions userOptions, DbConnectionInternal& connection)
2024-03-12T10:20:22+01:00          at Microsoft.Data.ProviderBase.DbConnectionPool.WaitForPendingOpen()
2024-03-12T10:20:22+01:00       --- End of stack trace from previous location ---
2024-03-12T10:20:22+01:00          at Microsoft.EntityFrameworkCore.Storage.RelationalConnection.OpenInternalAsync(Boolean errorsExpected, CancellationToken cancellationToken)
2024-03-12T10:20:22+01:00          at Microsoft.EntityFrameworkCore.Storage.RelationalConnection.OpenInternalAsync(Boolean errorsExpected, CancellationToken cancellationToken)
2024-03-12T10:20:22+01:00          at Microsoft.EntityFrameworkCore.Storage.RelationalConnection.OpenAsync(CancellationToken cancellationToken, Boolean errorsExpected)
2024-03-12T10:20:22+01:00          at Microsoft.EntityFrameworkCore.Storage.RelationalConnection.BeginTransactionAsync(IsolationLevel isolationLevel, CancellationToken cancellationToken)
2024-03-12T10:20:22+01:00          at Microsoft.EntityFrameworkCore.Storage.RelationalConnection.BeginTransactionAsync(CancellationToken cancellationToken)
2024-03-12T10:20:22+01:00          at Microsoft.EntityFrameworkCore.Update.Internal.BatchExecutor.ExecuteAsync(IEnumerable`1 commandBatches, IRelationalConnection connection, CancellationToken cancellationToken)
2024-03-12T10:20:22+01:00          at Microsoft.EntityFrameworkCore.Update.Internal.BatchExecutor.ExecuteAsync(IEnumerable`1 commandBatches, IRelationalConnection connection, CancellationToken cancellationToken)
2024-03-12T10:20:22+01:00          at Microsoft.EntityFrameworkCore.Update.Internal.BatchExecutor.ExecuteAsync(IEnumerable`1 commandBatches, IRelationalConnection connection, CancellationToken cancellationToken)
2024-03-12T10:20:22+01:00          at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.StateManager.SaveChangesAsync(IList`1 entriesToSave, CancellationToken cancellationToken)
2024-03-12T10:20:22+01:00          at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.StateManager.SaveChangesAsync(StateManager stateManager, Boolean acceptAllChangesOnSuccess, CancellationToken cancellationToken)
2024-03-12T10:20:22+01:00          at Microsoft.EntityFrameworkCore.SqlServer.Storage.Internal.SqlServerExecutionStrategy.ExecuteAsync[TState,TResult](TState state, Func`4 operation, Func`4 verifySucceeded, CancellationToken cancellationToken)
2024-03-12T10:20:22+01:00          at Microsoft.EntityFrameworkCore.DbContext.SaveChangesAsync(Boolean acceptAllChangesOnSuccess, CancellationToken cancellationToken)
2024-03-12T10:20:22+01:00          at Microsoft.EntityFrameworkCore.DbContext.SaveChangesAsync(Boolean acceptAllChangesOnSuccess, CancellationToken cancellationToken)
2024-03-12T10:20:22+01:00          at Atlantis.DeploymentService.Processes.K8sNameSpaceWithPg_UpDown_Process.SetUpAsync(CancellationToken cancel) in /src/Cloud/DeploymentService/Processes/K8sNameSpaceWithPg_UpDown_Process.cs:line 230
2024-03-12T10:20:22+01:00          at Atlantis.DeploymentService.DeploymentProcessor.PerformSetUp(IProcess process) in /src/Cloud/DeploymentService/DeploymentProcessor.cs:line 358

We hoped the problem would be solved by 5.2.0, but it does not look to be the case.

@JiriZidek
Copy link
Author

Please remove lavel "sqlite" - it was added by mistake, but I cannot change it anymore to "ms sqlclient" or similar.

@ajcvickers ajcvickers transferred this issue from dotnet/efcore Mar 12, 2024
@JRahnama JRahnama added this to Needs triage in SqlClient Triage Board via automation Mar 12, 2024
@JRahnama
Copy link
Member

@JiriZidek Can you capture some EventSource logs. Also can you confirm by reverting back to Net7 the issue does not happen.

@JRahnama JRahnama moved this from Needs triage to Needs More Info in SqlClient Triage Board Mar 12, 2024
@JRahnama
Copy link
Member

I recommend you contact Azure support and have the networking team look into what is going on during those periods of failures.

@JiriZidek
Copy link
Author

I recommend you contact Azure support and have the networking team look into what is going on during those periods of failures.

I am afraid there are no "periods". The problem appears randomly, but always after restarting the container. So from this point of view it is reproducible.

@JiriZidek
Copy link
Author

@JiriZidek Can you capture some EventSource logs. Also can you confirm by reverting back to Net7 the issue does not happen.

Yes - we tested old version, which differs only by being .NET7 and all the nuget modules from December (EF, SqlClient, image aspnet:7.0) and yes, when restarted it does not yield this error.
EventSource - I'll try my best, but I am in Kubernetes POD - so in fact I have really limited ways how to do that. In fact I use only text log output captured Kubernetes. The pages refered are more for Windows installation.

@JiriZidek
Copy link
Author

I realized one important point - I run the very same code as Azure Service, using Linux plan and .NET8 - and in this case there is not such problem with SqlClient.

@JRahnama
Copy link
Member

I realized one important point - I run the very same code as Azure Service, using Linux plan and .NET8 - and in this case there is not such problem with SqlClient.

This confirms that your best option would be contacting Azure support.

@JiriZidek
Copy link
Author

This confirms that your best option would be contacting Azure support.

But how this plays with .NET7 works vs. .NET8 does not ? In same place, same AKS cluster ? I can put two PODs, one old, one new in parallel to decide if it is likely the network problem. Let you know. I'd like to avoid being pingponged back from Azure support.

@Crossbow78
Copy link

Just a note to confirm we're seeing the exact same issue in almost an identical setup:

  • Upgraded from .NET 7 to .NET 8 (and EF Core 7 to EF Core 8)
  • Using Microsoft.EntityFrameworkCore.SqlServer 8.0.2, Microsoft.Data.SqlClient 5.1.4
  • Running on linux pods in an AKS cluster, Kubernetes 1.27
  • Connecting to a managed Azure SQL instance

Exception message:

A connection was successfully established with the server, but then an error occurred during the login process. (provider: TCP Provider, error: 35 - An internal exception was caught) Unable to write data to the transport connection: Connection reset by peer. Connection reset by peer

Note that the error occurs incidentally (less than 0.5% of the calls), and we're seeing the exceptions since the day that the .NET upgrade was deployed.

@JiriZidek
Copy link
Author

So positively confirmed. AKS, same K8s namespace. One POD with .NET8 and .NET 7.

.NET8 - multiple errors:
image

.NET7 - no problem
image

So for me this is NOT Azure problem. It is problem if .NET8 or problem of Debian 12 (which is bases of the image). But since I have observed same bad behavior on Unbuntu based images I bet on .NET8 or Microsoft.Data.SqlClient. I guess there is something wrong with TLS handshake.

@popcatalin81
Copy link

I'm having the same issue, there are no concurrent connections to the DB and no async operations happening in parallel on the same context.

An exception occurred while iterating over the results of a query for context type 'DataAccess.ChassiContext'.
      System.InvalidOperationException: The connection does not support MultipleActiveResultSets.
         at Microsoft.Data.SqlClient.SqlCommand.<>c.<ExecuteDbDataReaderAsync>b__211_0(Task`1 result)
         at System.Threading.Tasks.ContinuationResultTaskFromResultTask`2.InnerInvoke()
         at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
      --- End of stack trace from previous location ---
         at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
         at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread)
      --- End of stack trace from previous location ---
         at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteReaderAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken)
         at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteReaderAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken)
         at Microsoft.EntityFrameworkCore.Query.Internal.SingleQueryingEnumerable`1.AsyncEnumerator.InitializeReaderAsync(AsyncEnumerator enumerator, CancellationToken cancellationToken)
         at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.<>c__DisplayClass30_0`2.<<ExecuteAsync>b__0>d.MoveNext()
      --- End of stack trace from previous location ---
         at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.ExecuteImplementationAsync[TState,TResult](Func`4 operation, Func`4 verifySucceeded, TState state, CancellationToken cancellationToken)
         at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.ExecuteImplementationAsync[TState,TResult](Func`4 operation, Func`4 verifySucceeded, TState state, CancellationToken cancellationToken)
         at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.ExecuteAsync[TState,TResult](TState state, Func`4 operation, Func`4 verifySucceeded, CancellationToken cancellationToken)
         at Microsoft.EntityFrameworkCore.Query.Internal.SingleQueryingEnumerable`1.AsyncEnumerator.MoveNextAsync()
      System.InvalidOperationException: The connection does not support MultipleActiveResultSets.
         at Microsoft.Data.SqlClient.SqlCommand.<>c.<ExecuteDbDataReaderAsync>b__211_0(Task`1 result)
         at System.Threading.Tasks.ContinuationResultTaskFromResultTask`2.InnerInvoke()
         at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
      --- End of stack trace from previous location ---
         at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
         at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread)
      --- End of stack trace from previous location ---
         at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteReaderAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken)
         at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteReaderAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken)
         at Microsoft.EntityFrameworkCore.Query.Internal.SingleQueryingEnumerable`1.AsyncEnumerator.InitializeReaderAsync(AsyncEnumerator enumerator, CancellationToken cancellationToken)
         at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.<>c__DisplayClass30_0`2.<<ExecuteAsync>b__0>d.MoveNext()
      --- End of stack trace from previous location ---
         at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.ExecuteImplementationAsync[TState,TResult](Func`4 operation, Func`4 verifySucceeded, TState state, CancellationToken cancellationToken)
         at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.ExecuteImplementationAsync[TState,TResult](Func`4 operation, Func`4 verifySucceeded, TState state, CancellationToken cancellationToken)
         at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.ExecuteAsync[TState,TResult](TState state, Func`4 operation, Func`4 verifySucceeded, CancellationToken cancellationToken)
         at Microsoft.EntityFrameworkCore.Query.Internal.SingleQueryingEnumerable`1.AsyncEnumerator.MoveNextAsync()

@patohara-rkt
Copy link

We experienced the same SqlException (0x80131904), however in a slightly different environment, and I beleive I understand the source of this exception, at least in our environment.

Microsoft.Data.SqlClient.SqlException (0x80131904): A connection was successfully established with the server, but then an error occurred during the login process. (provider: TCP Provider, error: 35 - An internal exception was caught)

TLDR

This exception occurs when there is a proxy between the client and the SQL Server AG Listener, and the proxy accepts TCP connections for both the primary and secondary SQL Server instances. The mechanism SqlClient uses to determine which host is primary is by opening a Socket to all IPs returned by the DNS lookup, and the first socket to open is considered the primary and the other connections are disposed. (See ParallelConnectAsync and ParallelConnectHelper from SNITcpHandle.cs). Under normal circumstances, without the proxy, the secondary instance is not listening on port <listener-ip>:1433 and therefore the client cannot open a socket. But when a proxy is between the client and server, the client can open the socket and even perform the TCP handshake with the proxy for both the primary and secondary SQL Server Instances. If the socket is opened to the secondary first, then as soon as any data is sent this exception is thrown.

Environment

  • Kubernetes (AWS EKS 1.26)
    • Istio Sidecar Proxy Enabled.
    • Istio DNS Proxy enabled.
  • Microsoft.Data.SqlClient version: 5.2.0
  • Images:
    • mcr.microsoft.com/dotnet/aspnet:8.0
    • mcr.microsoft.com/dotnet/sdk:8.0
  • Connection String:
    • Note: Adding "Pooling=False;" to the connection string will turn off the connection pooling and cause the issue to occur more frequently.
"Data Source=<sql-server-ag-listener.fqdn>;Initial Catalog=<DatabaseInAG>;MultiSubnetFailover=True;Persist Security Info=False;User ID=<user>;Password=<password>;MultipleActiveResultSets=False;Encrypt=True;TrustServerCertificate=True;Connection Timeout=30;";

pod.yaml

---
apiVersion: v1
kind: Pod
metadata:
  labels:
    sidecar.istio.io/inject: "true"
  annotations:
    proxy.istio.io/config: |
      proxyMetadata:
        ISTIO_META_DNS_CAPTURE: "true"
        ISTIO_META_DNS_AUTO_ALLOCATE: "true"
  name: dotnet8
spec:
  containers:
  - image: mcr.microsoft.com/dotnet/sdk:8.0
    name: dotnet8
    imagePullPolicy: Always
    command: ["sleep"]
    args: ["3600"]

service.yaml

---
apiVersion: networking.istio.io/v1beta1
kind: ServiceEntry
metadata:
  name: sql-ag
spec:
  hosts:
  - "sql-server-ag-listener.fqdn"
  exportTo:
  - "."
  location: MESH_EXTERNAL
  ports:
  - number: 1433
    name: sql
    protocol: TCP
  resolution: DNS

Description

The first issue with this, is that Istio will create a Virtual IP (VIP) for the ServiceEntry ex: 240.240.240.10 and any DNS queries performed within the pod for the listener name will resolve to this VIP. In addition to the VIP, the Istio proxy sidecar will do DNS resolution on that hostname and put both the primary and secondary IPs into a pool and perform load balancing between all IPs in the pool. This means sometimes when SqlClient will get sent to the secondary instance.

Disable the VIP and Set the Service resolution to NONE

  annotations:
  proxy.istio.io/config: |
    proxyMetadata:
      ISTIO_META_DNS_CAPTURE: "true"
      ISTIO_META_DNS_AUTO_ALLOCATE: "false"
apiVersion: networking.istio.io/v1beta1
kind: ServiceEntry
spec:
  resolution: NONE

With the VIP disabled the the pod's DNS queries will now resolve to the 2 SQL Listener IPs, and with the ServiceEntry resolution set to NONE, connections will be routed/forwarded to the IP requested.

This however does not solve the root of the issue. Istio Proxy is listening and accepting connection on 1433. SqlClient attempts to open 2 connections to the 2 IPs and because Istio is listening both sockets will succeed. If the primary is opened first then all is good. If the secondary is opened first then will perform the TCP handshake and then TLS / TDS handshake which is where the exception is thrown.

Bypass Istio

    annotations:
      traffic.sidecar.istio.io/excludeOutboundPorts: '1433'

This will cause any traffic on port 1433 to completely bypass Istio and traffic will go directly to SQL Server without issue.

Possible Fix

In SqlClient, delay the decision for which IP/Host is considered primary to after the TLS handshake / TDS Handshake is attempted instead of using Socket.Connect() and perform the handshake in parallel for all IPs. If a socket is opened to the secondary listener ip it will fail at the handshake, otherwise the Socket.Connect() will timeout and in the mean time the primary handshake can proceed, in which case discard the rest of the connections.

@JiriZidek
Copy link
Author

  • Note: Adding "Pooling=False;" to the connection string will turn off the connection pooling and cause the issue to occur more frequently.

👍 In my recent experiments this makes app almost unusable with Azure SQL. I planned to use this in POC project.

@patohara-rkt
Copy link

The only reason I mentioned Pooling=False is because it is useful if anyone attempts to re-create the issue and wants to reliably reproduce it.

It is unlikely you would want to Pooling=False in a real world application

@chticer
Copy link

chticer commented Apr 13, 2024

I am also facing the same issue as well.

An exception occurred while iterating over the results of a query for context type 'musicstreamingplayer.DatabaseModels.igdbapi.IGDBAPIAppContext'.
Microsoft.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired.  The timeout period elapsed prior to completion of the operation or the server is not responding.
 ---> System.ComponentModel.Win32Exception (258): The wait operation timed out.
   at Microsoft.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
   at Microsoft.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
   at Microsoft.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
   at Microsoft.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)
   at Microsoft.Data.SqlClient.SqlDataReader.TryReadInternal(Boolean setTimeout, Boolean& more)
   at Microsoft.Data.SqlClient.SqlDataReader.Read()
   at Microsoft.EntityFrameworkCore.Storage.RelationalDataReader.Read()
   at Microsoft.EntityFrameworkCore.Query.Internal.SingleQueryingEnumerable`1.Enumerator.MoveNext()
ClientConnectionId:d2f1ae4f-531d-47a7-90f3-97eca52f1501
Error Number:-2,State:0,Class:11

@chticer
Copy link

chticer commented Apr 14, 2024

Ignore my comment from before, the issue I am facing is not related to this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
SqlClient Triage Board
  
Needs More Info
Development

No branches or pull requests

6 participants