Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synchronously opening multiple ActiveDirectoryManagedIdentity connections in parallel is very slow #2470

Open
TomGathercole opened this issue Apr 22, 2024 · 2 comments

Comments

@TomGathercole
Copy link

TomGathercole commented Apr 22, 2024

Describe the bug

When multiple connections are opened concurrently using Authentication=ActiveDirectoryManagedIdentity via .Open(), the time taken increases significantly depending on the overall count.

When using the default Connect Timeout value (15s), this will prevent any connections opening with as few as ~20 concurrent attempts.

To reproduce

This can be reproduced using code similar to the following, running on a freshly-rebooted Azure App Service:

const string connectionString = @"Data Source=managedidentityrepro-sql.database.windows.net;Initial Catalog=managedidentityrepro-sqldb;Connect Timeout=60;Encrypt=True;Trust Server Certificate=True;Authentication=ActiveDirectoryManagedIdentity;Application Name=EntityFramework";
Parallel.For(0, 16, new ParallelOptions { MaxDegreeOfParallelism = 16}, _ =>
{
    using var connection = new SqlConnection(connectionString);
    connection.Open();
});

Because I have not been able to test ActiveDirectoryManagedIdentity auth locally, I've put together a repro that can be deployed to an Azure App Service + Azure SQL Database:
https://github.com/TomGathercole/ManagedIdentityRepro

The issue only affects new connections, so the app service must be rebooted before testing with a particular number of threads. This is a bit of a pain, and you may not want to deploy this in the first place, so I have incldued the results I collected below:

async threads totalMilliseconds
False 1 2,584.46
False 2 2,787.73
False 4 4,301.57
False 8 10,845.30
False 16 13,047.61
False 32 26,729.01
True 1 2,020.73
True 2 2,098.92
True 4 2,208.14
True 8 2,176.08
True 16 2,230.75
True 32 2,384.87

(async = false uses the synchronous .Open() method. async = true uses .OpenAsync(), and is included to illustrate the difference in behaviour)

I realize this is fairly complicated - if there's a way I can easily test ActiveDirectoryManagedIdentity auth in a console app, then please let me know and I'll try and simplify the repro.

Expected behavior

.Open() should perform similarly to .OpenAsync() when multiple connections are opened in parallel.

Further technical details

Microsoft.Data.SqlClient version: (5.2.0)
.NET target: (.net8.0)
SQL Server version: (Azure SQL)
Operating system: (Windows on Azure App Service)

@DavoudEshtehari DavoudEshtehari added this to Needs triage in SqlClient Triage Board via automation Apr 23, 2024
@JRahnama JRahnama moved this from Needs triage to Needs Investigation in SqlClient Triage Board Apr 23, 2024
@David-Engel
Copy link
Contributor

What happens to your test when you open a single, sync connection first. Then perform you parallel connections loop?

The reason I ask is that the first successful connection should obtain an access token via Managed Identity (this can take a second or two) that is then reused for subsequent connections. When you start off with a bunch of parallel connections, I think those connection open requests pass the point in the driver where they could take advantage of a cached access token and are simultaneously hitting the code that requests a new Managed Identity access token.

@TomGathercole
Copy link
Author

Yeah, this is in line with what I've observed. Once any number of sync or async connections succeed (either because you only attempt to make one, or because they all finish before Connect Timeout), then any future connections will be very fast. This is why the script in my repro has to keep rebooting the app service to trigger the slow connection.

It primarily seems to affect our app services rebooting under load, where we're seeeing enough req/s at peak times to mean that they are never able to come back up.

This is unconfirmed, but I suspect that we also hit the same situation when the managed identity token expires and a new one has to be retrieved (only based on the anecdotal evidence that we had an app service exhibit these same symptoms randomly without a reboot having ocurred).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
SqlClient Triage Board
  
Needs Investigation
Development

No branches or pull requests

4 participants