Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some questions on AccessToken, pooling and performance #767

Closed
roji opened this issue Oct 21, 2020 · 11 comments
Closed

Some questions on AccessToken, pooling and performance #767

roji opened this issue Oct 21, 2020 · 11 comments

Comments

@roji
Copy link
Member

roji commented Oct 21, 2020

There have been some ongoing discussions about how to use access tokens efficiently with EF Core (see this blog post for an example), and I'm trying to figure out the right way to use access tokens efficiently. Since I'm a beginner I wanted to try to clear up a thing or two :)

  • Once a physical connection has been established to SQL Server/SQL Azure, does it stay valid until physically closed, or is some sort of access token refresh/renegotiation have to occur? In other words, is the access token only important for initially creating/opening the connection, and no longer relevant afterwards?
  • From a cursory look at the SqlClient source code, it seems that the access token (and also SqlCredential) is part of the pooling key (SqlConnectionPoolKey). If I understand correctly, this means that if the access token expires and needs to be rotated, a new pool is created, and all the physical connections in the old pool would never get used again, is that right? If so, I'd be interested in understanding why - would it not be possible to omit the access token from the pooling key, reusing the same pool as the token changes during the lifetime of an application? Is this a sort of security measure to require code using SqlClient to always know the access token, i.e. not return an already-open, pooled connection to code which doesn't know it?
  • My actual concern here is with the performance implications of getting an access token each time an SqlConnection needs to be opened. Since this may be a heavy operation (i.e. with network traffic), this could be significant, especially where the whole point of pooling is to avoid heavy operations on Open.
  • Depending on the answers to the above, would it make sense to imagine SqlConnection accepting a callback that returns an access token, which gets called only if a new physical connection needs to be created? For pooling scenarios, this callback would never be invoked, saving the overhead of fetching a token. With the current situation, user code has no way of knowing whether calling Open will create a new physical connection or not (as pooling is an internal concern), and so must always populate the AccessToken property via what is potentially a heavy operation.

Thanks for taking the time to read and answer the above...!

/cc @ajcvickers

@roji
Copy link
Member Author

roji commented Oct 21, 2020

Note: @ErikEJ has just referred me to #730 (comment), which may be relevant in this context - could this be used to only acquire access tokens when they are actually needed?

@cheenamalhotra
Copy link
Member

cheenamalhotra commented Oct 21, 2020

Hi @roji

Let me provide you answers as per my best understanding:

Once a physical connection has been established to SQL Server/SQL Azure, does it stay valid until physically closed, or is some sort of access token refresh/renegotiation have to occur? In other words, is the access token only important for initially creating/opening the connection, and no longer relevant afterwards?

The answer is No. Once a physical connection is created, as long as it is not "RESET" (happens during reactivation of pooled connection), it will stay valid. if it is being reset and access token is not valid in that case, the server may choose to "Revalidate" the access token, and reject the connection request. This is what we also did in PR #635 to renew access token if it's expired when acquiring connection from pool. If user provides access token with SqlConnection API, it is "expected" to be always valid, whether or not pooling is enabled.

From a cursory look at the SqlClient source code, it seems that the access token (and also SqlCredential) is part of the pooling key (SqlConnectionPoolKey). If I understand correctly, this means that if the access token expires and needs to be rotated, a new pool is created, and all the physical connections in the old pool would never get used again, is that right?

Yes, it is required to be part of pool key, to avoid re-using connections that were created with an "expired" access token, as Server will revalidate, and that should be "expected". The revalidation is not clearly defined yet but will likely become more strict in future. We discovered it happened every 10 hours in a span of "active" physical connection when connections were reset. But we must expect it to occur anytime a new connection is created/pooled connection is activated and when that happens "access token" must be "valid".

Depending on the answers to the above, would it make sense to imagine SqlConnection accepting a callback that returns an access token, which gets called only if a new physical connection needs to be created? For pooling scenarios, this callback would never be invoked, saving the overhead of fetching a token.

A callback would be always needed to be called, as I mentioned above the importance of acquiring access token even when activating a pooled connection. We can expect the user will update their access tokens when opening a pooled connection if they're expired or near expiry.

@cheenamalhotra
Copy link
Member

cheenamalhotra commented Oct 22, 2020

Also to mention, there are possibilities of accepting a function callback and acquiring access token from the callback as and when needed. But that would require driver to validate access token everytime before making a connection request (pooled or non pooled) which is possible using JWT APIs and if token is found expired, driver could acquire fresh from callback.

It would become similar to design of AzureKeyVaultProvider that accepts authenticationCallback in it's constructor that only provides an implementation and not an actual access token.

The new API trends are however moving ahead to a Credential based construction patterns as most APIs in Azure.Identity library are designed now. Would recommend going through those once.

But yes, you'll have to see if that actually gives any perf benefit v/s current design and if it's worth the complexity from user point of view.

@roji
Copy link
Member Author

roji commented Oct 22, 2020

Thanks for the details responses @cheenamalhotra.

The revalidation is not clearly defined yet but will likely become more strict in future. We discovered it happened every 10 hours in a span of "active" physical connection when connections were reset

So to be sure I understand... After 10 hours of a physical connection lifetime, SQL Server will trigger some sort of revalidation? Or does the revalidation happen only when SqlClient initiates a RESET as part of the pooling? In other words, say my application opens a new connection (with an access token), and doesn't return it to the pool for 2 days - is it guaranteed to stay usable and not to error because of token expiry?

A callback would be always needed to be called, as I mentioned above the importance of acquiring access token even when activating a pooled connection.

Again - just to be sure I understand: does this mean that every time a pooled connection is handed out (i.e. when SqlConnection.Open is called), the access token is sent to SQL Server as part of a RESET? Does this imply an extra database roundtrip for every pooled Open, and does this roundtrip need to happen even when access tokens aren't in use? I'm asking because my expectation when using pooling is for DbConnection.Open to be an extremely short operation with no networking.

From the SQL Server Connection Pooling docs:

If a connection exists to a server that has disappeared, this connection can be drawn from the pool even if the connection pooler has not detected the severed connection and marked it as invalid. This is the case because the overhead of checking that the connection is still valid would eliminate the benefits of having a pooler by causing another round trip to the server to occur. When this occurs, the first attempt to use the connection will detect that the connection has been severed, and an exception is thrown.

Although this paragraph talks about severed connections, it seems relevant (perf-wise) also for any roundtrips done for token revalidation etc.

@cheenamalhotra
Copy link
Member

cheenamalhotra commented Oct 22, 2020

After 10 hours of a physical connection lifetime, SQL Server will trigger some sort of revalidation? Or does the revalidation happen only when SqlClient initiates a RESET as part of the pooling?

Revalidation only occurs when "RESET" bit is set in header information, only after pooled connection reactivation.
We have tested non-pooled connection lifetime too, they stay open forever, as long as they are active and not getting stale.

does this mean that every time a pooled connection is handed out (i.e. when SqlConnection.Open is called), the access token is sent to SQL Server as part of a RESET?

Let me explain more in detail.

So, let's say you created a new pooled connection and closed it, driver will pool this connection but the physical connection is already active on server side. When the driver reactivates this pooled connection, it's only done on client side by taking it out of pool stack and setting it's state Active. But when next query is executed, we ensure a "RESET" bit is set in TDS header with next command on this newly reactivated pooled connection to cleanup any stale state, as per standard behavior.

Now, Access Token information is already available with the server, since it was part of "login credential" with this which connection was instantiated. The server revalidates credentials whenever RESET bit is received after 10 hours, no matter what mode of authentication is chosen. If user specified username/password and let's say password expires after 5 hours, the server will revalidate credentials anytime 10 hours the first RESET is triggered from client. And if password is found expired, command will not execute and exception will be thrown making connection non-usable. All future commands on this connection will continue to fail.

@roji
Copy link
Member Author

roji commented Oct 22, 2020

@cheenamalhotra thank you very much, that's very helpful!

I guess I have one more question. When you say that the server "revalidates credentials" (after seeing a query with the RESET bit on), does that mean it requests the credentials again from the client (access token, user/password), and SqlClient sends it at that point? In other words, does SqlClient basically re-authenticate in that scenario (RESET bit on after 10 hours)? Or does SQL Server just check if the original credentials it received are still valid, and basically break the connection? I'm trying to understand if SqlClient has an opportunity to supply new credentials when revalidation occurs, or is the connection basically doomed 10 hours after the original credentials are no longer valid.

Thanks again for your patient and detailed responses.

@cheenamalhotra
Copy link
Member

SQL Server checks if the original credentials it received are still valid, and will throw exception because RESET bit is set when executing a new SqlCommand on the same physical connection.

I'm not sure if we have an opportunity after receiving exception to update login credentials on an active connection, since that would mean we need to execute Login Flow once again - same as a new connection request.

@saurabh500
Are you aware of any such possibility?

@roji
Copy link
Member Author

roji commented Oct 22, 2020

Thanks again - I think I have a clear picture, and the SqlClient behavior makes total sense (including why an access token callback would have to be called every time).

@saurabh500 I'm still interested if a re-login after expiration is at least a theoretical possibility - it could allow a single pool, without the access token being part of the key, and with the callback only being called when Login Flow is initiated again. But it really is more of an academic interest at this point.

@saurabh500
Copy link

Hello @cheenamalhotra and @roji

The TDS protocol doesn't allow reauthentication on the same physical connection. So there is no protocol support for re-login on a physical connection. Any login today, requires a new physical TCP connection.

PS: I have been working on internal repositories only and not paying much attention to GH mentions. In case there is something that requires an answer from me, it would be best to ping me on Teams. I happened to have installed the github app and I was going through the mentions and this caught my eye.

@cheenamalhotra
Copy link
Member

Thanks for confirming @saurabh500
Sure, will ping you when needed :)

@roji
Copy link
Member Author

roji commented Oct 23, 2020

Thanks @saurabh500 and @cheenamalhotra, I have all I need here!

To sum it up, SQL Server revalidates the initially-provided credentials at RESET after 10 hours. Since there's no opportunity to rotate credentials with SQL Server (i.e. provide new ones), connections basically have a maximum lifetime of around 10 hours (assuming normal pooling). This means that to avoid errors, it's the user's responsibility to always provide a valid access token when calling Open, and SqlClient internally ensures this works by having separate pools for different access tokens.

For context, see dotnet/efcore#13261 (comment) - it's bad perf-wise to actually fetch the token (i.e. call Azure.Identity GetTokenAsync) every single time - it's the user's responsibility to cache tokens. Following the request for SqlClient to integrate with Azure.Identity made in #730 (comment), I opened #771. We have dotnet/efcore#13261 tracking doing the same in EF Core, but it seems that SqlClient is the better place to do this (i.e. it would be usable by non-EF users).

One last academic note: I find it really strange that SQL Server revalidates credentials only when RESET is sent by client... In other words, from a security perspective it makes little sense for a server to require pooled connections to revalidate but to never revalidate a connection as long as its not returned to the pool... But that has nothing to do with SqlClient.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants