Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Have GetSigningCredentialsAsync refresh its key earlier #359

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

j3parker
Copy link
Member

@j3parker j3parker commented Feb 7, 2024

Only refreshing after the key is expired will cause us to sign tokens that might not validate for their entire lifetime.

The spirit was to have GetSigningCredentialsAsync call RefreshAsync even if there was no background job calling it -- so this change makes it call it at the same cadence/time we expect a background service to.

This hasn't been a problem in production for various reasons but it was for a specific use-case in the dev environment, but regardless its a bug.

Only refreshing after the key is expired will cause us to sign tokens
that might not validate for their entire lifetime.

The spirit was to have GetSigningCredentialsAsync call RefreshAsync even
if there was no background job calling it -- so this change makes it
call it at the same cadence/time we expect a background service to.

This hasn't been a problem in production for various reasons but it was
for a specific use-case in the dev environment, but regardless its a
bug.
@j3parker j3parker marked this pull request as ready for review February 7, 2024 18:10
// do a foreground Refresh
private static readonly TimeSpan GetSigningCredentialsRefreshGracePeriod
= BackgroundRefreshDelay
+ BackgroundRefreshRetryDelay + BackgroundRefreshRetryDelay;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't do 2* 😒

@@ -51,7 +65,9 @@ OAuth2Configuration config

var now = m_clock.UtcNow;

if ( current == null || current.ValidTo <= now ) {
if ( current == null
|| ExpectedTimeOfNewUsableKey( current ) + GetSigningCredentialsRefreshGracePeriod < now
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implication here is that once this period is passed every call to GetSigningCredentials is going to fight to update that key.

That basically matches how it worked before the background refresh anyway. The only things without background refresh for us are Lambda functions though, which are single threaded anyway.

if ( current == null || current.ValidTo <= now ) {
if ( current == null
|| ExpectedTimeOfNewUsableKey( current ) + GetSigningCredentialsRefreshGracePeriod < now
) {
// Slow path: RefreshKeyAsync() wasn't called on boot and/or it
// isn't being called in a background job.
await RefreshKeyAsync( now )
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i thought about listening to the response from RefreshKeyAsync here, but the approach I went with was "stateless" and doesn't require having previously called RefreshKeyAsync 🤷 the logic is all in this file and shared so it doesn't seem too bad to me?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If key rotation fails for 3 minutes I think the results are pretty bad here? Taking a look now at whether dev key rotation ever gets delayed...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dev dynamo metrics show writes every 3 hours in the first 5 minutes of the hour steady for the past two months. I think we're still in pretty rough shape if cert generation gets delayed, but hopefully that won't happen too often.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants