Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

knox: handle race condition on concurrent logout call #186

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

xrmx
Copy link
Contributor

@xrmx xrmx commented Jul 19, 2019

With AUTO_REFRESH enabled authentication is racing against token removal of logout. If a token gets removed updating its expiry would led to a DatabaseError raised by the Django ORM.

Fix that by ignoring DatabaseError exception returned by renew_token so that the last request will get a AuthenticationFailed exception instead of a 500 error.

Copy link
Collaborator

@johnraz johnraz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello 👋 and thanks for the contribution!

Your branch seems to be based on master and not develop, could you please rebase on develop?

I believe the changelog would also need a note about the bug fix.

I left a comment for a potential improvement in the code as well.

Thanks again!

@@ -73,7 +74,11 @@ def authenticate_credentials(self, token):
raise exceptions.AuthenticationFailed(msg)
if compare_digest(digest, auth_token.digest):
if knox_settings.AUTO_REFRESH and auth_token.expiry:
self.renew_token(auth_token)
try:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather move this try/except around the save call within renew_token. Also, I’m worried that swallowing every database error like this could lead to some other unrelated error being swallowed. Can we check the error code ? The error message maybe ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The .save is the only database interaction in that hunk of code so there's not much chances to swallow unrelated exceptions. Regarding moving the handling inside renew_token I don't see much value in it: it will duplicate the error handling as we really can do something sensible only in authenticate_credentials.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could swallow any other database error and mistaken it with an authentication error which would be confusing in a debug session. renew_token could change in the future and have more database interactions so I think moving it around the save makes sense. I don’t understand your argument about code duplication as this is exactly what I’m trying to avoid, using refresh token somewhere else would require to duplicate the error handling, right? Moving the error handling inside refresh token makes it safe to be used anywhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By duplicating error handling i mean that on top of the exception handling i have to check the return value. In my opinion it's making the code less explicit.

diff --git a/knox/auth.py b/knox/auth.py
index 1d453ca..5345175 100644
--- a/knox/auth.py
+++ b/knox/auth.py
@@ -74,10 +74,8 @@ class TokenAuthentication(BaseAuthentication):
                 raise exceptions.AuthenticationFailed(msg)
             if compare_digest(digest, auth_token.digest):
                 if knox_settings.AUTO_REFRESH and auth_token.expiry:
-                    try:
-                        self.renew_token(auth_token)
-                    except DatabaseError:
-                        # avoid race condition with concurrent logout calls
+                    renewed = self.renew_token(auth_token)
+                    if not renewed:
                         continue
                 return self.validate_user(auth_token)
         raise exceptions.AuthenticationFailed(msg)
@@ -89,7 +87,12 @@ class TokenAuthentication(BaseAuthentication):
         # Throttle refreshing of token to avoid db writes
         delta = (new_expiry - current_expiry).total_seconds()
         if delta > knox_settings.MIN_REFRESH_INTERVAL:
-            auth_token.save(update_fields=('expiry',))
+            try:
+                auth_token.save(update_fields=('expiry',))
+            except DatabaseError:
+                # avoid race condition with concurrent logout calls
+                return False
+        return True

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I see your point now. I still have a code smell around this. I’m even wondering if the authentication failed error makes sense in this case? You logged in but it failed to refresh... Maybe raising a 409 (conflict) would be better, we could do a get right before the save and trigger the exception if the get fails ? That would remove the database error check I don’t like and the error checking duplication you don’t like.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xrmx this is a very valid point, thanks for the valuable link as well which helped refresh my mind on this topic 👍

Another approach we could then take and which would probably make more sense would be to do a select for update when getting the token on login.

The rationale would be that as soon as a login is in progress, no change should be applied to a token. Basically, one can not change / destroy a token if someone is actually using it to log in but they should instead retry after the login has completed properly.

What do you think ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are overthinking this a bit :) I think in this case taking locks will create more chances to mess things than what are preventing.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure, what do you have in mind that could make this worse?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh and I just wanted to add that I agree on the overthinking bit as well, this is the feeling I have too but I don't see how to solve it simply while being "correct". Maybe we should leave it as is 😝

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A worse scenario for me is code taking locks and not releasing them

@xrmx
Copy link
Contributor Author

xrmx commented Jul 19, 2019

@johnraz Thanks for the review, it's based on master because develop is missing some 4.1.0 commits. Will add changelog.

@johnraz
Copy link
Collaborator

johnraz commented Jul 19, 2019

If you base your work on develop and create the PR against develop, missing commits from master should not be an issue. Those are only release related commits I believe.

Writing from my phone so I may be missing a point. Maybe some other contributors could have a look. Cheers.

@xrmx
Copy link
Contributor Author

xrmx commented Jul 19, 2019

@johnraz yeah, but master commits should be merged back to develop or you'll have conflicts, e.g. like in this specific case in the CHANGELOG

@johnraz
Copy link
Collaborator

johnraz commented Jul 19, 2019

Yes agreed on the merge back to develop but please let’s do that in a separate PR, maybe @belugame can help, won’t have access to my computer soon (travelling right now).

@xrmx xrmx changed the title knox: handle race condition on concurrent logout calls knox: handle race condition on concurrent logout call Jul 23, 2019
@xrmx xrmx force-pushed the expiryrace branch 2 times, most recently from b61e9dc to bd4e825 Compare July 23, 2019 15:44
@xrmx
Copy link
Contributor Author

xrmx commented Jul 23, 2019

Updated PR:

  • rebased on top of develop
  • renamed tests and added some comments to help understand the scenario
  • rewritten commit message to make it hopefully more understandable
  • changed the continue to a break when handling the exception as we found the token

With AUTO_REFRESH enabled authentication is racing against token
removal of logout. If a token gets removed updating its expiry
would led to a DatabaseError raised by the Django ORM.

Fix that by ignoring DatabaseError exception returned by renew_token
so that the last request will get a AuthenticationFailed exception
instead of a 500 error.
@johnraz
Copy link
Collaborator

johnraz commented May 26, 2024

I added the help wanted on this PR to get it rebased and merged.

@giovannicimolin if you feel like coding be my guest 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants