New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add LastError in the TargetSnapShot for debugging and monitoring #420
Conversation
I'm coming off of a being sick, so I'm not going to make any technical decisions yet, but a few things that I can immediately answer..
|
Thanks for the quick reply! Yeah there was a race condition in the previously added test and I forgot to use After rethinking about our need, the existing Since now nothing goes inside the Speaking of |
Adding LastError seems to be minimal and if it fits your needs I'm generally okay with this approach. One last comment I'd make is about
This is close to what I was suggesting when I said "maybe pass a channel?" type AutoRefreshError struct {
URL string
Error error
}
ch := make(chan AutoRefreshError) // Note: user must drain it otherwise AutoRefresh will block.
ar := jwk.NewAutoRefresh( jwk.WithAutoRefreshErrorChannel(ch) )
// ... normal operation ...
// somewhere else
for arErr := range ch {
log.Printf("error while fetching %s: %s", arErr.URL, ar.Err.Error)
} I don't know. I think we can keep the LastError bit regardless, but if your use case benefits from an asynchronous error stream, I thought it might be worth thinking about. |
Codecov Report
@@ Coverage Diff @@
## main #420 +/- ##
==========================================
+ Coverage 69.61% 69.72% +0.10%
==========================================
Files 80 80
Lines 9186 9189 +3
==========================================
+ Hits 6395 6407 +12
+ Misses 1942 1933 -9
Partials 849 849
Continue to review full report at Codecov.
|
I was thinking of something like this:
which is basically with the same idea as what you suggested. This wouldn't need any new structs or options. Otherwise, what you proposed makes more sense to me. Async error stream means less polling so it will surely bring benefits but perhaps marginal. If the current PR looks fine to you, I'd love to see it merged and I'm happy to contribute on the async error streaming as a separate task. |
Hmmmm, okay, so I was going to merge, but then I thought, |
@rockspore here's a possible solution from me: #421. This doesn't block, you can turn it on and off, and should be race free. Please check if this works for you |
Yes I am aware of the race condition when I tested this on the client side. Then I thought it would be very rare and not cause panic so I didn't pay attention to it. My purpose is mostly to catch incorrect JWKS URL configuration. But you're right it'd be nice to catch intermittent error as long as it doesn't require frequent polling. #421 looks good! I will do some manual test with it shortly and get back to you. Thanks for the swift turnaround ;-) |
It seems any error that occurs during fetching the JWKS in the background gets swallowed.
In my use case of this library, the JWKS url is provided by users and misconfiguration of the JWKS url would be very difficult to troubleshoot if those errors are not visible unless Fetch() or Refresh() gets specifically invoked. So I added an option which allows library users to inject an error handling function to deal with the error with their own logic.
One open question outside this PR is: It seems the backoff interval is ineffective if it's smaller than the MinRefreshInterval. This basically means any failure caused by internet interruption or temporary server downtime will not be able to recover asynchronously till quite some time after. Is this intended? Thanks.