Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Watch starts receiving HTTP 401 after GKE/OAuth token expires #2612

Closed
rtsio opened this issue Nov 19, 2020 · 10 comments · Fixed by #2731
Closed

Watch starts receiving HTTP 401 after GKE/OAuth token expires #2612

rtsio opened this issue Nov 19, 2020 · 10 comments · Fixed by #2731

Comments

@rtsio
Copy link
Contributor

rtsio commented Nov 19, 2020

Using 4.12.0, I have a couple of Watches configured using:

Watch watch = client.v1().events().inAnyNamespace().watch(new Watcher<Event>() {
  ...
});

After a while, what seems to be a temporary network connectivity issue (that I can't reproduce exactly) causes the following:

2020-11-18 20:01:59.305 WARN 1 --- [.204.54.132/...] i.f.k.c.d.i.WatchConnectionManager : Exec Failure

java.io.IOException: canceled due to java.lang.IllegalStateException: cannot make a new request because the previous response is still open: please call response.close() at okhttp3.RealCall$AsyncCall.execute(RealCall.java:185) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) ~[okhttp-3.14.9.jar!/:na] at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[na:na] at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[na:na] at java.base/java.lang.Thread.run(Thread.java:834) ~[na:na] Suppressed: java.lang.IllegalStateException: cannot make a new request because the previous response is still open: please call response.close() at okhttp3.internal.connection.Transmitter.newExchange(Transmitter.java:164) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:41) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:94) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:88) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) ~[okhttp-3.14.9.jar!/:na] at io.fabric8.kubernetes.client.utils.BackwardsCompatibilityInterceptor.intercept(BackwardsCompatibilityInterceptor.java:135) ~[kubernetes-client-4.12.0.jar!/:na] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) ~[okhttp-3.14.9.jar!/:na] at io.fabric8.kubernetes.client.utils.OIDCTokenRefreshInterceptor.intercept(OIDCTokenRefreshInterceptor.java:52) ~[kubernetes-client-4.12.0.jar!/:na] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) ~[okhttp-3.14.9.jar!/:na] at io.fabric8.kubernetes.client.utils.ImpersonatorInterceptor.intercept(ImpersonatorInterceptor.java:68) ~[kubernetes-client-4.12.0.jar!/:na] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) ~[okhttp-3.14.9.jar!/:na] at io.fabric8.kubernetes.client.utils.HttpClientUtils.lambda$createHttpClient$3(HttpClientUtils.java:151) ~[kubernetes-client-4.12.0.jar!/:na] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) ~[okhttp-3.14.9.jar!/:na] at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:229) ~[okhttp-3.14.9.jar!/:na] at okhttp3.RealCall$AsyncCall.execute(RealCall.java:172) ~[okhttp-3.14.9.jar!/:na] ... 4 common frames omitted

Exception in thread "OkHttp Dispatcher" java.lang.IllegalStateException: cannot make a new request because the previous response is still open: please call response.close() at okhttp3.internal.connection.Transmitter.newExchange(Transmitter.java:164) at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:41) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:94) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:88) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) at io.fabric8.kubernetes.client.utils.BackwardsCompatibilityInterceptor.intercept(BackwardsCompatibilityInterceptor.java:135) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) at io.fabric8.kubernetes.client.utils.OIDCTokenRefreshInterceptor.intercept(OIDCTokenRefreshInterceptor.java:52) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) at io.fabric8.kubernetes.client.utils.ImpersonatorInterceptor.intercept(ImpersonatorInterceptor.java:68) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) at io.fabric8.kubernetes.client.utils.HttpClientUtils.lambda$createHttpClient$3(HttpClientUtils.java:151) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:229) at okhttp3.RealCall$AsyncCall.execute(RealCall.java:172) at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:834)

After this occurs, the Watch goes into an exception loop:

2020-11-18 20:02:01.311 WARN 1 --- [.204.54.132/...] i.f.k.c.d.i.WatchConnectionManager : Exec Failure

java.net.ProtocolException: Unexpected status line: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Unauthorized","reason":"Unauthorized","code":401} at okhttp3.internal.http.StatusLine.parse(StatusLine.java:69) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.http1.Http1ExchangeCodec.readResponseHeaders(Http1ExchangeCodec.java:213) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.connection.Exchange.readResponseHeaders(Exchange.java:115) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.java:94) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:43) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:94) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:88) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) ~[okhttp-3.14.9.jar!/:na] at io.fabric8.kubernetes.client.utils.BackwardsCompatibilityInterceptor.intercept(BackwardsCompatibilityInterceptor.java:135) ~[kubernetes-client-4.12.0.jar!/:na] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) ~[okhttp-3.14.9.jar!/:na] at io.fabric8.kubernetes.client.utils.OIDCTokenRefreshInterceptor.intercept(OIDCTokenRefreshInterceptor.java:41) ~[kubernetes-client-4.12.0.jar!/:na] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) ~[okhttp-3.14.9.jar!/:na] at io.fabric8.kubernetes.client.utils.ImpersonatorInterceptor.intercept(ImpersonatorInterceptor.java:68) ~[kubernetes-client-4.12.0.jar!/:na] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) ~[okhttp-3.14.9.jar!/:na] at io.fabric8.kubernetes.client.utils.HttpClientUtils.lambda$createHttpClient$3(HttpClientUtils.java:153) ~[kubernetes-client-4.12.0.jar!/:na] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117) ~[okhttp-3.14.9.jar!/:na] at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:229) ~[okhttp-3.14.9.jar!/:na] at okhttp3.RealCall$AsyncCall.execute(RealCall.java:172) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) ~[okhttp-3.14.9.jar!/:na] at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[na:na] at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[na:na] at java.base/java.lang.Thread.run(Thread.java:834) ~[na:na]

2020-11-18 20:02:05.393 WARN 1 --- [.204.54.132/...] i.f.k.c.d.i.WatchConnectionManager : Exec Failure: HTTP 403, Status: 403 - events is forbidden: User "system:anonymous" cannot watch resource "events" in API group "" at the cluster scope

java.net.ProtocolException: Expected HTTP 101 response but was '403 Forbidden' at okhttp3.internal.ws.RealWebSocket.checkUpgradeSuccess(RealWebSocket.java:224) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.ws.RealWebSocket$1.onResponse(RealWebSocket.java:195) ~[okhttp-3.14.9.jar!/:na] at okhttp3.RealCall$AsyncCall.execute(RealCall.java:174) ~[okhttp-3.14.9.jar!/:na] at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) ~[okhttp-3.14.9.jar!/:na] at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[na:na] at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[na:na] at java.base/java.lang.Thread.run(Thread.java:834) ~[na:na]

This repeats forever until the service is restarted. Any idea how to fix this?

@rohanKanojia
Copy link
Member

Is this happening only in 4.12.0 version or in previous versions too? How much time does it usually take for client to start throwing these errors?

@rtsio
Copy link
Contributor Author

rtsio commented Nov 19, 2020

@rohanKanojia I have only used 4.12.0, I can test previous versions if necessary. The most recent time this happened about 1 hour after starting the service, however there seems to be no pattern and it happens randomly.

Would switching to informers solve intermittent network issues? I have read they can gracefully handle these sorts of conditions.

@rohanKanojia
Copy link
Member

Informers are able to handle 410 HTTP_GONE errors. But I'm not sure right now what's the root cause of your watch terminating unexpectedly.

@rtsio
Copy link
Contributor Author

rtsio commented Dec 19, 2020

Okay, I finally had time to look at this, identified the issue, however I'm still thinking of the best way to solve this. This is probably related to #2112 and somewhat to #2111.

I am using GKE, authenticating via a Google service account and OAuth token (the recommended path for "bot" users). After the initial token in KubernetesClient expires after 1 hour, it's never updated and thus the Watch starts erroring out.

First off, here is my code that creates the Watch:

        String oauthToken = KubeUtils.getGcloudToken();
        String kubectlContext = KubeUtils.getGkeContextName(...);
        Config config = Config.autoConfigure(kubectlContext);
        config.setOauthToken(oauthToken);

        final CountDownLatch isWatchClosed = new CountDownLatch(1);
        try (KubernetesClient client = new DefaultKubernetesClient(config)) {

            Watch watch = client.v1().events().inAnyNamespace().watch(new Watcher<Event>() {
            ....

I create a Config object using autoConfigure() and manually set the OAuth token by calling a gcloud wrapper. For some background: this is the only way I found to get fabric8 working with GKE authentication. If you follow the recommended steps here, your initial kubeconfig for a GKE service account will look something like this:

apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: ...
    server: https://1.2.3.4
  name: gke_test-google-project_europe-west4-c_test-cluster
contexts:
- context:
    cluster: gke_test-google-project_europe-west4-c_test-cluster
    user: gke_test-google-project_europe-west4-c_test-cluster
  name: gke_test-google-project_europe-west4-c_test-cluster
current-context: ""
kind: Config
preferences: {}
users:
- name: gke_test-google-project_europe-west4-c_test-cluster
  user:
    auth-provider:
      config:
        cmd-args: config config-helper --format=json
        cmd-path: /usr/lib/google-cloud-sdk/bin/gcloud
        expiry-key: '{.credential.token_expiry}'
        token-key: '{.credential.access_token}'
      name: gcp

This is suitable for kubectl, which has a GCP authentication plugin that recognizes the gcp type for auth-provider. Once you call any kubectl command, the plugin will call cmd-path with cmd-args and the kubeconfig will be adjusted to look like this:

users:
- name: gke_test-google-project_europe-west4-c_test-cluster
  user:
    auth-provider:
      config:
        access-token: ya29.a0A...
        cmd-args: config config-helper --format=json
        cmd-path: /usr/lib/google-cloud-sdk/bin/gcloud
        expiry: "2020-12-12T01:34:34Z"
        expiry-key: '{.credential.token_expiry}'
        token-key: '{.credential.access_token}'
      name: gcp

So kubectl takes care of generating the access-token and refreshing it by calling the gcloud utility when necessary, using expiry. fabric8 doesn't do this - it will, at most, parse the access-token field if it's already present, so you must always make some no-op kubectl call before creating your Config to ensure a fresh access-token is present. Another option is to call the gcloud util directly and then use config.setOauthToken(), which is what I did. This works fine for single requests, but obviously not for a long running KubernetesClient used by a Watch.

I also looked through OIDCTokenRefreshInterceptor and OpenIDConnectionUtils, which will refresh OpenID tokens automatically, but these are obviously different from OAuth, and GKE doesn't actually support OpenID. resolveOIDCTokenFromAuthConfig() will look for id-token and refresh-token but not access-token, so I cannot just write out the OAuth token to the kubeconfig on disk and wait for OIDCTokenRefreshInterceptor to pick it up.

The only solution I can think of right now is to tear down each Watch every hour and re-create it with a new Config/KubernetesClient, before the token expires. Maybe there's a way to mutate the KubernetesClient used by a Watch without recreating all the objects from scratch. Do you have any ideas @rohanKanojia ?

@rtsio rtsio changed the title IllegalStateException in WatchConnectionManager causes exception loop Watch starts receiving HTTP 401 after KubernetesClient OAuth expires Dec 19, 2020
@rtsio rtsio changed the title Watch starts receiving HTTP 401 after KubernetesClient OAuth expires Watch starts receiving HTTP 401 after KubernetesClient OAuthToken expires Dec 19, 2020
@rohanKanojia
Copy link
Member

oh, It looks a bit similar to client-go-credential-plugins. I think we should support for generating and refreshing tokens for GCP Auth Provider like kubectl does.

@rtsio
Copy link
Contributor Author

rtsio commented Dec 21, 2020

Well, there does seem to be support in the lib for the exec format linked above, but the gcloud utility does not use this yet (see https://github.com/jglick/gke-exec-credential as a possible workaround). I do have an idea to fix this: 1) make OIDCTokenRefreshInterceptor look for the oidc AuthProvider, and not assume all 401 responses are linked to OpenID and 2) add a new Interceptor that attempts to solve 401s by simply reloading the Config from disk, similar to how the OIDC Interceptor does it. If this sounds reasonable I can submit a PR @rohanKanojia ?

@rohanKanojia
Copy link
Member

@rtsio : We would really appreciate it if you could submit a PR :-) .

@philmtd
Copy link

philmtd commented Jan 13, 2021

We have the same issue with GCP authentication.

I think we should support for generating and refreshing tokens for GCP Auth Provider like kubectl does.

That would be great!

@andreaTP
Copy link
Member

Same issue here, fabric8 is not performing the refresh of the credentials.
Looking forward to the fix!
Very likely to be related as well:
fabric8io/fabric8-maven-plugin#1349

@rtsio rtsio changed the title Watch starts receiving HTTP 401 after KubernetesClient OAuthToken expires Watch starts receiving HTTP 401 after GKE/OAuth token expires Jan 15, 2021
rtsio added a commit to rtsio/kubernetes-client that referenced this issue Jan 18, 2021
@rtsio
Copy link
Contributor Author

rtsio commented Jan 18, 2021

Submitted a PR - just to be clear, I'm going for the simplest solution with the least amount of changes, providing a full GCP auth provider to automatically refresh credentials is out of scope (it is not trivial to add). My proposed solution would simply check the latest kubeconfig available for a valid token, and update the HTTP client to use that token. You will still have to keep the token in your kubeconfig valid via a separate thread, by calling the gcloud utility or using the Google OAuth library for example (see my previous comment and kubernetes-client/java#290).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants