Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EKS 1.21/1.22][BoundServiceAccountTokenVolume] Refresh AWS service account tokens automatically #74

Closed
prashil-g opened this issue May 19, 2022 · 9 comments · Fixed by #80

Comments

@prashil-g
Copy link

Describe the bug
I am using logzio/logzio-fluentd : 1.0.2 on AWS EKS 1.21 and received the following email from AWS:

`Description
We have identified applications running in one or more of your Amazon EKS clusters that are not refreshing service account tokens. Applications making requests to Kubernetes API server with expired tokens will fail. You can resolve the issue by updating your application and its dependencies to use newer versions of Kubernetes client SDK that automatically refreshes the tokens.

What is the problem?

Kubernetes version 1.21 graduated BoundServiceAccountTokenVolume feature [1] to beta and enabled it by default. This feature improves security of service account tokens by requiring a one hour expiry time, over the previous default of no expiration. This means that applications that do not refetch service account tokens periodically will receive an HTTP 401 unauthorized error response on requests to Kubernetes API server with expired tokens. You can learn more about the BoundServiceAccountToken feature in EKS Kubernetes 1.21 release notes [2].

To enable a smooth migration of applications to the newer time-bound service account tokens, EKS v1.21+ extends the lifetime of service account tokens to 90 days. Applications on EKS v1.21+ clusters that make API server requests with tokens that are older than 90 days will receive an HTTP 401 unauthorized error response.

We recommend that you update your applications and its dependencies that are using stale service accounts tokens to use one of the newer Kubernetes Client SDKs that refetches tokens.

If the service account token used is close to expiry (<90 days) and you do not have sufficient time to update your client SDK versions before expiry, then you can terminate existing pods and create new ones. This results in refetching of the service account token, giving you additional time (90 days) to update your client SDKs.
`

Similar issue is also reported in fluet-bit fluent/fluent-bit#5445. But i didnt see anything in fluentd github

To Reproduce

Deploy an logzio/logzio-fluentd : 1.0.2 pod on AWS EKS 1.21
Expected behavior
I think the fluentd pod should automatically refresh its IRSA credentials.

Your Environment

Version used:
AWS EKS 1.21
logzio/logzio-fluentd : 1.0.2

@prashil-g prashil-g changed the title Refresh AWS service account tokens automatically [EKS 1.21/1.22][BoundServiceAccountTokenVolume] Refresh AWS service account tokens automatically May 19, 2022
@mirii1994
Copy link
Contributor

Hi @prashil-sophos, thanks for reaching out!
Our solution uses the fluentd plugin kubernetes_metadata_filter. This plugin is the component in our integration that addresses the k8s api.
In the plugin’s repo there’s an open issue about it, and seems like there’s a PR in progress to solve this issue.
We’re following both the issue and the PR. Once a fix will be released we will use it.

@prashil-g
Copy link
Author

Any update on this?
I see fluentbit : fluent/fluent-bit#5445 fix shipped with 1.9.4.

@mirii1994
Copy link
Contributor

@prashil-g the kubernetes_metadata_filter plugin is external to fluentd, and is not related to fluent-bit.
The PR I mentioned in my previous comment is still in progress, and I noticed that another PR to fix this issue was created.
Once a new version with a fix will be released, we'll update our dependencies.

@mirii1994
Copy link
Contributor

Hi @prashil-g ,
The plugin that caused your issue released a new version (2.11.1) that should handle that bug.
We created a new image (logzio/logzio-fluentd:1.1.1) with the updated plugin.
Please upgrade to our latest version, and let us know if this solves your issue.

@prashil-g
Copy link
Author

Thanks @mirii1994 I'll try upgrading to logzio/logzio-fluentd:1.1.1

@prashil-g
Copy link
Author

prashil-g commented Oct 18, 2022

After updating to new version we see surge in below type of logs in cloudwatch

{"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Request","auditID":"b8eb4e26-38fb-4e51-8fa9-92da00a750b8","stage":"ResponseStarted","requestURI":"/api/v1/watch/pods?resourceVersion=53014397","verb":"watch","user":{},"sourceIPs":["10.0.83.63"],"userAgent":"http.rb/4.4.1","objectRef":{"resource":"pods","apiVersion":"v1"},"responseStatus":{"metadata":{},"status":"Failure","reason":"Unauthorized","code":401},"requestReceivedTimestamp":"2022-10-18T07:32:35.092761Z","stageTimestamp":"2022-10-18T07:32:35.094359Z"}

Also pod is printing below error continuously
2022-10-18 07:52:44 +0000 [info]: #0 [filter_kube_metadata] Encountered '401 Unauthorized' exception in watch, recreating client to refresh token 2022-10-18 07:52:44 +0000 [info]: [filter_kube_metadata] Encountered '401 Unauthorized' exception in watch, recreating client to refresh token

@mirii1994
Copy link
Contributor

@prashil-g I see that there was a PR in the plugin's repo to initialize pod watcher on 401.
I'll create a new version to our image that will use the plugin's latest version, I think it should solve it.

@prashil-g
Copy link
Author

Thanks @mirii1994 for the quick response as always !

@mirii1994
Copy link
Contributor

@prashil-g logzio/logzio-fluentd:1.2.0 is out. Please use that version.
If the issue still occurs let us know :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants