Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1.13.2: aws-iam-authenticator now finding /root/.aws/config #2989

Open
nairb774 opened this issue Apr 7, 2023 · 2 comments
Open

1.13.2: aws-iam-authenticator now finding /root/.aws/config #2989

nairb774 opened this issue Apr 7, 2023 · 2 comments
Labels
area/core Issues core to the OS (variant independent) status/icebox Things we think would be nice but are not prioritized type/bug Something isn't working

Comments

@nairb774
Copy link

nairb774 commented Apr 7, 2023

Image I'm using:

Upgrading Bottlerocket OS 1.13.1 (aws-k8s-1.22) to Bottlerocket OS 1.13.2 (aws-k8s-1.22)

What I expected to happen:

A successful upgrade.

What actually happened:

Nodes failed to successfully authenticate with the cluster.

How to reproduce the problem:

I apologize that this might be long, and that the config/goals might feel a little odd.

Start nodes with the following configuration:

[settings.aws]
config = "W3Byb2ZpbGUgZGVmYXVsdF0KY3JlZGVudGlhbF9zb3VyY2UgPSBFYzJJbnN0YW5jZU1ldGFkYXRhCnJvbGVfYXJuID0gYXJuOmF3czppYW06OjEyMzQ1Njc4OTAxMjpyb2xlL015RXh0cmFSb2xlCg=="
profile = "default"
region = "us-east-1"

Where the settings.aws.config is something like:

[profile default]
credential_source = Ec2InstanceMetadata
role_arn = arn:aws:iam::123456789012:role/MyExtraRole

The role attached to the instance, and the additional role referenced above have all the correct permissions to allow the assume role to take place. Additionally, the instance role and the extra role have the same set of permissions attached to them (the required AmazonEKSWorkerNodePolicy and AmazonEC2ContainerRegistryReadOnly).

When the v1.13.1 instance starts up, the older v0.6.2 version of the aws-iam-authenticator would use the instance's role to generate the token needed to authenticate with the control plane. With the 1.13.2 version, which contains the v0.6.8 of the aws-iam-authenticator, it will discover the /root/.aws/config file and will make use of the default provider to create the token for authenticating the node with the control plane. This change in behavior was due to aws/aws-sdk-go#4519 being included in the aws-sdk-go library upgrades. That PR adds a fallback to read the home directory information from /etc/passwd if the HOME environment variable is not set.

The assumed role is not able to authenticate with the control plane, even if the aws-auth ConfigMap is updated to support the different role, because the SessionName used by the assumed role does not contain the instance ID.

To get the instance to connect correctly, the AWS config file can be changed to use a profile with a name other than default. The use of default was suggested in #2885 (comment). With #2904 and #2924 we were able to go back to using a name other than default which works.

I can see a few resolutions out of this issue, and mostly filing it to get feedback and consideration.

  • Using a non-default profile is the best option. In other words, this isn't an issue, but just an odd interaction of components.
  • Support getting the aws-iam-authenticator to set the session name to the right value so that the EC2PrivateDNSName substitution works.
  • Implement a more direct approach than the AWS config for achieving kubelet: Support running under a separate role. #1624.
  • Some other option I'm not seeing.

It would be nice if we could get the aws-iam-authenticator to use the separated role for auth, but since that is just identity auth it isn't as immediately interesting as being able to remove all the policies attached to the instance role.

References:

@nairb774 nairb774 added status/needs-triage Pending triage or re-evaluation type/bug Something isn't working labels Apr 7, 2023
@stmcginnis
Copy link
Contributor

With #2904 and #2924 we were able to go back to using a name other than default which works.

I'm glad there's a workaround! Though still not a great situation...

I'll see if I can add a note to the setting for now that default may cause unexpected behavior with other AWS services. I wouldn't consider that a "fix", but hopefully it's enough of a breadcrumb to help avoid getting in this situation for some.

@stmcginnis stmcginnis added area/core Issues core to the OS (variant independent) status/icebox Things we think would be nice but are not prioritized and removed status/needs-triage Pending triage or re-evaluation labels Apr 7, 2023
@nairb774
Copy link
Author

nairb774 commented Apr 7, 2023

Thanks for at least considering this odd use case. I really should find some time to see if a proper solution for #1624 could be implemented, and get out of the business of trying to do this indirectly and causing the self-inflicted pain. :)

Really appreciate the experience of filing issues here even if it is sometimes self induced foot-guns. Makes me feel good about using Bottlerocket in production and gives me lots of confidence in the project. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/core Issues core to the OS (variant independent) status/icebox Things we think would be nice but are not prioritized type/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants