Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inability to pass env vars to native engine breaks AWS authentication for assumed roles #20624

Open
serjx86 opened this issue Feb 27, 2024 · 11 comments · May be fixed by #20667
Open

Inability to pass env vars to native engine breaks AWS authentication for assumed roles #20624

serjx86 opened this issue Feb 27, 2024 · 11 comments · May be fixed by #20667
Labels

Comments

@serjx86
Copy link

serjx86 commented Feb 27, 2024

Describe the bug

TL;DR

There has to be a way to propagate env vars to native code. Either it exists and we did not figure out how to do it, or it needs to be added, otherwise native code is not implementing AWS authentication routine correctly (env vars should take precedence over other auth methods).


We were successfully using resource targets with http_source pointing to s3 URI. However, recently we were experimenting with permissions management on our CI and decided to use AWS IAM role chaining to avoid duplicating policies for different build environments that were running same builds.

For the best of my knowledge, credentials for assumed role can be consumed either via env vars of updating AWS profile configs in user's home dir. On CI we naturally don't use local config profiles, so the only way to switch to assumed role is by making an API call to assume the role and setting the session credentials we get back in env vars, something to the effect of

export $(printf "AWS_ACCESS_KEY_ID=%s AWS_SECRET_ACCESS_KEY=%s AWS_SESSION_TOKEN=%s" \
$(aws sts assume-role \
--role-arn arn:aws:iam::123456789012:role/MyAssumedRole \
--role-session-name MySessionName \
--query "Credentials.[AccessKeyId,SecretAccessKey,SessionToken]" \
--output text))

(stolen form stackoverflow as an example)

Our setup for CI env we were testing is as follows. EC2 instance has an IAM role profile A attached, the role A does not have s3 access. Profile role A is granted permission to assume role B. Role B has s3 access.

In testing this setup we saw that after successfully assuming the role and setting env vars for role B, pants ignored the env vars and instead queried the metadate server, which gave it credentials for role A. (Which is perfectly normal, metadata server has no knowledge of locally assumed roles, as I understand it).

Native engine is not the only pants component we rely on to access s3 URIs. We also have a plugin that relies on backend.url_handlers.s3. Looking at debug logs for botocore used by backend.url_handlers.s3 we saw that env vars credentials were checked first, but then metadata server was still queried. Using this clue, we set

[subprocess-environment]
env_vars = [
    "AWS_ACCESS_KEY_ID",
    "AWS_SECRET_ACCESS_KEY",
    "AWS_SESSION_TOKEN",
]

and we saw that indeed, this fixed the issue for backend.url_handlers.s3 and the credentials from env vars took precedence over metadata server issued credentials.

However, we still could not get resource targets with http_source pointing to s3 URIs to work. Looking at the debug logs we noted that AWS_... variables were logged for for PEX processes that were used by our plugin, but these variables were not reported for calls used to resolve s3 resource targets. We thus concluded that env vars are not propagated to the native code that resolves http_source.

We are not sure if there is already a way to pass variable to the native code or if this is a bug. And we believe this is a bug (and not a feature request), because this makes native code's implementation of AWS authentication incorrect: credentials in env vars, if present, should take precedence over metadata server issued credentials.

Pants version
2.19.0rc5

OS
Linux (Ubuntu Focal)

Additional info

Not sure how to reproduce, since the problem here is something is not happening. The fix, however, should be easy to observe: an env var specifically set/forwarded to native code should be logged in debug logs as passed to the process, even if that env var is not used for anything in the code itself.

@serjx86 serjx86 added the bug label Feb 27, 2024
@serjx86
Copy link
Author

serjx86 commented Feb 27, 2024

One thing that nags me is whether these env vars are going to be used to calculate cache keys. If they are, we are going to be invalidating cache for everything every time, these creds are set anew for every build on CI.

@huonw
Copy link
Contributor

huonw commented Feb 27, 2024

Thanks for filing an issue!

@thejcannon I know you have done a bunch of work with files and http_sources; do you have insight about how to handle this?

@thejcannon
Copy link
Member

I'm on mobile, so I can't be as in depth.

The native code responsible for the underlying resource handling is DownloadFile. It's logically just a GET request, so env vars are largely irrelevant.

The url_handlers are responsible for taking a URL and turning it into a DownloadFile, with optional (not cached) headers.

So I think you should poke at those bits. See how your URI is being translated into a URL+headers.

@benjyw
Copy link
Sponsor Contributor

benjyw commented Feb 28, 2024

Related: [subprocess-environment] is a lie - it only applies to pex processes, and by its own documentation it should apply to all processes (see #14809). But if we're doing a GET request to S3 instead of going through an AWS API then that will defeat authentication regardless of the subprocess issue.

@serjx86
Copy link
Author

serjx86 commented Feb 29, 2024

@benjyw , yes, I concur on a lie = )
I also don't know where too look for rust code that does AWS auth. At least in Python I can see botocore is used to talk to s3, so I know that the auth protocol will be followed. In rust, even if env vars are passed, will they be used?? So I am not sure if fixing the subprocess thing will be enough to make rust s3 code behave correctly

@serjx86
Copy link
Author

serjx86 commented Feb 29, 2024

@thejcannon, so it seems what you are saying the rust gets a GET request that is already authenticated and signed? so then the question is what code does authentication and signing, because that is where the env vars vs other types of credentials come into play

@thejcannon
Copy link
Member

This class I think spells out most of the pieces:

class URLDownloadHandler:

And here's the Pants-provided S3 URL Backend implementation:

async def download_from_s3(request: S3DownloadFile, aws_credentials: AWSCredentials) -> Digest:

So:

  • Pants sees a URL

  • Pants sees if any of the registered URL Handlers "claim" based on scheme/authority/etc...

  • If so, now they are responsible for downloading the URL

  • They can do that however they want, but they probably want to issue a NativeDownloadFile request

    • Which is equivalent to a "cached" URL GET
    • Where the cache is based on URL+expected checksum, but not headers

    Note that that implementation doesn't take into account any env vars. You might want to roll your own implementation which does so before importing/invoking botocore, or make a PR to Pants to this backend to see if those have been provided via one channel or another.

@serjx86
Copy link
Author

serjx86 commented Feb 29, 2024

thank you @thejcannon!

by the looks of it, botocore will do the authentication, so if the env vars were set for this process, it would all work just fine. Then the question is, why aren't the env vars from [subprocess-environment] propagated to this invocation?

and since we have the same python code working fine in one of our plugins pulling pre-built artifacts from s3 without using resource\http_source, that corroborates my assumption even farther

@thejcannon
Copy link
Member

This isn't a process. It's a rule.

@serjx86
Copy link
Author

serjx86 commented Mar 7, 2024

can we do anything to let the rule see the env vars?

@thejcannon
Copy link
Member

You can request them as a rule input, and see where you can pass their values into botocore.

Grep around for CompleteEnvironmentVars (I think that's the name)

@TansyArron TansyArron linked a pull request Mar 13, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants