Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S3 error : Unable to locate credentials #558

Closed
jmleoni opened this issue Nov 19, 2021 · 28 comments · Fixed by #622
Closed

S3 error : Unable to locate credentials #558

jmleoni opened this issue Nov 19, 2021 · 28 comments · Fixed by #622

Comments

@jmleoni
Copy link
Contributor

jmleoni commented Nov 19, 2021

Hello,

since asynciohttp v3.8.1 was deployed on condaforge 3 days ago while trying to use s3fs with iamRole based credentials in AWS EC2 servers, we are encountering the following errors :

  File "fastparquet/api.py", line 132, in __init__
  File "fsspec/asyn.py", line 91, in wrapper
  File "fsspec/asyn.py", line 71, in sync
  File "fsspec/asyn.py", line 25, in _runner
  File "s3fs/core.py", line 1128, in _isdir
  File "s3fs/core.py", line 580, in _lsdir
  File "aiobotocore/paginate.py", line 32, in __anext__
  File "aiobotocore/client.py", line 142, in _make_api_call
  File "aiobotocore/client.py", line 161, in _make_request
  File "aiobotocore/endpoint.py", line 77, in _send_request
  File "aiobotocore/endpoint.py", line 71, in create_request
  File "aiobotocore/hooks.py", line 27, in _emit
  File "aiobotocore/signers.py", line 16, in handler
  File "aiobotocore/signers.py", line 63, in sign
  File "botocore/auth.py", line 373, in add_auth
botocore.exceptions.NoCredentialsError: Unable to locate credentials

It seems that fixing aiohttp in version v3.7.4.post0 solves this issue.

Not also that we are not encountering the same errors on the same codebase on tasks that are running on Fargate instead of EC2.

I guess it would be good to investigate what happened with aiohttp (which is required by aiobotocore)

Any better solution than fixing aiohttp version is welcome !

@martindurant
Copy link
Member

lease cross-post to aiobotocor and/or botocore - they might be aware of this.

@martindurant
Copy link
Member

PS: worth trying whether using boto directly works, or causes the same exception.

@achimgaedke
Copy link
Contributor

I stumbled into this issue while using S3 for a dvc remote storage.

I isolated the failure to s3fs, a test with boto did not fail - test code below. Changing the requirement for aiohttp<3.8.0 solves the problem.

Btw. A very similar problem is described in iterative/dvc#6899 - and fixed in upstream?!

import unittest


class test_list_bucket(unittest.TestCase):

    test_bucket = "test-bucket-achim"
    test_prefix = "/"

    def test_s3fs(self):
        import s3fs
        fs = s3fs.S3FileSystem()
        fs.ls(f"s3://{self.test_bucket}/{self.test_prefix}")

    def test_boto3(self):

        import boto3
        s3c = boto3.client("s3")
        s3c.list_objects(Bucket=self.test_bucket, Prefix=self.test_prefix)

@martindurant
Copy link
Member

The linked issue seems to be around azure-blob, not s3 - are you sure it is similar? If yes, please tag a couple of the people there to see if they can help.

Note that aiohttp is now at 3.8.1; can you generate a set of versions which do/do not work?

@achimgaedke
Copy link
Contributor

achimgaedke commented Nov 22, 2021

Hey @martindurant

Re DVC issue: I noticed that. It is similar from an "end-user" perspective, using DVC and s3fs and getting the same error. I've added the word "surprisingly", I can't say whether this is a red herring.

Re version comparison: The setup I'm debugging is a bit more complex, so I will boil it down over the next day or two. Here a diff between the two quite convoluted conda environments:

diff s3test_pass.txt s3test_fail.txt 
11c11
< aiohttp                   3.7.4.post0      py39h3811e60_1    conda-forge
---
> aiohttp                   3.8.1            py39h3811e60_0    conda-forge
13a14
> aiosignal                 1.2.0              pyhd8ed1ab_0    conda-forge
21c22
< async-timeout             3.0.1                   py_1000    conda-forge
---
> async-timeout             4.0.1              pyhd8ed1ab_0    conda-forge
108a110
> frozenlist                1.2.0            py39h3811e60_1    conda-forge

The passing environment contains the aiohttp<3.8.0 requirement. aiohttp<3.8.1 is not sufficient to make the test pass.

The failing test produced this traceback:

======================================================================
ERROR: test_s3fs (test_s3fs.test_list_bucket)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/notebooks/test_s3fs.py", line 12, in test_s3fs
    fs.ls(f"s3://{self.test_bucket}/{self.test_prefix}")
  File "/opt/conda-envs/envs/s3test/lib/python3.9/site-packages/fsspec/asyn.py", line 91, in wrapper
    return sync(self.loop, func, *args, **kwargs)
  File "/opt/conda-envs/envs/s3test/lib/python3.9/site-packages/fsspec/asyn.py", line 71, in sync
    raise return_result
  File "/opt/conda-envs/envs/s3test/lib/python3.9/site-packages/fsspec/asyn.py", line 25, in _runner
    result[0] = await coro
  File "/opt/conda-envs/envs/s3test/lib/python3.9/site-packages/s3fs/core.py", line 795, in _ls
    files = await self._lsdir(path, refresh)
  File "/opt/conda-envs/envs/s3test/lib/python3.9/site-packages/s3fs/core.py", line 578, in _lsdir
    async for i in it:
  File "/opt/conda-envs/envs/s3test/lib/python3.9/site-packages/aiobotocore/paginate.py", line 32, in __anext__
    response = await self._make_request(current_kwargs)
  File "/opt/conda-envs/envs/s3test/lib/python3.9/site-packages/aiobotocore/client.py", line 141, in _make_api_call
    http, parsed_response = await self._make_request(
  File "/opt/conda-envs/envs/s3test/lib/python3.9/site-packages/aiobotocore/client.py", line 161, in _make_request
    return await self._endpoint.make_request(operation_model, request_dict)
  File "/opt/conda-envs/envs/s3test/lib/python3.9/site-packages/aiobotocore/endpoint.py", line 77, in _send_request
    request = await self.create_request(request_dict, operation_model)
  File "/opt/conda-envs/envs/s3test/lib/python3.9/site-packages/aiobotocore/endpoint.py", line 70, in create_request
    await self._event_emitter.emit(event_name, request=request,
  File "/opt/conda-envs/envs/s3test/lib/python3.9/site-packages/aiobotocore/hooks.py", line 27, in _emit
    response = await handler(**kwargs)
  File "/opt/conda-envs/envs/s3test/lib/python3.9/site-packages/aiobotocore/signers.py", line 16, in handler
    return await self.sign(operation_name, request)
  File "/opt/conda-envs/envs/s3test/lib/python3.9/site-packages/aiobotocore/signers.py", line 63, in sign
    auth.add_auth(request)
  File "/opt/conda-envs/envs/s3test/lib/python3.9/site-packages/botocore/auth.py", line 373, in add_auth
    raise NoCredentialsError()
botocore.exceptions.NoCredentialsError: Unable to locate credentials

@raybellwaves
Copy link
Contributor

raybellwaves commented Nov 22, 2021

If you pip install s3fs does it works? (#553 (comment) and #554 (comment) for reference)

@achimgaedke
Copy link
Contributor

achimgaedke commented Nov 22, 2021

I had problems with the pip installation as well - but (both pip and conda) it worked when having only the minimal environment definition with the latest versions of all packages - but stopped working for a richer (more realistic) environment (both pip and conda).

So, that's why I intend to try to provide an environment definition soon... just having not enough time at hand, sorry.

@achimgaedke
Copy link
Contributor

A minimal conda environment to reproduce the error. NB dvc and the version numbers seem to be critical.
The tests are run wiht mamba 0.18.1 and ubuntu 20.04.

name: s3test
channels:
  - conda-forge
  - nodefaults
dependencies:
  # move this
  # - aiohttp<3.8.0
  - boto3>=1.17
  - dvc>=2.8
  - dvc-s3>=2.8
  - python>=3.9
  - s3fs>=2021.0.0

The difference between pass and fail are:

$ diff s3test_pass.txt s3test_fail.txt 
7c7
< aiohttp                   3.7.4.post0     py310h6acc77f_1    conda-forge
---
> aiohttp                   3.8.1           py310h6acc77f_0    conda-forge
9a10
> aiosignal                 1.2.0              pyhd8ed1ab_0    conda-forge
11c12
< async-timeout             3.0.1                   py_1000    conda-forge
---
> async-timeout             4.0.1              pyhd8ed1ab_0    conda-forge
55a57
> frozenlist                1.2.0           py310h6acc77f_1    conda-forge

The environment lists are attached:

@achimgaedke
Copy link
Contributor

achimgaedke commented Nov 23, 2021

A week ago I had problems with a pip installed s3fs. I can't reproduce the error with a minimal pip requirements file similar to the conda environment.

The platform is ubuntu 20.04 with python3.9 (installed with conda).

this is the requirements file:

boto3>=1.17
dvc[s3]>=2.8
s3fs>=2021.0.0

@achimgaedke
Copy link
Contributor

name: s3test
channels:
  - conda-forge
dependencies:
  - mamba
  - pip
  - python>=3.9,<3.10
  - wheel
  - pip:
    - dvc[s3,ssh]>=2.8
    - s3fs>=2021.0.0

fails as well

@martindurant
Copy link
Member

So only ever in conjunction with dvc?

@achimgaedke
Copy link
Contributor

achimgaedke commented Nov 23, 2021

@martindurant - yes for me. @jmleoni any insight in this?

I believe it is due to a weird and unfortunate combination of libraries and their versions, so I try to help with reporting...

@jmleoni
Copy link
Contributor Author

jmleoni commented Nov 23, 2021

We are trying to reproduce the issue in isolation, from our application with no luck so far.
What I can confirm is that it is not working with either aiohttp 3.8.0 nor 3.8.1 but it works fine with version 3.7.4.post0
I will keep you posted if we have a setup that can be exported and that let us reproduce the issue in a consistent manner

@TheoJammes
Copy link

Hello,
I work with @jmleoni and we tried investigating the issue. Unfortunately we could not reproduce with a minimal python script that just reads a parquet file, but it occurs 100% with our python project which has a rather big code base.

By plugging a debugger to the python running on batch EC2 we've discovered that the error does not appear when the code has been halted by breakpoints, due to this we suspect there might be a race condition when AioCredentialsResolver#load_credentials() is called (see screenshot).
When there are no breakpoints it returns null but if we put a breakpoint after the async call is made, then it returns a value.

image

@martindurant
Copy link
Member

Thanks for the investigation and interesting finding. I'm not certain what to do about it. It is it a timing thing indeed, it suggests we can merely retry on credentials error, or explicitly wait for the credentials provider above to complete, before attempting anything else.

@raybellwaves
Copy link
Contributor

@jmleoni any idea if this is fixed with the latest s3fs?
can confirm my issue (reported in #554 (comment)) is fixed. However, i'll confess when I installed today in brought in 2021.11.0.

@garibarba
Copy link

We are trying to reproduce the issue in isolation, from our application with no luck so far. What I can confirm is that it is not working with either aiohttp 3.8.0 nor 3.8.1 but it works fine with version 3.7.4.post0 I will keep you posted if we have a setup that can be exported and that let us reproduce the issue in a consistent manner

Indeed pip install "aiohttp<3.8" fixes it.

To reproduce in my case:
Using Python 3.9.7
pip install dvc[s3]

Then:

import s3fs
fs = s3fs.S3FileSystem()
fs.ls('my-bucket')

@achimgaedke
Copy link
Contributor

achimgaedke commented Dec 5, 2021

Hey @garibarba - thanks for reproducing. I ran the same commands with python3.8, python3.9 and python3.10 - and "it works for me" without "aiohttp<3.8"

python3 -m venv s3test
./s3test/bin/pip install -U pip wheel
./s3test/bin/pip install s3fs dvc[s3]
./s3test/bin/python

and then executing the commands - replacing my-bucket with an existing bucket I have access permissions to.

import s3fs
fs = s3fs.S3FileSystem()
fs.ls('my-bucket')

Result: I get the bucket listing as expected.

I repeated the setup from #558 (comment) and it works as well. - I am not convinced whether the diff is really helpful, but here it is.

diff s3test_updated_pass.txt s3test_fail.txt 
30c30,31
< charset-normalizer        2.0.8              pyhd8ed1ab_0    conda-forge
---
> chardet                   4.0.0           py310hff52083_2    conda-forge
> charset-normalizer        2.0.0              pyhd8ed1ab_0    conda-forge
34c35
< cryptography              36.0.0          py310h685ca39_0    conda-forge
---
> cryptography              35.0.0          py310h685ca39_2    conda-forge
37c38
< diskcache                 5.3.0              pyhd8ed1ab_0    conda-forge
---
> diskcache                 5.2.1              pyh44b312d_0    conda-forge
53c54
< fonttools                 4.28.3          py310h6acc77f_0    conda-forge
---
> fonttools                 4.28.1          py310h6acc77f_0    conda-forge
71c72
< harfbuzz                  3.1.2                hb4a5f5f_0    conda-forge
---
> harfbuzz                  3.1.1                hb4a5f5f_1    conda-forge
112c113
< mailchecker               4.1.3              pyhd8ed1ab_0    conda-forge
---
> mailchecker               4.1.0              pyhd8ed1ab_0    conda-forge
130c131
< phonenumbers              8.12.38            pyhd8ed1ab_0    conda-forge
---
> phonenumbers              8.12.37            pyhd8ed1ab_0    conda-forge
155,156c156,157
< requests                  2.26.0             pyhd8ed1ab_1    conda-forge
< rich                      10.15.2         py310hff52083_0    conda-forge
---
> requests                  2.26.0             pyhd8ed1ab_0    conda-forge
> rich                      10.14.0         py310hff52083_1    conda-forge
161,162c162,163
< scipy                     1.7.3           py310hea5193d_0    conda-forge
< setuptools                59.4.0          py310hff52083_0    conda-forge
---
> scipy                     1.7.2           py310hea5193d_0    conda-forge
> setuptools                59.2.0          py310hff52083_0    conda-forge
164c165
< shtab                     1.5.2              pyhd8ed1ab_0    conda-forge
---
> shtab                     1.5.0              pyhd8ed1ab_0    conda-forge
167c168
< sqlite                    3.37.0               h9cd32fc_0    conda-forge
---
> sqlite                    3.36.0               h9cd32fc_2    conda-forge

My next step: I will wait for the December s3fs release and test it in the full data-science tool chain.

Thanks s3fs team for hanging in there... what an annoying bug!

@garibarba
Copy link

@achimgaedke interesting. In my case there is a chance of some non-standard network configuration. I believe my access to S3 is through a VPC endpoint, but there might be other specific configs. I've seen some relatable 3.8.0 issues such as aio-libs/aiohttp#6227 or aio-libs/aiohttp#6239 but I cannot put my finger on it.

@achimgaedke
Copy link
Contributor

My next step: I will wait for the December s3fs release and test it in the full data-science toolchain.

Doesn't work with the more complex setup... so still something wrong.

@martindurant
Copy link
Member

Since we are likely to bump the req, can you try with aiiohttp>4?

@garibarba
Copy link

garibarba commented Dec 13, 2021

Since we are likely to bump the req, can you try with aiiohttp>4?

I've tried pip install "aiohttp==4.0.0a1"and it still doesn't work (see my message above) but for a different reason:

.../python3.9/site-packages/aiobotocore/httpsession.py in __init__(self, verify, proxies, timeout, max_pool_connections, socket_options, client_cert, proxies_config, connector_args)
     89                     ssl_context.load_verify_locations(ca_certs, None, None)
     90 
---> 91         self._connector = aiohttp.TCPConnector(
     92             limit=max_pool_connections,
     93             verify_ssl=bool(verify),

TypeError: __init__() got an unexpected keyword argument 'verify_ssl'

but deleting line 93 of httpsession.py makes it work, so I guess it will work 👌

@rom1504
Copy link

rom1504 commented Feb 9, 2022

Thanks for the investigation and interesting finding. I'm not certain what to do about it. It is it a timing thing indeed, it suggests we can merely retry on credentials error, or explicitly wait for the credentials provider above to complete, before attempting anything else.

Having such a retry option (and even maybe putting it at like 5 retry by default) would be really appreciated. I just hit an intermittent such auth issue and it's a bit ugly of having to handle that in user code

@achimgaedke
Copy link
Contributor

achimgaedke commented May 5, 2022

There has been some more work on this bug. PR aio-libs/aiobotocore#934 looks promising. Consider upgrading requirements to aiobotocore~=2.3.0 when released.

@martindurant
Copy link
Member

@achimgaedke , would you like to make a PR with that change, when available?

martindurant pushed a commit that referenced this issue May 6, 2022
* aiobotocore to 2.3.0

closes #558

* use conda-forge's aiobotocore 2.3 for CI pipeline
@jmleoni
Copy link
Contributor Author

jmleoni commented May 20, 2022

@achimgaedke I didn't take the opportunity yet, but many thanks to you for fixing on this issue, you are a life saver !

@achimgaedke
Copy link
Contributor

I waited for almost half a year for someone else to dig it up. Took me 4 hours, 30 min to fix and then another x hours over 3 weeks to push it through the software supply chain... now conda-forge is the last one missing...

Thanks for thanking, @jmleoni .

@achimgaedke
Copy link
Contributor

See conda-forge/s3fs-feedstock#55

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants