-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Boto3 incompatible with python zip import #1770
Comments
Confirmed. Our data loaders can't handle when being run from a zip.
Which fails our Marking this as a feature request. |
What are the odds of getting this implemented? Its preventing us from distributing boto3, which makes it very hard to provide a package that depends on it in PySpark. |
https://stackoverflow.com/a/22646702 has a snippet processing a zip. |
is this issue resolved? Am also stuck with loading boto3 from .zip file when using --py-files option in spark2-submit. Appreciate any help to overcome this situation |
pytz has a similar issue reading timezone data in the zoneinfo folder from a packaged directory. To get around this is uses https://setuptools.readthedocs.io/en/latest/pkg_resources.html It adds setuptools as a dependency when distributing as a zip, but at least it works. Would be great to have a fix for this. Workarounds are needlessly ugly. |
Is #1008 also a duplicate? |
I submitted PR boto/botocore#1969 |
Using pkg_resources allows for loading modules from a zip. Original author: Gábor Lipták <gliptak@gmail.com> See: - boto/boto3#1770 - boto#1969
@gliptak Hello, I encounter this problem now, can we reopen boto/botocore#1969 and fix this problem? |
@shadowdsp we need a commiter's help on that repo to move forward |
Hi @gliptak - I tried your PR as a patch to
and got:
|
@wolfch-elsevier you might try removing that folder the Python search path or this has a pointer |
Hi, thanks. No but this directory, |
@wolfch-elsevier I need to fix the same problem. Have you found any solution for that? |
@fmxleg Unfortunately, no. I'm really surprised that Amazon hasn't come up with a working example to help promote their EMR ("cloudized" Apache Spark). I mean their example creates an I tried this solution, which uses the It didn't work for me. |
Found a work around for this. You can pass spark conf args to have spark unzip the dependencies and include in path, something like this,
Worked with EMR 6.2.0 and Python 3.7.9 |
Some python modules should not be distributed via PySpark for various reasons. For example, Boto doesn't work when distributed as a zip file (which is how the PySpark distribution works), see boto/boto3#1770 for details. This PR adds a set of modules excluded by default, as well as configurability of this feature. It can be turned off entirely, or a different list of exclusions can be provided.
I have made a new PR, boto/botocore#2437, to attempt to resubmit boto/botocore#1969 |
This is also an issue for SaltStack modules:
Salt execution modules are imported using mod = zipimporter(fpath).load_module(name) If one were to create a Zip archive containing botocore, the following error will occur when attempting to execute the module:
This is due to the fact that the botocore loader ( This renders SaltStack modules distributed as Zip modules using botocore useless. |
Hi! I'm experiencing this error trying to use boto3 in modules within a zip dependency files on EMR. I think this worth a fix. |
The fix is in boto/botocore#2437, but someone from AWS will have to review, approve and merge it. |
Hey @dsonavane-rgare I'm trying this without success. Can you elaborate a bit more?
This is what I've tried now, based on your example:
and this as well:
Thanks |
Does anyone have a work around for this? |
The only thing that ever worked for me was to run on systems with boto3 already installed, and exposed to the PYTHONPATH. |
One of Python's useful features is its ability to load modules from a .zip archive (PEP here), allowing you to package up multiple dependencies into a single file.
Boto breaks when trying to import it from a .zip, throwing:
How to Reproduce:
Tested on Python 3.6.7
boto3 1.9.39
botocore 1.12.39
The text was updated successfully, but these errors were encountered: