Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable running 'pip check' during sanity check for TensorFlow 2.0.0 #9308

Closed
wants to merge 7 commits into from

Conversation

boegel
Copy link
Member

@boegel boegel commented Nov 15, 2019

requires easybuilders/easybuild-easyblocks#1853

This will need more work, since pip check fails now with:

== 2019-11-15 20:42:53,223 build_log.py:164 ERROR EasyBuild crashed with an error (at easybuild/base/exceptions.py:124 in __init__): cmd "pip check" exited with exit code 1 and output:
tensorflow 2.0.0 requires google-pasta, which is not installed.
tensorflow 2.0.0 requires opt-einsum, which is not installed.
tensorflow 2.0.0 has requirement gast==0.2.2, but you have gast 0.3.2.
tensorboard 2.0.0 has requirement setuptools>=41.0.0, but you have setuptools 40.8.0.

see also @Flamefire's bug report #9306

@boegel boegel added the bug fix label Nov 15, 2019
@boegel boegel added this to the next release (4.1.0) milestone Nov 15, 2019
@Flamefire
Copy link
Contributor

👍 Shall I base my fix on this or rebase this on my fix?

@boegel
Copy link
Member Author

boegel commented Nov 16, 2019

@Flamefire I started looking into fixing this, see updates in 0faf07a, but we're not there yet...

For the Python 3.7.4 on which TensorFlow-2.0.0-fosscuda-2019b-Python-3.7.4.eb depends, we'll need to add several sphinx-* packages + wcwidth for pytest to make pip check happy:

sphinx 2.2.0 requires sphinxcontrib-applehelp, which is not installed.
sphinx 2.2.0 requires sphinxcontrib-devhelp, which is not installed.
sphinx 2.2.0 requires sphinxcontrib-htmlhelp, which is not installed.
sphinx 2.2.0 requires sphinxcontrib-jsmath, which is not installed.
sphinx 2.2.0 requires sphinxcontrib-qthelp, which is not installed.
sphinx 2.2.0 requires sphinxcontrib-serializinghtml, which is not installed.
pytest 5.1.2 requires wcwidth, which is not installed.

That should be done in a separate PR...

@boegel
Copy link
Member Author

boegel commented Nov 16, 2019

The changes in TensorFlow-2.0.0-foss-2019a-Python-3.7.2.eb somehow break the installation of the astor extension... How lovely.

== 2019-11-16 17:27:21,974 run.py:200 DEBUG run_cmd: running cmd  pip install --prefix=/software/TensorFlow/2.0.0-foss-2019a-Python-3.7.2  --verbose  --no-deps  --ignore-installed  --no-build-isolation  . (in /tmp/easybuild/build/TensorFlow/2.0.0/foss-2019a-Python-3.7.2/astor/astor-0.8.0)
== 2019-11-16 17:27:21,974 run.py:219 INFO running cmd:  pip install --prefix=/software/TensorFlow/2.0.0-foss-2019a-Python-3.7.2  --verbose  --no-deps  --ignore-installed  --no-build-isolation  .
== 2019-11-16 17:27:22,829 build_log.py:164 ERROR EasyBuild crashed with an error (at easybuild/base/exceptions.py:124 in __init__): cmd " pip install --prefix=/software/TensorFlow/2.0.0-foss-2019a-Python-3.7.2  --verbose  --no-deps  --ignore-installed  --no-build-isolation  ." exited with exit code 1 and output:
Created temporary directory: /tmp/eb-q00p60y1/pip-ephem-wheel-cache-ga_5t6qr
Created temporary directory: /tmp/eb-q00p60y1/pip-req-tracker-rq9f96xf
Created requirements tracker '/tmp/eb-q00p60y1/pip-req-tracker-rq9f96xf'
Created temporary directory: /tmp/eb-q00p60y1/pip-install-i3z9dfog
Processing /tmp/easybuild/build/TensorFlow/2.0.0/foss-2019a-Python-3.7.2/astor/astor-0.8.0
  Created temporary directory: /tmp/eb-q00p60y1/pip-req-build-rbbicmd9
  Added file:///tmp/easybuild/build/TensorFlow/2.0.0/foss-2019a-Python-3.7.2/astor/astor-0.8.0 to build tracker '/tmp/eb-q00p60y1/pip-req-tracker-rq9f96xf'
    Running setup.py (path:/tmp/eb-q00p60y1/pip-req-build-rbbicmd9/setup.py) egg_info for package from file:///tmp/easybuild/build/TensorFlow/2.0.0/foss-2019a-Python-3.7.2/astor/astor-0.8.0
    Running command python setup.py egg_info
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/eb-q00p60y1/pip-req-build-rbbicmd9/setup.py", line 17, in <module>
        setup(**config['options'])
      File "/software/TensorFlow/2.0.0-foss-2019a-Python-3.7.2/lib/python3.7/site-packages/setuptools/__init__.py", line 145, in setup
        return distutils.core.setup(**attrs)
      File "/software/Python/3.7.2-GCCcore-8.2.0/lib/python3.7/distutils/core.py", line 108, in setup
        _setup_distribution = dist = klass(attrs)
      File "/software/TensorFlow/2.0.0-foss-2019a-Python-3.7.2/lib/python3.7/site-packages/setuptools/dist.py", line 447, in __init__
        k: v for k, v in attrs.items()
      File "/software/Python/3.7.2-GCCcore-8.2.0/lib/python3.7/distutils/dist.py", line 292, in __init__
        self.finalize_options()
      File "/software/TensorFlow/2.0.0-foss-2019a-Python-3.7.2/lib/python3.7/site-packages/setuptools/dist.py", line 735, in finalize_options
        ep.load()(self, ep.name, value)
      File "/software/TensorFlow/2.0.0-foss-2019a-Python-3.7.2/lib/python3.7/site-packages/setuptools/dist.py", line 291, in check_specifier
        packaging.specifiers.SpecifierSet(value)
      File "/software/TensorFlow/2.0.0-foss-2019a-Python-3.7.2/lib/python3.7/site-packages/setuptools/_vendor/packaging/specifiers.py", line 594, in __init__
        specifiers = [s.strip() for s in specifiers.split(",") if s.strip()]
    AttributeError: 'SpecifierSet' object has no attribute 'split'

@Flamefire
Copy link
Contributor

This will be fixed by berkerpeksag/astor#163, see berkerpeksag/astor#162. Other method would be to use setuptools <41.4.

I added the PR as a patch now and astor installs.

BTW: Why did you add the dependencies dependencies like werkzeug? I thought EB/pip will do that itself? If not then there are more: From the configuration right before installing TF2:

absl-py-0.8.1
astor-0.8.0
cachetools-3.1.1
google-auth-1.7.1
google-auth-oauthlib-0.4.1
google-pasta-0.1.8
grpcio-1.25.0
keras-applications-1.0.8
keras-preprocessing-1.1.0
markdown-3.1.1
oauthlib-3.1.0
opt-einsum-3.1.0
protobuf-3.10.0
pyasn1-modules-0.2.7
requests-oauthlib-1.3.0
rsa-4.0
tensorboard-2.0.1
tensorflow-2.0.0
tensorflow-estimator-2.0.1
termcolor-1.1.0
werkzeug-0.16.0
wrapt-1.11.2

Would be fun to figure out in which order to install all this :/

@Flamefire
Copy link
Contributor

I thought EB/pip will do that itself?

Ok found it. Why aren't we using use_pip_for_deps?

@boegel
Copy link
Member Author

boegel commented Nov 19, 2019

I thought EB/pip will do that itself?

Ok found it. Why aren't we using use_pip_for_deps?

Because then the installation is not reproducible later, pip will likely use more recent versions for dependency package, which may work or not...

To figure out the correct order of things, I've found this helpful: https://gist.github.com/boegel/fd9a636d652aa5c8e57778088e9c0a21 .
We should probably look into integrating that with EasyBuild somehow...

@Flamefire
Copy link
Contributor

For the Python 3.7.4 on which TensorFlow-2.0.0-fosscuda-2019b-Python-3.7.4.eb depends, we'll need to add several sphinx-* packages + wcwidth for pytest to make pip check happy:

See #9329

@Flamefire
Copy link
Contributor

To figure out the correct order of things, I've found this helpful: https://gist.github.com/boegel/fd9a636d652aa5c8e57778088e9c0a21 .

Thanks for that. I significantly improved that: https://gist.github.com/Flamefire/49426e502cd8983757bd01a08a10ae0d This now handles e.g. google.pasta being import pasta and stuff like that. Also packages are now checked for and it errors out if it wasn't found instead of silently returning wrong lists. This happened for Keras-Applications vs keras-applications -.-

@boegel
Copy link
Member Author

boegel commented Nov 21, 2019

Test report by @boegel
SUCCESS
Build succeeded for 3 out of 3 (2 easyconfigs in this PR)
node3300.joltik.os - Linux centos linux 7.7.1908, Intel(R) Xeon(R) Gold 6242 CPU @ 2.80GHz, Python 2.7.5
See https://gist.github.com/fc698712f95f7124f6bf6d423200caf4 for a full test report.

@boegel
Copy link
Member Author

boegel commented Nov 21, 2019

Test report by @boegel
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in this PR)
node3124.skitty.os - Linux centos linux 7.7.1908, Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz, Python 3.6.8
See https://gist.github.com/a3c7f4af73a687d6fa4c96f67a58ed2f for a full test report.

@Flamefire
Copy link
Contributor

Test report by @Flamefire
FAILED
Build succeeded for 1 out of 2 (2 easyconfigs in this PR)
taurusi6216.taurus.hrsk.tu-dresden.de - Linux RHEL 7.4, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz, Python 2.7.5
See https://gist.github.com/4bb148e79c12703bb354f6cdaae0029c for a full test report.

@akesandgren
Copy link
Contributor

Test report by @akesandgren
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in this PR)
b-an03.hpc2n.umu.se - Linux ubuntu 16.04, Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz, Python 2.7.12
See https://gist.github.com/dbf455c1bffdacf4471ea80c9164f72f for a full test report.

@akesandgren
Copy link
Contributor

The TensorFlow-1.13.1_lrt-flag.patch is missing from the fosscuda easyconfig in this PR

@akesandgren
Copy link
Contributor

And checksum for TensorFlow-1.14.0_fix-cuda-build.patch needs to be updated to match the already merged PR #9333

@Flamefire
Copy link
Contributor

And checksum for TensorFlow-1.14.0_fix-cuda-build.patch needs to be updated to match the already merged PR #9333

@boegel Just rebase to current develop please

@boegel
Copy link
Member Author

boegel commented Nov 24, 2019

Test report by @boegel
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in this PR)
node3309.joltik.os - Linux centos linux 7.7.1908, Intel(R) Xeon(R) Gold 6242 CPU @ 2.80GHz, Python 2.7.5
See https://gist.github.com/45a61e11e547d28988a04e0308d174fe for a full test report.

@akesandgren
Copy link
Contributor

Test report by @akesandgren
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in this PR)
b-an03.hpc2n.umu.se - Linux ubuntu 16.04, Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz, Python 2.7.12
See https://gist.github.com/3dba93383eb04d7c571ea93b01d397ee for a full test report.

@boegel
Copy link
Member Author

boegel commented Nov 24, 2019

superseded by @Flamefire's PR #9338, so closing this one...

@boegel boegel closed this Nov 24, 2019
@boegel boegel deleted the TF200_pip_check branch November 24, 2019 21:04
@surak
Copy link
Contributor

surak commented Jun 26, 2020

Those things somehow didn't end up on the most recent easyconfig for 2.2...

@boegel
Copy link
Member Author

boegel commented Jun 27, 2020

@surak Please open an issue on that, following things up in closed PRs is a nightmare.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants