Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pinned requirements in the optional dependencies tests #20623

Closed
asmeurer opened this issue Dec 18, 2020 · 24 comments
Closed

Pinned requirements in the optional dependencies tests #20623

asmeurer opened this issue Dec 18, 2020 · 24 comments
Labels
GitHub Actions Related to the GitHub Actions CI setup. Do not use for test failures unless they are GHA specific. Travis Related to the Travis CI setup. Do not use for test failures unless they are Travis specific.
Milestone

Comments

@asmeurer
Copy link
Member

The optional dependencies tests are installing from a pinned requirements file in the releases/ directory (see https://github.com/sympy/sympy/blob/master/release/aptinstall.sh and https://github.com/sympy/sympy/blob/master/release/requirements.txt).

Is there a reason for this? I don't think it's a good idea to pin the dependencies, unless we have to. Otherwise, we won't catch bugs from new versions of the dependencies.

CC @oscarbenjamin

I'm also surprised that we are using pip instead of conda. If pip works I guess that's fine, but I would expect conda to be necessary for at least some dependencies.

Side note: is Sage being tested on GitHub Actions right now? I don't see it.

@oscarbenjamin
Copy link
Contributor

I made those files as a way to ensure it was possible to test all dependencies and build the docs as part of the release script where I wanted to pin the versions for reproducibility. Since I already had those ready to setup an Ubuntu 20.04 environment it was quick and easy to reuse them when migrating to actions in a hurry.

I think probably those files should be split up into several parts e.g. some parts are needed for tests and some for building the docs, some for apt and some for pip etc. There's also update_requirements.sh that has the bare pip command to install the latest versions of the same pieces.

The reasons for using conda over pip are not the same as they used to be since packaging with wheels etc has improved a lot over the years.

I don't know about sage. Maybe that got missed out...

@asmeurer asmeurer added the Travis Related to the Travis CI setup. Do not use for test failures unless they are Travis specific. label Dec 18, 2020
@asmeurer
Copy link
Member Author

Well conda packaging has also improved because of conda-forge. Although I've heard that tensorflow is better to install with pip.

For sage, we basically have to use the conda install. It also has to be a separate conda environment, because sage pins a lot of packages itself, and it also breaks mpmath when it is installed. I'm not sure why the Travis install pins a specific conda-forge channel. We might look into if that is still necessary.

@asmeurer asmeurer added the GitHub Actions Related to the GitHub Actions CI setup. Do not use for test failures unless they are GHA specific. label Dec 18, 2020
@oscarbenjamin
Copy link
Contributor

Although I've heard that tensorflow is better to install with pip.

I run the release script in a virtualbox VM. Initially I gave it 1GB of memory but it kept crashing while installing Tensorflow so I had to rebuild the VM with 2GB of memory. It turns out that pip install tensorflow uses something like 1.5GB of memory (briefly). I don't know what they're putting in that package or how pip is extracting the zip that that happens.

@asmeurer
Copy link
Member Author

The manylinux wheel for tensorflow is 400 MB (https://pypi.org/project/tensorflow/#files). I downloaded it and it's 900 MB uncompressed (wheels are just zip files). I'm guessing pip decompresses the file in memory, and perhaps stores it in memory twice.

The reason it is so big is that wheels have to statically link their dependencies, or else include a copy of them to dynamically link against, because wheels can only depend on other pip packages which have to be Python packages. conda packages can depend on any library. Conda works more like a Linux package manager like apt. I'm not sure why the conda-forge tensorflow package doesn't work (it's just what I've heard). If pip works, we can use it, but if it breaks, I suspect moving to back to the conda-forge packages will be better. conda is better at managing compiled dependencies. Also some of our dependencies are compilers, which aren't actually pip installable. Conda Forge packages also have much nicer channels to go through when they break, namely the feedstock issue trackers, and the conda-forge community is pretty responsive.

@oscarbenjamin
Copy link
Contributor

I've just never used conda so I would find it difficult to set anything up with it. I've tried to do bits through miniconda and other things but it seems like I really need to install the whole massive Anaconda distribution to get it working properly and that hasn't ever seemed reasonable to me (although I've advised other people to do it in the past). I think this also contributed to the problems I had with rever.

@asmeurer
Copy link
Member Author

asmeurer commented Feb 1, 2021

Fixing this would also get rid of the annoying "We found potential security vulnerabilities in your dependencies." banner at the top of https://github.com/sympy/sympy/

@oscarbenjamin
Copy link
Contributor

I have unpinned the versions of the optional dependencies in #20944. I had to remove tensorflow from the list of installed modules because it was pinning the numpy version to 1.19.x. I'm not sure why tensorflow pinned numpy when installed by pip in Actions but not with conda on Travis.

There were failures with numpy 1.20.x that showed up on Travis. Those would now also show up on Actions except that I have now fixed them (np.complex and np.float are deprecated in numpy 1.20 and were used in a few places in sympy).

There are a number of remaining todos for Actions:

If anyone wants to work on any of these then please go ahead.

@oscarbenjamin
Copy link
Contributor

There are some stats tests being skipped in the optional dependency build still:

sympy/stats/sampling/tests/test_sample_continuous_rv.py[8] ..s.....         [OK]
sympy/stats/sampling/tests/test_sample_discrete_rv.py[4] ..sw               [OK]
sympy/stats/sampling/tests/test_sample_finite_rv.py[5] ...s.                [OK]
sympy/stats/tests/test_compound_rv.py[6] ......                             [OK]
sympy/stats/tests/test_continuous_rv.py[84] ...w.......................w........
....................f...............ww..........                            [OK]
sympy/stats/tests/test_discrete_rv.py[18] ...w..............                [OK]
sympy/stats/tests/test_error_prop.py[2] ..                                  [OK]
sympy/stats/tests/test_finite_rv.py[22] ......................              [OK]
sympy/stats/tests/test_joint_rv.py[17] ............f..s.                    [OK]
sympy/stats/tests/test_matrix_distributions.py[8] ......s.                  [OK]

I guess this is tensorflow:

sympy/utilities/tests/test_lambdify.py[107] ....................................
..............ssssssssss...............................................     [OK]

These doctests are also being skipped:

sympy/integrals/rubi/rubimain.py[2] ss                                      [OK]
sympy/utilities/lambdify.py[4] ...s                                         [OK]

@oscarbenjamin
Copy link
Contributor

The requirements are no longer pinned and we now have optional dependency tests running on Python 3.8, 3.9 and 3.10 and pypy 3.8. We also have a bleeding edge 3.11-beta job that runs with the master branches of cython, numpy and scipy:

# -------------------- Optional dependency tests ----------------- #
optional-dependencies:
needs: code-quality
runs-on: ubuntu-20.04
continue-on-error: ${{ matrix.experimental }}
strategy:
fail-fast: false
matrix:
python-version: ['3.9', '3.10', 'pypy-3.8']
experimental: [false]
include:
- python-version: '3.11.0-alpha - 3.11'
experimental: true
steps:
- uses: actions/checkout@v2
- uses: actions/setup-python@v3
with:
python-version: ${{ matrix.python-version }}
# Install the non-Python dependencies
- run: sudo apt install antlr4 libgfortran5 gfortran libmpfr-dev libmpc-dev libopenblas-dev
- run: python -m pip install --upgrade pip wheel
# Use numpy and scipy from git for Python 3.11
- if: ${{ contains(matrix.python-version, '3.11') }}
run: pip install git+https://github.com/cython/cython@master
- if: ${{ contains(matrix.python-version, '3.11') }}
run: pip install git+https://github.com/numpy/numpy@main
- if: ${{ contains(matrix.python-version, '3.11') }}
run: pip install git+https://github.com/scipy/scipy@main
# dependencies to install in all Python versions:
- run: pip install mpmath matplotlib numpy ipython cython scipy aesara \
wurlitzer autowrap numexpr 'antlr4-python3-runtime==4.7.*'
# gmpy2 is not available for pypy
- if: ${{ matrix.python-version != 'pypy-3.8' }}
run: pip install gmpy2
# These are not available for pypy and cannot be installed in 3.11 (yet)
- if: ${{ matrix.python-version != 'pypy-3.8' && ! contains(matrix.python-version, '3.11') }}
run: pip install symengine llvmlite numba pymc
# Test external imports
- run: bin/test_external_imports.py
- run: bin/test_submodule_imports.py
- run: bin/test_executable.py
# Test modules with specific dependencies
- run: bin/test_optional_dependencies.py

A few optional dependencies are excluded under 3.11 and should be added when it becomes possible to install them (maybe it already is):

# These are not available for pypy and cannot be installed in 3.11 (yet)
- if: ${{ matrix.python-version != 'pypy-3.8' && ! contains(matrix.python-version, '3.11') }}
run: pip install symengine llvmlite numba pymc

The latest 3.10 optional dependency run shows only 3 doctests skipped. Two of those are in rubi and one is in lambdify. I think that's due to tensorflow which is tested in another job.
https://github.com/sympy/sympy/runs/6982638023?check_suite_focus=true

The main tests for 3.10 optional dependency job show 112 skipped. That seems to be due to cupy, tensorflow. Also I see pymc3. The other skips just seem to be test_args. So tensorflow is tested in another job.

It doesn't look like cupy is tested anywhere. Maybe that should be fixed. It might need its own job like tensorflow (the problem with tensorflow was that it pinned the numpy version).

For pymc3 perhaps just the tests need to be updated after #23650 (CC @oscargus).

@oscarbenjamin
Copy link
Contributor

I would like to keep a bleeding edge job and also add master testing for gmpy2 and mpmath there as well.

Another thing that might be good is a bleeding edge sphinx job because the sphinx build typically breaks in every new sphinx release. Note that sphinx is currently pinned:

Sphinx==4.5.0

The issue there has now been fixed and I think that the fix is in sphinx 5.0.2 so that can probably be unpinned now:
sphinx-doc/sphinx#10539

@asmeurer
Copy link
Member Author

cupy isn't tested because you can't install it on CI (it requires a GPU, which generally isn't available in free CI resources).

@asmeurer
Copy link
Member Author

Let's keep Sphinx unpinned by default, and only pin it when issues crop up.

@oscarbenjamin
Copy link
Contributor

I've unpinned the Sphinx version in #23664.

@oscargus I think that installing pymc does not work because all of the code in sympy tries to import pymc3. Should that be changed? Is there a need to support both? I don't understand what the situation with those libraries is.

@asmeurer
Copy link
Member Author

I believe pymc3 was renamed to just pymc.

@oscarbenjamin
Copy link
Contributor

I believe pymc3 was renamed to just pymc.

Yes, but what should SymPy do? Should it support both like:

try:
    import pymc
except ImportError:
    import pymc3 as pymc

Do we need to do that?

Currently SymPy is unable to make use of pymc because it is imported under a different name. Simply replacing pymc3 -> pymc in the codebase would mean that the former pymc3 module would not be usable.

@asmeurer
Copy link
Member Author

I guess. It looks like pymc 4 was released fairly recently. https://www.pymc.io/blog/v4_announcement.html.

I have no idea if pymc3 is still supported by the pymc devs, but it's probably as simple as that to support both for now, so we can do that then remove pymc3 later.

@oscarbenjamin
Copy link
Contributor

I think @oscargus knows more about this (#22780, #23650)

@oscargus
Copy link
Contributor

This is what I know:

pymc version 3 is imported as pymc3 and relies on theano, which is deprecated.

pymc version 4 is back to pymc as module name. This relies an aesara, which we do support and is under active development.

As pymc3 gave deprecation warnings and issues with theano, it was removed from the imports and now that pymc 4 is released it was added back.

However, the SymPy code relies on pymc3 (so it was actually not that useful to add pymc back...). There are currently 126 mentions of pymc3 in the code base, especially the file sympy/stats/sampling/sample_pymc3.py. Most are probably pretty straightforward to change (I have no idea if something fundamental has change between pymc 3 and 4), but it is not clear to me if one should add pymc as a sample method (and deprecate pymc3) or simply replace pymc3 with pymc?

I guess that there may be code that relies on explicitly calling pymc3 somewhere, so at least one should still support feeding that as an argument.

@oscarbenjamin
Copy link
Contributor

Maybe @brandonwillard can advise. Is it safe to just do this:

try:
    import pymc
except ImportError:
    import pymc3 as pymc

Then in a few years just change it to just import pymc.

@brandonwillard
Copy link
Contributor

Maybe @brandonwillard can advise. Is it safe to just do this:

try:
    import pymc
except ImportError:
    import pymc3 as pymc

Then in a few years just change it to just import pymc.

That depends on the exact parts of PyMC being testing, of course, but the user-level API should be pretty consistent between pymc3 and pymc (i.e. PyMC version 4), so there's a good chance that such changes would suffice.

From a quick search, it looks like most/all uses of pymc3 only involve the pymc3.Model and pymc3.sample interfaces, which should be consistent across versions, and the construction of basic distributions—most/all of which are available in pymc.

@oscarbenjamin
Copy link
Contributor

Thanks @brandonwillard. I think everything is tested if we just make sure that the tests actually run. They currently don't just because they're not even trying to import pymc.

@oscargus
Copy link
Contributor

oscargus commented Jun 24, 2022

With #23678 there are 107 skipped tests remaining.

@oscarbenjamin
Copy link
Contributor

I think that the only thing pinned now is antlr4-python3-runtime==4.7.* and as I understand it that's necessary because the files need to be regenerated for each version. I'm going to close this issue as fixed.

@asmeurer
Copy link
Member Author

Hopefully these go away once we tag a release https://github.com/sympy/sympy/security/dependabot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GitHub Actions Related to the GitHub Actions CI setup. Do not use for test failures unless they are GHA specific. Travis Related to the Travis CI setup. Do not use for test failures unless they are Travis specific.
Projects
None yet
Development

No branches or pull requests

4 participants