Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PyPy3 wheels not uploaded to cache -> very slow builds #288

Closed
2 of 5 tasks
hugovk opened this issue Dec 1, 2021 · 4 comments · Fixed by #303
Closed
2 of 5 tasks

PyPy3 wheels not uploaded to cache -> very slow builds #288

hugovk opened this issue Dec 1, 2021 · 4 comments · Fixed by #303
Labels
bug Something isn't working

Comments

@hugovk
Copy link
Contributor

hugovk commented Dec 1, 2021

Description:

Wheels built by PyPy3 aren't being cached. This makes the PyPy3 build very slow as it rebuilds them every time.

Action version:

v2 = v2.3.1

Platform:

  • Ubuntu
  • macOS
  • Windows

Runner type:

  • Hosted
  • Self-hosted

Tools version:

PyPy 7.3.7 with Python (3.8.12)

(Also Python 3.7-3.10 in the matrix)

Repro steps:

I'm not entirely sure how the current state came about but:

  1. Trigger a build
  2. The existing pip cache is downloaded, but it doesn't contain wheels for pypy-3.8
  3. requirements.txt is installed. Some pure Python wheels are installed from cache, but a number of sdists are downloaded: matplotlib-3.5.0.tar.gz, kiwisolver-1.3.2.tar.gz, Pillow-8.4.0.tar.gz
  4. Wheels are built for these sdists. It is takes nearly 9 minutes to install everything.

Expected behavior:

I expect after the wheels are build the first time, the pip cache is uploaded with actions/cache for next time.

Actual behavior:

The pip cache is never uploaded for PyPy3.

"Post Set up Python pypy-3.8" shows either:

Post job cleanup.
Unable to reserve cache with key setup-python-Linux-pip-4547a7629556c7c4fb4f7b36b381999277204f48e1d116d1dcdc783c98b8aa0c, another job may be creating this cache.

https://github.com/hugovk/drop-python/runs/4384155862?check_suite_focus=true

Or:

Post job cleanup.
Cache hit occurred on the primary key setup-python-Linux-pip-f5b11c7530ca1ca017e33203e65a953e8864e5891c8f500684b94ffcdde1d825, not saving cache.

https://github.com/hugovk/drop-python/runs/4383955419?check_suite_focus=true


I see the cache key is calculated from the runner OS (e.g. Linux), the package manager (e.g. pip or pipenv) and the hash of the dependency path:

const hash = await glob.hashFiles(this.cacheDependencyPath);
const primaryKey = `${this.CACHE_KEY_PREFIX}-${process.env['RUNNER_OS']}-${this.packageManager}-${hash}`;
const restoreKey = `${this.CACHE_KEY_PREFIX}-${process.env['RUNNER_OS']}-${this.packageManager}`;

For a given OS, pip should be able to handle caching different Python versions in the same cache dir. However, there seems to be some sort of race condition here.

Before using this new caching I used to include the Python version in the cache key:

          key:
            ${{ matrix.os }}-${{ matrix.python-version }}-v1-${{
            hashFiles('**/setup.py') }}
          restore-keys: |
            ${{ matrix.os }}-${{ matrix.python-version }}-v1-

Perhaps this would help here too?

@hugovk hugovk added bug Something isn't working needs triage labels Dec 1, 2021
@dmitry-shibanov
Copy link
Contributor

Hello @hugovk. Thank you for your report. If I've understood well you want to add the primary as for pipenv.

@hugovk
Copy link
Contributor Author

hugovk commented Dec 1, 2021

I don't understand why it's failing to upload the updated cache, but yes, I think that's one way to fix it:

const hash = await glob.hashFiles(this.patterns);
const primaryKey = `${this.CACHE_KEY_PREFIX}-${process.env['RUNNER_OS']}-python-${this.pythonVersion}-${this.packageManager}-${hash}`;
const restoreKey = undefined;

@hugovk
Copy link
Contributor Author

hugovk commented Dec 14, 2021

I'm now hitting this in other repos, for example:


I believe this is happening:

  1. On macOS, NumPy has released wheels for CPython but not for PyPy

  2. CPython job downloads a prebuilt NumPy wheel (fast) and puts it in its pip cache

  3. CPython job uploads the pip cache

  4. PyPy job either has no pip cache or downloads a CPython one with an incompatible NumPy wheel

  5. PyPy job downloads the NumPy sdist and builds it into a wheel (very slow) and puts it in its pip cache

  6. PyPy job does not upload its pip cache because there's already one for this OS

This then repeats for other PyPy jobs and subsequent builds


In fact this means only one job is using the cache; jobs for other CPython versions don't find a wheel for their CPython version in the cache, so have to always download a fresh wheel from PyPI.

Whilst this is much faster than building an sdist, it does mean the cache isn't being used and large wheel files are being downloaded every time.

In the example above:

  • One CPython version/job has a wheel in the cache
  • Three CPython versions/jobs download large wheels (~12-27 MB) every time
  • Two PyPy versions/jobs download a large sdist (10 MB) and spend a lot of time building wheels which are discarded

Prior to using cache: pip in actions/setup-python, I did include both the OS and Python version in the key, for example:

@hugovk
Copy link
Contributor Author

hugovk commented Dec 16, 2021

Please see PR #303.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants