Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build against libjpeg-turbo instead of libjpeg #6563

Closed
pmeier opened this issue Sep 12, 2022 · 3 comments · Fixed by #7820
Closed

Build against libjpeg-turbo instead of libjpeg #6563

pmeier opened this issue Sep 12, 2022 · 3 comments · Fixed by #7820

Comments

@pmeier
Copy link
Collaborator

pmeier commented Sep 12, 2022

torchvision is currently building

vision/setup.py

Line 321 in cac4e22

(jpeg_found, jpeg_conda, jpeg_include, jpeg_lib) = find_library("jpeglib", vision_include)

and testing against libjpeg

Pillow is building against libjpeg-turbo on Windows for some time now and since Pillow=9 on all platforms (Jan 2022).

This has two downsides for us:

  1. We can't use Pillow as reference for our own decoding and encoding ops. See
  2. As the name implies, libjpeg-turbo is a faster implementation of the JPEG standard. Thus, our I/O ops are simply slower than using Pillow, which hinders adoption.

Recently, @NicolasHug led a push to also use libjpeg-turbo, but hit a few blockers:

  1. Our workflows use the defaults channel from conda. Unfortunately, on defaults libjpeg-turbo is only available for Windows and macOS.
  2. Adding conda-forge to the channels for Linux, leads to crazy environment solve times (10+ minutes), which ultimately time out the CI. In general this change should be possible if conda-forge has a lower priority than defaults.
  3. Depending on the experimental libmamba solver indeed speeds ups the solve for the CI to not time out (it is still a little slower than before). Unfortunately, our CI setup does not properly work with it, since a CUDA 11.6 workflow is still pulling a PyTorch version build against CUDA 11.3.

From here on I currently see four options:

  1. Only build and test Windows and macOS binaries against libjpeg-turbo. This would mean that arguably most of our users won't see that speed-up.
  2. Find a way to stop the CI from timing out when using conda-forge as extra channel. This can probably be done through the configuration or by emitting more output during the solve.
  3. Fix our CI setup to work with the libmamba solver.
  4. Package libjpeg-turbo for Linux ourselves. We already use the pytorch or pytorch-nightly channels. If it was available there, we wouldn't need to pull it from conda-forge. In Use libjpeg-turbo in CI instead of libjpeg #5941 (comment) @malfet only talks about testing against it, but maybe we can also build against it.

cc @seemethere

@vfdev-5 vfdev-5 changed the title Build against libjepg-turbo instead of libjpeg Build against libjpeg-turbo instead of libjpeg Sep 13, 2022
@pmeier
Copy link
Collaborator Author

pmeier commented Oct 13, 2022

#6746 increased the inactivity timeout for the CI from 10 to 30 minutes. If that is acceptable in general now, we might get away with 2. above.

@sh-shahrokhi
Copy link

#6746 increased the inactivity timeout for the CI from 10 to 30 minutes. If that is acceptable in general now, we might get away with 2. above.

Hello, just wanted to thank for this. Maybe you can add libjpeg-turbo to pytorch channel, and remove conda-forge?

@sh-shahrokhi
Copy link

Also, please check this https://www.anaconda.com/blog/conda-is-fast-now

conda install -n base conda-libmamba-solver
conda config --set solver libmamba

it increases conda's speed without adding conda-forge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants