Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exceding NSIS 2Go limit ... #591

Closed
stonebig opened this issue Feb 11, 2018 · 18 comments
Closed

exceding NSIS 2Go limit ... #591

stonebig opened this issue Feb 11, 2018 · 18 comments

Comments

@stonebig
Copy link
Contributor

stonebig commented Feb 11, 2018

Winpython installer gets to big for NSIS:

Internal compiler error #12345: error mmapping file (2125247030, 18568856) is out of range.

... need to sacrifice some package or change of toolchain

what TreeSise says:

  • 542 Mo for numpy
  • 250 Mo for Jupyter lab
  • 200 Mo for PyQt5
  • 180 Mo for Tensorflow
  • 43 Mo for Pandoc
  • 38 Mo for ffmpeg
@stonebig stonebig added this to the 2018-01 Pandas-0.22.0 / Spyder-3.2.6 / Jupyterlab-beta milestone Feb 11, 2018
@stonebig
Copy link
Contributor Author

will remove Tensorflow, I'm afraid

@hiccup7
Copy link

hiccup7 commented Feb 11, 2018

I am using ffmpeg every day, called by Python functions. I don't use the very old version (v3.2.4) in the latest WinPython beta. I am currently using v3.3.3, which has functionality I need. I plan to test a newer version (currently v3.4.1) with my code when I need new functionality or bug fixes. Thus, I need to manage which ffmpeg version is used by my code. Currently, I hard code the path to ffmpeg in my code. If I were to use ffmpeg found in the PATH instead, WinPython puts its old version in the PATH before ffmpeg installed for the system.

Another solution for me is to delete the ffmpeg that comes with each installation of WinPython and change my code to use ffmpeg found in the PATH instead. But my code will break if I install a new WinPython and forget to delete its ffmpeg. Since ffmpeg is 38 MB, I am wondering if including it in WinPython is the best solution. Especially since many users have multiple versions of WinPython installed.

Besides Tensorflow, other Python packages depend on ffmpeg, such as imageio and moviepy. For inexperienced Windows users, would it be better to give them instructions on where to download ffmpeg, how to copy to C:\Program Files\ffmpeg, and add this folder to the PATH (even outside WinPython)? This way, the WinPython project doesn't have to decide which version of ffmpeg is best for the user. Also, the download size is smaller for users who don't need ffmpeg.

@stonebig
Copy link
Contributor Author

Dear @hiccup7

I agree ffmpeg":

I checked removing Tensorflows as a proven solution.
Now I can try by removing ffmpeg instead.

@stonebig
Copy link
Contributor Author

removing ffmpeg seems sufficient for now.
next candidate for removal is to prepare, as I don't expect it to be sufficient for more than 3 months

@orbitalz
Copy link

numpy is the required library for important python packages like pandas, matplotlib. May I vote for keeping it on the winpython.

@stonebig
Copy link
Contributor Author

for sure I can't remove numpy-mkl.

@stonebig
Copy link
Contributor Author

back to the limit.... let's try not doing jupyterlab build

@stonebig
Copy link
Contributor Author

stonebig commented Feb 25, 2018

no change... need a bigger cut... next week:

  • nbdime (shall suffice, if I guess right)

not enough, next try by removing:

  • Theano
  • pymc3
  • lasagne
  • scikit-neuralnetwork

@hiccup7
Copy link

hiccup7 commented Feb 26, 2018

I have a work colleague who has been trying to use Tensorflow. On the surface, it seems easy to use. However, execution is extremely slow. There are lots of problems using the GPU. He is very frustrated. It is very difficult for each user to figure out exactly which versions of NVidia driver, Cuda driver, Python, etc. are needed to make it work (doesn't work with AMD GPU). If the user does find a combination that works, better not update anything for fear of breaking it. Numba looks better than Tensorflow in terms of being debugged and supported. Since Tensorflow is such a big download, I propose not to include it in WinPython.

@stonebig
Copy link
Contributor Author

I anticipate I will be force to drop Tensorflow, I can't imagine JupyterLab not growing further in size.

@orbitalz
Copy link

orbitalz commented Mar 5, 2018

dropping Tensorflow might be a good choice since Tensorflow supports only Nvidia GPU and, compare with Jupyterlab package, I think that it is less important than the Jupyterlab which demonstrates useful in many cases.

@stonebig
Copy link
Contributor Author

stonebig commented Mar 5, 2018

There is nevertheless a huge space taken per Numpy+mkl/Jupyterlab/Tensorflow, that feels "more than what they should need" to me.

I would dream this three packages to take a look on their size metric.

If Japan chainer was smaller, I would take this road, but it doesn't seem to.

Matplotlib-2.2.0 is sligthly smaller than Matplotlib-2.1.2, so size inflation is not a curse.

@orbitalz
Copy link

orbitalz commented Mar 6, 2018

these 3 packages are the huge one and 2 out of 3 are very important packages for computer science and physical science. I hope they are getting smaller some days.

@stonebig
Copy link
Contributor Author

stonebig commented Mar 8, 2018

maybe Microsoft will resolve our problem by including a Machine Learning bloc in default Windows : https://arstechnica.com/gadgets/2018/03/microsoft-pushing-machine-learning-as-big-developer-feature-for-next-windows-update/

@stonebig
Copy link
Contributor Author

stonebig commented Mar 8, 2018

and maybe good news, resolving all issues (too good to be true, I have to check):
https://gitter.im/jupyterlab/jupyterlab?at=5a9c4fe2f3f6d24c683778d8

stonebig @stonebig mars 04 07:43
hi, I'm trying to compact my installation. is there anyting in directory "python-3.6.4.amd64\share\jupyter\lab" that can be cleared without impact ? like python-3.6.4.amd64\share\jupyter\lab\staging\build

Min RK @minrk mars 04 20:58
@stonebig I believe the whole staging directory can go

so:

  • uncompress gain = 250 Mo
  • compressed gain = 22 Mo

@orbitalz
Copy link

orbitalz commented Mar 9, 2018

Wow, if it works, we would gain 225 Mo additional space!

@stonebig
Copy link
Contributor Author

stonebig commented Mar 9, 2018

it works, so we can re-add back something dropped that was important for users:

  • if I bring back pymc3 I need to bring back theano, but I don't see pymc3 moving and theano is dead slow in WinPython (no-GCC compiler)
  • if we bring back pandoc, some functionnalities of save as pdf may work again.

Unless strong argument, I would keep the things as they are, as:

  • we have only gain 1 year of tranquility,
  • jupyterlab may continue to grow, some new features may be to add,
  • wait & see what Microsoft bring to the table with included Machine Learning in next Windows 10 update.

@stonebig
Copy link
Contributor Author

Tensorflow-1.6 is compiled for avx processors only ... problem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants