New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error for binaries larger than 2Gb #3939
Comments
A possible solution would be to change this from signed to unigned values (need to be done in the bootloader, too). Can you please provide a test-case we can include into the test-suite. Thanks. |
The switch from signed to unsigned moves the limit from 2gb to 4gb, right ? Is there a way to go over the 4gb limit ? Basically, could we go with an 8byte specifier ? As for a test case, do you need a specific format or just a set of steps ? |
Right.
Basically we could be, but this required more changes. Also we should keep in mind 32-bit platforms, which might have trouble using 8-byte specifiers.
Well, this should fit into our test-quite, which uses py-test. The most problematic point is to generate same data which surly ends up to be > 4gb when creating the CTOC. I suggest several test-cases, which (as I assume) also ease developing tests:
|
Sorry this might be a dumb question, but how to do it? Change '!iiiiBB' to '!IIIIBB' ? |
Pinging |
Any news? |
@Red-Eyed I've added this to my todo-list, and I'll get this in the 4.1 release, maybe even the 4.0 release depending on when that actually gets released. |
@Legorooj is this fixed now? |
@Legorooj sorry for the annoying comments, but this issue is important to me: Currently I have to do my installer with additional tarballs rather than having just a single large executable. So, If you don't mind, I just want to ask a few questions in order to have a better understanding the status of this issue:
Thanks! |
Ah right. Let me take a lok at this right now - my apologies, I completely forgot about this.
|
I question the usefulness of this. It'd take a good 30 minutes to pack a 4GB application then a good minute or so to unpack again followed by however long it takes for your antivirus to give it a good sniff which would have to happen every time the user opens the application. I already find ~200MB applications to be annoyingly slow to startup. I think you'd be better off turning your one-dir applications into an installable bundle using NSIS or something equivalent. That way your user only has to unpack it once. |
@bwoodsend you described only ONE use case, and yes it's useless. Also, you do not take into account other OS than windows. My use case is actually cross platform installer. And I do not want or need any third party installer, as my self written install in python code does the job. Linux distributions do not have problems with filesystem and antiviruses so it's fast to unpack zstd archives in multithreaded mode. Also, u said that u've done it, could you please share that branch with me? |
Idk, it takes me about 2 min to pack ~2GB.But note, I just pack "data" |
I don't have a branch for it. I was just messing about with zlib which is what PyInstaller uses to pack and unpack. |
If your large size is coming from data files is it possible to just put those into a zip, have your code read directly from said zip, then include the zip in your onefile app? |
I just include tar.xz into my pyinstaller one file and then unpack it |
my tar.xz is about 1.6 GB (the source size is 6GB) |
I played around with PyInstaller and CArchive, but I haven't make it to work. So, If you're not going to create PR for this issue, I would like to see any kind of investigation work, even if it doesn't meet PR requirements, just to see what have u done. Or u didn't change CArchive logic? Thanks |
@bwoodsend if you pack the cuda cudnn libraries for deep learning, the package will up to 2GB archive. Please move the limit from 2GB to 4GB or larger. We really need this feature. Thanks bro. |
Yes, same usage. I ended up with putting all model data into password protected zip. It will be great if we could go beyond 2GB limitation. |
Uf, this is going to be all sorts of fun... Here's an experimental branch that raises limit from 2 GB to 4 GB by switching from signed integers to unsigned ones: https://github.com/rokm/pyinstaller/tree/large-file-support I think before even considering the move to 64-bit integers for raising the limit further, we'll need to rework the archive extraction in the bootloader. Because currently, it extracts the whole toc entry into an allocated buffer and, if compression is enabled, decompresses it in one go. This is done both for internal use and for extraction onto filesystem during unpacking... The decompression should definitely be done in a streaming manner (i.e., using smaller chunks; so that we can avoid having whole compressed data in memory at once). And when extraction is performed as a part of |
@rokm You mean it'll currently have all 2-4GB in RAM during decompression? That'd be horrific on a 4GB machine. |
I'm not sure if the big data entries are actually compressed... But even for uncompressed files, the current implementation of pyinstaller/bootloader/src/pyi_archive.c Lines 145 to 223 in e67a589
(And it's even worse if it is compressed, because then we keep both whole compressed and uncompressed data blobs in memory during decompression). |
@rokm I just tried your branch I faced with the same error, I guess, in the bootloader when I tried to work on this issue -> ~/my_cool_installer
[568688] Cannot open self /home/redeyed/my_cool_installer or archive /home/redeyed/my_cool_installer.pkg |
Yes, you need to rebuild the bootloader yourself. |
Okay, will do that tomorrow. That would be awesome. Thanks! |
One of the commits adds a test that creates a 3 GB data file with random contents, computes its md5 hash, and then adds this file to a onefile build of a program, which in turn reads the unpacked file from its _MEIPASS dir, computes the md5 hash, and compares it to the one that was computed previously. This test is now passing on my Fedora 33 box, and in a Windows 10 VM. (But you need to rebuild the bootloader, because that's where the unpacking actually takes place). |
Yes! In my case, the |
Currently I have 6 GB of python environment (pytorch, tensorflow, scipy etc) and pack it into 1.6GB tar.xz archive to get under 2GB limit but I want to use zstandard compression (to speed up decompression) but zstandart compresses to 2.6GB which is above of 2GB limit |
Note: I just built bootloader and it works, thank you @rokm ! |
Here's a further branch, in which 64-bit integers are used: https://github.com/rokm/pyinstaller/tree/large-file-support-v2 So now in theory, sky's the limit - you can chuck in all your deep learning frameworks, CUDA libraries, pretrained models, ... In practice, however, the 5 GB onefile test passes only on linux (tested only 64-bit for now). Windows (even 64-bit) do not seem to support executables larger than 4 GB. On (64-bit) macOS, the So really huge |
@Legorooj any progress? |
I stopped work on this because by the time I woke up the next morning @rokm had already written everything that needed to be😅. Maybe he could submit a PR? |
Nice job! |
@sshuair the current plan is to have the changes from those experimental branches submitted and merged gradually - first the endian-handling cleanup, then the file extraction cleanup, then switch to unsigned 32-bit ints, and finally to 64-bit ones. In the meantime, if you require this functionality, you can use either of the experimental branches linked above. |
@Red-Eyed Sorry~~It's my fault, I install the package from |
@rokm so how to solve the error problem for binaries larger than 4G, my OS system is CentOS Linux release 7.6.1810 (Core) struct.error: 'I' format requires 0 <= number <= 4294967295 |
You cannot. If you wanted to generate embedded archive that's larger than 4 GB, you would need to switch the types used by corresponding PyInstaller code (both python and C) to 64-bit integers. While this would work on linux, neither Windows nor macOS support executables larger than 4 GB, so we only extended max size from 2 GB to 4 GB by switching to unsigned 32-bit integers. |
For final binaries larger than 2Gb, the following exception message is printed out:
struct.error: 'i' format requires -2147483648 <= number <= 2147483647
https://github.com/pyinstaller/pyinstaller/blob/develop/PyInstaller/archive/writers.py#L264
The text was updated successfully, but these errors were encountered: