Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

archive_viewer not able to find CArchive #2372

Closed
dshmgh opened this issue Jan 11, 2017 · 3 comments · Fixed by #5554
Closed

archive_viewer not able to find CArchive #2372

dshmgh opened this issue Jan 11, 2017 · 3 comments · Fixed by #5554
Labels
good first issue This is a good issue if you want to start working on PyInstaller @low pull-request wanted Please submit a pull-request for this, maintainers will not actively work on this.

Comments

@dshmgh
Copy link

dshmgh commented Jan 11, 2017

This affects the use of archive_viewer.py

In the 3.2 release of pyinstaller in archive/readers.py at lines 147 and 149 there is a constant of 4096 hardcoded for the range of searching for the "MAGIC" value back from the end of the archive. In my case, I had a certificate of length 7248 bytes appended to the archive and it never found the "MAGIC" value location. Perhaps a better way is to start with 4096 and if MAGIC is not found, keep doubling the "EndBlockLength" and search again until the "EndBlockLength" is greater than the file length (or some algorithm like that).

@htgoebel htgoebel added @low good first issue This is a good issue if you want to start working on PyInstaller labels Jan 12, 2017
@htgoebel
Copy link
Member

This only effects the use of archive_viewer, since CArchiveReader.checkmagic is only used by the archive_viewer. Nevertheless we should be careful changing it.

Beside doubling the size to be read another possibility would be to simply read another 4096 bytes each time. On the other hand, this size does not really matter. So we could simple read 8K, 10K or 16K and add another block of this size. But then again, there is no need to keep all this data in memory, so something like this should work, too:

buf = b'' ; pos = -1
while True:
    seek back 4096 bytes
    buf = read(min(filelen, 4096)) + buf  # TODO: avoid overlapping data
    pos = buf.rfind(self.MAGIC)
    if pos >= 0:
       break
 else:
     # not found
     raise RuntimeError("%s is not a valid %s archive file" %
                       (self.path, self.__class__.__name__))

@dshmgh
Copy link
Author

dshmgh commented Jan 15, 2017

Below is some minimally tested code for this issue. One thing to consider is how far back in the archive should one try to find the cookie? The further one goes, the more chance of a false positive - but likely other things like the unpack of the TOC will fail. Below is diff from the current version of readers.py that I downloaded 3 hours ago. If this code looks OK, I can upload the new file.

[live@localhost for_release]$ diff ../orig/readers.py readers.py
146,150c146,167
<         
<         self.lib.seek(max(0, filelen-4096)) 
<         searchpos = self.lib.tell()
<         buf = self.lib.read(min(filelen, 4096))
<         pos = buf.rfind(self.MAGIC)
---
>  
>         # cant make myself not try to use a symbolic constant ....
>         _BUFSIZE = 4096
>         buf = b'' ; pos = -1
>         seekpos = filelen
>         ## 0th seeksize is the full buffer size
>         seeksize = _BUFSIZE
>         ## next subtract takes care of a cookie that crosses a search area boundary
>         seeksize1toN = _BUFSIZE - (2 * self._cookie_size)
> 
>         ## while not found, then continue back through archive to try and find the cookie
>         while pos == -1:
>             seekpos = max(0, seekpos - seeksize)
>             self.lib.seek(seekpos) 
>             searchpos = self.lib.tell()
>             buf = self.lib.read(min(filelen, _BUFSIZE))
>             pos = buf.rfind(self.MAGIC)
>             ## change next from value 0 to max(0,filelen - (X * _BUFSIZE)) to limit search area
>             if seekpos <= 0:
>                 break
>             seeksize = seeksize1toN
> 
152c169
<             raise RuntimeError("%s is not a valid %s archive file" %
---
>             raise RuntimeError("%s is not a valid %s archive file -- CANNOT find MAGIC cookie" %
158c175
<             raise RuntimeError("%s is not a valid %s archive file" %
---
>             raise RuntimeError("%s is not a valid %s archive file -- bad MAGIC cookie" %
164c181
<                 raise RuntimeError('Problem with embedded archive in %s' %
---
>                 raise RuntimeError('Problem with embedded archive length in %s' %
[live@localhost for_release]$

@htgoebel
Copy link
Member

htgoebel commented Jan 15, 2017

Below is some minimally tested code for this issue.

Please submit a pull-request instead of posting patches. I'm not going to copy-and-paste raw patches out of an issue. This is not how github works and makes a lot of work.

Please also have a look at the Development Guide and mind adding a test-case for this. Thanks.

@htgoebel htgoebel added the pull-request wanted Please submit a pull-request for this, maintainers will not actively work on this. label Sep 25, 2017
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 16, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
good first issue This is a good issue if you want to start working on PyInstaller @low pull-request wanted Please submit a pull-request for this, maintainers will not actively work on this.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants