Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More robust ZipImporter cache invalidation #202

Closed
ghost opened this issue May 7, 2014 · 16 comments
Closed

More robust ZipImporter cache invalidation #202

ghost opened this issue May 7, 2014 · 16 comments

Comments

@ghost
Copy link

ghost commented May 7, 2014

Originally reported by: jaraco (Bitbucket: jaraco, GitHub: jaraco)


In #168 and Pull Request 48, a sub-optimal workaround was implemented, but a more robust solution was proposed. Setuptools should consider implementing those suggestions.


@ghost
Copy link
Author

ghost commented May 7, 2014

Original comment by pje (Bitbucket: pje, GitHub: pje):


(copying the notes from the old ticket)

A Way To Fix zipimport's Internal Caches

Suppose that, when we remove the entries from the zip cache, we actually set them aside for a moment, then create a new zipimporter for the file. Then, we clear the old entry, and update it with the new entry. Because the cache entries are shared between zipimporters, updating the old entry would automatically fix every existing zipimporter for that zipfile. Something like:

old_entry = zipdircache.pop(path_to_zip)  # remove from cache
zipimport.zipimporter(path_to_zip)    # create new entry
old_entry.clear()
old_entry.update(zipdircache[path_to_zip])  # force-update older zipimporters

(But done for each path in the cache that is the "same" path as path_to_zip, not just the one entry.)

This approach should fix all the use cases, without needing to track down and replace individual distribution objects in every environment and working set, because the way the cache works, the dictionary representing a given zipfile is shared between all existing zipimporter objects. So rewriting the old entry should fix the old zipimporters.

It would require a bit more elaborate change to the uncaching code, as it would need to first pull out any matching entries for the zipfile, saving them in a list, then it would need to create a zipimporter instance for the path, then pull out the new entry and update the old one(s). All in all, this is probably the "right" way to do it, as it should take care of all updating-an-egg-in-place scenarios properly.

@ghost
Copy link
Author

ghost commented May 7, 2014

Original comment by pje (Bitbucket: pje, GitHub: pje):


By the way, since Pull Request 48 already changed the _uncache() function code to always loop over the whole cache, it just needs the code above added to the processing of each entry when the cache it's processing is the zipdir cache.

Note, however, that this uncaching approach will only work if the uncaching happens after the new file is moved/copied and is in fact a zip file; otherwise creating the zipimporter will either raise an error (if there's no file or it's a directory), or re-cache the old file (if the new one hasn't overwritten it yet).

(Since the last discussion of this issue, I've already forgotten whether the uncaching code is called at the right point for this, so it needs to be rechecked. ;-) )

@ghost
Copy link
Author

ghost commented May 8, 2014

Original comment by jurko (Bitbucket: jurko, GitHub: jurko):


I'll take a look at this now...

@ghost
Copy link
Author

ghost commented May 8, 2014

Original comment by jurko (Bitbucket: jurko, GitHub: jurko):


I've opened pull request #51 for you to review... flame away... 😄

Best regards,
Jurko Gospodnetić

@ghost
Copy link
Author

ghost commented Jun 1, 2014

Original comment by jaraco (Bitbucket: jaraco, GitHub: jaraco):


fix issue202 - update existing zipimporters when replacing a zipped egg

When replacing a zipped egg distribution with a different zipped egg, we make
all existing zipimport.zipimporter loaders valid again instead of having to go
hunting them down one by one. This is done by updating their shared zip
directory information cache - zipimport._zip_directory_cache.

Related to the following project issues:
#169 - http://bitbucket.org/pypa/setuptools/issue/168
#202 - http://bitbucket.org/pypa/setuptools/issue/202

@ghost
Copy link
Author

ghost commented Jun 1, 2014

Original comment by jaraco (Bitbucket: jaraco, GitHub: jaraco):


I've uploaded Setuptools 4.1b1 to our downloads section with this new implementation for review and testing.

@ghost
Copy link
Author

ghost commented Jun 2, 2014

Original comment by jaraco (Bitbucket: jaraco, GitHub: jaraco):


Failing tests in Travis suggest this implementation fails on pypy.

@ghost
Copy link
Author

ghost commented Jun 2, 2014

Original comment by pje (Bitbucket: pje, GitHub: pje):


Apparently PyPy uses a custom mapping type for that cache, that doesn't have a .pop method. Looks like you have to use old_entry = cache[p]; del cache[p] instead. The relevant PyPy code is here.

@ghost
Copy link
Author

ghost commented Jun 2, 2014

Original comment by jurko (Bitbucket: jurko, GitHub: jurko):


Added pull request #59 with the fix suggested by @pje .

The same pull request later updated to include additional cached zipimport data clearing updates suggested by @pje .

Best regards,
Jurko Gospodnetić

@ghost
Copy link
Author

ghost commented Jun 14, 2014

Original comment by jaraco (Bitbucket: jaraco, GitHub: jaraco):


fix issue202 - update existing zipimporters when replacing a zipped egg

When replacing a zipped egg distribution with a different zipped egg, we make
all existing zipimport.zipimporter loaders valid again instead of having to go
hunting them down one by one. This is done by updating their shared zip
directory information cache - zipimport._zip_directory_cache.

Related to the following project issues:
#169 - http://bitbucket.org/pypa/setuptools/issue/168
#202 - http://bitbucket.org/pypa/setuptools/issue/202

@ghost
Copy link
Author

ghost commented Jun 14, 2014

Original comment by jaraco (Bitbucket: jaraco, GitHub: jaraco):


The tests are still failing it Travis.

@ghost
Copy link
Author

ghost commented Jun 14, 2014

Original comment by pje (Bitbucket: pje, GitHub: pje):


Weird. It lets you delete cache entries, but not assign to them. Guess we'll have to skip that additional fix on PyPy, or just use a try/except to ignore the error on PyPy.

@ghost
Copy link
Author

ghost commented Jun 15, 2014

Original comment by jurko (Bitbucket: jurko, GitHub: jurko):


See pull request #65 containing my suggested fix.

Note that BitBucket might get confused by multiple default branch heads, and if that happens (I believe it can happen when you do additional commits to pypa/setuptools) you might need to merge its one commit manually.

Best regards,
Jurko Gospodnetić

@ghost
Copy link
Author

ghost commented Jun 15, 2014

Original comment by jaraco (Bitbucket: jaraco, GitHub: jaraco):


Excellent. By being in the default branch (and particularly because it was against the issue202 head), I was able to merge it manually and directly, and bitbucket detected it.

Furthermore, tests are passing. I believe this can go into the next release.

Thanks for all the hard work on this one.

@ghost
Copy link
Author

ghost commented Jun 15, 2014

Original comment by jaraco (Bitbucket: jaraco, GitHub: jaraco):


Merge issue202 changes. Fixes #202.

@ghost
Copy link
Author

ghost commented Jun 15, 2014

Original comment by jurko (Bitbucket: jurko, GitHub: jurko):


Thanks for returning to process the issue and the pull requests!

Here's hoping that this time the resolved status sticks. 😄

Best regards,
Jurko Gospodnetić

@ghost ghost closed this as completed Mar 29, 2016
jaraco added a commit that referenced this issue Feb 6, 2023
Fix MinGW-w64 segmentation fault (v2)
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

0 participants