New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intermittent failure in offline compression #1099
Comments
Hi @jtremesay-sereema — Can you take a look at #1086 and see if you can work out a reproduce? If we can see it ourselves we've got a chance of fixing it. |
I can try |
I have created a dummy django project that trigger the bug most of the time https://github.com/jtremesay-sereema/django-compressor-issue-1099
|
Super. Thanks @jtremesay-sereema. //cc @cuu508 |
The dummy project is awesome, it reliably reproduces the issue for me. I played around with it for a good bit, some findings: Looking at the project's templates, the contents of the But, each In the offline manifest we get a mapping:
Many different offline manifest keys, all pointing to the same file. If we look at Compressor.output_file(), it:
The One idea for the fix is to remove the "or forced" condition in
This way, if the file already exists in the filesystem, it would not get overwritten. This fixes the exception and speeds up offline compression :-) But I guess the extra condition was added for a reason. I don't know the reason. Here's the commit where it was added: 1908747 |
Great investigation, thanks! From memory (I haven't looked closely at this code in years), I think the problem is that
Without refactoring how
Bottom line, I think a deeper fix is needed - but I'd be happy to be proven wrong! |
Perhaps we can find a way to overwrite existing files, but only once within a single In
Currently, when running |
Yeah. The thing that surprised me is that this is not very different in essence from ef45175 : since what you added in this commit was based on the key being in the manifest, I would have expected it be enough to work around this issue ? |
Thought so too! The catch here is that we have many different offline manifest keys that map to the same generated file:
compressor does not regenerate the same offline manifest key multiple times any more. But, if several different |
Oh right, I wasn't paying enough attention. So yeah, your idea makes sense, I'm not sure how much refactoring is going to be needed to get to that point though. |
Here's an idea for discussion. Here's how def output_file(self, mode, content, forced=False, basename=None):
"""
The output method that saves the content to a file and renders
the appropriate template with the file's URL.
"""
new_filepath = self.get_filepath(content, basename=basename)
if not self.storage.exists(new_filepath) or forced:
self.storage.save(new_filepath, ContentFile(content.encode(self.charset)))
url = mark_safe(self.storage.url(new_filepath))
return self.render_output(mode, {"url": url}) It:
The proposal: if forced=True, then track what filepaths we've already handled, and only save to storage the filepaths that we see for the first time. I added class Compressor:
"""
Base compressor object to be subclassed for content type
depending implementations details.
"""
output_mimetypes = {}
seen_filepaths = set()
seen_filepaths_lock = Lock() And changed the def output_file(self, mode, content, forced=False, basename=None):
"""
The output method that saves the content to a file and renders
the appropriate template with the file's URL.
"""
new_filepath = self.get_filepath(content, basename=basename)
if forced:
# forced=True means this is offline generation (manage.py compress).
# We should overwrite all existing files.
# But, to avoid overwriting a file more than once in a single run,
# keep track of already handled filepaths in `seen_filepaths` class
# variable. If a filepath is already there, *don't* call
# self.storage.save() for it.
with self.seen_filepaths_lock:
skip_save = new_filepath in self.seen_filepaths
self.seen_filepaths.add(new_filepath)
else:
skip_save = self.storage.exists(new_filepath)
if not skip_save:
self.storage.save(new_filepath, ContentFile(content.encode(self.charset)))
url = mark_safe(self.storage.url(new_filepath))
return self.render_output(mode, {"url": url}) Class variables feel iffy to me, as they are basically globals. But the alternative would be to pass |
I created a PR with the above, plus a test case – #1100. I'm not 100% sure this is the right way to fix the issue, but it is more concrete and easier to play around with in a PR form. |
I created another PR with an alternate (and I think a better) fix – #1105 |
@jtremesay-sereema thanks to @cuu508 we have a fix for this in the |
Thx ! |
Excellent, thanks for the feedback, several people have now reported that it fixed the issue for them, I'll try to make a new release this week-end. |
I just released 4.0 with that fix. |
Hi,
We use
django-compressor
(v3.1) in our Django (v4.0.2) application. Our CI randomly fails (something like 1 run in 20) on the compress step with errors like this one:If I read correctly, this is a known issue (#1082) that was fixed in v3.0. But it seems like the fix was not enough.
The text was updated successfully, but these errors were encountered: