You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A few years ago the AWS_PRELOAD_METADATA setting was removed in this PR: #636
I understand why - this is not something you’d want to turn on when servicing web requests and was probably the source of many complaints. But it was crucial to the performance optimisations that collectfast made. I’ve recently forked collectfast into Collectfasta as collectfast had been archived- but I’ve had to reimplement a lot of the S3Storage methods to add back the preloading functionality in order to maintain the performance improvements. This is not sustainable as the code will need to be kept in sync with each django-storages release.
My proposal: we move this logic back into django-storages (I am happy to contribute) but with some safeguards to ensure it only runs during management commands. Some ideas on this below (not all need to be implemented):
Only allow it for static files storage classes - not other classes. This would be somewhat difficult as all the logic exists in the base classes so it would require a bit of refactoring of the base class to work. It also doesn’t solve it not running when serving requests
Do not allow it to be set through settings at all - within the management command we can manipulate the storage object directly. Kinda feels ick doing this.
Use an ENV var rather than Django settings - management commands can set the env var directly or users can do it when running the command. The reason I suggest this is that you don’t want to have a separate env module just for running collectstatic
Detect the command that is being run somehow. Perhaps by checking sys.argv - I doubt this works with call_command so it probably not my first choice. As far as I know there’s no functionality in Django to tell within an arbitrary piece of code (i.e where you don’t have a request object) if you’re serving requests or running a management command.
If you want to read about why collectstatic performance is so bad I have a write-up on it here - unfortunately the way that the core Django storage API works - in particular ManifestStaticFilesStorage, it makes it very difficult to do a performant collectstatic command with just the methods available in the Storage interface if your storage class deals with remote storage.
The text was updated successfully, but these errors were encountered:
jasongi
changed the title
Reintroduction of Preload metadata functionality
Reintroduction of AWS_PRELOAD_METADATA functionality
Apr 6, 2024
I will say it was an oversight of mine not to search GH to see how many consumers of the preload metadata API existed before I removed it.
And similarly, you understand why it was removed.
That being said I am happy to work on a solution that improves this for people since it seems there is still significant demand. In general, back when I was working with Django, I had switched to Whitenoise, but I do understand that doesn't work for everyone.
Can you propose a new API that would meet all the needs? It's possible we could upstream it once it was developed out and since this library has BaseStorage it might be okay to do something general.
A few years ago the AWS_PRELOAD_METADATA setting was removed in this PR:
#636
I understand why - this is not something you’d want to turn on when servicing web requests and was probably the source of many complaints. But it was crucial to the performance optimisations that collectfast made. I’ve recently forked collectfast into Collectfasta as collectfast had been archived- but I’ve had to reimplement a lot of the S3Storage methods to add back the preloading functionality in order to maintain the performance improvements. This is not sustainable as the code will need to be kept in sync with each django-storages release.
My proposal: we move this logic back into django-storages (I am happy to contribute) but with some safeguards to ensure it only runs during management commands. Some ideas on this below (not all need to be implemented):
If you want to read about why collectstatic performance is so bad I have a write-up on it here - unfortunately the way that the core Django storage API works - in particular ManifestStaticFilesStorage, it makes it very difficult to do a performant collectstatic command with just the methods available in the Storage interface if your storage class deals with remote storage.
The text was updated successfully, but these errors were encountered: