New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow setting the chunk_size in Blob creation via the GS_BLOB_CHUNK_SIZE #401
Conversation
…tting When uploading a large file to the Google Cloud Storage, the generated request must fit into memory and can easily exhaust it. This can be avoided by specifying the chunk_size at Blob creation which then triggers the resumable upload that uploads the file in parts and does not load the complete file in memory. See: https://github.com/GoogleCloudPlatform/google-cloud-python/blob/d3ef455b797b6960ed58f563d4b43d9dcc4c7364/storage/google/cloud/storage/blob.py#L770
Allow setting the chunk_size in Blob creation via the GS_BLOB_CHUNK_SIZE
Codecov Report
@@ Coverage Diff @@
## master #401 +/- ##
==========================================
- Coverage 76.1% 76.05% -0.06%
==========================================
Files 11 11
Lines 1578 1566 -12
==========================================
- Hits 1201 1191 -10
+ Misses 377 375 -2
Continue to review full report at Codecov.
|
I think this cloud be very easily done with simple python inheritance. I have approached the same problem by creating and in my new storage class sooner or later , you will end up overriding the class to add your own customization ( setting metadata , changing default acl, bucket name , upload files directly without saving them locally , etc ) , so it is better done now. |
@myimages I am already subclassing storages.backends.gcloud.GoogleCloudStorage to have additional methods from the GoogleStorage available. However, chunked uploads are in my opinion often required (also, Google suggest this method for all Blobs bigger than 5MB) and should therefore be available out of the box. |
I think this should definitely be included in this lib instead of requiring people to subclass both |
@joahim Can you add documentation for this setting? |
Added docs in #757. |
When uploading a large file to the Google Cloud Storage, the generated request must fit into memory and can easily exhaust it. This can be avoided by specifying the chunk_size at Blob creation which then triggers the resumable upload that uploads the file in chunks and does not load the complete file into memory.
See: https://github.com/GoogleCloudPlatform/google-cloud-python/blob/d3ef455b797b6960ed58f563d4b43d9dcc4c7364/storage/google/cloud/storage/blob.py#L770