When refresh_metadata plugin is used, UploadedFile#download using
Storage#download causes bad performance when UploadedFile#download is
called multiple times during metadata extraction, because in that case
the same file will be downloaded from the storage multiple times.
add_metadata :foo do |io, context|
io.download { ... } # downloaded from the storage (1st time)
end
add_metadata :bar do |io, context|
io.download { ... } # downloaded from the storage (2nd time)
end
Moreover, the download that `UploadedFile#refresh_metadata!` started at
the beginning of metadata extraction won't be reused. It will be reused
for determine_mime_type and store_dimensions plugins, but for any custom
metadata that uses UploadedFile#download (either directly or implicitly
via Shrine.with_file) a whole new download will be started.
Storage#download existed for two reasons:
1) If Storage#open uses Down::ChunkedIO, calling UploadedFile#download
the first time on an UploadedFile object would copy the remote
content to two Tempfiles: the Tempfile that's returned and
Down::ChunkedIO's internal Tempfile. Storage#download allows storage
implementers to avoid the additional copy.
2) Some SDKs such as aws-sdk-s3 implement automatic retries only if the
download destination is a Tempfile (like in S3#download), but not if
it's a block (like in S3#open).
The 1st issue can be addressed by explicitly passing `rewindable: false`
for storages that use Down::ChunkedIO and support :rewindable. Though
it's not really a big deal anyway.
The 2nd issue should be fixed in the SDK. google-api-client already
supports resumable downloads for any download destination, and
fix for aws-sdk-s3 is in the works.
I no longer believe Storage#download provides enough benefits to keep
around. The performance issues with refresh_metadata plugin and custom
metadata are a great reason to throw it away. An added benefit is that
storages now have one less method to implement.