You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Content-Length header of an image URL reports the correct size in bytes of the image
MiniMagick sometimes reads the image from the same URI and returns a value for #size that does not match the header's value
Theoretically this is the cause of corrupted images being written by MiniMagick.
For example:
image_uri=URI("https://via.placeholder.com/350x150.jpg")expected_image_size_bytes=Faraday.head(image_uri).headers["content-length"].to_iimage=MiniMagick::Image.open(image_uri)raise"Expected size and actual size mismatch"ifimage.size != expected_image_size_bytes
This occurs rarely, but reliably, and is not specific to the image host the image is being served from. Every instance of this problem can be solved by re-running the job this code executes in, and things work as expected. The size of the image as reported by the Content-Length header is always the "correct" size of the image.
What's even stranger is that the reported value returned from #size is so far (in every case) larger than the Content-Length header value, up to nearly double the size.
I don't want to blame MiniMagick here, but it happens with both IM and GM backends, latest IM 6.x and GM versions. My only theory is some problem with the underlying call to IO.copy_stream which is somehow wigging out in the middle of the stream copy, only to rewind and start copying the file from the start again.
Let me know if any of this sounds plausible, I'm continuing to investigate but this one is hard to pin down.
The text was updated successfully, but these errors were encountered:
taylorthurlow
changed the title
Image#size and Content-Length header mismatch
Image#size and Content-Length header mismatch, intermittent
Sep 20, 2022
Further testing confirms that the initial HEAD request matches Content-Length with a subsequent GET request which includes the full image in the body. The Image#size result still occasionally fails to match with the header value. That test took place with a separate GET request for the header, and allowed MiniMagick to submit a separate request to the same image as it saw fit (given the same input URI). Next change is to use the body from the explicit GET request as an input to Image.read and wait for another size mismatch.
It's likely this is a bug in open-uri, which MiniMagick uses to download images. The issue is most certainly not in IO.copy_stream, because that doesn't touch the network.
Have you tried downloading images with a different HTTP library?
MiniMagick
4.11.0
:I'm having an intermittent issue where:
Content-Length
header of an image URL reports the correct size in bytes of the imageTheoretically this is the cause of corrupted images being written by MiniMagick.
For example:
This occurs rarely, but reliably, and is not specific to the image host the image is being served from. Every instance of this problem can be solved by re-running the job this code executes in, and things work as expected. The size of the image as reported by the
Content-Length
header is always the "correct" size of the image.What's even stranger is that the reported value returned from
#size
is so far (in every case) larger than theContent-Length
header value, up to nearly double the size.I don't want to blame MiniMagick here, but it happens with both IM and GM backends, latest IM 6.x and GM versions. My only theory is some problem with the underlying call to
IO.copy_stream
which is somehow wigging out in the middle of the stream copy, only to rewind and start copying the file from the start again.Let me know if any of this sounds plausible, I'm continuing to investigate but this one is hard to pin down.
The text was updated successfully, but these errors were encountered: