Add length method to InputStream #277

zacstewart · 2016-02-25T17:04:01Z

To make this thing act more realistically like an IO, it needs a length method.

One useful example is being able to iterate through a zip of images and upload them to an HTTP server using UploadIO. In order to multi-part such a POST body, the length of the IO must be known.

Fixes #276

To make this thing act more realistically like an IO, it needs a length method.

mnaberez · 2016-02-25T17:16:45Z

The omission of the length was likely intentional because there is not a way to know the size of the decompressed data without decompressing it. The size reported by a zip entry should be considered advisory only and can't be relied on in this way. The entry is easily manipulated and this is how zip bombs work; the entry reports a small size but the data decompresses much larger.

If you are creating the ZIP files yourself and know that the entry size is reliable, using the entry size in this way might be acceptable. However, I don't support this as a general addition to RubyZip because it can't be used reliably on ZIP files from other sources.

zacstewart · 2016-02-25T17:24:01Z

Are there any risks that outweigh the benefits? From my perspective, a zip with inaccurately reported entry sizes is just invalid, and with invalid data you can expect invalid behavior.

Unless there are security implications, it seems useful to me to be able to treat the stream like an IO, even if it requires some implicit trust in the archive being truthful about the decompressed size of its entries

mnaberez · 2016-02-25T17:45:06Z

Are there any risks that outweigh the benefits? From my perspective, a zip with inaccurately reported entry sizes is just invalid, and with invalid data you can expect invalid behavior.

RubyZip is sometimes used in web applications, and the ZIP files being decompressed may come from user uploads. Zip bombs are well known and commonly used as a denial of service attack. The ZIP bomb works by exploiting the possible inconsistency between the size reported in the entry and the actual size the data will decompress to. A very tiny ZIP file may expand to many gigabytes. If the consumer of the decompressed data is holding it in RAM, it may consume all available RAM and swap. If the consumer of the decompressed data is storing it on disk, it may consume all space on the disk. Web developers can't ignore these kinds of ZIP files as simply "invalid" or else they will be vulnerable to these attacks.

hainesr · 2016-02-26T14:03:45Z

Playing Devil's Advocate a bit here, but is it the responsibility of RubyZip to save badly designed Web applications from themselves? Or, to put it another way: is it reasonable to omit a useful, sensible and (maybe) expected feature just because people may abuse it?

I suppose a third way of looking at it might also be: There's no length attribute now, so how would a Web application make decisions on how to unpack things as they stand? At least if there was an advertised length then one could stop reading when one hit that length.

mnaberez · 2016-02-26T16:47:42Z

I suppose a third way of looking at it might also be: There's no length attribute now, so how would a Web application make decisions on how to unpack things as they stand?

The expected size of the decompressed data is already available from ZipEntry#size. The application can call entry.size to get the expected size before calling entry.get_input_stream to get the stream.

This pull request is about adding a length method to the stream itself, and having it return entry.size. I think the reason length was not added to the stream originally is because the real size of the decompressed data can't be known without decompressing. I don't support adding length for that reason and because I think it makes it easier for users to implement a bad practice. The size reported by the entry should not be trusted blindly.

simonoff · 2017-11-08T18:39:52Z

I don't agree a reasons for it. If you need such method you can wrap it in your code.

Add length method to InputStream

da4067c

To make this thing act more realistically like an IO, it needs a length method.

simonoff closed this Nov 8, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add length method to InputStream #277

Add length method to InputStream #277

zacstewart commented Feb 25, 2016

mnaberez commented Feb 25, 2016

zacstewart commented Feb 25, 2016

mnaberez commented Feb 25, 2016

hainesr commented Feb 26, 2016

mnaberez commented Feb 26, 2016

simonoff commented Nov 8, 2017 •

edited

Add length method to InputStream #277

Add length method to InputStream #277

Conversation

zacstewart commented Feb 25, 2016

mnaberez commented Feb 25, 2016

zacstewart commented Feb 25, 2016

mnaberez commented Feb 25, 2016

hainesr commented Feb 26, 2016

mnaberez commented Feb 26, 2016

simonoff commented Nov 8, 2017 • edited

simonoff commented Nov 8, 2017 •

edited