Add ability to limit output size for zip unarchiver #115

satamas · 2019-07-15T07:58:55Z

This is needed as a protection against zip bombs.
(Recently found bomb example).

…ction against zip bombs.

olamy · 2019-07-15T10:40:15Z

src/main/java/org/codehaus/plexus/archiver/AbstractUnArchiver.java

@@ -35,6 +27,11 @@
 import org.codehaus.plexus.util.IOUtil;
 import org.codehaus.plexus.util.StringUtils;

+import java.io.*;


we usually not use stars import (please avoid changing the order of import)

plamentotev · 2019-07-15T18:27:06Z

Hi,

Thanks for contributing. The Zip bomb method you shared is quite clever and interesting. I just wonder if Apache Commons Compress is more suitable place to address this issue? It have knowledge about the Zip file internals so it is probably better equipped to deal with zip bombs (and other archive bombs for that matter). And solving it there will protect wider audience (plexus archiver including).

satamas · 2019-07-15T19:13:56Z

Protection agains concrete way of creating zip bomb is task for Apache Commons Compress for sure, but the only way of protection agains all kind of such bombs is to limit produced output size.
Same idea was mentioned by author of article about new zip bomb

Detecting the specific class of zip bomb we have developed in this article is easy: just look for overlapping files. Mark Adler has written a patch for Info-ZIP UnZip that does just that. In general, though, rejecting overlapping files does not protect against all classes of zip bomb. It is difficult to predict in advance whether a zip file is a bomb or not, unless you have precise knowledge of the internals of the parsers that will be used to parse it. Peeking into the headers and summing the "uncompressed size" fields of all files does not work, in general, because the value stored in the headers may not agree with the actual uncompressed size. (See the "permits too-short file size" row in the compatibility table.) Robust protection against zip bombs involves placing time, memory, and disk space limits on the zip parser while it operates. Handle zip parsing, as any complex operation on untrusted data, with caution.

And zip stream from Apache Commons implements InputStreamStatistics for the same purposes. I didn't used it in order to make method re-usable by other unarchivers in future.

plamentotev · 2019-07-21T14:33:21Z

src/main/java/org/codehaus/plexus/archiver/zip/AbstractZipUnArchiver.java

@@ -44,6 +43,9 @@

    private String encoding = "UTF8";

+    @Nullable
+    private Long maxOutputSize;


Would it make sense to use long with -1 meaning unlimited in order to avoid NPEs?

Would it makes sense to have it for other archives (such as tar)?

plamentotev

In general looks good. I have a couple of comments. But I think it would really help if you do the refactoring (using try-with-resources, inlining extractFileIfIncluded, etc) in separate pull request. I'll merge it and you can rebase your changes on top of it. This would make this pull request a lot easier to read.

Also there are some places where spaces are missing around braces. Not a biggie I can fix them myself before merging.

plamentotev · 2019-07-21T14:36:54Z

src/main/java/org/codehaus/plexus/archiver/zip/AbstractZipUnArchiver.java

-        org.apache.commons.compress.archivers.zip.ZipFile zf = null;
-        InputStream in = null;
-        try
+        try(ZipFile zipFile = new ZipFile( getSourceFile(), encoding, true ))


Would you please keep the diff minimal. I think this change makes the diff difficult to read. Using try-with-resource is better, but it would be better if this refactoring is made in separate merge request in order to make the commit easier to understand.

plamentotev · 2019-07-21T14:39:59Z

src/main/java/org/codehaus/plexus/archiver/zip/AbstractZipUnArchiver.java

-            final Enumeration e = zf.getEntriesInPhysicalOrder();
+            getLogger().debug( "Expanding: " + getSourceFile() + " into " + getDestDirectory() );
+            Long remainingSpace = maxOutputSize;
+            final Enumeration e = zipFile.getEntriesInPhysicalOrder();


The same here. Initially I thought the order in which the entries are iterated is changed :) Refactoring unrelated to the change(like renaming variables, changing the lines order, etc) is best left in separate merge request.

plamentotev · 2019-07-21T14:43:55Z

src/main/java/org/codehaus/plexus/archiver/zip/AbstractZipUnArchiver.java

-
-                    in.close();
-                    in = null;
+                    try (InputStream in = zipFile.getInputStream(ze)) {


plamentotev · 2019-07-21T14:53:46Z

src/main/java/org/codehaus/plexus/archiver/AbstractUnArchiver.java

-    protected void extractFile( final File srcF, final File dir, final InputStream compressedInputStream,
-                                String entryName, final Date entryDate, final boolean isDirectory,
-                                final Integer mode, String symlinkDestination, final FileMapper[] fileMappers )
+    protected File extractFile(final File srcF, final File dir, final InputStream compressedInputStream,


Please follow the coding style. There are spaces around the braces.

Please correct me if I'm wrong, but I think that would break existing code (if somebody have overridden the protected method). Would it make sense to keep this method void. If you don't have limit set then you don't care about the file anyway.

plamentotev · 2019-07-21T14:57:43Z

src/main/java/org/codehaus/plexus/archiver/AbstractUnArchiver.java

+    }
+
+    protected File extractFile(final File srcF, final File dir, final InputStream compressedInputStream,
+                               Long remainingSpace, String entryName, final Date entryDate, final boolean isDirectory,


I think sizeLimit or something among those lines makes more sense as this is what this variable is used for. It is the remaining size for the client of the method. For the method it is the size limit it should impose to the output stream.

Also would it make sense to have different name for this method (as the other extractFile does not return File) to avoid confusion. I'm not saying it is. I'm just wondering.

plamentotev · 2019-07-21T15:03:31Z

src/main/java/org/codehaus/plexus/archiver/zip/AbstractZipUnArchiver.java

@@ -44,6 +43,9 @@

    private String encoding = "UTF8";

+    @Nullable
+    private Long maxOutputSize;


Would it makes sense to have it for other archives (such as tar)?

plamentotev · 2019-07-21T15:10:29Z

src/main/java/org/codehaus/plexus/archiver/zip/AbstractZipUnArchiver.java

-        finally
+    }
+
+    private String resolveSymlink( ZipFile zf, ZipArchiveEntry ze )


Is resolve link changed?

plamentotev · 2019-07-21T15:13:46Z

src/main/java/org/codehaus/plexus/archiver/AbstractUnArchiver.java

@@ -345,7 +354,7 @@ protected void extractFile( final File srcF, final File dir, final InputStream c
        {
            if ( !isOverwrite() && f.exists() && ( f.lastModified() >= entryDate.getTime() ) )
            {
-                return;
+                return f;


The file is not really extracted. Then should the file be returned?

plamentotev · 2019-07-21T15:17:06Z

src/main/java/org/codehaus/plexus/archiver/zip/AbstractZipUnArchiver.java

+                                    resolveSymlink(zipFile, ze), getFileMappers());
+                            if (file != null)
+                            {
+                                remainingSpace -= file.length();


Is file.length() reliable way to get the file size. What I mean is (quoting the File class API docs):

Where it is required to distinguish an I/O exception from the case that 0L is returned, or where several attributes of the same file are required at the same time, then the java.nio.file.Files#readAttributes method may be used.

plamentotev · 2019-07-21T15:24:28Z

I've asked because Apache Commons Compress already uses BoundedInputStream to limit the ZIP entry size. But the limit is set to the Zip entry size. I just wonder if makes sense to be able to set the global limit there. That way not only Plexus Archiver users will benefit from your change. Anyway doing it here also makes sense.

plamentotev · 2019-08-11T07:28:35Z

Hi @satamas, I've merged your changes for the try-with-resource into master. Just wanted to ping you to check if there is anything else I can do to help with moving this merge request forward.

satamas · 2019-08-12T08:10:09Z

I'm on long vacation now. I've asked @serejke to take care of this PR

serejke · 2019-08-13T13:23:09Z

I've reworked this pull-request quite completely, so to simplify reviewing I opened a new pull request #117.
This pull request will be dismissed. Please refer to #117.

plamentotev · 2019-08-23T16:06:32Z

Closing because it was implemented in #117

satamas added 2 commits July 15, 2019 10:51

Add ability to limit output size for zip unarchiver as a way of prote…

d145401

…ction against zip bombs.

Add zip output size limit test

54704a6

olamy reviewed Jul 15, 2019

View reviewed changes

Refactor imports after review

74b2b3a

plamentotev reviewed Jul 21, 2019

View reviewed changes

plamentotev added the enhancement label Jul 27, 2019

plamentotev added this to the 4.1.1 milestone Jul 27, 2019

serejke mentioned this pull request Aug 13, 2019

(Reworked) Add ability to limit output size for zip archives (against zip bombs) #117

Closed

plamentotev closed this Aug 23, 2019

plamentotev removed this from the 4.2.0 milestone Aug 23, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ability to limit output size for zip unarchiver #115

Add ability to limit output size for zip unarchiver #115

satamas commented Jul 15, 2019

olamy Jul 15, 2019

plamentotev commented Jul 15, 2019

satamas commented Jul 15, 2019

plamentotev Jul 21, 2019

plamentotev Jul 21, 2019

plamentotev left a comment

plamentotev Jul 21, 2019

plamentotev Jul 21, 2019

plamentotev Jul 21, 2019

plamentotev Jul 21, 2019

plamentotev Jul 21, 2019 •

edited

plamentotev Jul 21, 2019

plamentotev Jul 21, 2019

plamentotev Jul 21, 2019

plamentotev Jul 21, 2019

plamentotev Jul 21, 2019

plamentotev commented Jul 21, 2019

plamentotev commented Aug 11, 2019

satamas commented Aug 12, 2019

serejke commented Aug 13, 2019

plamentotev commented Aug 23, 2019

Add ability to limit output size for zip unarchiver #115

Add ability to limit output size for zip unarchiver #115

Conversation

satamas commented Jul 15, 2019

Choose a reason for hiding this comment

plamentotev commented Jul 15, 2019

satamas commented Jul 15, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

plamentotev left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

plamentotev Jul 21, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

plamentotev commented Jul 21, 2019

plamentotev commented Aug 11, 2019

satamas commented Aug 12, 2019

serejke commented Aug 13, 2019

plamentotev commented Aug 23, 2019

plamentotev Jul 21, 2019 •

edited