Fix high memory usage due to Deflater buffering

When support for ZipCrypto was added, an internal StringIO buffer was added to Deflater, in order to fix a decryption bug. While this worked, it caused unlimited memory growth when compressing large files. The proper fix is to writer the encryption header in init_next_entry instead of finalize_current_entry, so the headers are written before any encrypted data. Because of this fix we can remove the buffering in Deflater, which keeps memory usage low and allows to stream compressed data while it is written. This should fix issue #233.
rubyzip · Oct 17, 2015 · 7a4b8bb · 7a4b8bb · jjb · Oct 28, 2015
1 parent a3ca219
commit 7a4b8bb
Show file tree

Hide file tree

Showing 2 changed files with 7 additions and 4 deletions.
diff --git a/lib/zip/deflater.rb b/lib/zip/deflater.rb
@@ -7,18 +7,21 @@ def initialize(output_stream, level = Zip.default_compression, encrypter = NullE
       @size          = 0
       @crc           = ::Zlib.crc32
       @encrypter     = encrypter
-      @buffer_stream = ::StringIO.new('')
     end
 
     def <<(data)
       val   = data.to_s
       @crc  = Zlib.crc32(val, @crc)
       @size += val.bytesize
-      @buffer_stream << @zlib_deflater.deflate(data)
+      buffer = @zlib_deflater.deflate(data)
+      if buffer.empty?
+        @output_stream
+      else
+        @output_stream << @encrypter.encrypt(buffer)
+      end
     end
 
     def finish
-      @output_stream << @encrypter.encrypt(@buffer_stream.string)
       @output_stream << @encrypter.encrypt(@zlib_deflater.finish) until @zlib_deflater.finished?
     end
 

diff --git a/lib/zip/output_stream.rb b/lib/zip/output_stream.rb
@@ -123,7 +123,6 @@ def copy_raw_entry(entry)
 
     def finalize_current_entry
       return unless @current_entry
-      @output_stream << @encrypter.header(@current_entry.mtime)
       finish
       @current_entry.compressed_size = @output_stream.tell - @current_entry.local_header_offset - @current_entry.calculate_local_header_size
       @current_entry.size = @compressor.size
@@ -139,6 +138,7 @@ def init_next_entry(entry, level = Zip.default_compression)
       @entry_set << entry
       entry.write_local_entry(@output_stream)
       @encrypter.reset!
+      @output_stream << @encrypter.header(entry.mtime)
       @compressor = get_compressor(entry, level)
     end