Skip to content

Commit

Permalink
Merge pull request #340 from alsor/force-entry-names-encoding-option
Browse files Browse the repository at this point in the history
add option to force entry names encoding
  • Loading branch information
simonoff committed Nov 8, 2017
2 parents fc83680 + deb6616 commit 1039b28
Show file tree
Hide file tree
Showing 4 changed files with 35 additions and 1 deletion.
8 changes: 8 additions & 0 deletions README.md
Expand Up @@ -254,6 +254,14 @@ Zip.default_compression = Zlib::DEFAULT_COMPRESSION
```
It defaults to `Zlib::DEFAULT_COMPRESSION`. Possible values are `Zlib::BEST_COMPRESSION`, `Zlib::DEFAULT_COMPRESSION` and `Zlib::NO_COMPRESSION`

Sometimes file names inside zip contain non-ASCII characters. If you can assume which encoding was used for such names and want to be able to find such entries using `find_entry` then you can force assumed encoding like so:

```ruby
Zip.force_entry_names_encoding = 'UTF-8'
```

Allowed encoding names are the same as accepted by `String#force_encoding`

You can set multiple settings at the same time by using a block:

```ruby
Expand Down
10 changes: 9 additions & 1 deletion lib/zip.rb
Expand Up @@ -34,7 +34,15 @@

module Zip
extend self
attr_accessor :unicode_names, :on_exists_proc, :continue_on_exists_proc, :sort_entries, :default_compression, :write_zip64_support, :warn_invalid_date, :case_insensitive_match
attr_accessor :unicode_names,
:on_exists_proc,
:continue_on_exists_proc,
:sort_entries,
:default_compression,
:write_zip64_support,
:warn_invalid_date,
:case_insensitive_match,
:force_entry_names_encoding

def reset!
@_ran_once = false
Expand Down
6 changes: 6 additions & 0 deletions lib/zip/entry.rb
Expand Up @@ -240,6 +240,9 @@ def read_local_entry(io) #:nodoc:all
extra = io.read(@extra_length)

@name.tr!('\\', '/')
if ::Zip.force_entry_names_encoding
@name.force_encoding(::Zip.force_entry_names_encoding)
end

if extra && extra.bytesize != @extra_length
raise ::Zip::Error, 'Truncated local zip entry header'
Expand Down Expand Up @@ -364,6 +367,9 @@ def read_c_dir_entry(io) #:nodoc:all
check_c_dir_entry_signature
set_time(@last_mod_date, @last_mod_time)
@name = io.read(@name_length)
if ::Zip.force_entry_names_encoding
@name.force_encoding(::Zip.force_entry_names_encoding)
end
read_c_dir_extra_field(io)
@comment = io.read(@comment_length)
check_c_dir_entry_comment_size
Expand Down
12 changes: 12 additions & 0 deletions test/unicode_file_names_and_comments_test.rb
Expand Up @@ -33,6 +33,18 @@ def test_unicode_file_name
assert(filepath == entry_name)
end
end

::Zip.force_entry_names_encoding = 'UTF-8'
::Zip::File.open(FILENAME) do |zip|
file_entrys.each do |filename|
refute_nil(zip.find_entry(filename))
end
directory_entrys.each do |filepath|
refute_nil(zip.find_entry(filepath))
end
end
::Zip.force_entry_names_encoding = nil

::File.unlink(FILENAME)
end

Expand Down

0 comments on commit 1039b28

Please sign in to comment.