New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible regression from 1.2.0 to 1.2.1 when ZIP entry's filename contains backslash #324
Comments
It would seem that creating a zip in Windows via |
This is hitting us as well, definitely a regression between 1.2.0 and 1.2.1. You can see the commit that caused it here: #308 I'm not an expert in character encodings, but it seems like the issue it was trying to address may only be an issue in older ruby versions? I have both ruby 2.2 and 2.4 installed locally (on macOS), and in both cases @dogatana can you comment at all? |
Hi, guys. The following irb session log explains the issue caused by tr. Zip file doesn't have any encoding information about the file names in it. Rubyzip provides us the way to handle file names in zip file, right? Regards,
|
@sofie-c and I were recently bit by this issue and were hoping to make an attempt at a fix. @dogatana, above you mentioned:
Can you clarify what you meant here? Are you suggesting we apply the Either way, I think we need some failing tests to demonstrate both the issue #308 was meant to solve, alongside this issue. Perhaps that's where I'll start. If anyone solution ideas, please share; we're new to this codebase. |
I modified the rubyzip test suite so that when it generates test zip files, it includes a file named The first failure I'm examining is against
I'm trying to decide how to interpret this. Is this failure telling me the application code is wrong or if my test assertion is wrong? |
There are tests with Unicode filenames that work though, aren't there? Do you have the Unicode option on when doing this? |
@bbuchalter any joy? One thing I have learned about character encoding over the years is that sometimes you just need to know what character encoding you have, and there's no way to guess/infer it. |
Maybe we could reinstate the Line 371 in 2f80da6
|
Having played around with just about every combination of characters and encodings that I can think of, I think the solution to this to reinstate the If you have non-ascii characters in your filenames I think the solution is that you need to set @bbuchalter you are seeing Some working from irb: > '表'.tr('\\', '/')
=> "表"
> '表\表'.tr('\\', '/')
=> "表/表"
> '表\表'.encode('windows-31j')
=> "\x{955C}\\\x{955C}"
> '表\表'.encode('windows-31j').tr('\\', '/')
=> "\x{955C}/\x{955C}"
> '表\表'.encode('windows-31j').tr('\\', '/').encode('utf-8')
=> "表/表" |
But only do it after we have set filename encoding appropriately to avoid breaking multibyte characters with `\`s in them. Fixes rubyzip#324.
Another thought on this: the zip spec says nothing about handling any other encodings other than IBM Code Page 437 and UTF-8. See APPENDIX D in the spec. Obviously people are using other encodings out there in the wild, which is presumably why we have |
But only do it after we have set filename encoding appropriately to avoid breaking multibyte characters with `\`s in them. Fixes #324.
But only do it after we have set filename encoding appropriately to avoid breaking multibyte characters with `\`s in them. Fixes rubyzip#324.
I experienced an issue with rubyzip last Fall where a customer-supplied ZIP had backslashes instead of forward-slashes for all the directory separators; I resolved it at the time by updating from rubyzip 0.9.9 to 1.2.0. It translated the backslashes to forward slashes e.g. directory separators on OS X and Linux. (0.9.9 behaved "correctly" on OS X but not on Linux).
The recent update to 1.2.1 has changed behavior on Linux such that the backslash is considered part of the filename again. OS X continues to exhibit what I would consider to be the desired behavior, treating backslashes as directory separators. See below.
Is this change in behavior from 1.2.0 to 1.2.1 intentional? Would you consider it a bug that it behavior is different from Linux to OS X?
The text was updated successfully, but these errors were encountered: