Fix use of a mismatching unicode path extra field in zip unarchiving #167
+143
−14
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The zip spec (§4.6.9) says that a unicode path extra field is only to be used when its CRC matches the current file name. Using it unconditionally, as introduced in a080e0a, is wrong. Commons.compress already does it right internally, there is no need to do anything in plexus-archiver.
These commits add some tests for the misbehavior using a specially crafted zip file (two variants supplied for completeness, although only one is useful for the test, see commit message) as well as a fix.
Originally the bug affected archive extraction, causing files to be extracted at wrong paths. This changed, probably unintentionally, with 2aec2ba (released in 4.2.0); since then only direct use of FileInfo has been affected, e.g. when using file selectors. The tests cover both situations.
For a real-world example where this caused problems, see https://repo1.maven.org/maven2/net/java/dev/jna/jna/5.6.0/jna-5.6.0.jar: It includes such a stale unicode path extra field for file
com/sun/jna/linux-x86-64/libjnidispatch.so
with contentslibjnidispatch.so
, meaning that plexus-archiver, unlike standards-compliant unarchivers, would extract that file in the root of the archive instead of the proper subfolder. This then caused https://download.eclipse.org/releases/2021-03/202103171000/plugins/com.sun.jna_5.6.0.v20200716-0148.jar (shipped with Eclipse 2021-03, built here) to include it in that incorrect location, rendering it inoperable on Linux x86-64.