Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tar writes header names differently if the file to add is a child of the current directory. #337

Open
iUnknwn opened this issue Apr 13, 2019 · 4 comments · May be fixed by #392 or #769
Open

Tar writes header names differently if the file to add is a child of the current directory. #337

iUnknwn opened this issue Apr 13, 2019 · 4 comments · May be fixed by #392 or #769
Labels
tar Related to TAR file format

Comments

@iUnknwn
Copy link
Contributor

iUnknwn commented Apr 13, 2019

Steps to reproduce

  1. Create an output tar archive.
  2. Set the root path to something in the same directory as the exe (so if the exe was C:\tools\thingWithTar.exe, set the root to C:\tools\dirToSetAsRoot)
  3. Attempt to add a file with the path C:\tools\dirToSetAsRoot\fileToSend.txt

Because the file name of the file to add has the current directory as the substring, TarEntry.CreateEntryFromFile, when it calls TarEntry.GetTarFileHeader, will set the name to

dirToSetAsRoot\fileToSend.txt

(this occurs on line 384 of TarEntry.cs)

Because the root path is no longer part of the name, it is not compared, and the entry added to the tar archive remains darToSetAsRoot\fileToSend.txt.

However, if the file was not a child of the current directory:

  1. Set root to C:\notCurDir\dirToSetAsRoot
  2. Attempt to add file C:\notCurDir\dirToSetAsRoot\fileToSend.txt

Then, TarEntry.CreateEntryFromFile names the entry: C:\notCurDir\dirToSetAsRoot\fileToSend.txt

Then, when written to the tar archive, the root is compared with the entry from string, which leads to the file being written as:
fileToSend.txt

Expected behavior

Behavior of adding files, relative to root path, should be the same regardless if the file being added is a child of the current working directory.

Actual behavior

The entry name differs if the file to add is a child of the current directory.

Version of SharpZipLib

1.1.0

Obtained from (only keep the relevant lines)

  • Package installed using NuGet
iUnknwn added a commit to iUnknwn/SharpZipLib that referenced this issue Apr 13, 2019
For icsharpcode#337: removes the changes to path based on current working
directory. This prevents issues where files that are below the
current working directly will behave differently than those that
aren't.

For icsharpcode#338: Simply removes the code that would remove leading forward
slashes from paths. This should unbreak absolute POSIX paths, and avoid
the issue where RootPath wouldn't work, though it does allow
unc paths (which were previously blocked) as a potential side effect.

Part of the issue with icsharpcode#338 is any changes to the path, as part of
the TarEntry's GetFileTarHeader have the potential to break RootPath,
so you can't simply check for UNC path names there and modify.
@Numpsy
Copy link
Contributor

Numpsy commented Jun 16, 2019

Would it be better if the GetCurrentDirectory() checks were removed from TarEntry, and TarArchive instead used the current directory as the default root folder?

@piksel piksel added the tar Related to TAR file format label Jun 18, 2019
iUnknwn added a commit to iUnknwn/SharpZipLib that referenced this issue Oct 31, 2019
For icsharpcode#337: removes the changes to path based on current working
directory. This prevents issues where files that are below the
current working directly will behave differently than those that
aren't.

For icsharpcode#338: Simply removes the code that would remove leading forward
slashes from paths. This should unbreak absolute POSIX paths, and avoid
the issue where RootPath wouldn't work, though it does allow
unc paths (which were previously blocked) as a potential side effect.

Part of the issue with icsharpcode#338 is any changes to the path, as part of
the TarEntry's GetFileTarHeader have the potential to break RootPath,
so you can't simply check for UNC path names there and modify.
@iUnknwn iUnknwn linked a pull request Oct 31, 2019 that will close this issue
@iUnknwn
Copy link
Contributor Author

iUnknwn commented Oct 31, 2019

I realized I never actually submitted a PR when I wrote a patch for this earlier.

I removed the GetCurrentDirectly check in TarHeader, which ensures the behavior is consistent regardless of the assembly location. Currently, this means there isn't a default rootPath.

However, if it's preferable, I can adopt the suggestion by @Numpsy and set the default rootPath to the assembly location.

@piksel
Copy link
Member

piksel commented Feb 1, 2020

Yeah, this has to do with relative vs. absolute file paths. When an absolute path is specified, it's of course unexpected for it to be handled as a relative path.
The relative path handling should probably be toggled using property, like FromRelativeFilePath which in turn is only set when the Entry was added using a relative path. This way we can keep the behaviour for those entries that rely on it.

@iUnknwn
Copy link
Contributor Author

iUnknwn commented Feb 4, 2020

I'm happy to modify my pull request, but I want to make sure I'm aligned on the desired behavior. Based on the comment by @Numpsy, would this be acceptable?

  • If a root path is set:
    • Ensure the root path is an absolute path (convert as appropriate)
    • All added items are transformed to absolute paths, and calculated against the root path (this can be done using the URI class, if needed, or by basic substrings).
  • If a root path is not set:
    • Set the root path to the absolute path of the current working directory.
    • Added files are added relative to the root path as described above.

Is that an acceptable policy? That way the behavior is always consistent if root path is set, and current directory is only used for cases where people are not using root path.

Chris-Johnston added a commit to Chris-Johnston/SharpZipLib that referenced this issue Aug 26, 2022
…ing dir

WIP. This change updates the behavior of adding files under the current working dir when the RootPath is set

If the rootpath is unset, the behavior is unchanged.

If the rootpath is a relative directory under the current directory, the current directory + root path will be removed from the start of the entry filepath

If the rootpath is an absolute directory under the current directory, the root path will be removed from the start of the entry filepath
Chris-Johnston added a commit to Chris-Johnston/SharpZipLib that referenced this issue Aug 26, 2022
…spect RootPath

This changes the behavior of TarArchive when files are added under the
CurrentDirectory.

When a TarEntry is created from a file which is under the
CurrentDirectory, the path is made relative to the CurrentDirectory.
This behavior affects how TarArchive removes a prefix from files set by
the RootPath directory.

The expectation for RootPath is that a matching substring will be
removed from the prefix of entries before they are added to the
TarArchive. Because the TarEntry for files under CurrentDirectory are
modified, RootPath will not work if it is an absolute path under
CurrentDirectory.

This change fixes this behavior. If RootPath is a directory under
CurrentDirectory, it will apply to entres added to TarArchive.

An attempt was made to implement this change without additional string
manipulation. This could be done with a modified IndexOf, that enables
searching for a substring of the pattern within the target substring.
This may improve performance.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tar Related to TAR file format
Projects
None yet
3 participants