Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tar entries with unrecognized entry type codes are silently ignored #697

Open
RobLinux opened this issue Dec 9, 2021 · 4 comments
Open
Labels
enhancement Feature request or other improvements of existing functionality tar Related to TAR file format

Comments

@RobLinux
Copy link

RobLinux commented Dec 9, 2021

Steps to reproduce

  1. Open tar file using TarInputStream ts = new TarInputStream(fs, Encoding.UTF8)
  2. Read first entry while(ts.Position != ts.Length && (tarEntry = ts.GetNextEntry()) != null)
  3. Try to get buffer to first and only file
  4. File is attached

Expected behavior

Get first and only file name and buffer

Actual behavior

The first entry is not read correctly, cannot get the name and the tar is directly completely read

Version of SharpZipLib

1.3.3

Obtained from (only keep the relevant lines)

  • Package installed using NuGet

Media_1 - Copy_2.zip

@RobLinux RobLinux changed the title Tar files not being extracting Tar files not being read correctly Dec 9, 2021
@RobLinux
Copy link
Author

RobLinux commented Dec 9, 2021

Seems like tar-cs and SharpCompress can read it.

@piksel
Copy link
Member

piksel commented Dec 10, 2021

I just looked at the file briefly, and it contains invalid data at the end, that shouldn't matter, but it indicates that it has been corrupted in some way. To be clear, what you are saying is that it doesn't find the first entry in the file? Or do you get an exception?

@piksel
Copy link
Member

piksel commented Dec 11, 2021

@piksel piksel changed the title Tar files not being read correctly Tar entries with unrecognized entry type codes are silently ignored Dec 11, 2021
@piksel
Copy link
Member

piksel commented Dec 11, 2021

All tar entries have a typeflag that specifies what type of entry it is. For regular files, this should be set to 48 (the ASCII value of '0'). But in the supplied sample file, the only entry has a type flag of 87, or ASCII W which SharpZipLib have no idea how to handle. In fact, I can find no tar docs that specify what kind of entry W would be... This is the output from gnu tar:

❯ tar -tvf repro/samples/issue-697.tar
?rw-r--r-- mobile/users 104704 1970-01-01 01:00 isinV6rtZPprvhi24LF4maLLZsPYO-gcxbqPiVOc1Jk= unknown file type ‘W’

Several vendor types is defined in the star man page, including 'V' and 'X', but no 'W':
https://www.systutorials.com/docs/linux/man/5-star/

This is why it just skips the entry, but ideally we should provide a way to iterate over all entries, treating them as normal files (giving the consumer a chance to handle them). At the very least, there should be some kind of indication that unknown entries are being encountered.
That would be harder to do in TarInputStream but for TarArchive it should just be a matter of emitting a progress event.

@piksel piksel added enhancement Feature request or other improvements of existing functionality tar Related to TAR file format labels Dec 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Feature request or other improvements of existing functionality tar Related to TAR file format
Projects
None yet
Development

No branches or pull requests

2 participants