Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking of upstream bug in tar #177

Closed
NobodyXu opened this issue Jun 11, 2022 · 17 comments
Closed

Tracking of upstream bug in tar #177

NobodyXu opened this issue Jun 11, 2022 · 17 comments
Labels
Blocked: upstream Fix or feature is needed to be implemented upstream (in a dependency) Report: bug Something isn't working

Comments

@NobodyXu
Copy link
Member

alexcrichton/tar-rs#295

tar failed to handle GNUSparseFile correctly.

@passcod passcod added Report: bug Something isn't working Blocked: upstream Fix or feature is needed to be implemented upstream (in a dependency) labels Jun 11, 2022
@NobodyXu
Copy link
Member Author

It seems that it would take quite some time for the upstream to fix it.

Since this bug is originally found when downloading cbindgen from cargo-quickinstall, @passcod shall we ask devs of cargo-quickinstall to stop using the gnu sparse file extension for it is not portable?

@passcod
Copy link
Member

passcod commented Jun 13, 2022

We can try, but is this something particular to qi or could we get that for someone else?

More out there: Are there other tar crates we could evaluate? Or even, how hard would it be to fix ourselves upstream?

@NobodyXu
Copy link
Member Author

We can try, but is this something particular to qi or could we get that for someone else?

cargo-binstall would not be able to extract any tarball that uses the gnu sparse file extension.

More out there: Are there other tar crates we could evaluate?

Not sure about that as the feature requires --sparse(-S) flag to be passed to tar to enable and most binaries are unlikely to have large enough holes to be a sparse file.

Besides, this problem is limited to the fetchers only.

fetch_crate_cratesio would never have this problem, since most crates are published using cargo-publish and cargo also uses tar.

Or even, how hard would it be to fix ourselves upstream?

tar seems to have some degree of gnu sparse file extension support:

  • It can recognize a gnu sparse file
  • It can parse the gnu sparse file header

But to extract a sparse file, it still need post-processing logic in xsparse and it also need to fix the filename, e.g. extracting the cbindgen tarball from qi using tar gives GNUSparseFile.0/cbindgen, which the file recognizes as a data file, not elf/executable.

@NobodyXu
Copy link
Member Author

NobodyXu commented Jun 13, 2022

Another workaround that we could add is fallback to cargo-install when we detect the presence of GNUSparseFile.*, but that one is really a hack and I'm not sure it will work reliably.

@passcod
Copy link
Member

passcod commented Jun 13, 2022

It can recognize a gnu sparse file

Can we detect that in code? The main issue I see is that we can do it for when we have a filter (entry.header().as_gnu().sparse) but not for the simple unpack. But overcoming that, it would be a reliable to detect the situation and initiate some kind of fallback.

Beyond failing the entire fetch and falling back to source, we could also try to fallback to using the tar cli tool if present. Then we could have the error say "install tar" instead of / in addition to "tell upstream to not do sparse packing".

@passcod
Copy link
Member

passcod commented Jun 13, 2022

Also it seems that quick-install is only using normal tar -czf, not the --sparse option? So I'm unsure what caused this package, unless the tar they're using has it on by default or turns it on based on some heuristic.

@NobodyXu
Copy link
Member Author

Can we detect that in code? The main issue I see is that we can do it for when we have a filter (entry.header().as_gnu().sparse) but not for the simple unpack. But overcoming that, it would be a reliable to detect the situation and initiate some kind of fallback.

That's doable with #180 , since it provides download_tar_based_and_visit, which enables you to pass a TarEntriesVisitor, but then we have to extract each file by hand.

@NobodyXu
Copy link
Member Author

Also it seems that quick-install is only using normal tar -czf, not the --sparse option? So I'm unsure what caused this package, unless the tar they're using has it on by default or turns it on based on some heuristic.

I am genuinely not sure what is happening then.

I tried with both the MacOS 12 finder and bsdtar 3.5.1 - libarchive 3.5.1 zlib/1.2.11 liblzma/5.0.5 bz2lib/1.0.8 and they are ok.

Only tar gives me the error and it is reproducible.

And I haven't yet noticed another tarball that gives me the same error.

@NobodyXu
Copy link
Member Author

NobodyXu commented Aug 19, 2022

Uhhh that just happened again cargo-bins/cargo-quickinstall#87 , this was discovered in #301 .

@alsuren
Copy link

alsuren commented Aug 19, 2022

Hi there. I'm as surprised as everyone else. This certainly wasn't on purpose.

I suspect that fixing it in the quickinstall archives is going to be a bit involved (and would result in holes in the package repo rather than packages that binstall would understand), but I had a crazy idea of how to fix it on the binstall side (if fixing it in the tar crate is too involved):

The tarballs are typically tiny, with a very small number of files, so you could just shell out to tar -xOzf - $bin_name multiple times (see https://www.gnu.org/software/tar/manual/html_node/Writing-to-Standard-Output.html), passing the tarball into stdin and reading the extracted file from stdout.

It's a hack, but it feels like the kind of hack that can be cleanly encapsulated behind a nice API, and it should be portable to all platforms that cargo-quickinstall supports.

@NobodyXu
Copy link
Member Author

The tarballs are typically tiny, with a very small number of files, so you could just shell out to tar -xOzf - $bin_name multiple times (see https://www.gnu.org/software/tar/manual/html_node/Writing-to-Standard-Output.html), passing the tarball into stdin and reading the extracted file from stdout.

IMHO I don't like this solution, because it adds runtime external dependency to cargo-binstall, which we are striving to avoid.

I would prefer upstream to fix this issue, and it looks like there's already a PR for this alexcrichton/tar-rs#298

@NobodyXu
Copy link
Member Author

NobodyXu commented Sep 9, 2022

@passcod It seems that the upstream is especially slow when it comes to PR review.
Perhaps we can maintain a fork that contains the fix alexcrichton/tar-rs#298 as a temporary solution.

@alsuren
Copy link

alsuren commented Sep 9, 2022

@NobodyXu have you verified that the upstream PR actually fixes the issue?

@NobodyXu
Copy link
Member Author

NobodyXu commented Sep 9, 2022

@NobodyXu have you verified that the upstream PR actually fixes the issue?

Not yet, I would do a test right now.

@NobodyXu
Copy link
Member Author

NobodyXu commented Sep 9, 2022

@alsuren I have tested it here and it indeed fixed the bug.

@NobodyXu
Copy link
Member Author

I've created a fork https://github.com/cargo-bins/tar-rs and would create a PR for this.

@passcod
Copy link
Member

passcod commented Sep 10, 2022

We can close this now that we've got it forked, I think. We'll want to change back once the upstream merges, but it's no longer a bug here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Blocked: upstream Fix or feature is needed to be implemented upstream (in a dependency) Report: bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants