Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to unpack archive downloaded when using "http". #150

Open
GrahamDumpleton opened this issue Apr 13, 2022 · 13 comments
Open

Add option to unpack archive downloaded when using "http". #150

GrahamDumpleton opened this issue Apr 13, 2022 · 13 comments
Labels
carvel-accepted This issue should be considered for future work and that the triage process has been completed enhancement This issue is a feature request priority/unprioritized-backlog Higher priority than priority/awaiting-more-evidence but not planned. Contributions are welcome.

Comments

@GrahamDumpleton
Copy link

Describe the problem/challenge you have

When downloading archive file using http I need the archive file to be automatically unpacked.

Describe the solution you'd like

The githubRelease option has unpackArchive for something similar although in that case you need to designate the file to unpack. In the case of http, there is only the one asset file so could get away with a single boolean flag to indicate whether download should be unpacked.

apiVersion: vendir.k14s.io/v1alpha1
kind: Config
directories:
- contents:
  path: src/wrapt
  - http:
      url: https://github.com/GrahamDumpleton/wrapt/archive/refs/heads/develop.tar.gz
      unpackArchive: true
    path: .

Anything else you would like to add:

I am assuming that once unpacked you can still use newRootPath so as to indicate you want the final result to only include files which existed in a subdirectory of the archive.

For example, in the archive above the top level directory it unpacks into is wrapt-develop. I am hoping one can do:

apiVersion: vendir.k14s.io/v1alpha1
kind: Config
directories:
- contents:
  path: src/wrapt
  - http:
      url: https://github.com/GrahamDumpleton/wrapt/archive/refs/heads/develop.tar.gz
      unpackArchive: true
    path: .
  newRootPath: wrapt-develop

so that top level wrapt-develop directory in the archive is eliminated and you get just the files in it.


Vote on this request

This is an invitation to the community to vote on issues, to help us prioritize our backlog. Use the "smiley face" up to the right of this comment to vote.

👍 "I would like to see this addressed as soon as possible"
👎 "There are other more important things to focus on right now"

We are also happy to receive and review Pull Requests if you want to help working on this issue.

@GrahamDumpleton GrahamDumpleton added carvel-triage This issue has not yet been reviewed for validity enhancement This issue is a feature request labels Apr 13, 2022
@GrahamDumpleton
Copy link
Author

GrahamDumpleton commented Apr 13, 2022

Note, it would be required that when unpacking the archive that the option exists to preserve execute bits.

Of note, when unpacking an archive from a githubRelease this doesn't happen, although there is a similar requirement for it as well to allow execute bits to be preserved.

@joe-kimmel-vmw joe-kimmel-vmw added priority/unprioritized-backlog Higher priority than priority/awaiting-more-evidence but not planned. Contributions are welcome. and removed carvel-triage This issue has not yet been reviewed for validity labels Apr 13, 2022
@joe-kimmel-vmw
Copy link
Contributor

Thanks @GrahamDumpleton ! it sounds like preserving the execute mode bits could present some stumbling blocks but we'll leave that to the initial implementation exploration.

@knechtionscoding
Copy link

@GrahamDumpleton Am I missing something with http? I think this is already supported?

    # fetches asset over HTTP (optional)
    http:
      # asset URL (required)
      url: 
      # verification checksum (optional)
      sha256: ""
      # specifies name of a secret with basic auth details;
      # secret may include 'username', 'password' keys (optional)
      secretRef:
        # (required)
        name: my-http-auth
      # skip unpacking tar, tgz, and zip files; by default files are unpacked (optional)
      disableUnpack: false

from: https://carvel.dev/vendir/docs/v0.32.0/vendir-spec/

Specifically:

      # skip unpacking tar, tgz, and zip files; by default files are unpacked (optional)
      disableUnpack: false

@knechtionscoding
Copy link

It works with zip files. I think the problem is it doesn't work with .tar.gz files?

@GrahamDumpleton
Copy link
Author

I suspect unpacking for http didn't exist back then and was since added but this issue was never updated. Or perhaps disableUnpack didn't exist and so wasn't obvious that things were unpacked at all. Either way, it does need to handle *.tar.gz if it doesn't already.

@knechtionscoding
Copy link

Understandable, just wanted to make you aware if you still needed it. I'll look and see if I can figure out why it doesn't support .tar.gz files. And maybe submit a PR for it.

@neil-hickey neil-hickey added the carvel-accepted This issue should be considered for future work and that the triage process has been completed label Feb 22, 2023
@reegnz
Copy link
Contributor

reegnz commented Aug 8, 2023

If it helps anyone, this is where the logic is:

switch header.Typeflag {
case tar.TypeDir:
// TODO should we make empty directories?
continue
case tar.TypeReg:
err = t.writeIntoFile(tarReader, dstPath, header.Name)
if err != nil {
return true, err
}
default:
return false, fmt.Errorf("Unknown file '%s' (%d)", header.Name, header.Typeflag)
}

And I get this error for my 'tar.gz' file:

2023/08/08 17:13:16 Unknown file 'pax_global_header' (103)

That is tar.TypeXGlobalHeader, which I guess is safe to skip.

I guess the problem is that the custom logic looking into the tar.gz file is not aware there are multiple formats, like USTAR, PAX and GNU.

Looking at some golang internal code: https://pkg.go.dev/golang.org/x/build/internal/untar

@reegnz
Copy link
Contributor

reegnz commented Aug 8, 2023

While looking into it I realized that the fix is so trivial that I opened a PR.

@praveenrewar
Copy link
Member

Thank you so much for looking into this @reegnz and creating a PR. Do you always get the same error while unpacking tar.gz files or this happens only in certain cases? Also, do you mind creating a new issue with the details, as @knechtionscoding rightly pointed out, the feature to unpack archive files has been there for a long time and it works by default, so we can close this issue.

@GrahamDumpleton
Copy link
Author

If this issue is going to be closed, then a new issue also needs to be created for the fact that execute permissions are not preserved when archives are unpacked. This was also mentioned above in this issue.

This preservation of execute permissions is also lacking with archives unpacked from GitHub releases. To round things out, there should also be a way to set execute permissions when provided files using inline. Preservation of execute permissions was added for OCI images, but it needs to be done for the other cases as well to be consistent and avoid hacks to work around the limitation. Vendir is used for more than just YAML files these days.

@praveenrewar
Copy link
Member

Thank you for pointing this out @GrahamDumpleton 🙇🏻 it had almost skipped my mind 😅
Would you like to create a separate issue for preserving execute permissions? (Or let me know and I can create one based on the info you have provided)

@reegnz
Copy link
Contributor

reegnz commented Aug 10, 2023

@praveenrewar opened a separate issue for the fix and linked the PR to that one. I'm strictly focused on the extraction issue, the permissions issue I'll let someone else pick up instead.

@praveenrewar
Copy link
Member

Thank you so much @reegnz!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
carvel-accepted This issue should be considered for future work and that the triage process has been completed enhancement This issue is a feature request priority/unprioritized-backlog Higher priority than priority/awaiting-more-evidence but not planned. Contributions are welcome.
Projects
Status: Unprioritized
Development

No branches or pull requests

6 participants