Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: csv syntax for git repos #4905

Open
fenollp opened this issue May 7, 2024 · 3 comments
Open

Proposal: csv syntax for git repos #4905

fenollp opened this issue May 7, 2024 · 3 comments

Comments

@fenollp
Copy link
Contributor

fenollp commented May 7, 2024

Today's way of expressing remote Git repos as contexts

https://github.com/user/repo.git
https://github.com/user/repo.git#
https://github.com/user/repo.git#:
https://github.com/user/repo.git#:dir
https://github.com/user/repo.git#branch_or_tag_or_commit
https://github.com/user/repo.git#branch_or_tag_or_commit:dir

does not allow

  • specify both a commit hash and e.g. a branch (as image digests do: $IMAGE_URL:mytag@sha256:cafebabe..)
    • Note importantly: tools such as renovate provide a way to upgrade digests given mytag
  • differentiating between a tag or a branch (and a non-full commit)

To permit this I propose to extend this syntax in a somewhat backwards compatible way:

https://github.com/user/repo.git##tag=mytag,branch=main,commit=cafebab,source=/dir
  • tag and branch cannot be used together
  • commit can be a short commit hash instead of a full one
  • note the leading / for source: to keep with uniformity with other csv args
  • ## usage (instead of just #) may already exist in the wild (but "unlikely" combined with csv)
  • renovate can work with branch or tag (git tags are as mutable as branches) and commit, as with images
@tianon
Copy link
Member

tianon commented May 9, 2024

differentiating between a tag or a branch (and a non-full commit)

FWIW, the simpler answer to this is to use full/explicit refs instead of abbreviated ones (refs/tags/xxx vs refs/heads/xxx or even GitHub's fancy refs/pull/xxx/head refs for test-building pull requests, for example).

(IMO [not a buildkit maintainer], it would be weird to invent buildkit-specific syntax sugar for these refs such that tag= gets translated to refs/tags/ and branch= to refs/heads/)

@tonistiigi
Copy link
Member

tag and branch cannot be used together

As @tianon said this can already be done with setting full ref. Additionally, I think builds from refs like refs/pull/xxx/head are very useful, so if we added a new syntax that capability would need to be kept. Even with CSV I would prefer ref=.. instead of branch=,tag=.

commit can be a short commit hash instead of a full one

This looks like too much sugar to me. You can't do docker pull alpine@sha256:partialhex.

renovate can work with branch or tag (git tags are as mutable as branches) and commit, as with images

Don't exactly get what case you have in mind here.

Downsides:

  • CSV is quite cumbersome to read and write. ## looks weird as well.

Potential opportunities with CSV format:

  • Ref and commit can't be used together atm in a URL. That is important use case for replaying a historic build from a git ref, with a pin to a specific commit (this info is provided by provenance attestation). Another question is what ref=,commit= combination actually means, and I guess it should just ignore/trust the ref in that case and use the commit. Ref might still be visible as current branch in .git. This is similar to how tag is ignored on container image refs if digest is present.
  • Keeping/discarding .git directory could be controlled by a property that is more flexible solution than current BUILDKIT_CONTEXT_KEEP_GIT_DIR build arg.
  • If keeping .git then another property could be used to control the depth of the cloned commit chain.
  • In git: allow sparse checkouts #4646 , subdir component can potentially split into subdir + filter components.

@fenollp
Copy link
Contributor Author

fenollp commented May 9, 2024

Agreeing on all points.
Note about renovate (and the majority of my motivation here): is to have a ref=..,commit=.. pair for git repos as we have $IMG:..@sha256:.. for images. That is, a constraint + a lock. Tooling can use this to auto-upgrade code.


I guess it should just ignore/trust the ref in that case

I understand this is today's behavior WRT image tags. I would argue that some kind of check that ensures that the given digest exists in/among/as the given constraint (at least once, some time in the past). e.g. make sure $commit exists on $branch.

I remember a beforetime when GitHub would be fine showing commits from some fork as commits of the upstream, trusted repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants