-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
contenthash: implement proper Linux symlink semantics #4896
Commits on Jun 6, 2024
-
contenthash: switch recursive rootPath implementation to be iterative
This is based on the github.com/cyphar/filepath-securejoin implementation. Since we need to resolve all components when doing lookups, doing it recursively serves little purpose (other than complicating the implementation). There was also a bug in how symlinks containing "/../" in the target name were handled, though this is a more specific issue than the general issues with ".." that the rest of the contenthash code has. Ref: 9bf5431 ("contenthash: fix issues on symlinks on parent paths") Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Configuration menu - View commit details
-
Copy full SHA for 531f71a - Browse repository at this point
Copy the full SHA 531f71aView commit details -
contenthash: implement proper Linux symlink semantics for getFollowLinks
This patch is part of a series which fixes the symlink resolution semantics within BuildKit. You cannot implement symlink resolution on Linux naively using path.Join. A correct implementation requires tracking the current path and applying each new component individually. This implementation is loosely based on github.com/cyphar/filepath-securejoin. Things to note: * The previous implementation of getFollowLinks actually only resolved symlinks in parent components of the path, leading to some callers to have to implement resolution manually (and incorrectly) by calling getFollowLinks several times. In addition to being incorrect and somewhat difficult to follow, it also lead to the ELOOP limit being much higher than 255 (while some callers used getFollowLinksWalk, most used getFollowLinks which reset the limit for each iteration). So, add getFollowParentLinks to allow for callers to decide which behaviour they need. getFollowLinks now follows all links (correctly). * The trailing-slash-is-significant behaviour in the cache (dir vs dir header) needs to be handled specially because on Linux there is no distinction between "a/" and "a" (assuming a is not a symlink, that is) and so filepath-securejoin's implementation didn't care about trailing slashes. The previous implementation hid the trailing path behaviour purely in the splitKey() implementation, making the need for this quite subtle. * The previous implementation was recursive, which in theory would allow you to find some paths slightly more quickly (if you find a valid ancestor you don't need to check above it) at the cost of making some lookups more expensive (a path with an invalid ancestor very early on in the path). However, implementing the correct lookup algorithm recursively proved to be quite difficult. It is possible to implement a similar optimisation (try to find the first non-symlink parent component and iterate from there), this complicates the implementation a fair amount and it doesn't seem clear that the performance tradeoff is a benefit in general. Ultimately, cache lookups are quite fast and so there probably isn't a practical performance difference between approaches. Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Configuration menu - View commit details
-
Copy full SHA for 2659945 - Browse repository at this point
Copy the full SHA 2659945View commit details -
contenthash: implement proper Linux symlink semantics for needsScan
This patch is part of a series which fixes the symlink resolution semantics within BuildKit. Since we now have a working implementation of symlink resolution in getFollowLinks, we can add a callback after each component is resolved so that we can track which components were and were not found during the resolution. Because getFollowLinks resolves the components in-order, we can use this to tell needsScan whether the resolver found a real (non-symlink) ancestor of the requested path in the cache. Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Configuration menu - View commit details
-
Copy full SHA for f724d6f - Browse repository at this point
Copy the full SHA f724d6fView commit details -
contenthash: unify "follow" and trailing-symlink handling for Checksum
This patch is part of a series which fixes the symlink resolution semantics within BuildKit. Previously, the concept of the follow flag had different meanings in various parts of the checksum codepath. FollowLinks is effectively O_NOFOLLOW, but the implementation in getFollowLinks was actually more like RESOLVE_NO_SYMLINKS. This was masked by the fact that checksumFollow would implement the O_NOFOLLOW behaviour (incorrectly), but checksumFollow would call checksumNoFollow (which would follow symlinks in path components by setting follow=true for getFollowLinks). It is much easier to simply remove these layers of indirection and unify the meaning of FollowLinks across all of the code. This means that the old follow flag is no longer needed. This also means that we can now remove the incorrect symlink resolution logic in (*cacheContext).checksumFollow() and move the followTrailing logic to (*cacheContext).checksum(), as well as removing getFollowParentLinks(). Since this removes some redundant re-checksum loops, we need to add followTrailing logic to scanPath() so that final symlink components result in the correct directory being scanned properly. The only user of (*cacheContext).checksum(follow=false) was (*cacheContext).includedPaths() which appeared to be simply using this as an optimisation (since the path being walked already had its parent path resolved). Having two easily-confused boolean flags for an optimisation that is probably not necessary (getFollowLinks already does a fast check to see if the original path is in the cache) seemed unnecessary, so just keep followTrailing. Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Configuration menu - View commit details
-
Copy full SHA for e5171cd - Browse repository at this point
Copy the full SHA e5171cdView commit details -
contenthash: add tests for non-lexical symlinks
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Configuration menu - View commit details
-
Copy full SHA for ee43730 - Browse repository at this point
Copy the full SHA ee43730View commit details