Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

include/exclude directives can lead to directories that are not writable at build time #1419

Open
BenjaminSchubert opened this issue Apr 12, 2022 · 8 comments · May be fixed by #1439
Open
Labels
status/ready Issue ready to be worked on. type/bug Issue that reports an unexpected behaviour.

Comments

@BenjaminSchubert
Copy link

When building an application using pack and specifying include and/or exclude, the permissions are not set correctly in some cases, and some directories might end up owned by root, which, in some cases, makes the build fail.

This is due to how docker works and can be reproduced by using docker solely.

I will focus on the include part of the equation here, but it is possible to trigger the same bug with exclude.

Reproduction using pack

Setup

In a directory, create a project:

mkdir -p testdir/testfile

cat << EOF > project.toml
[build]
# This triggers the bug
include = ["testdir/testfile"]

[[build.buildpacks]]
id = "testing/workspace-info"

  [build.buildpacks.script]
  api = "0.7"
  # The build will always fail, but we care about the output here
  inline = "ls -lahR && exit 1"
EOF

This technically means that testdir is ignored, and will not be copied, however, you want testdir/testfile, and thus, something will have to create
this testdir, to keep your hierarchy.

Triggering the bug

Now, running pack build --builder ${WHATEVER_BUILDER:paketobuildpacks/builder:tiny} test, you will see the following:

===> BUILDING
.:
total 12K
drwsrwsrwt 3 cnb  cnb  4.0K Jan  1  1970 .
drwxr-xr-x 1 root root 4.0K Apr 12 21:16 ..
drwxr-sr-x 2 root root 4.0K Apr 12 21:16 testdir

./testdir:
total 8.0K
drwxr-sr-x 2 root root 4.0K Apr 12 21:16 .
drwsrwsrwt 3 cnb  cnb  4.0K Jan  1  1970 ..
-rw-r--r-- 1 cnb  cnb     0 Apr 12 21:16 testfile

What did I expect?

testdir should be owned by cnb:cnb

Triggering using Docker

This is not solely a problem with pack, it is rather how docker cp works.

Again, using an empty directory, we can do:

# Create the base setup
mkdir -p testdir
touch testdir/testfile

# Create a tar archive, including only testfile. This is the same as what pack does under the hood, when setting up the workspace
tar -czvf example.tar.gz testdir/testfile

# The archive will not contain `testdir`
tar tvf example.tar.gz

# Now let's copy it into docker, using docker cp
# Start a busybox instance in the background
docker run --rm -d --name test busybox sleep INFINITY

# And docker cp it
cat example.tar.gz | docker cp - test:/root

# We can see the same behavior as before
docker exec -it test ls -lahR /root/

# Finally, cleanup the leftovers
docker stop test

Alternatives

In order not to hit this problem, this is what needs to be done:

include = ["testdir", "!testdir/*", "testdir/testfile"]

This is error prone, hard to debug when it goes wrong, and ultimately not intuitive for users.

Fixing it

I am happy to provide a fix if we agree that this is a bug. However, I am not sure what is the best way of achieving this so I would need a bit of guidance.

My current idea would be to modify WriteDirToTar, and the corresponding entry for zip, to walk the path a first time, gather a list of everything needing copying, and then go over it a second time, inserting missing directories as needed.

Does that sound sound? A better way I would be missing?

Thanks!

@BenjaminSchubert BenjaminSchubert added status/triage Issue or PR that requires contributor attention. type/bug Issue that reports an unexpected behaviour. labels Apr 12, 2022
@jromero
Copy link
Member

jromero commented Apr 21, 2022

First off, I agree that this is unintentional and thereby a bug.

I do wonder if an alternative solution could be to update the filtering function to not filter out directories that contain files that would ultimately be included. It looks like we inherit some gitignore functionality from a library so it may not be as trivial as I make it sound but it may be worth exploring. The main benefit being that there's no post-processing necessary.

@jromero jromero added status/ready Issue ready to be worked on. and removed status/triage Issue or PR that requires contributor attention. labels Apr 21, 2022
@imnitishng
Copy link
Contributor

I'd like to work on this if nobody else isn't.
I just needed to clear up a few things here -
I understand the issue that happens with include part of the bug, but the exclude part is still unclear.
From my understanding if the same project.toml definition is used with exclude = ["testdir/testfile"] then the testfile will not be copied but contrary to the issue pointed out above the directory testdir is owned by cnb instead of root.

===> BUILDING
.:
total 16K
drwsrwsrwt 3 cnb  cnb  4.0K Jan  1  1970 .
drwxr-xr-x 1 root root 4.0K May  8 12:54 ..
-rw-r--r-- 1 cnb  cnb   299 May  8 12:54 project.toml
drwxr-xr-x 2 cnb  cnb  4.0K May  8 12:53 testdir

./testdir:
total 8.0K
drwxr-xr-x 2 cnb cnb 4.0K May  8 12:53 .
drwsrwsrwt 3 cnb cnb 4.0K Jan  1  1970 ..

I wanted to confirm if this is how the bug is triggered for exclude, or let me know if I am missing something here.

For the include part of the bug I agree that we can update the filtering function to include all the parent directories of files that are specified inside include

# If include contains
include = ["testdir/abc/testfile", "dir2/file.txt"]

# Get parent dirs
parent_dirs = ["testdir", "abc", "dir2"]

# Add their respective gitignore entries to `fileFilter`

I can start working on the include part if the approach seems fine, moreover exclude can be worked on after clarifications.

@jromero
Copy link
Member

jromero commented May 9, 2022

@imnitishng It makes sense that your example of excludes works as expected because you are excluding the file and not the directory so therefore the directory wouldn't match the "ignore". @BenjaminSchubert if you could provide some examples on how this could be recreated via exclude that would be appreciated otherwise will move forward with fixing include.

For include, the pseudo strategy feels a little incomplete. That may just be my misunderstanding on how it's written. I'm concerned about partial includes for parent directories (ie. abc in the example above). If we only put the directory name of abc then any other directory with that name (elsewhere) may be included.

So in pseudo terms it should really be more like this:

# If include contains
include = ["testdir/abc/testfile", "dir2/file.txt"]

# Get parent dirs
parent_dirs = [
  "./testdir/", 
  "./testdir/abc/", 
  "./dir2/",
]

Let me know if you agree or if I'm missing something.

@imnitishng
Copy link
Contributor

I'm concerned about partial includes for parent directories (ie. abc in the example above). If we only put the directory name of abc then any other directory with that name (elsewhere) may be included.

Yea, the example I provided was quite vague, but I had the same idea in my mind. I will start working on this and create a draft PR for include and extend that once we get exclude clear.
Will make sure tests are robust enough to deal with any edge cases that may happen.

@BenjaminSchubert
Copy link
Author

Hey @jromero, sure, something like:

[build]
# This triggers the bug
exclude = ["testdir", "!testdir/testfile"]

[[build.buildpacks]]
id = "testing/workspace-info"

  [build.buildpacks.script]
  api = "0.7"
  # The build will always fail, but we care about the output here
  inline = "ls -lahR && exit 1"

should work though I can't test it right now

@jromero
Copy link
Member

jromero commented May 11, 2022

@BenjaminSchubert ah, that's an interesting edge case. I feel conflicted by it. Is it considered a misconfiguration given that the nested file obviously needs the parent directories?

Some options that come to mind:

  1. Do nothing - this feels like an edge case that can be easily explained when configuration is looked at.
  2. "Fix" for the user - we could add some logic to remove any parent directories from excludes if children files/directory are being included. This feels like it could be very error prone if their intentions were different.
  3. Warning only - if we detect that a parent directory is being excluded while a children file/directory is being included.

I'm currently leaning towards 1) because the added complexity doesn't feel justified.

If the intent of the user is to exclude everything in the testdir except for a single file then it should probably be: [testdir/*, !testdir/file]

Reference: https://stackoverflow.com/a/29932318

@imnitishng imnitishng linked a pull request May 11, 2022 that will close this issue
2 tasks
@BenjaminSchubert
Copy link
Author

BenjaminSchubert commented Jun 21, 2022

@jromero sorry for the delay in responding.

@BenjaminSchubert ah, that's an interesting edge case. I feel conflicted by it. Is it considered a misconfiguration given that the nested file obviously needs the parent directories?

git does work correctly with such patterns, and the spec mentions that it follow the gitignore pattern here. You can for example, with git do:

mysubdir/
!mysubdir/myfile

and that does work (well, technically git doesn't really care about directories at all).

So if we wanted to consider it a misconfiguration, we should document and catch this case as the build goes, with a nice error message instead. I don't think "just letting it go through" is a nice user experience.

I would more lean towards 2, or a potential number 4, which would be:

  1. Error if such a pattern is found. This is not going to give the expected results anyways

@BenjaminSchubert
Copy link
Author

And I was wrong.

I've played more with git, and it does look like

mysubdir/
!mysubdir/file

does ignore mysubdir/file, so number 2 would actually deviate from the gitignore way of doing things.

Which means, it's handled however incorrectly in pack... as it does add the file.

At that point, I think an error with message might be the best bet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status/ready Issue ready to be worked on. type/bug Issue that reports an unexpected behaviour.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants