Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes 4108; remove hidden files in kustomize edit command to correctly mimic shell globbing behaviour #4170

Merged
merged 4 commits into from Nov 10, 2021

Conversation

m-Bilal
Copy link
Member

@m-Bilal m-Bilal commented Sep 5, 2021

Fixes #4108 which pointed out that kustomize edit was including hidden files in directory.

APPROACH:
Once the expanded paths are available, I loop through all the paths and check if any hidden files or directories have been included (by checking for /. substring in the strings). If a path has that substring, it is ignored.

cc: @KnVerey

@k8s-ci-robot k8s-ci-robot added do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Sep 5, 2021
@k8s-ci-robot
Copy link
Contributor

Welcome @m-Bilal!

It looks like this is your first PR to kubernetes-sigs/kustomize 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/kustomize has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Sep 5, 2021
@k8s-ci-robot
Copy link
Contributor

Hi @m-Bilal. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Sep 5, 2021
@m-Bilal m-Bilal changed the title Fixes #4108; remove hidden files in kustomize edit command to correctly mimic shell globbing behaviour Fixes 4108; remove hidden files in kustomize edit command to correctly mimic shell globbing behaviour Sep 5, 2021
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. label Sep 5, 2021
@natasha41575
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Sep 7, 2021
}
var result []string
for _, path := range paths {
if strings.Contains(path, "/.") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few questions:

  1. Will this still work on windows?

  2. What about hidden files in the current directory, e.g. if someone does kustomize edit add resource * and there is a hidden file in my current directory, e.g. .github, in which case "/." doesn't appear?

Suggestion: Change this check to strings.HasPrefix(filepath.Base(path), ".")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@natasha41575
I'm unable to understand one thing. If I have a directory which has the structure as follows:

dir/some-file-1
dir/some-file-2
dir/dir2/some-file3

GlobPatterns returns only some-file-1, some-file-2 for the pattern dir/*. It does not go into nested directories, but the command ls dir/* does return the file in dir2 on bash if I replicate the same directories on storage. Is the behavior of GlobPatterns correct? Should it not reference the nested directories (and hidden nested directories)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deep down in the implementation of GlobPatterns, kustomize does the following:

// Glob returns the list of matching files
func (fsOnDisk) Glob(pattern string) ([]string, error) {
return filepath.Glob(pattern)
}

IMO I would think that filepath.Glob does the "correct" globbing behavior, because it is a builtin Go package. In any case, it makes sense to me for kustomize to follow whatever filepath.Glob does rather than attempting to define its own correct behavior. WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that makes sense

Copy link
Member Author

@m-Bilal m-Bilal Sep 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a second thought @natasha41575, I think correcting the globbing behavior of filepath.Glob is exactly what we're trying to do here (by explicitly removing files that filepath.Glob returns). I guess it really comes down to what would you expect if you had the above mentioned directory structure and used dir/* as the glob pattern (or, even just *). If you expect the file in dir2 to be included and it isn't, it will be reported as a bug in the future, even though that is the standard behaviour of filepath.Glob (like #4108). I don't use globbing enough to justify if that is correct or incorrect, but maybe someone who knows better (like yourself, or @KnVerey or @khrisrichardson) could decide if this should be a separate issue that needs to be fixed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a second thought @natasha41575, I think correcting the globbing behavior of filepath.Glob is exactly what we're trying to do here (by explicitly removing files that filepath.Glob returns). I guess it really comes down to what would you expect if you had the above mentioned directory structure and used dir/* as the glob pattern (or, even just *).

That's a fair point, and worth some more investigation. I also don't use globbing enough to be able to confidently state what is and what isn't correct behavior, but FWIW some google search results (https://stackoverflow.com/questions/1690809/what-expands-to-all-files-in-current-directory-recursively and https://unix.stackexchange.com/questions/49913/recursive-glob) seem to suggest that you need to do something special (**) for recursive globbing.

@khrisrichardson
Copy link
Contributor

I could be mistaken, but it does not appear to me that this implementation takes into account intentional globbing of hidden files (e.g. - .*)

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Sep 8, 2021
@m-Bilal
Copy link
Member Author

m-Bilal commented Sep 14, 2021

@natasha41575 is this PR complete or should recursive globbing be a part of this PR? (I tested a little, seems like recursive globbing is not currently supported)

@natasha41575
Copy link
Contributor

natasha41575 commented Sep 17, 2021

I've just pulled and built your branch, and with the following directory structure:

.
├── kustomization.yaml
└── moo
    └── .cow

And I run kustomize edit add resource moo/*, it incorrectly adds the file moo/.cow to my kustomization's resources field. Futhermore, some spinkled print statements suggest that when running the edit add resource command, the functionRemoveHiddenFiles is never reached.

It seems that this PR fixes some of the kustomize edit add commands, but not all of them go through this path. For example, edit add resource calls fSys.Glob here:

files, err := fSys.Glob(pattern)
if err != nil {
return nil, err
}
if len(files) == 0 {

I would suggest moving this code that you've written:

if util.IsHiddenFilePath(patterns[0]) {
	result = allFilePaths
} else {
	result = util.RemoveHiddenFiles(allFilePaths)
}

directly into fSys.Glob, where the corrected behavior can be shared among all commands that use it. We have two implementations of fSys, and their Glob methods can be found at

// Glob returns the list of matching files
func (fsOnDisk) Glob(pattern string) ([]string, error) {
return filepath.Glob(pattern)
}

and

// Glob implements FileSystem.
// Glob returns the list of file paths matching
// per filepath.Match semantics, i.e. unlike RegExpGlob,
// Match("foo/a*") will not match sub-sub directories of foo.
// This is how /bin/ls behaves.
func (n *fsNode) Glob(pattern string) ([]string, error) {

My suggestion is to filter out the hidden files at the end of these functions, as a last step. You can also move your helper functions and tests into the files kyaml/filesys/util.go and kyaml/filesys/util_test.go.

I would also appreciate some feedback from @monopole, @KnVerey, or @yuwenma regarding this approach.

Copy link
Contributor

@KnVerey KnVerey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this PR complete or should recursive globbing be a part of this PR? (I tested a little, seems like recursive globbing is not currently supported)

Let's stick to the dotfile fix. IMO we want to exclude hidden files because it's not likely users running kustomize edit actually want them, not just to mimic shell behavior on principle. I'm not sure at this point we want to enable recursive globbing, and if someone were to convince me, I'd still say it should go in its own PR.

paths := []string{
"dir/fa1",
"dir/fa2",
"dir/.fa3",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: for completeness include top-level files (hidden & not) and files with extensions (hidden & not) to show those cases are handled correctly too.

@@ -95,6 +95,47 @@ func TestGlobPatternsWithLoaderRemoteFile(t *testing.T) {
}
}

func TestIsHiddenFilePath(t *testing.T) {
hiddenFilePaths := []string{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: in addition to the glob expressions, include test data that looks like full file paths, since we use this function with both types of input. It might make sense to use a table test instead of multiple loops, e.g.:

tests := map[string]struct {
		paths        []string
		expectHidden bool
	}{
		"hidden globs": {
			expectHidden: true,
			paths: []string{
				".*",
				"/.*",
				"dir/.*"},
		},
               // more categories of examples
	}
	for name, test := range tests {
		test := test
		t.Run(name, func(tt *testing.T) {
			for _, path := range test.paths {
				require.Equal(tt, test.expectHidden, IsHiddenFilePath(path), "path %s not correctly identified", path)
			}
		})
	}
}

func TestExpandFileSourceWithHiddenFiles(t *testing.T) {
fSys := filesys.MakeEmptyDirInMemory()
_, err := fSys.Create("dir/file-1.xtn")
require.NoError(t, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: instead of repeating the create calls and error checking, use a loop over a list of files you want to create:

	files := []string{
		"dir/file-1.xtn",
		"dir/file-2",
		"dir/.file-3",
		"dir/dir-2/.file-4",
		"dir/dir-2/file-5.xtn",
		"dir/dir-2/.file-6.xtn",
		"root-1.xtn",
		".root-2.xtn",
	}
	for _, file := range files {
		_, err := fSys.Create(file)
		require.NoError(t, err)
	}

@k8s-ci-robot
Copy link
Contributor

@m-Bilal: This PR has multiple commits, and the default merge method is: merge.
You can request commits to be squashed using the label: tide/merge-method-squash

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@m-Bilal
Copy link
Member Author

m-Bilal commented Oct 5, 2021

Sorry for updating this so late, I had gotten stuck on this issue.
I have made all the suggested changes in, and here are a few observations from testing and I wanted to cross check if these are the intended behaviors.

  1. Running kustomize edit add resource moo/* with the following directory structure
.
├── kustomization.yaml
└── moo
    └── .cow

now results in kustomize printing moo/* has no match and nothing is added to kustomization.yaml
However, running kustomize edit add resource moo/.* on the above directory structure results in the following kustomization file (the result is same on master branch as well):

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- moo/.
- moo/..
- moo/.cow
  1. Glob("*") in fsnode.go returns directories and files present in the directory (not just the files).
  2. Adding a hidden file to bunchOfFiles in fsnode_test.go breaks TestRegExpGlob(t *Testing.t). I haven't added the hidden file filtering in RegExpGlob(pattern string) because it accepts a regular expression and I think that * as a regular expression would return all files, including the hidden ones. So, for now, I have added the hidden file to the list of expected files in TestRegExpGlob(t *Testing.t). Please let me know if this is incorrect and RegExpGlob(pattern string) should also have a check for hidden files, I'll add that and update the test.

I'm pushing this code so that it can be reviewed as I'm not sure about the things that I've listed above, I'll squash and update the commits once code has been reviewed.

@natasha41575
Copy link
Contributor

Running kustomize edit add resource moo/* with the following directory structure
.
├── kustomization.yaml
└── moo
└── .cow
now results in kustomize printing moo/* has no match and nothing is added to kustomization.yaml

Makes sense to me.

However, running kustomize edit add resource moo/.* on the above directory structure results in the following kustomization file (the result is same on master branch as well):

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:

  • moo/.
  • moo/..
  • moo/.cow

This also seems correct to me, especially if this is what's happening on the master branch.

Glob("*") in fsnode.go returns directories and files present in the directory (not just the files).

This is also what currently happens with the master branch, and I don't see a reason to change it in this PR.

Adding a hidden file to bunchOfFiles in fsnode_test.go breaks TestRegExpGlob(t *Testing.t). I haven't added the hidden file filtering in RegExpGlob(pattern string) because it accepts a regular expression and I think that * as a regular expression would return all files, including the hidden ones. So, for now, I have added the hidden file to the list of expected files in TestRegExpGlob(t *Testing.t). Please let me know if this is incorrect and RegExpGlob(pattern string) should also have a check for hidden files, I'll add that and update the test.

I think it's fine if we don't touch RegExpGlob to exclude hidden files. I did a quick check and this method isn't even used anywhere so I'd say we don't have to worry about it.

/lgtm I will wait on letting this merge for a bit to see if there are any comments from others.

Thanks for working on this!

@natasha41575
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 10, 2021
@KnVerey
Copy link
Contributor

KnVerey commented Nov 10, 2021

/label tide/merge-method-squash

@KnVerey KnVerey added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Nov 10, 2021
@KnVerey
Copy link
Contributor

KnVerey commented Nov 10, 2021

Re: . and .. being included, it's plausible to me we'll get a feature request to change that, which I'd be open to. But it definitely doesn't need to be part of this PR, considering it happens on master today.

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: KnVerey, m-Bilal

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 10, 2021
@k8s-ci-robot k8s-ci-robot merged commit cb1cbbe into kubernetes-sigs:master Nov 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kustomize globbing incorrectly includes dot files
5 participants