Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

testscript: Add check for duplicate and out-of-order files #184

Closed
twpayne opened this issue Sep 25, 2022 · 14 comments · Fixed by #185
Closed

testscript: Add check for duplicate and out-of-order files #184

twpayne opened this issue Sep 25, 2022 · 14 comments · Fixed by #185

Comments

@twpayne
Copy link
Contributor

twpayne commented Sep 25, 2022

I was just reviewing my testscripts and noticed that one of them had a duplicate file:

# test ...
exec command

-- filename --
# contents of filename
-- some/other/file --
# contents of some/other/file
-- filename --
# different contents of filename --

testscript did not warn about the duplication, and the duplication was not easy to spot as the files were out of order. If the files had been in order the duplicate files would have been easier to spot. Long story short, I ended up writing a very simple txtar linter and formater in twpayne/chezmoi#2387.

Rather than adding a formatter and linter to testscript, I propose to add a mechanism similar to #164 and add two extra parameters:

RequireUniqueFilenames bool, default false, which will return an error if there are any duplicate filenames in the txtar archive.

RequiredSortedFilenames bool, default false, which will return an error if the files in the txtar archive are not in order.

This should maintain backward compatibility while also effectively adding optional opt-in linting capabilities.

Is this of interest? Would you accept a PR that implements this?

@mvdan
Copy link
Collaborator

mvdan commented Sep 25, 2022

I've made the same mistake with duplicate files once, and it also confused me for a good ten minutes. I definitely support some way to avoid that. I would even argue that we should always reject duplicates, because they literally cannot serve any purpose in a testscript file. The vast majority of them should be mistakes that need to be fixed.

Whether or not that qualifies as a breaking change is perhaps up for debate. I am willing to agree with Go's compatibility promise, which is that it is OK to break a user if their program was always incorrect. And I think we can agree that duplicate txtar files in a testscript are incorrect.

Perhaps @rogpeppe or @myitcv have thoughts.

RequiredSortedFilenames bool, default false, which will return an error if the files in the txtar archive are not in order.

This one I agree that we shouldn't do by default, primarily because of the amount of noise it would cause. I'd support adding the option, in any case.

@bitfield
Copy link
Contributor

This point came up while we were discussing #181 on the BIT Slack— @myitcv asked what happens if you use txtar-c to append duplicate files to an existing txtar archive.

The answer is nothing: when the archive is unpacked, later files silently replace earlier ones with the same path. That makes sense to me, and it's actually useful in the context of txtar-c -script, which is used to execute a given test script in the context of several different file sets. For example, you could overwrite the script's main.go with a different main.go each time.

I agree that in most cases, though, duplicate files won't be what the user intended, and while the txtar format itself allows them:

There are no possible syntax errors in a txtar archive.
Word of God

it would be useful to have a way to detect them. I'm not sure it should be exclusive to testscript—this feels more like a txtar thing to me, since txtar files are more general. I'm also rather fond of the ruthless simplicity of testscript, and wouldn't want to pollute that even with this quite worthwhile feature. Perhaps a txtar-lint tool would be a nice add-on here instead?

@mvdan
Copy link
Collaborator

mvdan commented Sep 26, 2022

My only objection to a linter is that I want this (rejecting duplicates) to happen automatically when I use testscript, and I argue it should be the default as well.

I don't think we can discuss changes to the txtar format here, because effectively we are a compatible fork of https://pkg.go.dev/golang.org/x/tools/txtar (that public package did not exist yet when go-internal was created). So I think it's a bad idea to make any changes to the format that would make our fork incompatible with upstream.

@twpayne
Copy link
Contributor Author

twpayne commented Sep 26, 2022

My only objection to a linter is that I want this (rejecting duplicates) to happen automatically when I use testscript, and I argue it should be the default as well.

Seconded. To be honest, I wasn't even aware of the txtar and testscript CLIs. I have always created my .txtar files by hand and only executed them through testscript.RunMain.

There are no possible syntax errors in a txtar archive.
Word of God

It is true that there are no possible syntax errors in a txtar archive, but I can think of at least three logical errors:

  • Duplicate filenames, where Archive.Write silently overwrites the earlier contents the later contents.
  • Absolute filenames or filenames that escape the destination directory, which cause Archive.Write to return an error later.
  • Empty filenames, which I think (but have not verified) cause Archive.Write to fail when it tries to overwrite the destination directory with a file.

I think its worth catching all of these logical errors as early as possible. All these logical errors are cheap to detect when you have your parsed txtar.Archive.

I did write a simple txtar-lint that catches the duplicate filename and empty filename errors, but I have to remember to run this external tool and include it in my CI. Having the checks built into testscript would catch the errors much earlier and with less effort.

@bitfield
Copy link
Contributor

I think #185 would break the way I use testscript, for the reasons I mentioned. Duplicate filenames are not only not an error in that case, but actually expected and required for some scenarios.

(I appreciate that's probably an unusual way to use testscript, and it may not bother anybody else. I only mention it as a data point.)

@mvdan
Copy link
Collaborator

mvdan commented Sep 27, 2022

Sure that duplicate filenames in a testscript file can pass, but are they really working as intended? Every scenario I can think of is a human error which went unnoticed, so I think making those fail would be fine. It is an easy fix, too.

@twpayne
Copy link
Contributor Author

twpayne commented Sep 27, 2022

I've updated #185 to make the unique filename requirement opt-in.

@bitfield
Copy link
Contributor

are they really working as intended?

Yes. I should have explained this in a bit more detail, but I didn't want to weary those who'd already read the Slack discussion. Essentially, my workflow involves appending arbitrary files to an existing test script (which may also contain files). A trivial example might be:

exec go run main.go

-- main.go --
// some code here

You can imagine using this script to check a whole bunch of different projects or folders, some of which may themselves contain a main.go. In that case, it's the project's main.go that I want to run, not the one built in to the test script.

So not only is the duplicate file not an error, it's what I want to happen when running the testscript in the context of this particular directory. (You can certainly argue that I shouldn't want that. And you may well be right.)

@myitcv
Copy link
Collaborator

myitcv commented Sep 29, 2022

RequireUniqueFilenames (default true)

This seems to make sense to me (note the default). Perhaps, @mvdan, this can be brought up with Bryan et al in that discussion?

@bitfield:

So not only is the duplicate file not an error, it's what I want to happen when running the testscript in the context of this particular directory.

Could your process replace the filename in the txtar, instead of appending?

At one time I think I wrote a little utility program to update a file within a txtar for not dissimilar reasons.

RequiredSortedFilenames

This feels like more of a preference. Perhaps an analogy here is that you might want your tests in a _test.go file sorted by name. We don't have an option to go test to enforce that, nor do I think we should have one here for testscript.

FWIW I place the most important files at the top of the txtar for example, to draw attention to them in reproductions.

So perhaps a linter suffices here?

@bitfield
Copy link
Contributor

Could your process replace the filename in the txtar, instead of appending?

Not easily, because #181 appends the given script file to the txtar's Comment field.

@twpayne
Copy link
Contributor Author

twpayne commented Sep 29, 2022

There's always an XKCD: https://xkcd.com/1172/

@mvdan
Copy link
Collaborator

mvdan commented Sep 29, 2022

This seems to make sense to me (note the default). Perhaps, @mvdan, this can be brought up with Bryan et al in that discussion?

Noted, I'll bring this issue up with Bryan.

I'm not opposed to adding an option as long as we make the default be what the vast majority of users will want. I think @bitfield's use case is somewhat special, so I don't think it would be weird to opt into "allow duplicates". We should probably flip the boolean to make the default be the zero value though, like AllowDuplicateFilenames bool.

@mvdan
Copy link
Collaborator

mvdan commented Sep 29, 2022

Thinking outloud: we should probably not simply reject duplicate filenames, but reject duplicate files when extracting. For example:

  • foo.txt and FOO.TXT should produce an error when extracted on a case-insensitive filesystem
  • foo and foo/bar should error even though one of the two is a directory

@twpayne
Copy link
Contributor Author

twpayne commented Dec 13, 2022

I've updated #185 to check for unique names, as suggested by @mvdan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants