Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improved filtering through tags #529

Closed
brainchild0 opened this issue Dec 27, 2021 · 8 comments · Fixed by #642
Closed

improved filtering through tags #529

brainchild0 opened this issue Dec 27, 2021 · 8 comments · Fixed by #642
Labels
Component: Bash Code Everything regarding the bash code Component: CLI Command line flags, exit code handling, ... Component: Docs Priority: Medium Wrong or misleading documentation, broken behavior with workaround Size: Large Changes across several files Type: Enhancement
Milestone

Comments

@brainchild0
Copy link

Currently, filtering is supported only by regular expressions on test descriptions, which are intended to be human-readable, and not necessarily designed for pattern filters. In fact, even changes to capitalization breaks a filter in current use.

A proposed tagging feature would support assignment of simple key words to each test, to support more robust and predictable filtering.

For example, keywords might follow the test description, designed by some leading character, such as +, as follows:

@test "Test integration with a public network service" +non-critical +external-deps {
    curl https://someplace/
}

A filter then might be defined to exclude tests with the tag external-deps in environments in which the external web resource is unlikely to be available, or tests which have both that tag as well as non-critical.

@martin-schulze-vireso
Copy link
Member

I have been thinking about labelling for some time and must say that the status quo is close to working. You could already do filtering for one tag.

What needs to be done from my point of view:

  • enhance filtering to allow combining multiple filters (operations: and, or, not)
  • optional:
    • case insensitivity
    • fixed string filtering
    • define a canonical tag format

@brainchild0
Copy link
Author

brainchild0 commented Dec 28, 2021

You could already do filtering for one tag.

I don't understand this comment.

In my view, two observations are prominent:

  • Matching the human-readable labels is not necessary, or even particularly useful, as long as tags would be available to support any needed filter condition.
  • With filter conditions given according to simple tags, it is easy to define as well as to implement a query syntax, as each simple test is represented just by the tag name, and only the three standard boolean operators, the ones you mention, are needed for compound tests (e.g. --tag-filter '!non-critical & !external-deps').

@martin-schulze-vireso
Copy link
Member

You could already do filtering for one tag.
I don't understand this comment.

You could put tags in the description of the test and use --filter:

@test "Test integration with a public network service +non-critical +external-deps" {
    curl https://someplace/
}
bats --filter '\+non-critical' test.bats

@martin-schulze-vireso martin-schulze-vireso added Component: CLI Command line flags, exit code handling, ... Component: Bash Code Everything regarding the bash code Component: Docs labels Dec 28, 2021
@brainchild0
Copy link
Author

Yes. It would be nice to keep the display string separate.

@martin-schulze-vireso
Copy link
Member

Yes. It would be nice to keep the display string separate.

Could you elaborate why? I am trying to strike a balance between adding functionality without increasing complexity/redundancy too much. The filtering mechanism as already close to what is being needed and would profit from the boolean operations even when not using tags.

These are the dis-/advantages I see with putting tags into the description instead of an additional location:

pro:

  • code sharing with the --filter mechanic
  • existing external tools (shellcheck, shfmt, ...) don't need to update their Bats "parser"

con:

  • slightly more verbose, as you have to specify the tag marker too (however, a convenience flag that internally maps to --filter would still work)
  • accidental matching when a test description contains something that looks like a tag already (we might be restrictive like demanding tags are appended to the end of the test description)
  • TAP stream test names contain tags too (but they could be filtered out?)

@brainchild0
Copy link
Author

brainchild0 commented Dec 28, 2021

The display strings would be seen by anyone running the tests or reviewing the results, whether or not familiar with the idea about embedding tags into these string. The framework is intended to display either pretty output, or informative, parse-able output, depending on the selected output mode. While it may have some utility given the currently-available features, embedding tags into display strings violates the basic principle of the string being natural and straightforward for a human to read, regardless of familiarity with any particular system.

There are a variety of vulnerabilities to embedding tags into the display string. One is a partial matching (e.g. +part matches +part-of-whole). A workaround is possible, of adding additional match characters at the end of the pattern, but this solution cumbersome and error prone. A perhaps better solution involves surrounding the entire tag with delimiters (e.g. [part]), though it too is awkward and vulnerable.

As I suggested, if the ultimate intention is to support compound boolean conditions through operators, the semantics required for such support are far easier if the simple conditions (i.e. not compounded by boolean operators) are just literal tag names. Once such features would become available, then pattern matching on the display string would be largely legacy, as it seems to have no particular necessary use case, compared to using tags.

Ultimately, my cursory view (which of course follows from certain assumptions about the project structure overall), is that supporting tags would lead to a cleaner development path as well as a cleaner experience for users.

@martin-schulze-vireso
Copy link
Member

The display strings would be seen by anyone running the tests or reviewing the results, whether or not familiar with the idea about embedding tags into these string. The framework is intended to display either pretty output, or informative, parse-able output, depending on the selected output mode. While it may have some utility given the currently-available features, embedding tags into display strings violates the basic principle of the string being natural and straightforward for a human to read, regardless of familiarity with any particular system.

Well, appending the tags to the string makes them hardly unreadable. At most they may become too long, but as I said: they could be filtered away. What I find harder is that the test name would depend on the labels, which is not ideal.

There are a variety of vulnerabilities to embedding tags into the display string. One is a partial matching (e.g. +part matches +part-of-whole). A workaround is possible, of adding additional match characters at the end of the pattern, but this solution cumbersome and error prone. A perhaps better solution involves surrounding the entire tag with delimiters (e.g. [part]), though it too is awkward and vulnerable.

I think delimiting with [] would work, although that definitely requires a dedicated interface to avoid cumbersome escaping in the --filter regex.

As I suggested, if the ultimate intention is to support compound boolean conditions through operators, the semantics required for such support are far easier if the simple conditions (i.e. not compounded by boolean operators) are just literal tag names. Once such features would become available, then pattern matching on the display string would be largely legacy, as it seems to have no particular necessary use case, compared to using tags.

I did not discount operators at all. I just wanted to say --filter would profit from them as well.

Ultimately, my cursory view (which of course follows from certain assumptions about the project structure overall), is that supporting tags would lead to a cleaner development path as well as a cleaner experience for users.

I find that less obvious. For one I don't know how many users rely on --filter and how easy it is to migrate them to other mechanics. To me it is mainly an interactive tool but there might be user that rely on it in committed code.

I'll have to think a bit about the use cases that are supported by --filter and your suggestion. My gut feeling says, that --filter is indeed "misused" in many cases to achieve something like the tagging you are suggesting here. I am just looking for a way to achieve both without too much new rigging. There are several features that deal with selecting subsets of tests (--filter, #483, #223) and we should avoid duplicating their usecases.

@martin-schulze-vireso martin-schulze-vireso added the Priority: Medium Wrong or misleading documentation, broken behavior with workaround label Dec 29, 2021
@brainchild0
Copy link
Author

Well, appending the tags to the string makes them hardly unreadable. At most they may become too long, but as I said: they could be filtered away. What I find harder is that the test name would depend on the labels, which is not ideal.

If you know what they are, you may not mind them, but seeing them in a field intended as a human-readable string may be confusing, and also unwieldy if the tags are very numerous.

I think delimiting with [] would work, although that definitely requires a dedicated interface to avoid cumbersome escaping in the --filter regex.

Is that easier than adding extra fields to the syntax (e.g. @tags)?

I did not discount operators at all. I just wanted to say --filter would profit from them as well.

The core of my suggestion was that designing and implementing syntax and semantics for boolean operations on filters is vastly easier if the simple tests (the primitive operands) are simply tag names, versus regular expressions.

Compare the following two examples:

  • ---filter '\[non\-critical\] --and --filter '\[external\-deps\]'
  • --tags 'non-critical & external-deps'

Augmenting the existing semantics for boolean operators will be very combersome, and even worse than the demonstration above when including grouping symbols. Meanwhile, parsing a basic string, the one shown as the argument to --tags, from a simple language that has only a few types of tokens, would be much simpler and elegant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Bash Code Everything regarding the bash code Component: CLI Command line flags, exit code handling, ... Component: Docs Priority: Medium Wrong or misleading documentation, broken behavior with workaround Size: Large Changes across several files Type: Enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants