improved filtering through tags #529

brainchild0 · 2021-12-27T23:45:16Z

Currently, filtering is supported only by regular expressions on test descriptions, which are intended to be human-readable, and not necessarily designed for pattern filters. In fact, even changes to capitalization breaks a filter in current use.

A proposed tagging feature would support assignment of simple key words to each test, to support more robust and predictable filtering.

For example, keywords might follow the test description, designed by some leading character, such as +, as follows:

@test "Test integration with a public network service" +non-critical +external-deps {
    curl https://someplace/
}

A filter then might be defined to exclude tests with the tag external-deps in environments in which the external web resource is unlikely to be available, or tests which have both that tag as well as non-critical.

The text was updated successfully, but these errors were encountered:

martin-schulze-vireso · 2021-12-28T06:48:02Z

I have been thinking about labelling for some time and must say that the status quo is close to working. You could already do filtering for one tag.

What needs to be done from my point of view:

enhance filtering to allow combining multiple filters (operations: and, or, not)
optional:
- case insensitivity
- fixed string filtering
- define a canonical tag format

brainchild0 · 2021-12-28T07:00:59Z

You could already do filtering for one tag.

I don't understand this comment.

In my view, two observations are prominent:

Matching the human-readable labels is not necessary, or even particularly useful, as long as tags would be available to support any needed filter condition.
With filter conditions given according to simple tags, it is easy to define as well as to implement a query syntax, as each simple test is represented just by the tag name, and only the three standard boolean operators, the ones you mention, are needed for compound tests (e.g. --tag-filter '!non-critical & !external-deps').

martin-schulze-vireso · 2021-12-28T07:18:37Z

You could already do filtering for one tag.
I don't understand this comment.

You could put tags in the description of the test and use --filter:

@test "Test integration with a public network service +non-critical +external-deps" {
    curl https://someplace/
}

bats --filter '\+non-critical' test.bats

brainchild0 · 2021-12-28T07:35:33Z

Yes. It would be nice to keep the display string separate.

martin-schulze-vireso · 2021-12-28T09:15:35Z

Yes. It would be nice to keep the display string separate.

Could you elaborate why? I am trying to strike a balance between adding functionality without increasing complexity/redundancy too much. The filtering mechanism as already close to what is being needed and would profit from the boolean operations even when not using tags.

These are the dis-/advantages I see with putting tags into the description instead of an additional location:

pro:

code sharing with the --filter mechanic
existing external tools (shellcheck, shfmt, ...) don't need to update their Bats "parser"

con:

slightly more verbose, as you have to specify the tag marker too (however, a convenience flag that internally maps to --filter would still work)
accidental matching when a test description contains something that looks like a tag already (we might be restrictive like demanding tags are appended to the end of the test description)
TAP stream test names contain tags too (but they could be filtered out?)

brainchild0 · 2021-12-28T20:05:19Z

The display strings would be seen by anyone running the tests or reviewing the results, whether or not familiar with the idea about embedding tags into these string. The framework is intended to display either pretty output, or informative, parse-able output, depending on the selected output mode. While it may have some utility given the currently-available features, embedding tags into display strings violates the basic principle of the string being natural and straightforward for a human to read, regardless of familiarity with any particular system.

There are a variety of vulnerabilities to embedding tags into the display string. One is a partial matching (e.g. +part matches +part-of-whole). A workaround is possible, of adding additional match characters at the end of the pattern, but this solution cumbersome and error prone. A perhaps better solution involves surrounding the entire tag with delimiters (e.g. [part]), though it too is awkward and vulnerable.

As I suggested, if the ultimate intention is to support compound boolean conditions through operators, the semantics required for such support are far easier if the simple conditions (i.e. not compounded by boolean operators) are just literal tag names. Once such features would become available, then pattern matching on the display string would be largely legacy, as it seems to have no particular necessary use case, compared to using tags.

Ultimately, my cursory view (which of course follows from certain assumptions about the project structure overall), is that supporting tags would lead to a cleaner development path as well as a cleaner experience for users.

martin-schulze-vireso · 2021-12-29T23:52:32Z

The display strings would be seen by anyone running the tests or reviewing the results, whether or not familiar with the idea about embedding tags into these string. The framework is intended to display either pretty output, or informative, parse-able output, depending on the selected output mode. While it may have some utility given the currently-available features, embedding tags into display strings violates the basic principle of the string being natural and straightforward for a human to read, regardless of familiarity with any particular system.

Well, appending the tags to the string makes them hardly unreadable. At most they may become too long, but as I said: they could be filtered away. What I find harder is that the test name would depend on the labels, which is not ideal.

There are a variety of vulnerabilities to embedding tags into the display string. One is a partial matching (e.g. +part matches +part-of-whole). A workaround is possible, of adding additional match characters at the end of the pattern, but this solution cumbersome and error prone. A perhaps better solution involves surrounding the entire tag with delimiters (e.g. [part]), though it too is awkward and vulnerable.

I think delimiting with [] would work, although that definitely requires a dedicated interface to avoid cumbersome escaping in the --filter regex.

As I suggested, if the ultimate intention is to support compound boolean conditions through operators, the semantics required for such support are far easier if the simple conditions (i.e. not compounded by boolean operators) are just literal tag names. Once such features would become available, then pattern matching on the display string would be largely legacy, as it seems to have no particular necessary use case, compared to using tags.

I did not discount operators at all. I just wanted to say --filter would profit from them as well.

Ultimately, my cursory view (which of course follows from certain assumptions about the project structure overall), is that supporting tags would lead to a cleaner development path as well as a cleaner experience for users.

I find that less obvious. For one I don't know how many users rely on --filter and how easy it is to migrate them to other mechanics. To me it is mainly an interactive tool but there might be user that rely on it in committed code.

I'll have to think a bit about the use cases that are supported by --filter and your suggestion. My gut feeling says, that --filter is indeed "misused" in many cases to achieve something like the tagging you are suggesting here. I am just looking for a way to achieve both without too much new rigging. There are several features that deal with selecting subsets of tests (--filter, #483, #223) and we should avoid duplicating their usecases.

brainchild0 · 2021-12-30T06:47:38Z

Well, appending the tags to the string makes them hardly unreadable. At most they may become too long, but as I said: they could be filtered away. What I find harder is that the test name would depend on the labels, which is not ideal.

If you know what they are, you may not mind them, but seeing them in a field intended as a human-readable string may be confusing, and also unwieldy if the tags are very numerous.

I think delimiting with [] would work, although that definitely requires a dedicated interface to avoid cumbersome escaping in the --filter regex.

Is that easier than adding extra fields to the syntax (e.g. @tags)?

I did not discount operators at all. I just wanted to say --filter would profit from them as well.

The core of my suggestion was that designing and implementing syntax and semantics for boolean operations on filters is vastly easier if the simple tests (the primitive operands) are simply tag names, versus regular expressions.

Compare the following two examples:

---filter '\[non\-critical\] --and --filter '\[external\-deps\]'
--tags 'non-critical & external-deps'

Augmenting the existing semantics for boolean operators will be very combersome, and even worse than the demonstration above when including grouping symbols. Meanwhile, parsing a basic string, the one shown as the argument to --tags, from a simple language that has only a few types of tokens, would be much simpler and elegant.

brainchild0 added the Type: Enhancement label Dec 27, 2021

martin-schulze-vireso added Component: CLI Command line flags, exit code handling, ... Component: Bash Code Everything regarding the bash code Component: Docs labels Dec 28, 2021

martin-schulze-vireso added the Priority: Medium Wrong or misleading documentation, broken behavior with workaround label Dec 29, 2021

martin-schulze-vireso added the Size: Large Changes across several files label Feb 21, 2022

martin-schulze-vireso mentioned this issue Mar 23, 2022

Focusing tests #223

Closed

martin-schulze-vireso mentioned this issue Sep 4, 2022

Tagging tests #642

Merged

1 task

martin-schulze-vireso added this to the 1.8.0 milestone Sep 4, 2022

martin-schulze-vireso closed this as completed in #642 Sep 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

improved filtering through tags #529

improved filtering through tags #529

brainchild0 commented Dec 27, 2021

martin-schulze-vireso commented Dec 28, 2021

brainchild0 commented Dec 28, 2021 •

edited

martin-schulze-vireso commented Dec 28, 2021

brainchild0 commented Dec 28, 2021

martin-schulze-vireso commented Dec 28, 2021

brainchild0 commented Dec 28, 2021 •

edited

martin-schulze-vireso commented Dec 29, 2021

brainchild0 commented Dec 30, 2021

improved filtering through tags #529

improved filtering through tags #529

Comments

brainchild0 commented Dec 27, 2021

martin-schulze-vireso commented Dec 28, 2021

brainchild0 commented Dec 28, 2021 • edited

martin-schulze-vireso commented Dec 28, 2021

brainchild0 commented Dec 28, 2021

martin-schulze-vireso commented Dec 28, 2021

brainchild0 commented Dec 28, 2021 • edited

martin-schulze-vireso commented Dec 29, 2021

brainchild0 commented Dec 30, 2021

brainchild0 commented Dec 28, 2021 •

edited

brainchild0 commented Dec 28, 2021 •

edited