Add --filter-status failed for running only failed tests from the last completed run #483

martin-schulze-vireso · 2021-08-18T10:10:17Z

Often only a little subset of tests is failing. While working on fixing them, running other tests just takes up time. While the filtering function can be used to filter down to a small number of tests, it can become cumbersome to select all tests that failed. This flag to Bats introduces that feature. What this PR does:

check for .bats/run-logs/ in the test directory (suite dir or containing dir of first test file)
- if this directory exists, a log of all passed/failed tests will be written
- if a test run is aborted via SIGINT/CTRL+C no log will be stored
when --filter-status is activated it picks the most recent log in .bats/run-logs
- if no log does exist, we assume a first run and will run all tests (subject to filtering/file selection)
- if the log exist, run all tests of the specified status in the last run (+missed tests: tests that were not encountered in the last run e.g., due to renaming/adding them)

TODO:

documentation
- man page
- --help text
I have reviewed the Contributor Guidelines.
I have reviewed the Code of Conduct and agree to abide by it

martin-schulze-vireso · 2021-11-17T22:14:52Z

I dislike two things about the current solution:

if we abort the recording mid run, we won't catch all failing tests, which will omit them from reruns.
we might want to lock in a set of failing tests and only run this subset without narrowing down further. This can happen when we change something that affects multiple tests.

I've got the feeling that there is a more minimal/modular solution lurking under the surface: Splitting the recording and "replay" functionality into separate flags would allow for 2.

Similarly, it might be a good idea to record all tests that ran, and save their state as failing/passing, then a replay run can skip all passed tests and run those that failed or were not run.

cyphar · 2022-03-17T00:50:27Z

I think it might be a good idea to implement it as a list of test names with the pass and fail state. That would be much simpler than splitting the feature into two options and would solve the problem of the test run being cancelled (or the problem of new tests being added between --rerun-failed runs).

However, the file creation logic and lifetime is a little bit worrisome. You can't use --rerun-failed without having to specify it the first time (which is not a great UX), but at the same time this means if you wanted to start from scratch you need to know to delete the .bats-* file because in order to record the file you need to pass the option but if you pass the option and the file exists it won't write a new file from scratch. The fact it's a hidden file (and one that most users would add to .gitignore) makes this even more of an issue.

Ideally we would always record the test results and then when --rerun-failed is used, we read that file. It might also be nice for --rerun-failed to update the test results file but I'm not sure if that's always something that a user would want to do (maybe they want to re-run a set of flaky tests multiple times). I'm also a little worried about creating such a file in the test directory -- if it accidentally gets checked into version control then that's just asking for trouble. But I guess we can't put it in ~/.cache/bats because then we'd have to figure out a way to represent the path in a nice way.

Also IMHO the option name is somewhat confusing (are we retrying a test immediately after it failed?). --run-last-failed or something might be better, but I'm just spitballing.

martin-schulze-vireso · 2022-03-23T23:03:06Z

I think it might be a good idea to implement it as a list of test names with the pass and fail state. That would be much simpler than splitting the feature into two options and would solve the problem of the test run being cancelled (or the problem of new tests being added between --rerun-failed runs).

Yes, I was thinking about this too. The current state just was an implementation shortcut because the format already works with the internal test list. We also would have to make sure that the test names are parsable from on line.

However, the file creation logic and lifetime is a little bit worrisome. You can't use --rerun-failed without having to specify it the first time (which is not a great UX), but at the same time this means if you wanted to start from scratch you need to know to delete the .bats-* file because in order to record the file you need to pass the option but if you pass the option and the file exists it won't write a new file from scratch. The fact it's a hidden file (and one that most users would add to .gitignore) makes this even more of an issue.

No, I'd actually like to update the file.

Ideally we would always record the test results and then when --rerun-failed is used, we read that file. It might also be nice for --rerun-failed to update the test results file but I'm not sure if that's always something that a user would want to do (maybe they want to re-run a set of flaky tests multiple times). I'm also a little worried about creating such a file in the test directory -- if it accidentally gets checked into version control then that's just asking for trouble. But I guess we can't put it in ~/.cache/bats because then we'd have to figure out a way to represent the path in a nice way.

I was torn on where to put it too. I am not sure if ~/.cache/ is available on all supported systems and as you said, once you centralize you need to pick it apart again. I looked at how other test frameworks deal with this and found the containing folder to be a common solution. I chose to use .bats/ to allow for reusing it later on for other files. The hidden nature should actually encourage people to ignore it. We might be even clearer with something like .bats-vcs-ignore?

Also IMHO the option name is somewhat confusing (are we retrying a test immediately after it failed?). --run-last-failed or something might be better, but I'm just spitballing.

I tried to avoid creating the failed test log unconditionally to not force it unto unsuspecting users with the new version. A middle ground might be to require explicit approval on first run. I am not sure if this is a feature of interest for CI systems. I could envision exporting the failed test log as an artifact for download, so you can streamline your local error search or later runs.

Anyways, my idea of --rerun-failed semantics would be to narrow down the list of failing tests incrementally, so each new complete run would capture more detail. I think solution that only overwrites with a new capture if the test suite completed would be good enough for now. This can be improved later on. However, the interface should not change.

martin-schulze-vireso · 2022-04-05T22:41:34Z

@cyphar Some more ideas:

We could require the user to create the .bats/ to avoid confusion about whether to put it into .gitignore. If the directory does not exist, the files will be gathered in a central place, e.g. /tmp and can be moved over, once the user decides to rerun failed tests. That way, we don't lose the last run. If there are multiple test suites on the same machine, they will overwrite their results but I think that should happen seldom enough to be a real problem and it is trivially fixable by creatign .bats/ locally.

martin-schulze-vireso · 2022-04-07T23:00:02Z

@cyphar After some experimenting and reading over your comment again I have the following suggestion:

rename to --filter-status failed, this leaves open the possibility to filter for other results like passed or skipped or whatever
only capture when there is a .bats/run-logs directory in the tests folder, which the user has to create actively

this means if we use --filter-status without .bats/run-logs existing, the user is instructed to create the folder and .gitignore it. They may lose the previous run but I think the complexity of catching that corner case is not worth the effort and problems with automatically generating the folder

when this directory exists, each non-aborted run will capture the status of executed tests

martin-schulze-vireso · 2022-06-15T22:52:01Z

@cyphar Please have a look on the UX again (code is obviously broken, since the tests are failing). I rewrote it with the following changes:

renamed flag to --filter-status failed (there is also missed and passed)
tests are run when they have the according status or were not encountered in the last run
logs are created under .bats/run-logs/<timestamp>.log, if the folder does not exist, no log is created
each test line in the log is either passed, failed or status-filtered (to show which tests were already filtered in the last run; they should be filtered again if the filter reason was the same)

You can also have a look at the tests to see how it is intended to be used.

which was required Bash pre 4.3

they are only useful in exec-suite

specify format that is understood by all target systems

martin-schulze-vireso requested a review from a team as a code owner August 18, 2021 10:10

martin-schulze-vireso added Component: Bash Code Everything regarding the bash code Type: Enhancement labels Aug 18, 2021

martin-schulze-vireso added this to the 1.6.0 milestone Nov 10, 2021

martin-schulze-vireso modified the milestones: 1.6.0, 1.7.0 Dec 25, 2021

martin-schulze-vireso mentioned this pull request Dec 29, 2021

improved filtering through tags #529

Closed

martin-schulze-vireso mentioned this pull request Feb 14, 2022

Capability to rerun failed tests cases (Retry Mechanism) #546

Closed

martin-schulze-vireso force-pushed the feature/rerun_failed branch from 4ed0e91 to 454ac25 Compare February 24, 2022 23:52

martin-schulze-vireso modified the milestones: 1.7.0, 1.8.0 May 7, 2022

martin-schulze-vireso force-pushed the feature/rerun_failed branch from 0ae9776 to deced7c Compare June 8, 2022 10:27

martin-schulze-vireso force-pushed the feature/rerun_failed branch 2 times, most recently from 0f4a408 to 6447096 Compare June 15, 2022 22:51

martin-schulze-vireso added 10 commits June 24, 2022 22:41

Use --dummy-flag to avoid switch for empty $flags

65846d8

which was required Bash pre 4.3

Add --rerun-failed

7dfbeb6

Remove unused flags

9400c0c

they are only useful in exec-suite

Bring internal var in line with naming scheme

dcb8b60

Rename --rerun-failed to --filter-status failed

117da02

Don't pick up .bats/ for shellcheck

b5c0f15

Fix invalid var reference

c1ddf29

Fix runlog collision avoidance

0c573c7

Fix shellcheck

d899847

Improve debug output

d5b0830

martin-schulze-vireso force-pushed the feature/rerun_failed branch from 6447096 to d5b0830 Compare June 24, 2022 21:16

martin-schulze-vireso changed the title ~~Add --rerun-failed for recording and subsequently running only failed tests~~ Add --filter-status failed for recording and subsequently running only failed tests Jul 3, 2022

martin-schulze-vireso changed the title ~~Add --filter-status failed for recording and subsequently running only failed tests~~ Add --filter-status failed for running only failed tests from the last complete run Jul 3, 2022

martin-schulze-vireso changed the title ~~Add --filter-status failed for running only failed tests from the last complete run~~ Add --filter-status failed for running only failed tests from the last completed run Jul 3, 2022

Fix to much whitespace to match carried over tests

6e7ce71

martin-schulze-vireso force-pushed the feature/rerun_failed branch from 9cc7ddb to ab0614f Compare July 3, 2022 22:26

martin-schulze-vireso added 4 commits July 5, 2022 23:40

Fix error with date flag -I on MacOS

e555c8b

specify format that is understood by all target systems

docs for --filter-status <status>

0e310ff

Add changelog entry for bats-core#473

f29a88d

Allow for comments and remove todo

97c8df4

martin-schulze-vireso force-pushed the feature/rerun_failed branch from ab0614f to 97c8df4 Compare July 5, 2022 21:42

martin-schulze-vireso merged commit 9d4222b into bats-core:master Jul 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add --filter-status failed for running only failed tests from the last completed run #483

Add --filter-status failed for running only failed tests from the last completed run #483

martin-schulze-vireso commented Aug 18, 2021 •

edited

martin-schulze-vireso commented Nov 17, 2021

cyphar commented Mar 17, 2022 •

edited

martin-schulze-vireso commented Mar 23, 2022

martin-schulze-vireso commented Apr 5, 2022 •

edited

martin-schulze-vireso commented Apr 7, 2022

martin-schulze-vireso commented Jun 15, 2022 •

edited

Add --filter-status failed for running only failed tests from the last completed run #483

Add --filter-status failed for running only failed tests from the last completed run #483

Conversation

martin-schulze-vireso commented Aug 18, 2021 • edited

martin-schulze-vireso commented Nov 17, 2021

cyphar commented Mar 17, 2022 • edited

martin-schulze-vireso commented Mar 23, 2022

martin-schulze-vireso commented Apr 5, 2022 • edited

martin-schulze-vireso commented Apr 7, 2022

martin-schulze-vireso commented Jun 15, 2022 • edited

martin-schulze-vireso commented Aug 18, 2021 •

edited

cyphar commented Mar 17, 2022 •

edited

martin-schulze-vireso commented Apr 5, 2022 •

edited

martin-schulze-vireso commented Jun 15, 2022 •

edited