Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --filter-status failed for running only failed tests from the last completed run #483

Conversation

martin-schulze-vireso
Copy link
Member

@martin-schulze-vireso martin-schulze-vireso commented Aug 18, 2021

Often only a little subset of tests is failing. While working on fixing them, running other tests just takes up time. While the filtering function can be used to filter down to a small number of tests, it can become cumbersome to select all tests that failed. This flag to Bats introduces that feature. What this PR does:

  • check for .bats/run-logs/ in the test directory (suite dir or containing dir of first test file)
    • if this directory exists, a log of all passed/failed tests will be written
    • if a test run is aborted via SIGINT/CTRL+C no log will be stored
  • when --filter-status is activated it picks the most recent log in .bats/run-logs
    • if no log does exist, we assume a first run and will run all tests (subject to filtering/file selection)
    • if the log exist, run all tests of the specified status in the last run (+missed tests: tests that were not encountered in the last run e.g., due to renaming/adding them)

TODO:

@martin-schulze-vireso martin-schulze-vireso requested a review from a team as a code owner August 18, 2021 10:10
@martin-schulze-vireso martin-schulze-vireso added this to the 1.6.0 milestone Nov 10, 2021
@martin-schulze-vireso
Copy link
Member Author

I dislike two things about the current solution:

  1. if we abort the recording mid run, we won't catch all failing tests, which will omit them from reruns.
  2. we might want to lock in a set of failing tests and only run this subset without narrowing down further. This can happen when we change something that affects multiple tests.

I've got the feeling that there is a more minimal/modular solution lurking under the surface: Splitting the recording and "replay" functionality into separate flags would allow for 2.

Similarly, it might be a good idea to record all tests that ran, and save their state as failing/passing, then a replay run can skip all passed tests and run those that failed or were not run.

@cyphar
Copy link
Contributor

cyphar commented Mar 17, 2022

I think it might be a good idea to implement it as a list of test names with the pass and fail state. That would be much simpler than splitting the feature into two options and would solve the problem of the test run being cancelled (or the problem of new tests being added between --rerun-failed runs).

However, the file creation logic and lifetime is a little bit worrisome. You can't use --rerun-failed without having to specify it the first time (which is not a great UX), but at the same time this means if you wanted to start from scratch you need to know to delete the .bats-* file because in order to record the file you need to pass the option but if you pass the option and the file exists it won't write a new file from scratch. The fact it's a hidden file (and one that most users would add to .gitignore) makes this even more of an issue.

Ideally we would always record the test results and then when --rerun-failed is used, we read that file. It might also be nice for --rerun-failed to update the test results file but I'm not sure if that's always something that a user would want to do (maybe they want to re-run a set of flaky tests multiple times). I'm also a little worried about creating such a file in the test directory -- if it accidentally gets checked into version control then that's just asking for trouble. But I guess we can't put it in ~/.cache/bats because then we'd have to figure out a way to represent the path in a nice way.

Also IMHO the option name is somewhat confusing (are we retrying a test immediately after it failed?). --run-last-failed or something might be better, but I'm just spitballing.

@martin-schulze-vireso
Copy link
Member Author

I think it might be a good idea to implement it as a list of test names with the pass and fail state. That would be much simpler than splitting the feature into two options and would solve the problem of the test run being cancelled (or the problem of new tests being added between --rerun-failed runs).

Yes, I was thinking about this too. The current state just was an implementation shortcut because the format already works with the internal test list. We also would have to make sure that the test names are parsable from on line.

However, the file creation logic and lifetime is a little bit worrisome. You can't use --rerun-failed without having to specify it the first time (which is not a great UX), but at the same time this means if you wanted to start from scratch you need to know to delete the .bats-* file because in order to record the file you need to pass the option but if you pass the option and the file exists it won't write a new file from scratch. The fact it's a hidden file (and one that most users would add to .gitignore) makes this even more of an issue.

No, I'd actually like to update the file.

Ideally we would always record the test results and then when --rerun-failed is used, we read that file. It might also be nice for --rerun-failed to update the test results file but I'm not sure if that's always something that a user would want to do (maybe they want to re-run a set of flaky tests multiple times). I'm also a little worried about creating such a file in the test directory -- if it accidentally gets checked into version control then that's just asking for trouble. But I guess we can't put it in ~/.cache/bats because then we'd have to figure out a way to represent the path in a nice way.

I was torn on where to put it too. I am not sure if ~/.cache/ is available on all supported systems and as you said, once you centralize you need to pick it apart again. I looked at how other test frameworks deal with this and found the containing folder to be a common solution. I chose to use .bats/ to allow for reusing it later on for other files. The hidden nature should actually encourage people to ignore it. We might be even clearer with something like .bats-vcs-ignore?

Also IMHO the option name is somewhat confusing (are we retrying a test immediately after it failed?). --run-last-failed or something might be better, but I'm just spitballing.

I tried to avoid creating the failed test log unconditionally to not force it unto unsuspecting users with the new version. A middle ground might be to require explicit approval on first run. I am not sure if this is a feature of interest for CI systems. I could envision exporting the failed test log as an artifact for download, so you can streamline your local error search or later runs.

Anyways, my idea of --rerun-failed semantics would be to narrow down the list of failing tests incrementally, so each new complete run would capture more detail. I think solution that only overwrites with a new capture if the test suite completed would be good enough for now. This can be improved later on. However, the interface should not change.

@martin-schulze-vireso
Copy link
Member Author

martin-schulze-vireso commented Apr 5, 2022

@cyphar Some more ideas:

We could require the user to create the .bats/ to avoid confusion about whether to put it into .gitignore. If the directory does not exist, the files will be gathered in a central place, e.g. /tmp and can be moved over, once the user decides to rerun failed tests. That way, we don't lose the last run. If there are multiple test suites on the same machine, they will overwrite their results but I think that should happen seldom enough to be a real problem and it is trivially fixable by creatign .bats/ locally.

@martin-schulze-vireso
Copy link
Member Author

@cyphar After some experimenting and reading over your comment again I have the following suggestion:

  1. rename to --filter-status failed, this leaves open the possibility to filter for other results like passed or skipped or whatever
  2. only capture when there is a .bats/run-logs directory in the tests folder, which the user has to create actively
  • this means if we use --filter-status without .bats/run-logs existing, the user is instructed to create the folder and .gitignore it. They may lose the previous run but I think the complexity of catching that corner case is not worth the effort and problems with automatically generating the folder
  1. when this directory exists, each non-aborted run will capture the status of executed tests

@martin-schulze-vireso
Copy link
Member Author

martin-schulze-vireso commented Jun 15, 2022

@cyphar Please have a look on the UX again (code is obviously broken, since the tests are failing). I rewrote it with the following changes:

  • renamed flag to --filter-status failed (there is also missed and passed)
  • tests are run when they have the according status or were not encountered in the last run
  • logs are created under .bats/run-logs/<timestamp>.log, if the folder does not exist, no log is created
  • each test line in the log is either passed, failed or status-filtered (to show which tests were already filtered in the last run; they should be filtered again if the filter reason was the same)

You can also have a look at the tests to see how it is intended to be used.

@martin-schulze-vireso martin-schulze-vireso changed the title Add --rerun-failed for recording and subsequently running only failed tests Add --filter-status failed for recording and subsequently running only failed tests Jul 3, 2022
@martin-schulze-vireso martin-schulze-vireso changed the title Add --filter-status failed for recording and subsequently running only failed tests Add --filter-status failed for running only failed tests from the last complete run Jul 3, 2022
@martin-schulze-vireso martin-schulze-vireso changed the title Add --filter-status failed for running only failed tests from the last complete run Add --filter-status failed for running only failed tests from the last completed run Jul 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Bash Code Everything regarding the bash code Type: Enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants