"Tests failed" but none listed-- child process dying? #819

OxSon · 2021-05-18T20:03:38Z

Intermittently (about 20-30% of the time), my test pipeline fails and says Tests Failed at the end of the job, despite earlier results summary saying 0 failures. It looks like not all the tests are being run as in the summary text: "x tests, 0 failed ..." x is 4700 for successful pipelines and less for ghost-failure pipelines.

Example output for a successful pipeline:

4700 examples, 0 failures, 11 pendings
Took 149 seconds (2:29)

And a ghost-failure pipeline:

3878 examples, 0 failures, 9 pendings
Tests have failed for a parallel_test group. Use the following command to run the group again:
bundle exec rspec spec/ <SNIP: a bunch of *_spec.rb files> --seed 36611
Took 217 seconds (3:37)
Tests Failed

Sometimes more than one bundle exec rspec spec/...... group is shown at the end, and in the X examples, 0 failures section, X is even smaller.

The relevant section of our CI looks like this (this is Gitlab CI):

test:
  stage: test
  except:
    - tags
  script:
    - export PARALLEL_TEST_PROCESSORS=6
    - export PARALLEL_TEST_FIRST_IS_1=true
    - bundle exec rake parallel:setup
    - JRUBY_OPTS=--debug bundle exec parallel_rspec -n 6 ./spec --verbose

My .rspec looks like this:

--order rand
--tty
--color
--format progress
--format ParallelTests::RSpec::SummaryLogger --out tmp/spec_summary.log

I've tried using the Gitlab parallel feature but it has screwed up my code-coverage results and so I've abandoned that route. I've tried various fixes listed in other similar, closed issues as applicable with no luck. No backtrace is reported by the failing child processes.

I noticed in other issues you often suggested adding something to the runners that prints something like "im alive!" every minute... I'm not sure how to do that effectively for Gitlab CI.

I would appreciate any troubleshooting help you can provide! Thank you :)

The text was updated successfully, but these errors were encountered:

grosser · 2021-05-23T18:08:52Z

afaik what happens is that one of the processes dies
so that could be a rouge exit or abort somewhere in the tests/code or maybe it gets oomkilled
it prints the group that failed, so it must be happening in there somewhere
rerunning the failed group did not help right ?

katanaeta mentioned this issue Jul 9, 2021

require 'parallel_tests/test/runtime_logger' if ENV['RECORD_RUNTIME'] #822

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"Tests failed" but none listed-- child process dying? #819

"Tests failed" but none listed-- child process dying? #819

OxSon commented May 18, 2021 •

edited

grosser commented May 23, 2021

"Tests failed" but none listed-- child process dying? #819

"Tests failed" but none listed-- child process dying? #819

Comments

OxSon commented May 18, 2021 • edited

grosser commented May 23, 2021

OxSon commented May 18, 2021 •

edited