Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix test runner --fail-fast test flake #51783

Merged
merged 1 commit into from May 15, 2024

Conversation

JoeDupuis
Copy link
Contributor

@JoeDupuis JoeDupuis commented May 11, 2024

I am working on a couple of flakes that have been bugging me for a while.
This one is testing that --fail-fast forces an early return and doesn't run all the tests when running tests in parallel.

Example of a failed test run

I've ran the test in a loop to test the variability.
Running the current version 50 times, I got these results:
{"3"=>1, "5"=>36, "4"=>8, "6"=>5}

The keys are the number of tests that ran and the value is the number of times we got that result in the outer loop (I ran the test 50 times). I've inserted the script I've used to get these numbers at the end.

A short sleep eliminates all variability:
{"4"=>50}

Sleeping in a test is not ideal. It is slow and it is not a true fix, but in this particular case I believe it could make sense:
1- The test is already not perfect. It seems prone to race condition / flake
2- Benchmark show negligible impact on test performance
3- A true fix would be quite complex
4- It's not a new pattern in the code base (Rails)

Since we eliminated variability, we can lower down the number of tests from 10 to 4. Making it, in some instances, even faster than the original test!

I ran a benchmark with 4 versions:
1- "Current test" (10 tests with no sleep)
2- "With sleep" (10 tests with sleep)
3- "Short" (4 tests)
4- "Short with sleep" (4 tests with sleep)

Here are the results:

ruby 3.2.0 (2022-12-25 revision a528908271) [x86_64-linux]
Warming up --------------------------------------
             Current     1.000 i/100ms
          with sleep     1.000 i/100ms
               short     1.000 i/100ms
    short with sleep     1.000 i/100ms
Calculating -------------------------------------
             Current      1.032 (± 0.0%) i/s -      6.000 in   5.812659s
          with sleep      1.018 (± 0.0%) i/s -      6.000 in   5.897112s
               short      1.033 (± 0.0%) i/s -      6.000 in   5.807960s
    short with sleep      1.031 (± 0.0%) i/s -      6.000 in   5.822803s

Comparison:
               short:        1.0 i/s
             Current:        1.0 i/s - 1.00x  slower
    short with sleep:        1.0 i/s - 1.00x  slower
          with sleep:        1.0 i/s - 1.01x  slower

Here are the results on variability:

Current test 50 times
Time:
  0.018374   0.016918  43.072536 ( 49.452285)
count:
{"3"=>1, "5"=>36, "4"=>8, "6"=>5}
------------------------------------------------------------
With sleep 50 times
Time:
  0.024689   0.018993  43.850807 ( 53.411969)
count:
{"4"=>50}
------------------------------------------------------------
Short 50 times
Time:
  0.015890   0.014887  42.154732 ( 48.472620)
count:
{"3"=>47, "2"=>3}
------------------------------------------------------------
Short with sleep 50 times
Time:
  0.021356   0.013760  41.751833 ( 48.164752)
count:
{"2"=>50}

Benchmark Script

Here's the (rough) benchmark script I used. It's a test in the same file as the original test.

diff --git a/railties/test/application/test_runner_test.rb b/railties/test/application/test_runner_test.rb
index f30bdb2890..80565899a2 100644
--- a/railties/test/application/test_runner_test.rb
+++ b/railties/test/application/test_runner_test.rb
@@ -802,6 +802,133 @@ def test_verify_fail_fast
       assert_operator matches[1].to_i, :<, 11
     end
 
+
+    def test_flake_fix
+      require 'benchmark/ips'
+      exercise_parallelization_regardless_of_machine_core_count(with: :processes)
+
+      app_file "test/unit/parallel_test_current.rb", <<-RUBY
+        require "test_helper"
+
+        class ParallelTest < ActiveSupport::TestCase
+          def test_verify_fail_fast
+            assert false
+          end
+
+          10.times do |n|
+            define_method("test_verify_fail_fast_\#{n}") do
+              assert true
+            end
+          end
+        end
+      RUBY
+
+      app_file "test/unit/parallel_test_sleep.rb", <<-RUBY
+        require "test_helper"
+
+        class ParallelTest < ActiveSupport::TestCase
+          def test_verify_fail_fast
+            assert false
+          end
+
+          10.times do |n|
+            define_method("test_verify_fail_fast_\#{n}") do
+              sleep 0.1
+              assert true
+            end
+          end
+        end
+      RUBY
+
+      app_file "test/unit/parallel_test_sleep_short.rb", <<-RUBY
+        require "test_helper"
+
+        class ParallelTest < ActiveSupport::TestCase
+          def test_verify_fail_fast
+            assert false
+          end
+
+          4.times do |n|
+            define_method("test_verify_fail_fast_\#{n}") do
+              sleep 0.1
+              assert true
+            end
+          end
+        end
+      RUBY
+
+      app_file "test/unit/parallel_test_short.rb", <<-RUBY
+        require "test_helper"
+
+        class ParallelTest < ActiveSupport::TestCase
+          def test_verify_fail_fast
+            assert false
+          end
+
+          4.times do |n|
+            define_method("test_verify_fail_fast_\#{n}") do
+              assert true
+            end
+          end
+        end
+      RUBY
+
+      sample_size = 50
+      count = {}
+      test_output = ""
+
+
+      run = ->(filename, sample_size) do
+        count = {}
+        test_output = ""
+        Benchmark.measure {
+          sample_size.times do
+            capture(:stderr) do
+              test_output = run_test_command("#{filename} --fail-fast --seed 31992", stderr: true)
+            end
+            matches = test_output.match(/(\d+) runs, (\d+) assertions, (\d+) failures/)
+            count[matches[1]] = count[matches[1]].to_i + 1
+          end
+        }
+      end
+
+      puts "Current test #{sample_size} times"
+      time = run.call("test/unit/parallel_test_current.rb", sample_size)
+      puts "Time:", time
+      puts "count:", count
+
+      puts '------------------------------------------------------------'
+
+      puts "With sleep #{sample_size} times"
+      time = run.call("test/unit/parallel_test_sleep.rb", sample_size)
+      puts "Time:", time
+      puts "count:", count
+
+      puts '------------------------------------------------------------'
+
+      puts "Short #{sample_size} times"
+      time = run.call("test/unit/parallel_test_short.rb", sample_size)
+      puts "Time:", time
+      puts "count:", count
+
+      puts '------------------------------------------------------------'
+
+      puts "Short with sleep #{sample_size} times"
+      time = run.call("test/unit/parallel_test_sleep_short.rb", sample_size)
+      puts "Time:", time
+      puts "count:", count
+
+
+      Benchmark.ips do |x|
+        x.report("Current") { run.call("test/unit/parallel_test_current.rb", 1) }
+        x.report("with sleep") { run.call("test/unit/parallel_test_current.rb", 1) }
+        x.report("short") { run.call("test/unit/parallel_test_short.rb", 1)  }
+        x.report("short with sleep") { run.call("test/unit/parallel_test_sleep_short.rb", 1)  }
+        x.compare!
+      end
+    end
+
+
     def test_run_in_parallel_with_processes
       exercise_parallelization_regardless_of_machine_core_count(with: :processes)
 

Checklist

Before submitting the PR make sure the following are checked:

  • This Pull Request is related to one change. Unrelated changes should be opened in separate PRs.
  • Commit message has a detailed description of what changed and why. If this PR fixes a related issue include it in the commit message. Ex: [Fix #issue-number]
  • Tests are added or updated if you fix a bug or add a feature.
  • CHANGELOG files are updated for the changed libraries if there is a behavior change or additional feature. Minor bug fixes and documentation changes should not be included.

Without the sleep, the test has high variability in the amount of test
ran before stopping. With the sleep, the test is pretty stable with
negligeable impact on run time.
@rails-bot rails-bot bot added the railties label May 11, 2024
@rafaelfranca rafaelfranca merged commit b56ee17 into rails:main May 15, 2024
4 checks passed
rafaelfranca added a commit that referenced this pull request May 15, 2024
Fix test runner --fail-fast test flake
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants