Performance. RSpec/NestedGroups #950

andrykonchin · 2020-06-28T23:25:27Z

Optimized performance of RSpec/NestedGroups and #on_top_level_describe callback.

Changes

avoid excessive recursive iteration oven nested example groups
fixed error with calculating nesting count

Performance measurements

Timing for RSpec/NestedGroups is changed from 14.3% to 3.3% (for RuboCop::RSpec::TopLevelDescribe#on_send callback).

Before

stackprof tmp/stackprof-cpu-gitlab.master.with-rubocop-rspec.NestedGroups.dump --method 'RuboCop::RSpec::TopLevelDescribe#on_send'
RuboCop::RSpec::TopLevelDescribe#on_send (/Users/andrykonchin/projects/rubocop-rspec/lib/rubocop/rspec/top_level_describe.rb:9)
  samples:   130 self (0.2%)  /   8732 total (16.6%)
  callers:
    8732  (  100.0%)  RuboCop::Cop::Commissioner#trigger_responding_cops
  callees (8602 total):
    8446  (   98.2%)  RuboCop::Cop::RSpec::NestedGroups#on_top_level_describe
     155  (    1.8%)  RuboCop::RSpec::TopLevelDescribe#top_level_describe?
       1  (    0.0%)  RuboCop::AST::MethodDispatchNode#arguments
  code:
                                  |     9  |       def on_send(node)
  127    (0.2%) /   127   (0.2%)  |    10  |         return unless respond_to?(:on_top_level_describe)
  158    (0.3%) /     3   (0.0%)  |    11  |         return unless top_level_describe?(node)
                                  |    12  |
 8447   (16.1%)                   |    13  |         on_top_level_describe(node, node.arguments)
                                  |    14  |       end

After

stackprof tmp/stackprof-cpu-gitlab.master.with-rubocop-rspec.NestedGroups.4.dump --method 'RuboCop::RSpec::TopLevelDescribe#on_send'
RuboCop::RSpec::TopLevelDescribe#on_send (/Users/andrykonchin/projects/rubocop-rspec/lib/rubocop/rspec/top_level_describe.rb:9)
  samples:    49 self (0.2%)  /    715 total (3.1%)
  callers:
     715  (  100.0%)  RuboCop::Cop::Commissioner#trigger_responding_cops
  callees (666 total):
     593  (   89.0%)  RuboCop::Cop::RSpec::NestedGroups#on_top_level_describe
      73  (   11.0%)  RuboCop::RSpec::TopLevelDescribe#top_level_describe?
  code:
                                  |     9  |       def on_send(node)
   49    (0.2%) /    49   (0.2%)  |    10  |         return unless respond_to?(:on_top_level_describe)
   73    (0.3%)                   |    11  |         return unless top_level_describe?(node)
                                  |    12  |
  593    (2.6%)                   |    13  |         on_top_level_describe(node, node.arguments)
                                  |    14  |       end

Measurements approach

Used stackprof profiler to measure proportion of the cop timing. Running Rubocop on the GitLab project specs.

Run only one cope without caching and skip config with command

bundle exec exe/rubocop --cache false --out gitlab-specs.out --force-default-config  --require rubocop-rspec --only RSpec/NestedGroups ../rubocop-profiling-examples/gitlabhq/spec

Before submitting the PR make sure the following are checked:

Feature branch is up-to-date with master (if not - rebase it).
Squashed related commits together.
Added tests.
Updated documentation.
Added an entry to the CHANGELOG.md if the new code introduces user-observable changes.
The build (bundle exec rake) passes (be sure to run this locally, since it may produce updated documentation that you will need to commit).

pirj

Nice! TopLevelGroup is a turbo drive.

lib/rubocop/cop/rspec/nested_groups.rb

andrykonchin · 2020-06-29T09:48:05Z

Fixed all the issues

Darhazer

Thank you
After you finish, you need to add a Changelog entry listing all your optimizations as well 🚀

lib/rubocop/cop/rspec/nested_groups.rb

Darhazer · 2020-06-29T16:06:11Z

lib/rubocop/cop/rspec/nested_groups.rb


        def on_top_level_describe(node, _args)
-          find_nested_contexts(node.parent) do |context, nesting|
+          find_nested_example_groups(node.parent) do |example_group, nesting|
            self.max = nesting


I wonder if this works correctly, e.g. if there are two example groups on the first level, and the first one has a larger number of nested groups, would the max be set to that number, or to the nesting level of the last offending group?

It should work correctly as far as we don't track nesting level and calculate it every time:

def nesting_count(node) count = node.each_ancestor(:block).count { |n| example_group?(n) } count + 1 end

Maybe I don't understand you well. What in your opinion may work incorrectly?

Hmm, not sure how this self.max value is used but looks like setting a value smaller that previous one is handled correctly:

def max=(value) cfg = config_to_allow_offenses cfg[:exclude_limit] ||= {} current_max = cfg[:exclude_limit][max_parameter_name] value = [current_max, value].max if current_max # <==== cfg[:exclude_limit][max_parameter_name] = value end

https://github.com/rubocop-hq/rubocop/blob/master/lib/rubocop/cop/mixin/configurable_max.rb#L10-L16

Yes, it's used when you run --auto-gen-config
Thank you for checking it though. I was going to verify myself, as it's not in the scope of the PR but I just noticed while reviewing the code

pirj

Looks great!
It's a nice surprise that it's not only a performance improvement, but a nice code cleanup as well.

lib/rubocop/cop/rspec/nested_groups.rb

pirj · 2020-07-01T18:11:30Z

lib/rubocop/cop/rspec/nested_groups.rb

+        def find_nested_example_groups(node, nesting: 1, &block)
+          yield node, nesting if example_group?(node) && nesting > max_nesting
+
+          next_nesting = example_group?(node) ? nesting + 1 : nesting


We seem to call example_group?(node) twice. WDYT of:

if example_group?(node) yield node, nesting if nesting > max_nesting nesting = nesting + 1 end node.each_child_node do |child| find_nested_example_groups(child, nesting: nesting, &block) end

pirj · 2020-07-01T18:18:45Z

lib/rubocop/cop/rspec/nested_groups.rb

-            nested_context.each_child_node do |child|
-              find_nested_contexts(child, nesting: nesting + 1, &block)
-            end
+          node.each_child_node do |child|


It might be possible to squeeze more out of this with node.each_descendant(:block).
Which one would perform better?

Filtering by node type looks reasonable (but each_descendant isn't suitable here because it's recursive but we need here direct children).

There is again one issue. We should handle not only block but at least begin node as well - if there are several nested contexts they are wrapped into begin node. TBH I am not sure whether there are another edge cases (e.g. kwbegin) or we could just filter block and begin nodes:

node.each_child_node(:block, :begin) do |child|

It has impressive impact - decreases the cop share from 3.3% to 1.8%.

There is no any difference between generated offenses with and without ~~with~~ filtering (checked on GitLab specs) but I still hesitate.

I'm a bit lost. What change reduced the share from 3.3% to 1.8%?

I meant additional filtering of block and begin nodes (what you've proposed) decreased cop time from 3.3% to 1.8%.

ping @Darhazer @bquorning What do you think? Is it correct to filter nodes here?

Yeah, I think using node.each_child_node(:block, :begin) is fine. If we are missing some edge cases, we can fix them as bugs. Provided of course that anyone will discover them.

Thank you. Done.

pirj

For me, it's more than good enough.

Trust you completely with what you decide to settle with.
Thanks again for the thorough and thoughtful work!

lib/rubocop/cop/rspec/nested_groups.rb

Darhazer

You can squash the commits

andrykonchin · 2020-07-02T18:33:36Z

Done

bquorning · 2020-07-03T07:31:01Z

Hey @andrykonchin. I just wanted to let you know that I tried running

time bundle exec rubocop --cache false --only RSpec/DescribeClass,FactoryBot/AttributeDefinedStatically,RSpec/InstanceVariable,RSpec/LeakyConstantDeclaration,RSpec/LetSetup,RSpec/NestedGroups,RSpec/ReturnFromStub,RSpec/SubjectStub -- spec

on the ~1400 spec files in my main work repository (--only running the cops that have changed between v1.40 and master branch).

Time spent using rubocop-rspec v1.40.0: 1:20.71 total
Time spent using rubocop-rspec master branch (6e1d698): 19.561 total

The time improvement is absolutely incredible, much better than I had hoped for. Thank you so much ❤️

bquorning · 2020-07-03T07:34:22Z

Ahh, sorry. I was measuring the difference between 1.39.0 and master, not 1.40.0. But the speed improvement (probably in SubjectStub) is still your work.

The runtime using v1.40.0 is “26.552 total” on my 1400 files. Still a significant improvement to master branch though.

andrykonchin · 2020-07-03T22:51:38Z

Yeah, SubjectStub cop optimization definitely was a big win. All the other recent optimizations should have fixed most of inefficient RSpec cops. So now every RSpec cop takes ~~less~~ no more than 1%, at least for the GitLab test suit:

stackprof tmp/stackprof-cpu-gitlab.master.with-rspec.test.fix.dump  --method 'RuboCop::Cop::Commissioner#trigger_responding_cops' | grep RSpec
    1737  (    1.9%)  RuboCop::RSpec::TopLevelDescribe#on_send
    1120  (    1.2%)  RuboCop::RSpec::TopLevelGroup#on_block
     907  (    1.0%)  RuboCop::Cop::RSpec::DescribedClass#on_block
     805  (    0.9%)  RuboCop::Cop::RSpec::MultipleSubjects#on_block
     790  (    0.9%)  RuboCop::Cop::RSpec::RepeatedDescription#on_block
     721  (    0.8%)  RuboCop::Cop::RSpec::RepeatedExample#on_block
     699  (    0.8%)  RuboCop::Cop::RSpec::MultipleExpectations#on_block
     658  (    0.7%)  RuboCop::RSpec::TopLevelGroup#on_block
     592  (    0.6%)  RuboCop::Cop::RSpec::ScatteredSetup#on_block
...

pirj · 2020-07-03T23:02:39Z

I might be reading it wrong, does it appear twice?

    1120  (    1.2%)  RuboCop::RSpec::TopLevelGroup#on_block
     658  (    0.7%)  RuboCop::RSpec::TopLevelGroup#on_block

andrykonchin · 2020-07-03T23:08:51Z

It's the way how stackprof shows different callers. TopLevelGroup is used in two cops - InstanceVariable and SubjectStub:

$ stackprof tmp/stackprof-cpu-gitlab.master.with-rspec.test.fix.dump  --method 'RuboCop::RSpec::TopLevelGroup#on_block'

RuboCop::RSpec::TopLevelGroup#on_block (/Users/andrykonchin/projects/rubocop-rspec/lib/rubocop/rspec/top_level_group.rb:13)
  samples:    42 self (0.0%)  /    658 total (0.4%)
  callers:
     658  (  100.0%)  RuboCop::Cop::Commissioner#trigger_responding_cops
  callees (616 total):
     446  (   72.4%)  RuboCop::Cop::RSpec::InstanceVariable#on_top_level_group
     170  (   27.6%)  RuboCop::RSpec::TopLevelGroup#top_level_group?
  code:
                                  |    13  |       def on_block(node)
   40    (0.0%) /    40   (0.0%)  |    14  |         return unless respond_to?(:on_top_level_group)
  172    (0.1%) /     2   (0.0%)  |    15  |         return unless top_level_group?(node)
                                  |    16  |
  446    (0.3%)                   |    17  |         on_top_level_group(node)
                                  |    18  |       end

RuboCop::RSpec::TopLevelGroup#on_block (/Users/andrykonchin/projects/rubocop-rspec/lib/rubocop/rspec/top_level_group.rb:13)
  samples:    30 self (0.0%)  /   1120 total (0.7%)
  callers:
    1120  (  100.0%)  RuboCop::Cop::Commissioner#trigger_responding_cops
  callees (1090 total):
     968  (   88.8%)  RuboCop::Cop::RSpec::SubjectStub#on_top_level_group
     122  (   11.2%)  RuboCop::RSpec::TopLevelGroup#top_level_group?
  code:
                                  |    13  |       def on_block(node)
   27    (0.0%) /    27   (0.0%)  |    14  |         return unless respond_to?(:on_top_level_group)
  125    (0.1%) /     3   (0.0%)  |    15  |         return unless top_level_group?(node)
                                  |    16  |
  968    (0.6%)                   |    17  |         on_top_level_group(node)
                                  |    18  |       end

andrykonchin · 2020-07-03T23:12:22Z

Hmm... So looks like this statistic isn't aggregated like I thought before 😓 :

    1737  (    1.9%)  RuboCop::RSpec::TopLevelDescribe#on_send
    1120  (    1.2%)  RuboCop::RSpec::TopLevelGroup#on_block

andrykonchin force-pushed the optimize-performance-nested-groups branch from 07b8610 to 12a45e7 Compare June 28, 2020 23:44

andrykonchin changed the title ~~Performance. RSpec/NestedGroups Optimize #on_top_level_describe callback~~ Performance. RSpec/NestedGroups Jun 29, 2020

pirj approved these changes Jun 29, 2020

View reviewed changes

lib/rubocop/cop/rspec/nested_groups.rb Outdated Show resolved Hide resolved

lib/rubocop/cop/rspec/nested_groups.rb Outdated Show resolved Hide resolved

lib/rubocop/cop/rspec/nested_groups.rb Outdated Show resolved Hide resolved

pirj assigned bquorning and Darhazer and unassigned bquorning and Darhazer Jun 29, 2020

pirj requested review from bquorning and Darhazer June 29, 2020 07:33

andrykonchin force-pushed the optimize-performance-nested-groups branch from 8a2a365 to 397ecb3 Compare June 29, 2020 12:28

andrykonchin requested a review from pirj June 29, 2020 14:39

Darhazer approved these changes Jun 29, 2020

View reviewed changes

Darhazer assigned bquorning Jun 30, 2020

pirj approved these changes Jul 1, 2020

View reviewed changes

lib/rubocop/cop/rspec/nested_groups.rb Outdated Show resolved Hide resolved

andrykonchin requested review from Darhazer and pirj July 1, 2020 17:54

pirj reviewed Jul 1, 2020

View reviewed changes

andrykonchin requested a review from pirj July 1, 2020 18:45

pirj approved these changes Jul 2, 2020

View reviewed changes

bquorning reviewed Jul 2, 2020

View reviewed changes

lib/rubocop/cop/rspec/nested_groups.rb Show resolved Hide resolved

bquorning approved these changes Jul 2, 2020

View reviewed changes

Darhazer approved these changes Jul 2, 2020

View reviewed changes

RSpec/NestedGroups Optimize #on_top_level_describe callback

85b0399

andrykonchin force-pushed the optimize-performance-nested-groups branch from 2af9a29 to 85b0399 Compare July 2, 2020 18:29

bquorning merged commit 5551988 into rubocop:master Jul 2, 2020

andrykonchin deleted the optimize-performance-nested-groups branch July 2, 2020 19:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance. RSpec/NestedGroups #950

Performance. RSpec/NestedGroups #950

andrykonchin commented Jun 28, 2020 •

edited by pirj

pirj left a comment

andrykonchin commented Jun 29, 2020

Darhazer left a comment

Darhazer Jun 29, 2020

andrykonchin Jun 29, 2020 •

edited

andrykonchin Jun 29, 2020

andrykonchin Jun 29, 2020 •

edited

Darhazer Jun 29, 2020

pirj left a comment

pirj Jul 1, 2020

pirj Jul 1, 2020

andrykonchin Jul 1, 2020

andrykonchin Jul 1, 2020 •

edited

pirj Jul 2, 2020

andrykonchin Jul 2, 2020

andrykonchin Jul 2, 2020 •

edited

bquorning Jul 2, 2020

andrykonchin Jul 2, 2020 •

edited

pirj left a comment

Darhazer left a comment

andrykonchin commented Jul 2, 2020

bquorning commented Jul 3, 2020

bquorning commented Jul 3, 2020

andrykonchin commented Jul 3, 2020 •

edited

pirj commented Jul 3, 2020

andrykonchin commented Jul 3, 2020 •

edited

andrykonchin commented Jul 3, 2020 •

edited

Performance. RSpec/NestedGroups #950

Performance. RSpec/NestedGroups #950

Conversation

andrykonchin commented Jun 28, 2020 • edited by pirj

Changes

Performance measurements

Measurements approach

pirj left a comment

Choose a reason for hiding this comment

andrykonchin commented Jun 29, 2020

Darhazer left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andrykonchin Jun 29, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andrykonchin Jun 29, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pirj left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andrykonchin Jul 1, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andrykonchin Jul 2, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andrykonchin Jul 2, 2020 • edited

Choose a reason for hiding this comment

pirj left a comment

Choose a reason for hiding this comment

Darhazer left a comment

Choose a reason for hiding this comment

andrykonchin commented Jul 2, 2020

bquorning commented Jul 3, 2020

bquorning commented Jul 3, 2020

andrykonchin commented Jul 3, 2020 • edited

pirj commented Jul 3, 2020

andrykonchin commented Jul 3, 2020 • edited

andrykonchin commented Jul 3, 2020 • edited

andrykonchin commented Jun 28, 2020 •

edited by pirj

andrykonchin Jun 29, 2020 •

edited

andrykonchin Jun 29, 2020 •

edited

andrykonchin Jul 1, 2020 •

edited

andrykonchin Jul 2, 2020 •

edited

andrykonchin Jul 2, 2020 •

edited

andrykonchin commented Jul 3, 2020 •

edited

andrykonchin commented Jul 3, 2020 •

edited

andrykonchin commented Jul 3, 2020 •

edited