Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Minor optimizations #8105

Closed
wants to merge 15 commits into from

Conversation

andrykonchin
Copy link
Contributor

Addresses the issue #8022

NOTE I didn't analyze the optimizations deeply so some of them can be occasional and will be dropped. Some code can be too quick and straightforward so will be polished and rewritten later.

Profiled Rubocop with stackprof and ran against following Ruby projects:

  • Rails
  • Chef
  • Puppet
  • aws-ruby-sdk
  • Rubocop

Rubocop was run with disabled cache and ignored rubocop.yml config of selected Ruby projects

exe/rubocop --force-default-config --cache false --out rubocop.out <DIR>

Current result

Despite the largest bottlenecks were optimized the general effect is not so big and total time of rubocop command is decreased only by 5-10%.


Before submitting the PR make sure the following are checked:

  • Wrote good commit messages.
  • Commit message starts with [Fix #issue-number] (if the related issue exists).
  • Feature branch is up-to-date with master (if not - rebase it).
  • Squashed related commits together.
  • Added tests.
  • Added an entry to the Changelog if the new code introduces user-observable changes. See changelog entry format.
  • The PR relates to only one subject with a clear title and description in grammatically correct, complete sentences.
  • Run bundle exec rake default. It executes all tests and RuboCop for itself, and generates the documentation.

@andrykonchin andrykonchin marked this pull request as draft June 6, 2020 22:57
@andrykonchin andrykonchin changed the title [DRAFT][WIP] Minor optimizations [WIP] Minor optimizations Jun 6, 2020
line_numbers.uniq
end

def _each_descendant(node, *types, &block)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are this and the next method used at all?

@@ -142,10 +148,15 @@ def find_location(node, loc)
loc.is_a?(Symbol) ? node.loc.public_send(loc) : loc
end

# TODO: remove - it's unused
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cop::Cop is being heavily refactored, all these changes will conflict, but thanks for reminding me to optimize duplicate_location? as Source::Range can now be used as hash keys / set elements

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marcandre This optimization can significantly decrease execution time in some cases.

There is a large rubocop.todo.yml in one of my projects. If run Rubocop without config (so there are thousands of errors) this optimization decrease the time from 20 min to 5.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this was a bad implementation (O(n^2) where n is the number of offenses).
You can try with def duplicate_location?(range); !@current_offense_locations.add?(range); end and adding @current_offense_locations = Set.new in the constructor, if you want, you'll also be in the 5 minutes range... This is the implementation I'm using in my branch #7868

@@ -213,7 +213,7 @@ def excess_range(uri_range, line, line_index)
end

def max
cop_config['Max']
@max ||= cop_config['Max']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I highly doubt this makes much of a difference, as cop_config is already cached (as it should) and the rest is just a hash lookup with a frozen string literal which is highly optimized.

@@ -0,0 +1,17 @@
module RuboCop
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, building a tokens index should make a good difference. I'm curious if we really need to use tokens though. If we do, the best would be to share the cache with a Force, even though they haven't really be designed for this.

@bbatsov
Copy link
Collaborator

bbatsov commented Jun 7, 2020

Despite the largest bottlenecks were optimized the general effect is not so big and total time of rubocop command is decreased only by 5-10%.

5-10% are not little. :-)

@andrykonchin
Copy link
Contributor Author

@marcandre Thank you for your review. Will come back to this PR soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants