New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Minor optimizations #8105
Conversation
line_numbers.uniq | ||
end | ||
|
||
def _each_descendant(node, *types, &block) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are this and the next method used at all?
@@ -142,10 +148,15 @@ def find_location(node, loc) | |||
loc.is_a?(Symbol) ? node.loc.public_send(loc) : loc | |||
end | |||
|
|||
# TODO: remove - it's unused |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cop::Cop
is being heavily refactored, all these changes will conflict, but thanks for reminding me to optimize duplicate_location?
as Source::Range
can now be used as hash keys / set elements
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marcandre This optimization can significantly decrease execution time in some cases.
There is a large rubocop.todo.yml
in one of my projects. If run Rubocop without config (so there are thousands of errors) this optimization decrease the time from 20 min to 5.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree this was a bad implementation (O(n^2)
where n
is the number of offenses).
You can try with def duplicate_location?(range); !@current_offense_locations.add?(range); end
and adding @current_offense_locations = Set.new
in the constructor, if you want, you'll also be in the 5 minutes range... This is the implementation I'm using in my branch #7868
@@ -213,7 +213,7 @@ def excess_range(uri_range, line, line_index) | |||
end | |||
|
|||
def max | |||
cop_config['Max'] | |||
@max ||= cop_config['Max'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I highly doubt this makes much of a difference, as cop_config
is already cached (as it should) and the rest is just a hash lookup with a frozen string literal which is highly optimized.
@@ -0,0 +1,17 @@ | |||
module RuboCop |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, building a tokens index should make a good difference. I'm curious if we really need to use tokens though. If we do, the best would be to share the cache with a Force
, even though they haven't really be designed for this.
5-10% are not little. :-) |
@marcandre Thank you for your review. Will come back to this PR soon. |
Addresses the issue #8022
NOTE I didn't analyze the optimizations deeply so some of them can be occasional and will be dropped. Some code can be too quick and straightforward so will be polished and rewritten later.
Profiled Rubocop with stackprof and ran against following Ruby projects:
Rubocop was run with disabled cache and ignored
rubocop.yml
config of selected Ruby projectsCurrent result
Despite the largest bottlenecks were optimized the general effect is not so big and total time of
rubocop
command is decreased only by 5-10%.Before submitting the PR make sure the following are checked:
[Fix #issue-number]
(if the related issue exists).master
(if not - rebase it).bundle exec rake default
. It executes all tests and RuboCop for itself, and generates the documentation.