New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improving auto correction's API #7868
Conversation
The new PS: I think I have an even cleaner proposal. Will reopen when updated. |
Sounds good to me! Sorry for the radio silence, but I didn't find the time to take a careful look at your proposal. Looking forward to the improved version! |
Updated the code and tweaked the API. I removed the syntax sugar part as it conflicts with the strict rules of RuboCop (no multiple block chaining). I believe it's not actually necessary and both the corrections and the code-to-execute-only-if-enabled can be in the same block. The revised API is thus simpler. I never spelled out the downside of my proposal: it's potentially slower. For example, if RuboCop is run without the |
I'll try to find some time soon to review your proposal more carefully. In the mean time it'd be great if you ran some benchmarks regarding the performance impact. |
Did you have anything in mind? I probably should not have mentioned the theoretical performance impact, because this PR affects only the correction part, which is typically negligible to the inspection part. I think the performance impact will be undetectable: I ran the test suite before and after the changes and got similar speeds (1m12 before vs 1m7 after, no I don't think it's noticeably faster either) |
Question remaining to be answered: Is compatiblity with |
Actually 1.0 is coming up real soon, which is great timing. I propose to include my I will try to finalize this PR (although the proposed API will remain unchanged) and think it would make a great addition to the 1.0 release. |
I don't think so. I can't even remember why we introduced this. Maybe @koic or @jonas054 do. I went over your proposal more carefully and I like it quite a lot overall. I guess we'll gradually be shifting the auto-corrections to the new API as times goes on. The problem with big codebases is that after some point in time it gets really hard to keep the uniform - e.g. we never migrated all the code to the NodePattern, after it was introduced, we never adopted everywhere all the node extensions and so on. |
Well, the It's going to be fun de refactor the existing cops to use the new API 😄; I'll focus first on cleaning up the |
I've made some progress on this. What I've done is modified I have not yet changed the When I'm satisfied with this, the last step would be to have those use the modern API and add warning to the internal legacy API (e.g. There's basically one spec that fails right now that has to do with error handling. Currently, if an error occurs when a Cop analyses a file, it is handled differently than if an error occurs when a Cop autocorrects it. In the former case, it logs the error or reraises it depending on the Does anyone know the reason for this difference in behavior? Is there a potential issue in having the same behavior (i.e. raise only if the |
Likely that's just some oversight.
Probably we can simplify things, as I don't see much reason for having different approaches. We'll also have to document all of this in the development docs, which are pretty light on such details right now. That's all we have on implementing auto-correction https://docs.rubocop.org/en/latest/development/#auto-correct I guess everyone just learns the ropes from other cops. :-) |
Great, thanks for the answer.
Right. I'll update and improve the docs.
Or trying things out to see if it passes or not, like this cop passing |
@kittiphophum18 your comment appears to be a quote of my entire message (I removed it for brevity) |
bce0844
to
6be48c1
Compare
Follow rubocop/rubocop#7868. This PR uses `Cop::Base` API for `Minitest` department's cops.
Follow rubocop/rubocop#7868. The legacy `Cop::Cop` API is soft deprecated and this PR use new `Cop::Base` API instead. > maintain any RuboCop extensions, as the legacy API will be removed in > RuboCop 2.0. https://metaredux.com/posts/2020/10/21/rubocop-1-0.html This new API requires RuboCop 0.87 or higher. https://github.com/rubocop/rubocop/blob/master/CHANGELOG.md#0870-2020-07-06 Denpendency version of `s.add_dependency "rubocop"` does not need to be updated in the gemspec if github#86 has been merged first.
Follow rubocop#7868. This PR uses `Cop::Base` API for `Layout/RedundantLineBreak`.
Follow #7868. This PR uses `Cop::Base` API for `Layout/RedundantLineBreak`.
Follow rubocop/rubocop#7868. The legacy `Cop::Cop` API is soft deprecated and this PR use new `Cop::Base` API instead. > maintain any RuboCop extensions, as the legacy API will be removed in RuboCop 2.0. https://metaredux.com/posts/2020/10/21/rubocop-1-0.html This new API requires RuboCop 0.87 or higher. https://github.com/rubocop/rubocop/blob/master/CHANGELOG.md#0870-2020-07-06 That dependent version update will be resolved in rubocop#7, so there are no updates in this PR.
Follow rubocop/rubocop#7868. The legacy `Cop::Cop` API is soft deprecated and this PR use new `Cop::Base` API instead. > maintain any RuboCop extensions, as the legacy API will be removed in RuboCop 2.0. https://metaredux.com/posts/2020/10/21/rubocop-1-0.html This new API requires RuboCop 0.87 or higher. https://github.com/rubocop/rubocop/blob/master/CHANGELOG.md#0870-2020-07-06 That dependent version update will be resolved in rubocop#7, so there are no updates in this PR.
Follow rubocop/rubocop#7868. The legacy `Cop::Cop` API is soft deprecated and this PR use new `Cop::Base` API instead. > maintain any RuboCop extensions, as the legacy API will be removed in > RuboCop 2.0. https://metaredux.com/posts/2020/10/21/rubocop-1-0.html This new API requires RuboCop 0.87 or higher. https://github.com/rubocop/rubocop/blob/master/CHANGELOG.md#0870-2020-07-06 Denpendency version of `s.add_dependency "rubocop"` does not need to be updated in the gemspec if github#86 has been merged first.
This PR fixes the following build error. ```console Failures: 1) RuboCop::Cop::Performance::ChainArrayAllocation Methods that require an argument first Failure/Error: expect(cop.messages.empty?).to be(false) expected false got true # ./spec/rubocop/cop/performance/chain_array_allocation_spec.rb:39:in `block (3 levels) in <top (required)>' ``` https://circleci.com/gh/rubocop-hq/rubocop-performance/1775 rubocop/rubocop#7868 has detected this wrong test case and this PR fixed it.
Follow rubocop/rubocop#7868. This PR uses `Cop::Base` API for `Performance` department's cops.
This PR fixes the following build error. ```console Failures: 1) RuboCop::Cop::Performance::ChainArrayAllocation Methods that require an argument first Failure/Error: expect(cop.messages.empty?).to be(false) expected false got true # ./spec/rubocop/cop/performance/chain_array_allocation_spec.rb:39:in `block (3 levels) in <top (required)>' ``` https://circleci.com/gh/rubocop-hq/rubocop-performance/1775 rubocop/rubocop#7868 has detected this wrong test case and this PR fixed it.
Follow rubocop/rubocop#7868. This PR uses `Cop::Base` API for `Performance` department's cops.
The autocorrection API was changed in Rubocop v0.87.0 (pull request rubocop/rubocop#7868). Here, I change the superclass of `RuboCop::Cop::RSpec::Cop` from `::RuboCop::Cop::Cop` to `::RuboCop::Cop::Base`. This has a few consequences: - Our `#message` methods get called with a different argument than before. It *can* be customized by defining a `#callback_argument` method, but I think it is clearer to avoid callbacks altogether. - Our `#autocorrect` methods don't get called anymore. Instead, we extend `Autocorrector`, and the `corrector` is being yielded when calling `#add_offense`, so the code is mostly moved in there. For some cases, this means that some code can be removed, which is nice. For some cases, it means that the methods get too long, or the code complexity gets too high, and in those cases I chose to just call out to an `#autocorrect` method anyway, but of course passing the `corrector` and any usable local variables along. This also means we bump the dependency of RuboCop quite a bit, from '>= 0.68.1' to '>= 0.87.0'.
Follow rubocop/rubocop#7868. The legacy `Cop::Cop` API is soft deprecated and this PR use new `Cop::Base` API instead. > maintain any RuboCop extensions, as the legacy API will be removed in > RuboCop 2.0. https://metaredux.com/posts/2020/10/21/rubocop-1-0.html
Follow rubocop/rubocop#7868. The legacy `RuboCop::Cop::Cop` API is soft deprecated and this PR makes custom cop inherit `RuboCop::Cop::Base` API instead. > maintain any RuboCop extensions, as the legacy API > will be removed in RuboCop 2.0. https://metaredux.com/posts/2020/10/21/rubocop-1-0.html I've confirmed that the following behavior is compatible: ```console $ bundle exec util/rubocop -r ./util/cops/deprecations --only Rubygems/Deprecations ```
Follow rubocop/rubocop#7868. The legacy `Cop::Cop` API is soft deprecated and this PR use new `Cop::Base` API instead. > maintain any RuboCop extensions, as the legacy API will be removed in > RuboCop 2.0. https://metaredux.com/posts/2020/10/21/rubocop-1-0.html
The autocorrection API was changed in Rubocop v0.87.0 (pull request rubocop/rubocop#7868). Here, I change the superclass of `RuboCop::Cop::RSpec::Cop` from `::RuboCop::Cop::Cop` to `::RuboCop::Cop::Base`. This has a few consequences: - Our `#message` methods get called with a different argument than before. It *can* be customized by defining a `#callback_argument` method, but I think it is clearer to avoid callbacks altogether. - Our `#autocorrect` methods don't get called anymore. Instead, we extend `Autocorrector`, and the `corrector` is being yielded when calling `#add_offense`, so the code is mostly moved in there. For some cases, this means that some code can be removed, which is nice. For some cases, it means that the methods get too long, or the code complexity gets too high, and in those cases I chose to just call out to an `#autocorrect` method anyway, but of course passing the `corrector` and any usable local variables along. This also means we bump the dependency of RuboCop quite a bit, from '>= 0.68.1' to '>= 0.87.0'.
This PR fixes the following build error. ```console Failures: 1) RuboCop::Cop::Performance::ChainArrayAllocation Methods that require an argument first Failure/Error: expect(cop.messages.empty?).to be(false) expected false got true # ./spec/rubocop/cop/performance/chain_array_allocation_spec.rb:39:in `block (3 levels) in <top (required)>' ``` https://circleci.com/gh/rubocop-hq/rubocop-performance/1775 rubocop/rubocop#7868 has detected this wrong test case and this PR fixed it.
Follow rubocop/rubocop#7868. This PR uses `Cop::Base` API for `Performance` department's cops.
This PR fixes the following build error. ```console Failures: 1) RuboCop::Cop::Performance::ChainArrayAllocation Methods that require an argument first Failure/Error: expect(cop.messages.empty?).to be(false) expected false got true # ./spec/rubocop/cop/performance/chain_array_allocation_spec.rb:39:in `block (3 levels) in <top (required)>' ``` https://circleci.com/gh/rubocop-hq/rubocop-performance/1775 rubocop/rubocop#7868 has detected this wrong test case and this PR fixed it.
Follow rubocop/rubocop#7868. This PR uses `Cop::Base` API for `Performance` department's cops.
The autocorrection API was changed in Rubocop v0.87.0 (pull request rubocop/rubocop#7868). Here, I change the superclass of `RuboCop::Cop::RSpec::Cop` from `::RuboCop::Cop::Cop` to `::RuboCop::Cop::Base`. This has a few consequences: - Our `#message` methods get called with a different argument than before. It *can* be customized by defining a `#callback_argument` method, but I think it is clearer to avoid callbacks altogether. - Our `#autocorrect` methods don't get called anymore. Instead, we extend `Autocorrector`, and the `corrector` is being yielded when calling `#add_offense`, so the code is mostly moved in there. For some cases, this means that some code can be removed, which is nice. For some cases, it means that the methods get too long, or the code complexity gets too high, and in those cases I chose to just call out to an `#autocorrect` method anyway, but of course passing the `corrector` and any usable local variables along. This also means we bump the dependency of RuboCop quite a bit, from '>= 0.68.1' to '>= 0.87.0'.
The autocorrection API was changed in Rubocop v0.87.0 (pull request rubocop/rubocop#7868). Here, I change the superclass of `RuboCop::Cop::RSpec::Cop` from `::RuboCop::Cop::Cop` to `::RuboCop::Cop::Base`. This has a few consequences: - Our `#message` methods get called with a different argument than before. It *can* be customized by defining a `#callback_argument` method, but I think it is clearer to avoid callbacks altogether. - Our `#autocorrect` methods don't get called anymore. Instead, we extend `Autocorrector`, and the `corrector` is being yielded when calling `#add_offense`, so the code is mostly moved in there. For some cases, this means that some code can be removed, which is nice. For some cases, it means that the methods get too long, or the code complexity gets too high, and in those cases I chose to just call out to an `#autocorrect` method anyway, but of course passing the `corrector` and any usable local variables along. This also means we bump the dependency of RuboCop quite a bit, from '>= 0.68.1' to '>= 0.87.0'.
I am writing my first cop and I feel the current
autocorrect
API could be improved.This PR is just a demo and not meant for merging. A few cops were refactored using the proposed API.
When to specify corrections
My main observation is that the very best moment to do an autocorrection is when the offense is detected. Currently the autocorrection and the offense detection are (officially) decoupled. When an offense is detected, we know exactly what it is (location, type, etc.) yet we are unable to make the correction there and then.
When we are later asked to make the autocorrection, we are passed the offending "node". All the state that we might have created to analyze the offense has to be rebuilt at that point, and if we detect different style of offenses, we even have to deduct again which offense this is and correct accordingly.
Current workarounds
Some cops simply redo the detection work twice
Some cops resort to "cheats" for the current state, like AccessModifierIndentation temporarily storing
@column_delta
. This works only provided that eachautocorrect
calls are made after eachadd_offense
. Also, a naive approach wouldn't work reliably:Overly complex de-dupping
The processing currently has two de-duping being done.
Firstly, extra offenses on the same range are ignored. I can see how this could be useful for certain cops.
Secondly, if offenses are on different ranges but the same node, the autocorrector is called only for the first one. The only builtin cop that uses this one is
SpaceInsideReferenceBrackets
: it will add one or two offenses on the opening'['
and/or closing']'
, and autocorrect will correct the node (detecting again which of the opening or closing brackets need to be corrected) and relies on the dedupping not to double-correct if both the opening and closing brackets were wrong. Clearly, this is an example where this de-dupping isn't needed but instead the correction should be specifiable when detected...That dedupping is the only reason I can see why the API of
add_offense
officially asks for aNode
and optionally alocation
. There's no good reason why an offenses would be attached to a node rather than a range. Indeed a few pass a range instead (like EmptyLineAfterMagicComment) and cheat by repeatinglocation: <same location>
.I propose that the newer API simply asks for a range (or node/comment and uses its
loc.expression
) and dedupes only on the range.Alternative to returning lambdas / false
Finally I personally dislike returning lambdas. Ruby makes it trivial to capture blocks which feels to me a much more natural way to proceed.
It also wasn't clear at first that
autocorrect
can also returnfalse
when no correction can be made. This complicates processing a bit... Before you want to autocorrect, you have to check that you can autocorrect.Moreover, there is also redundancy and it permits a cop writer to return a lambda that nevertheless wouldn't make any correction.
This trouble should be avoided and deduced automatically. Technically the issue is that we would need a way to build the corrections, check if there are any, and only later merge them.
In this "demo" I use a hack. I simply run the block twice, first on a fake corrector that simply records if it is called and the usual processing will call it later on Parser's
TreeRewriter
. The plan would be not to use a hack but I would instead add the merging functionality intoTreeRewriter
(I am its author btw). Before I spend time doing it though, I'd like to see if an agreement can reached on theadd_offense
API.Proposed API - simple case
I would propose an improved (and fully compatible) API.
Post processing
One complication is that
add_offense
already yields (unless the offense is disabled) for some rare cases where post processing is needed. I noticed that 5% of the builtin cops use this post processing vs 45% that do autocorrections; most with post processing also have autocorrection. There are many solutions to this, but the simple case is simply to use the same block:Compatiblity
What I'm proposing should be compatible with existing cops.
The new API is opt-in for cops that decide to
include Autocorrector
. I'm proposing that thelocation:
key isn't supported in the new API (pass the range directly instead of the node in those case).It's not clear to what extent compatibility is required, in particular with
Cop#corrections
array. I wrote aCorrectionsProxy
so thatcorrections
could still be used. Actually the rest of the code still uses it for now (Team
, ...), this seemed like a good way to test compatiblity. I intend on cleaning things up though so that the core doesn't useCop#corrections
...