Add autocorrect functionality #944

devonestes · 2022-01-31T18:30:55Z

This issue is to discuss how we can add autocorrect functionality to Credo.

What is it?

Autocorrect would enable Credo to programatically "correct" issues found in a given file
instead of only telling the user what can be improved and how to improve it. One example
would be that it would automatically change foo |> bar(baz) to bar(foo, baz) without
the user having to manually make that change if the single pipe check is enabled.

How it might work

Autocorrect would be disabled on all checks by default, so mix credo would run exactly
as it does today. Autocorrect would be implemented on a per-check basis, since some
checks (like cyclomatic complexity) cannot be autocorrected.

If a user wanted to run autocorrect on all checks that support it, they can run
mix credo --autocorrect and that will run Credo and autocorrect everything that can
be autocorrected.

A user could also enable autocorrect for certain checks by passing an argument to
that flag like mix credo --autocorrect Credo.Check.Refactor.AppendSingleItem, or
by adding autocorrect: true to their config file like so:
{Credo.Check.Consistency.ParameterPatternMatching, autocorrect: true}.

Implementation ideas

My thinking for a general sketch of an implementation would be something along
the lines of this:

We add a new autocorrect/1 callback to the Credo.Check behavior. This
callback would either accept an AST and return an AST, or accept a String and
return a String (depending on the check) which corresponds to the AST node
or fragment of the file which triggered the issue. The autocorrect/1 function
will return the "corrected" string or AST for the given check. The default for this
callback will be to simply return what it is given (a noop which does no
autocorrecting).
For all issues, this autocorrect function is called and the return values of
these functions are used as replacements for the original values in the file
or AST which triggered it.
Somehow we find a way to stitch these replacements together in the original
file 😄

Workflow ideas

To validate this approach, I'd think we would maybe want to pick 2-3 checks as a
starting point (ideally ones that can happen multiple times per file) and implement
just these 2-3. Ideally we'd pick one that is fairly straightforward, like
Credo.Check.Readability.SinglePipe, Credo.Check.Readability.StringSigils, or
Credo.Check.Readability.TrailingBlankLine, and ideally one that uses an AST
for the autocorrect and one that uses a string as the autocorrect, so we can try
both of those.

The text was updated successfully, but these errors were encountered:

rrrene · 2022-01-31T22:07:13Z

First, thank you very much for putting this together. As I wrote on Twitter, it would be amazing to have this! 🎉

I hope neither you nor anyone else reading this is discouraged by the rest of this response. Take this wall-of-text as a compliment for taking on a great endeavour.

Why has nobody done this, yet?

The biggest question from a maintainer side: Why hasn't this been developed as a plugin? Credo has the necessary plugin APIs since version 1.1 (add an --autofix switch from a plugin and add a PerformAutoFixes task to the execution pipeline).

Several people, myself included, have since tried to come up with an experimental plugin to do this. I am guessing the others were as successful as I was in bringing it to fruition.

Some people have pointed out that we weren't able to implement something like Autofix before Elixir 1.13 because we did not have Code.Fragment, but this is not the case: Credo's analysis has never worked with incomplete code thus all issues ever produced by Credo were from files with complete code. So the ability to "analyse" incomplete code does not change anything (but I might be wrong, please correct me if I misunderstood the significance of Elixir 1.13 regarding a potential Credo Autofix feature).

So what do we need?

Here's a quick rundown of the things we need to consider (off the top of my head):

Tranformer API: We need to be able to transform the AST surrounding an issue (and define what "surrounding" means). We need a declarative way to define these code transformations, i.e. to compose queries/changesets reminiscent of those in Ecto (e.g. "switch the contents of the do and else blocks and change the unless to an if.").
Elixir compatibility: We need to think about upwards/downwards compatibility for this feature/plugin. It is perfectly fine if this only works for Elixir > 1.13, but it also has to work for all newer versions, which means that the Transformer APIs have to support multiple Elixir versions in just the same way Credo.Code.Token supports different Elixir versions (where :elixir_tokenizer returns different things, but Credo.Code.Token works with all of them). Also: should we consider cases, where Credo runs with another Elixir version than the code it is analysing (in an Editor plugin for example)?
Extensibility: What are scenarios, where we could autofix something. if we knew what the user wanted? How would we control and define these cases (e.g. "I know @moduledoc is missing in these 12 modules, just put @moduledoc false everywhere").
Autofix Race Conditions: Multiple checks do their auto-fixing subsequently, which means that Check2 autofixes code after Check1, which can result in a location change for issues resulting from Check2. Re-running the complete analysis after every fix is probably not a satisfying solution here.
Issue Invalidation: One check's autofix might even fix/change another check's issue (e.g. an autofixer renaming functions/variables might fix a MaxLineLength issue).

(This list might very well be incomplete as it is a bit late and I have an early day tomorrow. Please add your thoughts in the comments below!)

Where do we go from here?

My biggest concern as a steward of this project is that this feature is a very shiny, high profile thing to work on for a weekend or a month, but if we do not nail the APIs, corner cases, upwards compatibility support etc. it will be a pain in the ?%$ to support for the coming years.

I would suggest that we work on this as a proof-of-concept plugin. If there are any features we need from Credo for this to work, we can probably add them without compromising Credo's core (e.g. if the Credo.SourceFile struct would need another field).

Just like .heex templates in Phoenix came from another, complementary project (Surface), before they were integrated into Phoenix's core, we should take the chance to validate this idea on its own, without the burden of implementing a perfect, universally workable solution on the first try.

Let me close by saying: I love this community and I would be delighted if we could work on this together. ✌️

NickNeck · 2022-02-01T20:18:09Z

I have started to playing around with the package doorgan/sourceror to reformat code that would raise issues in credo. Maybe that could give some inspiration. The repo can be found at hrzndhrn/recode. So far recode can fix single-pipes, expand aliases and ads brackets to one arity functions in pipes. The package includes also a very early approach of an credo-plugin (my first try to write a plugin for credo, starting an hour ago).

There is also an approach for a refactoring task to rename functions. But this is much more experimental as the other stuff.

devonestes · 2022-02-02T09:08:32Z

Thanks for the really good rundown of what we'll need for this. I had thought of a few of these before as well, but not all of them, so thanks for bringing this up. Here's what I have so far:

Tranformer API - I think this is well covered by the autocorrect callback in the Credo.Check behaviour. This function would be responsible for determining how a given AST node or string should be corrected based on the given data. I'm not sure we'd need to be able to compose them in a changeset-like functionality (although I must admit, that would be very cool). That idea reminds me of the delta-elixir library, but that seems to me like more than is needed for this problem since race conditions aren't a necessary problem to solve with autofixing like it is with real-time collaborative file editing (more on that later 😄).
Elixir compatibility - This seems (to me at least) to be an implementation detail to be dealt with in each autocorrect callback and not project-wide. If the implementation of a given autocorrect callback for a specific check were to depend on something that is only available in Elixir 1.13, then that would affect its ability to run. I'd imagine that there may be cases where the implementation of autocorrect would contain something like if function_exported?(m, f, a) do, and in those cases the else would just be a noop if the necessary core functions from Elixir weren't available. Ideally we would implement these in such a way that they could be used on the widest range of Elixir versions possible, but for some you really do need the information about comments that previously wasn't included in the AST if you want to make significant changes to a file without potentially losing any comments around the code being changed. Maybe it can just be clear with a message in the console or something that autocorrect behavior for certain callbacks is only available in certain versions of Elixir if someone is running autocorrect on an Elixir version that wouldn't support a given check's autocorrect function?
Extensibility - I hadn't thought about this, but I think allowing users to configure their own autocorrect functions that override the default for a given check in their .credo.exs files should be sufficient. It would still take and return the same arguments, and the ability to override the default by user configuration is already well baked into Credo.
Autofix Race Conditions - This is part of why I think this needs to be part of Credo core - autocorrect would likely require a different executor/runner for Credo. We can safely autofix any number of files in parallel, but for each file we would need to run checks sequentially and correct them as each issue is found. This is of course very different from how Credo executes at the moment, but I can't see a reasonable way of safely and accurately implementing autocorrect without running checks on a given file in series instead of in parallel. I'm sure given enough time and effort it could be done, but it would increase the complexity by at least an order of magnitude and I'm not sure would offer any real benefits to users as files would still be autocorrected in parallel. Also, since consistency checks require looking at the project as a whole, those would likely need to be run separately and wouldn't be autocorrected, or at least not at first. I can think of ways to make that work, but I'm not sure it would offer a whole ton of benefit given the complexity that solving that problem would involve. It would really only benefit folks who are adding Credo to an existing application and had dozens of places where the consistency check would find an issue, and that's a fairly rare case.
Issue Invalidation - If we do autofixing for a given file in series, then this isn't a problem as the issue wouldn't be raised if it had previously been resolved.

rrrene · 2022-02-13T11:45:13Z

I still have difficulties wrapping my head around the question "What is the next actionable step?" here.

Here's a Gist with standard checks that could be auto-fixable (including those fixed by Elixir's formatter): https://gist.github.com/rrrene/487a4362873db1fe93fe55b8df27fb28

rrrene added Discussion Level: Advanced Status: help wanted labels Jan 31, 2022

devonestes mentioned this issue Jul 27, 2022

Initial spike on Autocorrect #985

Open

4 tasks

esambo mentioned this issue Aug 28, 2023

More context adobe/elixir-styler#68

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add autocorrect functionality #944

Add autocorrect functionality #944

devonestes commented Jan 31, 2022

rrrene commented Jan 31, 2022 •

edited

NickNeck commented Feb 1, 2022

devonestes commented Feb 2, 2022

rrrene commented Feb 13, 2022

Add autocorrect functionality #944

Add autocorrect functionality #944

Comments

devonestes commented Jan 31, 2022

What is it?

How it might work

Implementation ideas

Workflow ideas

rrrene commented Jan 31, 2022 • edited

Why has nobody done this, yet?

So what do we need?

Where do we go from here?

NickNeck commented Feb 1, 2022

devonestes commented Feb 2, 2022

rrrene commented Feb 13, 2022

rrrene commented Jan 31, 2022 •

edited