Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add autocorrect functionality #944

Open
devonestes opened this issue Jan 31, 2022 · 4 comments
Open

Add autocorrect functionality #944

devonestes opened this issue Jan 31, 2022 · 4 comments

Comments

@devonestes
Copy link
Contributor

This issue is to discuss how we can add autocorrect functionality to Credo.

What is it?

Autocorrect would enable Credo to programatically "correct" issues found in a given file
instead of only telling the user what can be improved and how to improve it. One example
would be that it would automatically change foo |> bar(baz) to bar(foo, baz) without
the user having to manually make that change if the single pipe check is enabled.

How it might work

Autocorrect would be disabled on all checks by default, so mix credo would run exactly
as it does today. Autocorrect would be implemented on a per-check basis, since some
checks (like cyclomatic complexity) cannot be autocorrected.

If a user wanted to run autocorrect on all checks that support it, they can run
mix credo --autocorrect and that will run Credo and autocorrect everything that can
be autocorrected.

A user could also enable autocorrect for certain checks by passing an argument to
that flag like mix credo --autocorrect Credo.Check.Refactor.AppendSingleItem, or
by adding autocorrect: true to their config file like so:
{Credo.Check.Consistency.ParameterPatternMatching, autocorrect: true}.

Implementation ideas

My thinking for a general sketch of an implementation would be something along
the lines of this:

  1. We add a new autocorrect/1 callback to the Credo.Check behavior. This
    callback would either accept an AST and return an AST, or accept a String and
    return a String (depending on the check) which corresponds to the AST node
    or fragment of the file which triggered the issue. The autocorrect/1 function
    will return the "corrected" string or AST for the given check. The default for this
    callback will be to simply return what it is given (a noop which does no
    autocorrecting).
  2. For all issues, this autocorrect function is called and the return values of
    these functions are used as replacements for the original values in the file
    or AST which triggered it.
  3. Somehow we find a way to stitch these replacements together in the original
    file 😄

Workflow ideas

To validate this approach, I'd think we would maybe want to pick 2-3 checks as a
starting point (ideally ones that can happen multiple times per file) and implement
just these 2-3. Ideally we'd pick one that is fairly straightforward, like
Credo.Check.Readability.SinglePipe, Credo.Check.Readability.StringSigils, or
Credo.Check.Readability.TrailingBlankLine, and ideally one that uses an AST
for the autocorrect and one that uses a string as the autocorrect, so we can try
both of those.

@rrrene
Copy link
Owner

rrrene commented Jan 31, 2022

First, thank you very much for putting this together. As I wrote on Twitter, it would be amazing to have this! 🎉

I hope neither you nor anyone else reading this is discouraged by the rest of this response. Take this wall-of-text as a compliment for taking on a great endeavour.

Why has nobody done this, yet?

The biggest question from a maintainer side: Why hasn't this been developed as a plugin? Credo has the necessary plugin APIs since version 1.1 (add an --autofix switch from a plugin and add a PerformAutoFixes task to the execution pipeline).

Several people, myself included, have since tried to come up with an experimental plugin to do this. I am guessing the others were as successful as I was in bringing it to fruition.

Some people have pointed out that we weren't able to implement something like Autofix before Elixir 1.13 because we did not have Code.Fragment, but this is not the case: Credo's analysis has never worked with incomplete code thus all issues ever produced by Credo were from files with complete code. So the ability to "analyse" incomplete code does not change anything (but I might be wrong, please correct me if I misunderstood the significance of Elixir 1.13 regarding a potential Credo Autofix feature).

So what do we need?

Here's a quick rundown of the things we need to consider (off the top of my head):

  • Tranformer API: We need to be able to transform the AST surrounding an issue (and define what "surrounding" means). We need a declarative way to define these code transformations, i.e. to compose queries/changesets reminiscent of those in Ecto (e.g. "switch the contents of the do and else blocks and change the unless to an if.").
  • Elixir compatibility: We need to think about upwards/downwards compatibility for this feature/plugin. It is perfectly fine if this only works for Elixir > 1.13, but it also has to work for all newer versions, which means that the Transformer APIs have to support multiple Elixir versions in just the same way Credo.Code.Token supports different Elixir versions (where :elixir_tokenizer returns different things, but Credo.Code.Token works with all of them). Also: should we consider cases, where Credo runs with another Elixir version than the code it is analysing (in an Editor plugin for example)?
  • Extensibility: What are scenarios, where we could autofix something. if we knew what the user wanted? How would we control and define these cases (e.g. "I know @moduledoc is missing in these 12 modules, just put @moduledoc false everywhere").
  • Autofix Race Conditions: Multiple checks do their auto-fixing subsequently, which means that Check2 autofixes code after Check1, which can result in a location change for issues resulting from Check2. Re-running the complete analysis after every fix is probably not a satisfying solution here.
  • Issue Invalidation: One check's autofix might even fix/change another check's issue (e.g. an autofixer renaming functions/variables might fix a MaxLineLength issue).

(This list might very well be incomplete as it is a bit late and I have an early day tomorrow. Please add your thoughts in the comments below!)

Where do we go from here?

My biggest concern as a steward of this project is that this feature is a very shiny, high profile thing to work on for a weekend or a month, but if we do not nail the APIs, corner cases, upwards compatibility support etc. it will be a pain in the ?%$ to support for the coming years.

I would suggest that we work on this as a proof-of-concept plugin. If there are any features we need from Credo for this to work, we can probably add them without compromising Credo's core (e.g. if the Credo.SourceFile struct would need another field).

Just like .heex templates in Phoenix came from another, complementary project (Surface), before they were integrated into Phoenix's core, we should take the chance to validate this idea on its own, without the burden of implementing a perfect, universally workable solution on the first try.

Let me close by saying: I love this community and I would be delighted if we could work on this together. ✌️

@NickNeck
Copy link
Contributor

NickNeck commented Feb 1, 2022

I have started to playing around with the package doorgan/sourceror to reformat code that would raise issues in credo. Maybe that could give some inspiration. The repo can be found at hrzndhrn/recode. So far recode can fix single-pipes, expand aliases and ads brackets to one arity functions in pipes. The package includes also a very early approach of an credo-plugin (my first try to write a plugin for credo, starting an hour ago).

There is also an approach for a refactoring task to rename functions. But this is much more experimental as the other stuff.

@devonestes
Copy link
Contributor Author

Thanks for the really good rundown of what we'll need for this. I had thought of a few of these before as well, but not all of them, so thanks for bringing this up. Here's what I have so far:

  • Tranformer API - I think this is well covered by the autocorrect callback in the Credo.Check behaviour. This function would be responsible for determining how a given AST node or string should be corrected based on the given data. I'm not sure we'd need to be able to compose them in a changeset-like functionality (although I must admit, that would be very cool). That idea reminds me of the delta-elixir library, but that seems to me like more than is needed for this problem since race conditions aren't a necessary problem to solve with autofixing like it is with real-time collaborative file editing (more on that later 😄).
  • Elixir compatibility - This seems (to me at least) to be an implementation detail to be dealt with in each autocorrect callback and not project-wide. If the implementation of a given autocorrect callback for a specific check were to depend on something that is only available in Elixir 1.13, then that would affect its ability to run. I'd imagine that there may be cases where the implementation of autocorrect would contain something like if function_exported?(m, f, a) do, and in those cases the else would just be a noop if the necessary core functions from Elixir weren't available. Ideally we would implement these in such a way that they could be used on the widest range of Elixir versions possible, but for some you really do need the information about comments that previously wasn't included in the AST if you want to make significant changes to a file without potentially losing any comments around the code being changed. Maybe it can just be clear with a message in the console or something that autocorrect behavior for certain callbacks is only available in certain versions of Elixir if someone is running autocorrect on an Elixir version that wouldn't support a given check's autocorrect function?
  • Extensibility - I hadn't thought about this, but I think allowing users to configure their own autocorrect functions that override the default for a given check in their .credo.exs files should be sufficient. It would still take and return the same arguments, and the ability to override the default by user configuration is already well baked into Credo.
  • Autofix Race Conditions - This is part of why I think this needs to be part of Credo core - autocorrect would likely require a different executor/runner for Credo. We can safely autofix any number of files in parallel, but for each file we would need to run checks sequentially and correct them as each issue is found. This is of course very different from how Credo executes at the moment, but I can't see a reasonable way of safely and accurately implementing autocorrect without running checks on a given file in series instead of in parallel. I'm sure given enough time and effort it could be done, but it would increase the complexity by at least an order of magnitude and I'm not sure would offer any real benefits to users as files would still be autocorrected in parallel. Also, since consistency checks require looking at the project as a whole, those would likely need to be run separately and wouldn't be autocorrected, or at least not at first. I can think of ways to make that work, but I'm not sure it would offer a whole ton of benefit given the complexity that solving that problem would involve. It would really only benefit folks who are adding Credo to an existing application and had dozens of places where the consistency check would find an issue, and that's a fairly rare case.
  • Issue Invalidation - If we do autofixing for a given file in series, then this isn't a problem as the issue wouldn't be raised if it had previously been resolved.

@rrrene
Copy link
Owner

rrrene commented Feb 13, 2022

I still have difficulties wrapping my head around the question "What is the next actionable step?" here.

Here's a Gist with standard checks that could be auto-fixable (including those fixed by Elixir's formatter): https://gist.github.com/rrrene/487a4362873db1fe93fe55b8df27fb28

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants