Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perf: Optimize ignoreModule calls in no-cycle #2612

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

DHedgecock
Copy link

@DHedgecock DHedgecock commented Dec 12, 2022

Hello 👋 - Thanks for taking the time to review this PR

Summary

This PR adds a few performance optimizations to the no-cycle rule to make calls to ignoreModule faster.

Details

Lots of details because digging into this was really interesting

I was recently profiling our ESLint suite and the no-cycle rule was responsible for 79% of the total lint time. I found #2348 and attempted to speed up the rule by opting IN to the ignoreExternal option for the rule, but surprisingly that increased our total lint time from ~5min to ~9min which seemed odd 🤔

This comment had a very helpful hint that lead to the issue: #2076 (comment)

When the ignoreExternal option is enabled the ignoreModule will call resolve for each module to determine if it's an external module. I added some counters and found that our repo, which has ~2,000 modules, ended up calling this check ~215,000 times (that number wasn't surprising since each file does a breadth first search of the file's entire dependency tree) so attempting to ignore external modules can actually add a lot of work calling resolve

Optimization

This update optimizes this codepath by adding:

  1. A new early return check for relative path imports
  2. A simple object cache of evaluated modules to short circuit subsequent evaluations

Results

For our medium sized codebase this update:

  1. Reduced total lint time from ~5min to ~2min 15s when we opted in to ignoreExternal
  2. Reduced relative time spent in no-cycle from ~80% to ~42%

Note for these comparisons I disabled all the other rules that use ExportMap to try and ensure rule order and building the export map wasn't variable.

Comment on lines +13 to 23
/**
* Cache of resolved paths for faster subsequent lookups
* @type {{ [importPath: string]: string }}
*/
const resolvedPaths = {};
/**
* Cache of external module evaluations for faster subsequent lookups
* @type {{ [resolvedPath: string]: boolean }}
*/
const externalModules = {};

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these caches need to be cleared in the same place traversed is

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Program:exit hook is called at the exit of each file, so clearing the caches there cut the number of cache hits in our repo by about 50% (although it only added ~45s to our repo test case)

The high level attempt at this update was to quick filter down to "maybe external" modules like 'react', '@reduxjs/toolkit' or '@/some/alias/path' - where those resolved paths can be safely cached across files 🤔

Is that possible to accomplish with any of the existing data structures?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, hmm - you're right, it's less about clearing it after the file and more about clearing it after the "run" - since in editors, eslint can be long-running.

I think that in every file, there could be a different resolver, so the cache would have to be on the resolver and the resolver config and the fully normalized file path, i think?

// import paths are set in resolvedPaths, eg "./index.js", or "../file.js"
// should not be stored because there could be duplicates that resolve to
// different paths
if (!options.ignoreExternal || name[0] === '.') return false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couldn't a resolver choose to configure a relative path as external?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 I suppose so, there's probably some monorepo setup out there like that...

I think removing this assumption means that we can't confidently cache the resolved module path we looked up because then there could be duplicate import paths that resolve to different files (eg import paths of './index.js').

Trying to approach this from a different angle: The current time to determine if a module is "external" is the performance bottleneck and this PR is attempting to short circuit that whenever possible.

Do you have suggestions on an approach?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the no-cycle rule option ignoreExternal could be a boolean or a regex, and if it's a regex it's used for this short circuit?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what the best configuration fields are, but doing a quick test with a regex short circuit has about the same performance improvement as the cache implementation.

Alternative approach tested:

    const ignoreModule = (name) => {
      if (!options.ignoreExternal) return false;

      if (options.ignoreExternalRegex) {
        return options.ignoreExternalRegex.test(name);
      }

      const path = resolve(name, context);

      let isExternal = externalModules[path];
      if (isExternal === undefined) {
        isExternal = isExternalModule(name, path, context);
        externalModules[path] = isExternal;
      }
      
      return isExternal;
    };

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the better question is, why is isExternalModule slow?

Perhaps the node and webpack resolvers can add a cache there, rather than inside the main plugin?

src/rules/no-cycle.js Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Dec 12, 2022

Codecov Report

Base: 95.29% // Head: 95.31% // Increases project coverage by +0.01% 🎉

Coverage data is based on head (f1feb78) compared to base (5a37196).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2612      +/-   ##
==========================================
+ Coverage   95.29%   95.31%   +0.01%     
==========================================
  Files          68       68              
  Lines        2955     2967      +12     
  Branches     1002     1004       +2     
==========================================
+ Hits         2816     2828      +12     
  Misses        139      139              
Impacted Files Coverage Δ
src/rules/no-cycle.js 98.36% <100.00%> (+0.40%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Co-authored-by: Jordan Harband <ljharb@gmail.com>
@ljharb ljharb marked this pull request as draft January 11, 2023 22:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants