Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store information for files with lint warnings and errors in cache #9948

Closed
galvarez421 opened this issue Feb 6, 2018 · 9 comments · Fixed by #10571
Closed

Store information for files with lint warnings and errors in cache #9948

galvarez421 opened this issue Feb 6, 2018 · 9 comments · Fixed by #10571
Labels
accepted There is consensus among the team that this change meets the criteria for inclusion archived due to age This issue has been archived; please open a new issue for any further discussion cli Relates to ESLint's command-line interface enhancement This change enhances an existing feature of ESLint

Comments

@galvarez421
Copy link

The version of ESLint you are using.
4.15.0

The problem you want to solve.
Make cache feature (and its corresponding documentation) more useful and intuitive. Currently, the cache feature does not seem as useful or intuitive as it might be in that only information for files that pass linting with no warnings or errors is saved in the cache. I wonder if the cache feature could offer even more of a performance benefit if the cache were to store information for files with lint warnings or errors as well such that they don't have to run through a linting process if they haven't been changed since they were last linted with the cache feature enabled?

Your take on the correct solution to problem.
Perhaps we can extend the usage of the cache such that information for files with lint warnings or errors is also saved to the cache? @IanVS, following up on this comment

@eslint-deprecated eslint-deprecated bot added the triage An ESLint team member will look at this issue soon label Feb 6, 2018
@ilyavolodin ilyavolodin added enhancement This change enhances an existing feature of ESLint core Relates to ESLint's core APIs and features evaluating The team will evaluate this issue to decide whether it meets the criteria for inclusion and removed triage An ESLint team member will look at this issue soon labels Feb 6, 2018
@ilyavolodin
Copy link
Member

That sounds like a reasonable enhancement. I would be a bit worried about files with lining errors/warning with --fix flag and caching, since I think caching will happen before autofixes are applied.

@not-an-aardvark
Copy link
Member

I'm concerned that this would make cache files quite large, because formatters need to be provided with the original source text of a file. If a cache file needed to maintain that information, it could effectively be the same size as the all of the javascript files combined. (However, maybe this could be mitigated by omitting the source text of a file from the results object, since we can just reread it from the filesystem if we know it hasn't changed).

@platinumazure
Copy link
Member

However, maybe this could be mitigated by omitting the source text of a file from the results object, since we can just reread it from the filesystem if we know it hasn't changed

Agreed. We can still get a huge win here if we replace the full linting process with a read-and-report process for unchanged files with errors.

Full lint process (done on any file with errors even after linting once):

  1. Read file from filesystem
  2. Parse file contents
  3. Prepare to lint (build selector events, set up scope analysis and visitor keys, etc.)
  4. Traverse file and invoke visitors to lint file
  5. Calculate output by applying fixes
  6. Write output to filesystem
  7. Cache results
  8. Call formatter

Reading cached file results and reporting errors (assuming file mtime is earlier than cache write time so no need to re-lint):

  1. Read cache to see that errors exist
  2. Read file from filesystem
  3. Call formatter with cached errors

My feeling is that this could be a massive benefit to someone doing a large refactor or introducing a new linting rule into their codebase. And it would be a small benefit to people who lint the codebase frequently even if they're only changing one file at a time.

@galvarez421
Copy link
Author

@ilyavolodin If auto fix occurs after results for a file have been cached, wouldn't that cause the file to be considered newer than the cache during the next lint and thereby cause the cache to be ignored for that file? In other words, wouldn't the cache process still work as expected (although maybe the timing of the autofix relative to the caching wouldn't)? Or are you concerned not about caching not working as expected but rather about the caching not being as effective as it could be due to the timing of autofixes? Based on the description by @platinumazure of the lint process, it seems that autofixes are applied before the results are outputted and cached, so maybe this is a non-issue.

@platinumazure
Copy link
Member

platinumazure commented Feb 6, 2018 via email

@platinumazure
Copy link
Member

I'm finally digging into this. At this point, it does look like we write to the cache after the file is linted and autofixed, so I think we should be able to cache files which failed linting (though we may need to remove the full source text from the file entry cache, as discussed above).

@platinumazure platinumazure added cli Relates to ESLint's command-line interface and removed core Relates to ESLint's core APIs and features labels Jul 6, 2018
@platinumazure
Copy link
Member

Relabeling as "cli" based on previous CLIEngine issues (which don't have to do with linting/fixing) being labeled as "cli".

platinumazure added a commit that referenced this issue Jul 11, 2018
* Chore: Extract current cache logic into lint-result-cache module

* Chore: Moved config hash validation to LintResultCache

* Chore: Added tests for lint-result-cache

* Chore: Small cleanup

* Chore: Removing unnecessary comments

* Chore: Remove unnecessary test fixture file

* Chore: options.cache to this.options.cache
@platinumazure platinumazure added the tsc agenda This issue will be discussed by ESLint's TSC at the next meeting label Jul 12, 2018
@platinumazure
Copy link
Member

platinumazure commented Jul 12, 2018

TSC Summary:

eslint --cache currently only caches files that passed linting. This is useful for users who have fully integrated ESLint into their codebase, so most of their files should be passing at any given time. However, for users that have not yet fully integrated ESLint (i.e., users who are still cleaning up lint errors), the cache becomes useless because most files are not passing at any given time, and we currently don't cache files which failed linting. It was suggested that files which failed linting could be cached, as long as we check that they haven't been modified between the cache time and the next lint run (in other words, following the same logic we follow for successfully linted files in the cache).

One challenge is that we currently include the entire source code of files with linting problems in the lint results object. This could result in the cache file growing significantly if there are a lot of files which failed linting. A possible approach to avoid this is to remove the source property when caching the lint results, and then when retrieving results from the cache, simply read the file from the filesystem again. This will slow the overall cache performance for files that failed linting, but still be much faster than reading, parsing, and linting the file again. (PR #10571 follows this approach.)

TSC Question:

Should we add files which failed linting to the lint results cache?

@platinumazure platinumazure added accepted There is consensus among the team that this change meets the criteria for inclusion and removed evaluating The team will evaluate this issue to decide whether it meets the criteria for inclusion tsc agenda This issue will be discussed by ESLint's TSC at the next meeting labels Jul 19, 2018
@platinumazure
Copy link
Member

This issue was accepted in today's TSC meeting.

not-an-aardvark pushed a commit that referenced this issue Jul 21, 2018
* Chore: Extract current cache logic into lint-result-cache module

* Chore: Moved config hash validation to LintResultCache

* Chore: Added tests for lint-result-cache

* Chore: Small cleanup

* Update: Cache files that failed linting (fixes #9948)

* Chore: Remove unused "removeEntry" API from LintResultCache

* Ensure empty source is handled correctly

* Remove unnecessary test fixture

* Don't cache files with output property
@eslint-deprecated eslint-deprecated bot locked and limited conversation to collaborators Jan 18, 2019
@eslint-deprecated eslint-deprecated bot added the archived due to age This issue has been archived; please open a new issue for any further discussion label Jan 18, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
accepted There is consensus among the team that this change meets the criteria for inclusion archived due to age This issue has been archived; please open a new issue for any further discussion cli Relates to ESLint's command-line interface enhancement This change enhances an existing feature of ESLint
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants