New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce memory allocation in rules and utils #2976
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be easier to review if this was split up into a separate commit for each conceptual kind of change - since some of them are no-brainers (like the Object.keys -> values) and others are more of a tradeoff between clarity and performance.
args.filter((arg) => arg.type === 'ObjectExpression').forEach((object) => validatePropNaming(node, object.properties)); | ||
args.forEach((object) => object.type === 'ObjectExpression' | ||
&& validatePropNaming(node, object.properties) | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how meaningful is this specific change? the engine should be able to optimize this down to the equivalent for loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question.
I'm not convinced v8 can optimise a filter and forEach into an equivalent for loop - some (admittedly slapdash and micro) benchmarking with node v14.15.3 iterating many times over an array shows a singular forEach with a built-in conditional as reliably faster than a filter, forEach chain:
$ ITERATIONS=100 SIZE=10000 node arraytest-filter-for-each.js
filter, forEach: time taken 58.374 ms
$ ITERATIONS=100 SIZE=10000 node arraytest-just-for-each.js
just forEach: time taken 43.283 ms
$ ITERATIONS=100 SIZE=10000 node arraytest-for-of-loop.js
for of loop: time taken 41.101 ms
The singular forEach is consistently faster - depending on the arrangement of the code seems to be around 20 -> 40% faster.
Going with more iterations but a smaller array shows similar results too, just in case:
$ ITERATIONS=10000 SIZE=100 node arraytest-filter-for-each.js
filter, forEach: time taken 45.790 ms
$ ITERATIONS=10000 SIZE=100 node arraytest-just-for-each.js
just forEach: time taken 40.437 ms
$ ITERATIONS=10000 SIZE=100 node arraytest-for-of-loop.js
for of loop: time taken 36.737 ms
Here's the gist:
https://gist.github.com/willheslam/d45a1d183d5d778d82c5113ffb8a4ab8
(using a naked for loop is faster than all three, but that's probably going too far)
Whether this specific change is meaningful, I don't honestly have concrete answer - it depends on the frequency the function is called, the average size of args
(and how many of them fulfil the condition)
and also how much the surrounding environment is generating memory pressure.
Benchmarking each individual change is difficult, given the noise and iteration time and how some effects of memory allocation are only obvious in concert, so as a blanket rule this PR finds any quick, localised opportunities to reduce memory allocation, where the change didn't massively impact the flow of the code.
@@ -93,11 +94,13 @@ module.exports = { | |||
const list = components.list(); | |||
|
|||
// If no defaultProps could be found, we don't report anything. | |||
Object.keys(list).filter((component) => list[component].defaultProps).forEach((component) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same with this .filter change to an if check inside forEach?
- Avoid creating new strings by replacing string splits and trims with RegExps - Avoid dynamically generating RegExps when possible - Compare some arrays by identity first, before checking join('') string identity
…ing some closures
- Avoid allocations where possible (e.g. concating with an empty list) - Promote some pure functions to module scope - Avoid some wasteful array allocations (filter & length => reduce / some) - Move some calculations behind early returns / guards
Codecov Report
@@ Coverage Diff @@
## master #2976 +/- ##
=======================================
Coverage 97.59% 97.59%
=======================================
Files 110 110
Lines 7269 7368 +99
Branches 2651 2695 +44
=======================================
+ Hits 7094 7191 +97
- Misses 175 177 +2
Continue to review full report at Codecov.
|
- Cache deprecated config - Precompute versions instead of re-derive it for each check - Avoid reallocating some functions - Avoid normalizing already normalized parts
- Combine filter -> forEach chains - Extract deeply nested functions to higher scopes - Promote shared or mostly-constant values and conditionals to higher scopes - Swap compound boolean expressions to exploit short circuiting - Push expensive operations below guards / early returns - Push expensive operations into conditionals to exploit short circuiting - Swap filters for finds when possible - Swap some array function orders, e.g. Filter before reversing arrays
Thanks for suggesting this - I've now split this PR into 13 commits, roughly across the kinds of changes made, or, if hard to disentangle, across the individual modules that have been modified. |
DEFAULT_LINK_COMPONENTS.concat(settings.linkComponents || []) | ||
settings.linkComponents | ||
? DEFAULT_LINK_COMPONENTS.concat(settings.linkComponents) | ||
: DEFAULT_LINK_COMPONENTS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
repeating DEFAULT_LINK_COMPONENTS
seems unfortunate; is creating an empty array really that costly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If settings.linkComponents
is undefined, then not only would we be creating an empty array, but we'd be creating a copy of DEFAULT_LINK_COMPONENTS
too.
On the project I'm testing with, it does this once for each file, or about 500 times.
The docs describe setting custom link components via settings.linkComponents
https://github.com/yannickcr/eslint-plugin-react/blob/master/docs/rules/jsx-no-target-blank.md#custom-link-components
but I'm working under the assumption this will usually be undefined
, thus avoiding two array allocations per file given that's the default case.
Creating 1000 extra arrays is not incredibly slow, all considered, but if 10 or more rules have this behaviour, this can start to add up and start putting some GC pressure on the rest of the program. How much, specifically in this case, is hard to say.
Looking back at this code again, I've realised that context.settings.linkComponents will be the same across all files, and because jsx-no-target-blank never mutates the resulting Map, it can be created just once for all files!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creating a copy of an array should be basically free.
return name === func || func.property === name; | ||
}); | ||
const propWrapperFunctions = context.settings.propWrapperFunctions; | ||
if (propWrapperFunctions && propWrapperFunctions.length > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this check needed? calling [].some()
immediately returns false
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is part of the changes designed to avoid allocating new strings where possible - in this case, avoiding splitting name
.
Because we now only conditionally create splitName
, we need to conditionally run some
too - but we get the added bonus of not having to allocate a closure at all.
isPropWrapperFunction
then becomes nearly allocationless, which is nice because it's called a fair amount in a lot of different rules.
Looking at this again, I've realised the predicate only references splitName in some very specific circumstances, so we can sometimes avoid splitting it in the first place and produce a more specialised predicate in the other cases.
(Because context.settings.propWrapperFunctions
is meant to be relatively static, we could also run the equivalent of some
potentially without allocating a closure at all.)
} else if (useBracket) { | ||
line.isUsingOperator = false; | ||
} else { | ||
const useBracket = /^([ ]|[\t])*[<]/.test(src); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we make this regex at module level, rather than nesting it deeper? (same with the useOperator regexes)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch - this actually resulted in my learning that, on top of regex literals being faster than dynamic (new RegExp(string)
) ones, in ES5, regex literal instances have their own identity, which means caching them at the module level is faster yet again:
https://stackoverflow.com/questions/9750338/dynamic-vs-inline-regexp-performance-in-javascript/32523333
The storedRegExp is about 5 - 20% percent faster across browsers than inlineRegExp
Always create your immutable regexps with literal syntax and cache it if it's to be re-used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm relatively sure that's not unique to ES5, that's eternally been the semantics in JS.
let matches; | ||
if (byLastLine) { | ||
src = lines[lines.length - 1]; | ||
matches = src.match(/.*\n(.*)$/); | ||
} else { | ||
src = lines[0]; | ||
matches = src.match(/^(.*)\n/); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let matches; | |
if (byLastLine) { | |
src = lines[lines.length - 1]; | |
matches = src.match(/.*\n(.*)$/); | |
} else { | |
src = lines[0]; | |
matches = src.match(/^(.*)\n/); | |
} | |
const matches = byLastLine ? src.match(/.*\n(.*)$/) : src.match(/^(.*)\n/); |
return text.trim().length === 0; | ||
return !/\S/.test(text); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these aren't necessarily always the same thing; trim
can be polyfilled to be more correct, but regexes can't be.
Additionally, is this better than /^[^\S]*$/.test(text)
, which might be clearer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point - it's hard to know for sure if this regex is truly equivalent to trim - a cursory glance at the spec says it removes "WhiteSpace and LineTerminator.".
The nice thing is it doesn't involve creating a new string, and will finish early the moment it finds a non-whitespace character, rather than continuing on trying to remove all the whitespace.
In terms of clarity I'm not sure - I'm happy to go with your suggestion here, but is /^\s*$/
equivalent, i.e. "Match only on whitespace from start to end"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I'm not convinced that .trim()
is meaningfully slower than using a custom regex. Do you have data to support that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about the late reply!
I was really convinced a regex would be faster because there'd be less work (stop the moment non-whitespace is found, rather than doing the work of constructing a new string, then checking its length), but after constructing some basic micro benchmarks to check it, they're basically equivalent in speed, and in Chrome 91 + latest V8 sometimes trim was faster.
Thanks for pushing back on this, I'm glad it was double checked - will revert this change.
@@ -20,4 +20,11 @@ describe('isFirstLetterCapitalized', () => { | |||
assert.equal(isFirstLetterCapitalized('IsCapitalized'), true); | |||
assert.equal(isFirstLetterCapitalized('UPPERCASE'), true); | |||
}); | |||
|
|||
it('should return false for non-letters', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you haven't checked non-english language characters. see https://stackoverflow.com/a/38123020/632724
@@ -69,7 +83,7 @@ module.exports = { | |||
} | |||
|
|||
function checkValidPropType(node) { | |||
if (node.name && !PROP_TYPES.some((propTypeName) => propTypeName === node.name)) { | |||
if (node.name && !(node.name in PROP_TYPES)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not use Sets here rather than in
? also, using in
here ensures we'll get false positives for things like toString
, or anything else on Object.prototype.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch on false positives with in
, I definitely hadn't considered that.
Set looks like a good fit here!
/** | ||
* Checks if a passed method is unsafe | ||
* @param {string} method Life cycle method | ||
* @returns {boolean} Returns true for unsafe methods, otherwise returns false | ||
*/ | ||
function isUnsafe(method) { | ||
const unsafeMethods = getUnsafeMethods(); | ||
return unsafeMethods.indexOf(method) !== -1; | ||
return (method in unsafe); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
@@ -141,8 +157,10 @@ module.exports = { | |||
return moduleName; | |||
} | |||
|
|||
const matchName = (name) => name === node.init.name; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might as well hoist this to module level
function isLessThan(a, b) { | ||
const higherMajor = a[0] < b[0]; | ||
if (higherMajor) { | ||
return true; | ||
} | ||
const higherMinor = a[0] === b[0] && a[1] < b[1]; | ||
if (higherMinor) { | ||
return true; | ||
} | ||
const higherOrEqualPatch = a[0] === b[0] | ||
&& a[1] === b[1] | ||
&& a[2] <= b[2]; | ||
return higherOrEqualPatch; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this logic doesn't seem like it will work for prerelease versions. i think we need to use the semver
package for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting - it's equivalent (or should be - will double check this) to what was already here - note the removed lines from function test
below.
Happy to consider using semver
, but I was hoping to keep this PR functionally equivalent and just a pure performance boost - what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, we could add semver
prior to this PR to correct the behavior, if you prefer :-p
@@ -35,7 +35,7 @@ function generateErrorMessageWithParentName(parentName) { | |||
* @returns {Boolean} | |||
*/ | |||
function startsWithRender(text) { | |||
return (text || '').startsWith('render'); | |||
return text !== undefined && text.startsWith('render'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm specifically interested in changes related to this rule. Thanks for testing the performance @willheslam!
While this improvement is definitely welcome I'm having a hard time believing it actually speeds up this rule by ~40%.
Is it really that slow to create new Strings? This method is not even accessed that often.
Why shouldn't other Strings also be declared once in module scope, e.g. render
, CallExpression
?
Rule Time (ms) Relative react/no-unstable-nested-components 172.198 2.7% to
Rule Time (ms) Relative react/no-unstable-nested-components 95.287 2.3%
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point - the performance gains for no-unstabled-nested-components
don't come from the change here - it's from the changes in lib/util/Components.js.
As an example, running soley no-unstable-nested-components
on a large-ish project results in isES6Component
being invoked many times, and each time (without the change) a regular expression is created dynamically via a string, used once, then discarded.
There are lots of other optimisations inside lib/util/Components.js that are contributing to this speed up too - to prove this is the case, I tried running no-unstable-nested-components
with just the lib/util/Components.js changes applied, and got a similar speed up as you note above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, that makes perfect sense. Those utilities are perfect spot for optimization as they are being used widely across the code base.
That's impressive improvement! 💯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Responded to comments - thanks for these, very good points.
Will fix what has been pointed out and do some more specific benchmarking to double check some changes.
@@ -35,7 +35,7 @@ function generateErrorMessageWithParentName(parentName) { | |||
* @returns {Boolean} | |||
*/ | |||
function startsWithRender(text) { | |||
return (text || '').startsWith('render'); | |||
return text !== undefined && text.startsWith('render'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point - the performance gains for no-unstabled-nested-components
don't come from the change here - it's from the changes in lib/util/Components.js.
As an example, running soley no-unstable-nested-components
on a large-ish project results in isES6Component
being invoked many times, and each time (without the change) a regular expression is created dynamically via a string, used once, then discarded.
There are lots of other optimisations inside lib/util/Components.js that are contributing to this speed up too - to prove this is the case, I tried running no-unstable-nested-components
with just the lib/util/Components.js changes applied, and got a similar speed up as you note above.
DEFAULT_LINK_COMPONENTS.concat(settings.linkComponents || []) | ||
settings.linkComponents | ||
? DEFAULT_LINK_COMPONENTS.concat(settings.linkComponents) | ||
: DEFAULT_LINK_COMPONENTS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If settings.linkComponents
is undefined, then not only would we be creating an empty array, but we'd be creating a copy of DEFAULT_LINK_COMPONENTS
too.
On the project I'm testing with, it does this once for each file, or about 500 times.
The docs describe setting custom link components via settings.linkComponents
https://github.com/yannickcr/eslint-plugin-react/blob/master/docs/rules/jsx-no-target-blank.md#custom-link-components
but I'm working under the assumption this will usually be undefined
, thus avoiding two array allocations per file given that's the default case.
Creating 1000 extra arrays is not incredibly slow, all considered, but if 10 or more rules have this behaviour, this can start to add up and start putting some GC pressure on the rest of the program. How much, specifically in this case, is hard to say.
Looking back at this code again, I've realised that context.settings.linkComponents will be the same across all files, and because jsx-no-target-blank never mutates the resulting Map, it can be created just once for all files!
return name === func || func.property === name; | ||
}); | ||
const propWrapperFunctions = context.settings.propWrapperFunctions; | ||
if (propWrapperFunctions && propWrapperFunctions.length > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is part of the changes designed to avoid allocating new strings where possible - in this case, avoiding splitting name
.
Because we now only conditionally create splitName
, we need to conditionally run some
too - but we get the added bonus of not having to allocate a closure at all.
isPropWrapperFunction
then becomes nearly allocationless, which is nice because it's called a fair amount in a lot of different rules.
Looking at this again, I've realised the predicate only references splitName in some very specific circumstances, so we can sometimes avoid splitting it in the first place and produce a more specialised predicate in the other cases.
(Because context.settings.propWrapperFunctions
is meant to be relatively static, we could also run the equivalent of some
potentially without allocating a closure at all.)
} else if (useBracket) { | ||
line.isUsingOperator = false; | ||
} else { | ||
const useBracket = /^([ ]|[\t])*[<]/.test(src); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch - this actually resulted in my learning that, on top of regex literals being faster than dynamic (new RegExp(string)
) ones, in ES5, regex literal instances have their own identity, which means caching them at the module level is faster yet again:
https://stackoverflow.com/questions/9750338/dynamic-vs-inline-regexp-performance-in-javascript/32523333
The storedRegExp is about 5 - 20% percent faster across browsers than inlineRegExp
Always create your immutable regexps with literal syntax and cache it if it's to be re-used.
return text.trim().length === 0; | ||
return !/\S/.test(text); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point - it's hard to know for sure if this regex is truly equivalent to trim - a cursory glance at the spec says it removes "WhiteSpace and LineTerminator.".
The nice thing is it doesn't involve creating a new string, and will finish early the moment it finds a non-whitespace character, rather than continuing on trying to remove all the whitespace.
In terms of clarity I'm not sure - I'm happy to go with your suggestion here, but is /^\s*$/
equivalent, i.e. "Match only on whitespace from start to end"?
) { | ||
return true; | ||
pragmaLowerCase = pragmaLowerCase || pragma.toLocaleLowerCase(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, I'll fix this.
@@ -9,8 +9,7 @@ function isFirstLetterCapitalized(word) { | |||
if (!word) { | |||
return false; | |||
} | |||
const firstLetter = word.charAt(0); | |||
return firstLetter.toUpperCase() === firstLetter; | |||
return /^[A-Z]/.test(word); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch - I was too dependent on the existing unit tests for this function.
It'd be great if we could avoid doing some string manipulations here, so it could be that we can use the regex as a fast path, but preserve the chatAt & toUpperCase as a slow path.
@@ -69,7 +83,7 @@ module.exports = { | |||
} | |||
|
|||
function checkValidPropType(node) { | |||
if (node.name && !PROP_TYPES.some((propTypeName) => propTypeName === node.name)) { | |||
if (node.name && !(node.name in PROP_TYPES)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch on false positives with in
, I definitely hadn't considered that.
Set looks like a good fit here!
function isLessThan(a, b) { | ||
const higherMajor = a[0] < b[0]; | ||
if (higherMajor) { | ||
return true; | ||
} | ||
const higherMinor = a[0] === b[0] && a[1] < b[1]; | ||
if (higherMinor) { | ||
return true; | ||
} | ||
const higherOrEqualPatch = a[0] === b[0] | ||
&& a[1] === b[1] | ||
&& a[2] <= b[2]; | ||
return higherOrEqualPatch; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting - it's equivalent (or should be - will double check this) to what was already here - note the removed lines from function test
below.
Happy to consider using semver
, but I was hoping to keep this PR functionally equivalent and just a pure performance boost - what do you think?
@willheslam are you interested in continuing this effort? It seems like there's a number of comments to address, and a few potential PRs to land separately, plus this needs a rebase. |
@ljharb Sorry for disappearing on this, work + life got in the way and I ran out of time. Yeah - I've got some time this week so I'll take a look at the changes I've made to address the comments, changes pending, and what the rebase will entail. I appreciate your following up on this! |
47b82ea
to
58a3c8b
Compare
59af733
to
865ed16
Compare
069314a
to
181c68f
Compare
Hello!
eslint-plugin-react is an amazing plugin - it catches so many problems that I consider it indispensable :)
Recently on a large-ish project I noticed some of the rules running relatively slowly, so I decided to see if I could speed them up.
In the end I was able to find quite a few opportunities to do less work, mainly by avoiding memory allocations.
I'd really appreciate any feedback on better ways to benchmark the plugin, and how safe these optimisations are.
They pass all the tests, though I appreciate it's not always possible to be 100% confident changes are correct, especially given the large number of changes here - luckily most of them are relatively localised.
I've tried to avoid changing the existing logic as much as possible, though sometimes speed has been picked over readability - I'd appreciate any thoughts on that too.
I've tried to make sure the changes will work on node 4, but that probably needs more thorough testing.
There's a list of the changes summarised at the bottom of this message. Thanks!
Benchmark results:
The project I benchmarked this against using
cloc
registers as about 550 files, with about 50k lines of code in .tsx files all containing React components.The changes knock off about 70 - 100ms off the slowest rules, roughly 30% faster, with a sample result here:
to
I used the following .eslintrc.js:
with the versions:
across the .tsx files in the project, 200 times by running:
for both
7.23.2
and then with thelib
directory swapped in from this branch.I then averaged all 200 runs in this google sheet:
https://docs.google.com/spreadsheets/d/1Y5QZ_tsAyXrzgw5Rre21Gmzl95jT9irZmxfwSeF1oI0/edit?usp=sharing
Optimisations: