perf: Optimize proxify and stringDistance #1098

sophiebits · 2017-12-02T22:52:56Z

Fill 2D array with ints upfront to reduce property access cost
Change from recursive to simpler iterative (DP) solution
Add cap parameter to stringDistanceCapped to limit computation
Make candidate generation use a simple loop to avoid allocating unnecessary arrays and to call stringDistance only once on each pair of strings instead of every time in the sort callback

This improves chai perf by about 13% on @bmeurer's https://github.com/v8/web-tooling-benchmark.

@bmeurer

- Fill 2D array with ints upfront to reduce property access cost - Change from recursive to simpler iterative (DP) solution - Add cap parameter to stringDistanceCapped to limit computation - Make candidate generation use a simple loop to avoid allocating unnecessary arrays and to call stringDistance only once on each pair of strings instead of every time in the sort callback This improves chai perf by about 13% on @bmeurer's https://github.com/v8/web-tooling-benchmark.

codecov · 2017-12-02T22:54:14Z

Codecov Report

Merging #1098 into master will increase coverage by 0.03%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #1098      +/-   ##
==========================================
+ Coverage   93.61%   93.64%   +0.03%     
==========================================
  Files          32       32              
  Lines        1628     1637       +9     
  Branches      393      393              
==========================================
+ Hits         1524     1533       +9     
  Misses        104      104

Impacted Files	Coverage Δ
lib/chai/utils/proxify.js	`100% <100%> (ø)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update eae99d1...5467f07. Read the comment docs.

bmeurer

This looks promising! I was actually thinking of using this algorithm instead which requires fewer temporary arrays. Have you considered/tested that as well?

sophiebits · 2017-12-03T04:55:54Z

I didn't try that but it makes sense. I did try hoisting the 2D array as a static 41x41 array (and bailing out on names longer than 40 chars) and it didn't seem to help much. Not sure where the bottleneck is now.

mathiasbynens · 2017-12-03T05:30:40Z

lib/chai/utils/proxify.js

+  // memo[i][j] is the distance between strA.slice(0, i) and
+  // strB.slice(0, j).
+  for (var i = 0; i <= strA.length; i++) {
+    memo[i] = Array(strB.length + 1).fill(0);


Consider using new Array(…) instead of Array(…). The [[Call]] syntax lacks elements kind tracking in V8 and is therefore less efficient.

bmeurer · 2017-12-03T05:37:39Z

In this case it doesn't matter, since the arrays contain only small integers.

lucasfcosta · 2018-01-03T22:15:02Z

lib/chai/utils/proxify.js

-      memo[i] = [];
-    }
+function stringDistanceCapped(strA, strB, cap) {
+  if (Math.abs(strA.length - strB.length) >= cap) {


This was a good catch 👍

lucasfcosta · 2018-01-03T22:18:07Z

Whoah! I'm really glad that we have so many awesome contributors that are also performance specialists.

I've taken a look through the code and the logic LGTM but since I don't feel like I could give any other valuable insights when it comes to performance I'll be happy to review this again when you are happy with it.

Thanks everyone for your collaboration 😄

sophiebits · 2018-01-03T22:27:42Z

I don't think I'll get a chance to look back at this so feel free to merge! Making it 2 x n and replacing 1 - i for i - 1 should work though if you want to test that yourself.

meeber

@sophiebits Thanks for contributing! This LGTM. Let's merge. Any future perf improvements can come in follow-up PRs.

bmeurer · 2018-01-08T07:45:57Z

Thanks a lot @sophiebits and @keithamus! Excited to see this happening. Anyone interested to explore the memory efficient variant of the Levenshtein algorithm? That should give another nice boost. 👍

not-an-aardvark · 2018-01-08T08:14:26Z

This improvement looks great, although I'm confused about why it's affecting performance so much -- ostensibly the stringDistance function only gets called when an invalid property has been accessed, which would usually be a result of programmer error. Is there a reason that stringDistance ends up getting called in the benchmark?

bmeurer · 2018-01-08T08:33:44Z

The chai test in the web-tooling-benchmark was derived from the chai test suite itself. So it tests the chai library using it's own assert/expect machinery. See the chai-benchmark.js file for the exact test cases.

meeber · 2018-01-08T13:00:45Z

stringDistance is being called in the benchmark due to the "proxify" tests. Search "pizza" in the benchmark file for all occurrences of this. As definitely-an-aardvark suggested, those tests are centered around programmer error, most likely hunger-induced. Although delicious, the pizza tests have the side-effect of causing the benchmarks to overstate the impact of perf improvements in the stringDistance function on a typical Chai workload. I don't normally advocate for the removal of pizza, but it may be warranted in the case of benchmarks.

bmeurer · 2018-01-08T14:04:52Z

@meeber But those are only 7 tests?

meeber · 2018-01-09T02:27:24Z

@bmeurer Yup, just verified those are the only tests calling stringDistance. Based on an admittedly small sample size, removing those tests seems to have a noticeable impact on the benchmark in Chai 4.1.2, but not very noticeable in the Chai master branch (which suggests that these perf improvements did their job!) Either way, those tests involve a code path that's only hit when a developer writes a non-existent assertion (e.g., expect(true).to.be.ture), so they're not really meaningful in regard to benchmarking actual Chai workloads. On the other hand, faster is better in every code path, and it's valuable to know if some change to the JavaScript engine negatively impacts performance in even a rarely hit code path.

bmeurer · 2018-01-09T10:32:30Z

@meeber Thanks for the explanation. Interesting to see that it made such a big difference. I'll think we still keep the 7 tests following your reasoning, especially since in the near future when we roll a new version of Chai, the impact will be less noticeable.

keithamus · 2018-01-09T16:11:53Z

Yes I suppose it is worth noting - on a typical users test suite, passing tests won't hit these code paths. This likely makes most of a difference to a users development machine, where they'll see this paths hit from typos.

Recently I've been considering how we can improve test times for passing tests, so that tests in typical workloads (e.g. CI) run as quick as possible. One example of slowdown that is instric to the architecture of chai at current - is that we format error messages even for passing tests.

bmeurer · 2018-01-09T18:22:29Z

@keithamus This is a good point. We've observed that error message formatting is indeed quite significant for chai.

sophiebits requested a review from a team as a code owner December 2, 2017 22:52

sophiebits mentioned this pull request Dec 2, 2017

fix: Avoid repeated String#slice calls in stringDistance #1095

Merged

bmeurer reviewed Dec 3, 2017

View reviewed changes

mathiasbynens reviewed Dec 3, 2017

View reviewed changes

lucasfcosta reviewed Jan 3, 2018

View reviewed changes

meeber approved these changes Jan 7, 2018

View reviewed changes

keithamus merged commit c02f64b into chaijs:master Jan 7, 2018

meeber mentioned this pull request Aug 13, 2018

Finding property with minimal string distance is slow #1183

Closed

snyk-bot mentioned this pull request Jan 29, 2020

[Snyk] Upgrade chai from 4.1.2 to 4.2.0 davemenninger/socket.io-chat-fargate#3

Open

snyk-bot mentioned this pull request Mar 15, 2020

[Snyk] Upgrade chai from 4.0.2 to 4.2.0 postmanlabs/sails-env-switch#15

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: Optimize proxify and stringDistance #1098

perf: Optimize proxify and stringDistance #1098

sophiebits commented Dec 2, 2017

codecov bot commented Dec 2, 2017 •

edited

bmeurer left a comment

sophiebits commented Dec 3, 2017

mathiasbynens Dec 3, 2017

bmeurer commented Dec 3, 2017

lucasfcosta Jan 3, 2018

lucasfcosta commented Jan 3, 2018 •

edited

sophiebits commented Jan 3, 2018

meeber left a comment

bmeurer commented Jan 8, 2018

not-an-aardvark commented Jan 8, 2018

bmeurer commented Jan 8, 2018

meeber commented Jan 8, 2018

bmeurer commented Jan 8, 2018

meeber commented Jan 9, 2018

bmeurer commented Jan 9, 2018

keithamus commented Jan 9, 2018

bmeurer commented Jan 9, 2018

perf: Optimize proxify and stringDistance #1098

perf: Optimize proxify and stringDistance #1098

Conversation

sophiebits commented Dec 2, 2017

codecov bot commented Dec 2, 2017 • edited

Codecov Report

bmeurer left a comment

Choose a reason for hiding this comment

sophiebits commented Dec 3, 2017

mathiasbynens Dec 3, 2017

Choose a reason for hiding this comment

bmeurer commented Dec 3, 2017

lucasfcosta Jan 3, 2018

Choose a reason for hiding this comment

lucasfcosta commented Jan 3, 2018 • edited

sophiebits commented Jan 3, 2018

meeber left a comment

Choose a reason for hiding this comment

bmeurer commented Jan 8, 2018

not-an-aardvark commented Jan 8, 2018

bmeurer commented Jan 8, 2018

meeber commented Jan 8, 2018

bmeurer commented Jan 8, 2018

meeber commented Jan 9, 2018

bmeurer commented Jan 9, 2018

keithamus commented Jan 9, 2018

bmeurer commented Jan 9, 2018

codecov bot commented Dec 2, 2017 •

edited

lucasfcosta commented Jan 3, 2018 •

edited