100% CPU stuck in levenshtein calculation #1087

atombender · 2018-10-25T16:57:28Z

Description

Browser gets stuck in 100% CPU for a long time whenever reloading after a code change, via Webpack.

Expected behavior

Shouldn't do that.

Actual behavior

Gets stuck in haveTextSimilarity calling levenshtein.get() on two strings that are 96227 characters long.

Suffice to say, the Levenshtein implementation in fast-levenshtein might be "fast", but it's not fast. If I modify haveTextSimilarity to return false on huge strings, it no longer gets bogged down.

Why are the strings being compared so big? I don't know; I'm not using Webpack in any unusual way. My components are importing CSS such as ionicons and which end up containing Base64-encoded source maps, which certainly contributes to the size.

Environment

React Hot Loader version: 4.3.11

Operating system: macOS
Browser and version: Chrome 69.0.3497.100

The text was updated successfully, but these errors were encountered:

theKashey · 2018-10-25T22:59:45Z

Levenshtein should be O(n) on the similar texts, and could increase complexity, as difference grew. But it should be a super small on hot update.

I could not disable it, as long it would lead to false positive replaces, but you can mitigate the problem by "registering" component - babel plugin does it for "top level" variables.

Is your component defined on a top level?

atombender · 2018-10-25T23:18:21Z

Not sure what you mean by registering. The component that made me discover this issue isn't the top level one, but it's embedded two levels down or so from the root.

theKashey · 2018-10-25T23:58:57Z

That's not about module structure, only about you ability to "export" some component from a file.

//anyFile.js

// module top level variable
const iAmOnTopLevel = () => <div>aha</div>;

// component would be created inside HOC, ie not "top level". 
const likeAHOC = (Component) => () => <div>and this one is hidden inside</div>

Levenshtein tests is applied only for that "hidden" inside another functions components. Should be called veeeery rarely.

atombender · 2018-10-26T00:25:43Z

I see. The component in question is within two HOCs:

import React from 'react';
import PropTypes from 'prop-types';
import {connect} from 'react-redux';
import {withRouter} from 'react-router';

export default withRouter(
  connect(state => ({
    loginSessionState: state.loginSession,
    feeds: state.feeds.models
  }))(
    class GlobalNavigation extends React.Component {
      // ...
    }));

theKashey · 2018-10-26T00:34:52Z

As long it is not created inside HOC, but would be a given some props by HOC - just extract class declaration outside of connect, and call it a day

// a top level variable!
class GlobalNavigation extends React.Component {
      // ...
 }

export default withRouter(
  connect(state => ({
    loginSessionState: state.loginSession,
    feeds: state.feeds.models
  }))(GlobalNavigation));

VinSpee · 2018-10-26T00:41:59Z

Hi! I'm seeing some 100% CPU issues as well when hot-reloading.

meinders · 2018-12-21T20:16:05Z

I'm experiencing something similar here (react-hot-loader@4.3.12), but on a component that shouldn't be 'hidden' as far as I can tell, since it's a default export:

export default class PropertyPage extends React.Component
{
	render()
	{
		// about 52k of code (after babel/webpack)
	}
}

When reloading, haveTextSimilarity calculates the Levenshtein distance on the old and new code of the render function (why?). Both strings are about 52k characters long (non-minified).

From what I read the Levenshtein distance is not O(n) but O(n^2). I did a quick benchmark with two variations of a random string, 52k characters long, with a Levenshtein distance of 1. This takes 45 seconds, run from Node.

It doesn't seem like a good idea to use such a costly function when hot reloading large pieces of code.

atombender · 2018-12-21T20:38:09Z

Indeed. I haven't had time to go back to this issue, but it's bit me several times recently, and I'd like to see it fixed.

A stupid but very effective fix would be to revert to a string equality check if the combined size of the strings exceeds a certain number. The danger of false positives is moot when you can't even get a result.

Levenshtein has different complexities depending on what algorithm is used. I doubt the one used here is O(n), not just because that's difficult to achieve, but also because clearly the performance drop we are seeing isn't linear.

The matrix method, which requires n * m bytes of space to hold the matrix, is the cheapest, as good as O(n + m), I believe.

But if you're only interested in checking if the distance exceeds a specific number, and you only need an approximation, there is a much faster algorithm, which should be O(n^(1+d)) in time, where d is the maximum distance. No idea if anyone has implemented in JS.

kromit · 2019-01-10T14:02:06Z

I can confirm that it is not linear. I have a large form that takes 6.5 seconds to check with haveTextSimilarity. Any half of the same form is almost instantly re-rendered.

Thinking about about removing RHL entirely and go the webpack HMR way.

theKashey · 2019-01-17T08:46:12Z

Sound like it's time to reconsider this moment. I'll try to use another algo, as long as I dont need a real levenshtein here - just some (short) distance metric

petesantos · 2019-08-21T17:40:51Z

I am not sure if this is an issue with the algorithm or an issue with the browser. I am not seeing this behavior in Firefox. I did some profiling and it looks like Chrome is churning at this line of code in fast-levenshtein:

strCmp = str1.charCodeAt(i) === str2Char[j];

I created some snippets that reproduce the issue, but the reproduction is highly dependent on the contents of the strings being compared. I can reproduce this issue with a long string of source code, but cannot reproduce it with randomly generated strings. I also took the source strings in question and randomly re-arranged the the lines of code and the issue disappeared. I also appended random junk to the end of the strings and the issue disappeared. I made arbitrary changes to the strings (removing all instances of a given character) and the issue disappears.

One change I made that I think would be harmless is to remove all the leading whitespace on every line, then run the comparison. This resolves the issue on chrome, but I am not sure why this is. It looks like a bug in V8, but i need to better isolate the repro steps.

If you could put a workaround in for this issue, it would be appreciated.
This workaround works for me:

var haveTextSimilarity = function haveTextSimilarity(a, b) {
    return (
        // equal or slight changed
        a === b || levenshtein.get(a.replace(/\n(\s+)/g, '\n'), b.replace(/\n(\s+)/g, '\n')) < a.length * 0.2
     );
};

theKashey · 2019-08-21T21:32:50Z

That make a sense! String management in Chrome is a bit special, especially for the big strings (and script is a big string) - even substring are big - https://bugs.chromium.org/p/v8/issues/detail?id=2869

Your hack is doing a small string from a big one, thus fixing the issue.

petesantos · 2019-08-22T13:04:08Z

Thanks @theKashey!

Will the hack I proposed or something similar be added to the code to enable faster reloads in chrome?

theKashey · 2019-08-22T22:50:47Z

Sure. I read some stuff around the problem, as found the best way(in terms of speed first, and then in simplicity) to solve the problem -

function clearStringFast(str) {
  return str.length < 12 ? str : (' ' + str).slice(1);
}

//

a === b || levenshtein.get(clearStringFast(a), clearStringFast(b)) < a.length * 0.2

@petesantos - I would greatly appreciate if confirm that this code solves your problem as well. In that case I'll release a new version just in an hour.

petesantos · 2019-08-23T01:22:16Z

@theKashey Your fix does resolve the issue - thanks again!
Would you mind posting links to some of the resources you read that identify the problem?

theKashey · 2019-08-23T02:58:59Z

My main source https://habr.com/ru/post/449368/ - not in English, but it has a long and detailed discussion.
Adventures in the land of substrings and RegExps. guevara/read-it-later#2353 - (in English) and with pictures :)
Chrome/FireFox(it's also affected) bug tracker.

theKashey · 2019-08-27T09:58:09Z

Released within 4.12.12

theKashey added a commit that referenced this issue Aug 27, 2019

fix: speedup levenshtein comparison, fixes #1087

28b8992

theKashey closed this as completed in b81dcb0 Aug 27, 2019

snyk-bot mentioned this issue Jan 30, 2020

[Snyk] Upgrade react-hot-loader from 4.6.0 to 4.12.18 preset-io/superset#7

Closed

snyk-bot mentioned this issue Apr 17, 2020

[Snyk] Upgrade react-hot-loader from 4.3.12 to 4.12.20 novalina26/anchor#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

100% CPU stuck in levenshtein calculation #1087

100% CPU stuck in levenshtein calculation #1087

atombender commented Oct 25, 2018

theKashey commented Oct 25, 2018

atombender commented Oct 25, 2018

theKashey commented Oct 25, 2018

atombender commented Oct 26, 2018 •

edited

theKashey commented Oct 26, 2018

VinSpee commented Oct 26, 2018

meinders commented Dec 21, 2018 •

edited

atombender commented Dec 21, 2018 •

edited

kromit commented Jan 10, 2019 •

edited

theKashey commented Jan 17, 2019

petesantos commented Aug 21, 2019 •

edited

theKashey commented Aug 21, 2019 •

edited

petesantos commented Aug 22, 2019

theKashey commented Aug 22, 2019

petesantos commented Aug 23, 2019

theKashey commented Aug 23, 2019

theKashey commented Aug 27, 2019

100% CPU stuck in levenshtein calculation #1087

100% CPU stuck in levenshtein calculation #1087

Comments

atombender commented Oct 25, 2018

Description

Expected behavior

Actual behavior

Environment

theKashey commented Oct 25, 2018

atombender commented Oct 25, 2018

theKashey commented Oct 25, 2018

atombender commented Oct 26, 2018 • edited

theKashey commented Oct 26, 2018

VinSpee commented Oct 26, 2018

meinders commented Dec 21, 2018 • edited

atombender commented Dec 21, 2018 • edited

kromit commented Jan 10, 2019 • edited

theKashey commented Jan 17, 2019

petesantos commented Aug 21, 2019 • edited

theKashey commented Aug 21, 2019 • edited

petesantos commented Aug 22, 2019

theKashey commented Aug 22, 2019

petesantos commented Aug 23, 2019

theKashey commented Aug 23, 2019

theKashey commented Aug 27, 2019

atombender commented Oct 26, 2018 •

edited

meinders commented Dec 21, 2018 •

edited

atombender commented Dec 21, 2018 •

edited

kromit commented Jan 10, 2019 •

edited

petesantos commented Aug 21, 2019 •

edited

theKashey commented Aug 21, 2019 •

edited