Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow parsing when blanks are continuous #7675

Closed
ota-meshi opened this issue Jul 11, 2022 · 6 comments · Fixed by #7706
Closed

Slow parsing when blanks are continuous #7675

ota-meshi opened this issue Jul 11, 2022 · 6 comments · Fixed by #7706
Labels

Comments

@ota-meshi
Copy link
Member

ota-meshi commented Jul 11, 2022

Describe the bug

If there are many consecutive blanks, it will be very slow to parse.

I run the following code and found that it took 9000ms.

import { parse } from "svelte/compiler";

const fixture = `<div>${" ".repeat(100000)}</div>`;

const start = Date.now();
parse(fixture);
console.log(Date.now() - start);

Reproduction

Running the following code that uses the svelte parser takes 9000ms.

import { parse } from "svelte/compiler";

const fixture = `<div>${" ".repeat(100000)}</div>`;

const start = Date.now();
parse(fixture);
console.log(Date.now() - start);

However, the code for the following template, which has more characters than before, finishes in 50ms.

import { parse } from "svelte/compiler";

const fixture = `<div>${" A".repeat(100000)}</div>`;

const start = Date.now();
parse(fixture);
console.log(Date.now() - start);

I think most of the time for template parses is spent on the next regex.

this.template = template.replace(/\s+$/, '');

Logs

No response

System Info

System:
    OS: macOS 12.4
    CPU: (4) x64 Intel(R) Core(TM) i5-6360U CPU @ 2.00GHz
    Memory: 50.58 MB / 8.00 GB
    Shell: 5.8.1 - /bin/zsh
  Binaries:
    Node: 18.4.0 - /usr/local/bin/node
    Yarn: 1.22.11 - /usr/local/bin/yarn
    npm: 8.12.1 - /usr/local/bin/npm
  Browsers:
    Chrome: 103.0.5060.114
    Firefox: 79.0
    Safari: 15.5
  npmPackages:
    svelte: ^3.46.1 => 3.48.0

Severity

annoyance

@rmunn
Copy link
Contributor

rmunn commented Jul 11, 2022

Looks like this might be an issue with catastrophic backtracking, where the regex engine tries to backtrack every single one of those 100k spaces to see if it might match the rest of the regex. So that instead of following all 100k spaces, seeing that they don't match the end of the line (because they have a </div> after them) and backing off the whole 100k, it then tries 100k times to find an end-of-line marker.

I can think of a couple ways to work around the catastrophic backtracking in this case. First, if template was expected to be a single-line string, trimEnd would likely be more performant as it wouldn't backtrack at all. String.trimEnd is not supported on IE, so the code might need to become something like:

if ('trimEnd' in String.prototype) {
  this.template = template.trimEnd();
} else {
  this.template = template.replace(/\s+$/, '');
}

(Unless Svelte plans to drop IE support entirely, in which case the if isn't necessary).

Another approach might be to use the technique described in https://instanceof.me/post/52245507631/regex-emulate-atomic-grouping-with-lookahead to prevent the regex engine from backtracking. Rewrite the /\s+$/ regex as /(?=(\s+))\1$/ and it should (I haven't tested this yet) perform the same in non-pathological cases, yet avoid backtracking in pathological cases. (Because the entire series of spaces will be captured by the parentheses inside the lookahead expression, then when the parser reaches the \1$ marker it's not able to backtrack through the \s+ expression). However, this would need to be tested, as it could possibly cause slowdown in non-pathological cases by creating capture groups for every space found in the template, then throwing those groups away later when it's discovered they weren't needed. (Which would result in GC pressure).

BTW, I traced the origin of the template.replace(/\s+$/, '') line back to bac0248 (from #153), meaning that that line has been in the parser for a long, long time.

@dummdidumm
Copy link
Member

We had to work around this issue as well in language tools by inserting non-whitespacs characters at the end of each line to prevent this from slowing down.

@ota-meshi
Copy link
Member Author

ota-meshi commented Jul 11, 2022

This is a parser issue, so if we can run it on Node.js version 8, the minimum supported version of Node.js, I think it's okay if we can't run it in IE.

However, Node.js v8 does not support trimEnd. But Node.js v8 seems to be able to use trimRight.
trimRight is currently an alias for trimEnd and is deprecated, so people don't like it very much.
Do you think we can use trimRight?

https://node.green/#ES2019-features-string-trimming
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/trimEnd

@baseballyama
Copy link
Member

baseballyama commented Jul 17, 2022

I agree to use trimRight. It seems to be available in the latest Node 18.6.0 also.
(So trimRight can use from our minimum supported version Node 8 to the latest version Node18)
Also, we can notice by CI if some environment droptrimRight in the future.

@dummdidumm
Copy link
Member

We only can use built in functions which are part of the node versions Svelte currently supports

@Conduitry
Copy link
Member

The performance should be improved in 3.50.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants