Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON: String with many escaped quotes break highlighting in Safari #2689

Closed
petergam opened this issue Jan 2, 2021 · 13 comments · Fixed by #2691
Closed

JSON: String with many escaped quotes break highlighting in Safari #2689

petergam opened this issue Jan 2, 2021 · 13 comments · Fixed by #2691

Comments

@petergam
Copy link

petergam commented Jan 2, 2021

Information

  • Language: JSON
  • Plugins: none

Description
Prism fails to tokenise a json text correctly in Safari if one of the values contains a large number (998 or more to be exact) of escaped quotes ("). This is a problem for me since I'm trying to highlight some JSON where the values contain HTML that has a lot of escaped quotes.

Code snippet

The snippet on the test page in the link below show the issue. Notice how things after the large string is highlighted incorrectly. If a single escaped quote is removed then the text is highlighted correctly.
Test page

The code being highlighted incorrectly.
{
  "1": "\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"",
  "2": "123"
}
@RunDevelopment
Copy link
Member

What is highlighted incorrectly? I don't see anything wrong with your example.

image

@petergam
Copy link
Author

petergam commented Jan 2, 2021

Looks like it only happens in Safari.

Screenshot 2021-01-02 at 7 32 17 PM

@petergam petergam changed the title JSON: String with many escaped quotes break highlighting JSON: String with many escaped quotes break highlighting in Safari Jan 2, 2021
@RunDevelopment
Copy link
Member

RunDevelopment commented Jan 2, 2021

I think I know what the problem is. I don't have access to a machine with Safari, so could you please paste the following into the dev console on the test page (with your example + JSON language selected):

Prism.languages.json = {
	'property': {
		pattern: /(^|[^\\])"(?:\\.|[^\\"\r\n])*"(?=\s*:)/,
		lookbehind: true,
		greedy: true
	},
	'string': {
		pattern: /(^|[^\\])"(?:\\.|[^\\"\r\n])*"(?!\s*:)/,
		lookbehind: true,
		greedy: true
	},
	'comment': {
		pattern: /\/\/.*|\/\*[\s\S]*?(?:\*\/|$)/,
		greedy: true
	},
	'number': /-?\b\d+(?:\.\d+)?(?:e[+-]?\d+)?\b/i,
	'punctuation': /[{}[\],]/,
	'operator': /:/,
	'boolean': /\b(?:true|false)\b/,
	'null': {
		pattern: /\bnull\b/,
		alias: 'keyword'
	}
};

After that, slightly change the text of your example to update the highlighted version.

@petergam
Copy link
Author

petergam commented Jan 2, 2021

You are a genius! That fixes it! Impressive that you were able to come up with a solution without a way to test it.

@RunDevelopment
Copy link
Member

Seems like my guess was correct. Safari's regex engine will stop searching for regex matches after 1million backtracking steps. My guess was that this problem is because our regex to detect properties takes O(n^2) many backtracking steps to reject strings of the form somePrefix + '\\"'.repeat(n) + someSuffix. This is why highlighting breaks after repeating \" a little less than 1000 times.

Thanks for the compliment but I wouldn't have been able to deduce that without your excellent error report, so - thank you.

This problem probably isn't unique to JSON. I'll look through our languages and fix this type of problem.

@petergam
Copy link
Author

petergam commented Jan 2, 2021

Thanks @RunDevelopment. I appreciate your work

@joshgoebel
Copy link

Seems like my guess was correct. Safari's regex engine will stop searching for regex matches after 1million backtracking steps

From what I recall from our prior testing Safari and Edge both have this protection while Chrome still just keeps going until it finishes, no matter how long it takes.

@RunDevelopment
Copy link
Member

@joshgoebel Indeed... kinda... Our pattern doesn't backtrack. The regex describes an effective DFA, there is no way for it to backtrack. The problem is that the regex engine moves the pattern across the string and that Safari actually counts the number of disjunctions it encounters and not backtracking steps. I will make an issue about this problem tomorrow and explain it in more detail.

@joshgoebel
Copy link

I loosely understand what you mean. Probably good that Safari also includes this protection as well (at least it's being consistent). I wonder what Edge's behavior is here.

@RunDevelopment
Copy link
Member

New Edge is built on Chromium, so the same as Chrome and Firefox.

@joshgoebel
Copy link

joshgoebel commented Jan 2, 2021

New Edge is built on Chromium, so the same as Chrome and Firefox.

No. It's back-tracking behavior is definitely different than Chrome. It cuts short after so many excessive iterations, just like Safari does. Didn't we already have this discussion? I think there are constants in the regex engine that can be tweaked at compile-time if not runtime and some builds tweak them differently.

So I was wondering if it's protection also covered " moves the pattern across the string".

@RunDevelopment
Copy link
Member

New Edge has the same behaviour as other Chromium-based browsers on my machine.

image

Didn't we already have this discussion?

We did. I just assumed that you were right but I did some testing afterward and found that it wasn't the case.

What did you do to get different behaviour?


Note:
IE11 behaves as you described (backtracking limit) and since old Edge is built on the same engine, it probably does too (haven't tested; restoring old Edge is a pain). But as you previously said:

[...] we don't care about the former [Edge] [pre Chrome engine] (as much).

and I don't either.

@joshgoebel
Copy link

What did you do to get different behaviour?

I believe was using one of the first exponential regex examples (I may glance at our earlier thread and it may jog my memory that it was one of those). I think I was using exec rather than test and I wasn't using performance or a closure... I just ran the variants in the console and noted how long it took to spit out the answer. No matter how high the repeat got Safari and Edge would always return fast, where-as Chrome would scale as you're seeing in the snap above.

Perhaps the behavior has changed - I was using the most recent Edge at the time (Chrome based), not an early version. I'll try it again later locally and see if I get different results than you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants