You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've submitted a previous PR (#233) with some improvements for string parsing. My proposed version has this features:
Avoid OOM or Stack size exceed generated with long strings in the current version v2.1.3
Time parsing improvements. It can parse a 10MB string in just 1 second while v2.1.3 requires more than 2 mins. Longer sizes crashes current version.
Keeps all features of v2.1.3
However, this is not enough for my application (I receive strings with more than 100MB). These strings are parsed in ~6 secs and this is too much for my application. With JSON, I can parse these strings in less than a second (600-800 ms) but I can't use this parser due to I use JSON5 standard and some features aren't supported by JSON.
All tests are passed correctly (but error position)
escaped sequences and special characters produce the same output
unsupported chars are detected and reported
However, I have 2 changes from current version in the error reporting:
I report the full error sequence. I mean, if it parses '"\u000g"', the current version reports an error like this:
JSON5: invalid character 'g' at 1:6
But my version reports this error:
JSON5: invalid sequence '\u000g' at 1:1
I guess this report is more descriptive than the previous one.
And the most important problem is that I can't identify the exact position of the error if I get a parsing error. So on parsing this string '"aaaa\\n\n\n\u000g"' the current v2.1.3 reports:
JSON5: invalid character 'g' at 2:12
but mine reports (by now):
JSON5: invalid sequence '\u000g' at 1:1
I could work on the error reporting to get a better location of the error but I'm quite sure it isn't possible report the real position due to escaped characters. Probably I could provide something like:
JSON5: invalid sequence '\u000g' near to 2:4
What do you suggest? What is more important, error reporting or execution time?
By now, I'll try to return the same location for error reporting but I'm not really sure if it is possible.
The text was updated successfully, but these errors were encountered:
Thanks for your work on this. Performance is something I haven't had time to work on, so this is a great help.
I like the error reporting changes. It makes things clearer. I would definitely say that accuracy is more important than performance though, especially since this is a reference parser.
I've submitted a previous PR (#233) with some improvements for string parsing. My proposed version has this features:
However, this is not enough for my application (I receive strings with more than 100MB). These strings are parsed in ~6 secs and this is too much for my application. With JSON, I can parse these strings in less than a second (600-800 ms) but I can't use this parser due to I use JSON5 standard and some features aren't supported by JSON.
So I've implemented a new parser for strings (https://github.com/jlguardi/json5/tree/string_replace) which is able to parse in 700ms and I keep almost all features of v2.1.3:
However, I have 2 changes from current version in the error reporting:
But my version reports this error:
I guess this report is more descriptive than the previous one.
but mine reports (by now):
I could work on the error reporting to get a better location of the error but I'm quite sure it isn't possible report the real position due to escaped characters. Probably I could provide something like:
What do you suggest? What is more important, error reporting or execution time?
By now, I'll try to return the same location for error reporting but I'm not really sure if it is possible.
The text was updated successfully, but these errors were encountered: