New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document precisely what we validate for skipped values #1640
Comments
A couple of things here seem like we may want to change in terms of featureset:
|
I remember at least one user who would disagree with that. |
I would be careful about adding too much weight to stage 1. We want to serve well the user that only wants to extract a tiny bit of information in a document that they "know" is correct. |
@jkeiser What about a 'validate()' method on the document? Now that we have document rewinds... that could be useful. This way, people who don't need it do not have to pay a price for it. |
I actually think we should have a (templated?) version of on demand that validates everything as you go. I think validate() is a good idea too. My change suggestions here are to try and get closer to the ideal "any errors in unused json that would cause you to get the wrong value later on will be detected." |
Perhaps more succinctly: "simdjson will detect any JSON errors that would change the output of your program." |
I think the default user will want some assurance it is safe. I can imagine a flag that goes the other way, too: "assume the document is valid." |
We always validate whatever you use, as well as any objects and arrays that are part of the path to it. The idea is, you should never get the wrong value just because something you skipped has an error in it; but there are many errors that won't affect that.
A (partial?) list for documentation of what we do and don't validate for skipped values:
Strings:
\"
). This means that even if a string has some invalid stuff inside, we validate enough that they cannot affect anything else.get_string()
orunescaped_key()
). i.e.\p
is not allowed, and\u
must be followed by hex digits.Numbers/Booleans/Null:
-0ab-10 trrrue
as two values without a comma between them, whether you actually use the values or not.Arrays/Objects:
,
or:
when you iterate or index an array or object.,
/:
in an array or object if you fully iterate it.]
or}
matches the opening one if you fully iterate the object/array.Document:
The text was updated successfully, but these errors were encountered: