Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Gson and JsonParser trailing data handling #2123

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

Marcono1234
Copy link
Collaborator

@Marcono1234 Marcono1234 commented May 22, 2022

Improves the trailing data handling for Gson and JsonParser, mainly addressing the following issues:

  • The old code was not detecting trailing data after a JSON null, e.g. null[], because it erroneously considered a null to always mean that the input was empty
  • The existing check for trailing data in the form of multiple top-level JSON elements was never working (?). While the parsing was done in lenient mode, the check reader.peek() != JsonToken.END_DOCUMENT was done separately, where the JsonReader was in strict mode again and therefore peek() itself threw a MalformedJsonException.
    With the new code added by this pull request the custom JsonSyntaxException is now thrown for multiple top-level JSON elements; however, a MalformedJsonException could likely still occur if the trailing data is not valid JSON data, but that is probably acceptable.

*
* @author Inderjeet Singh
* @author Joel Leitch
*/
public class JsonParserTest extends TestCase {
public class GsonParsingTest extends TestCase {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have renamed this class because there exists already a JsonParserTest (in a different package), and this test class is mainly checking the behavior of Gson methods.

@Marcono1234 Marcono1234 force-pushed the marcono1234/Gson-JsonParser-trailing-data-check branch from 0da3a56 to ee70912 Compare June 29, 2022 19:55
Copy link
Member

@eamonnmcmanus eamonnmcmanus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran this against all of Google's tests and did not find any problems, except:

  • references to a now-deleted internal method, which I think needs to be undeleted
  • tests that check for specific exceptions that are now different

I did have to merge with HEAD, and that implied adding a couple of @Test annotations to the now-JUnit4 JsonParserTest.

*/
public static JsonElement parse(JsonReader reader) throws JsonParseException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm finding a surprisingly large number of references to this method in Google's source base (including third-party code). In some cases it is because people have copied RuntimeTypeAdapterFactory into their own projects. So I think we probably need to keep this overload. Then of course we could continue to call it from RuntimeTypeAdapterFactory instead of the new (..., false) version.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, before #1959 RuntimeTypeAdapterFactory was using the internal Streams class. I have now added back that method, but marked it as deprecated to make users aware that it is declared in an internal Gson class, I hope that is ok.

Then of course we could continue to call it from RuntimeTypeAdapterFactory instead of the new (..., false) version.

The RuntimeTypeAdapterFactory variant which was still using this method was actually a copy of the class from the extras module. I have now removed that copy (and the enclosing test class RuntimeTypeAdapterFactoryFunctionalTest) and have moved the adjusted test code to the RuntimeTypeAdapterFactoryTest class in the extras module to avoid code duplication.

@Marcono1234
Copy link
Collaborator Author

tests that check for specific exceptions that are now different

Are these tests related to JSON with trailing data / multiple top level JSON values? And are the newly thrown exceptions more reasonable?
If not, which exceptions are now thrown? I thought I did not change any of the other exception handling logic.

@eamonnmcmanus
Copy link
Member

Are these tests related to JSON with trailing data / multiple top level JSON values? And are the newly thrown exceptions more reasonable?
If not, which exceptions are now thrown? I thought I did not change any of the other exception handling logic.

Thanks for prompting me to look at this in more detail. I think there are problems with the new code. I haven't looked at all the test failures, but the one I did look at failed because this:

JsonParser.parseString("$$$$////")

gets a JsonSyntaxException with the current code but returns a JsonPrimitive for the string "$$$$" with this change. That doesn't seem right.

@Marcono1234
Copy link
Collaborator Author

Marcono1234 commented Jul 24, 2022

gets a JsonSyntaxException with the current code but returns a JsonPrimitive for the string "$$$$" with this change. That doesn't seem right.

The problem is that JsonParser parses leniently. However, before this PR the JsonReader.peek() call to determine whether the end of the document has been reached was (erroneously?) done after the original lenient mode was restored (therefore the JsonReader was already strict again). You will also notice that in the message of the original exception you get for that input:

Use JsonReader.setLenient(true) to accept malformed JSON at line 1 column 6 path $

This indicates that the first / caused the exception and not the $ (the location information is slightly incorrect, see also #1764).

With the changes of this PR the peek() call is now (correctly?) done in lenient mode as well, which therefore parses the JSON as:

  • String $$$$
  • End of line comment start //
    • End of line comment //

If you don't want this to be changed, then the end of document checks can most likely be removed (or reduced to a peek() call discarding the result) because they are effectively dead code in their current form. Though explaining to a user when and how Gson is lenient becomes then slightly more confusing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants