Fixes #438 - Corrections to BigDecimal consistency #440

johnjaylward · 2018-10-02T16:54:00Z

What problem does this code solve?

Fixes opt/getBigDecimal to be consistent
Updates get<NumberType> operations to return the result of getNumber for consistency and less copy/paste of code
Updates opt<NumberType> operations to return the result of optNumber for consistency and less copy/paste of code
Performance: Updates JSONWriter to use a regex to decide if writing as a number is best.

Risks
Due to the incorrect behavior of getBigDecimal, some code may now return different values than previously. The new behavior tries to keep the underlying value as close as possible, same as the optBigDecimal implementation. This means that values like 1.234f will no longer return as new BigDecimal("1.234") and will instead return as new BigDecimal(1.234f). The caller of get/optBigDecimal can then choose the number of significant digits they need by using the BigDecimal.setScale and choosing a rounding mode.

Changes to the API?
no.

Will this require a new release?
yes.

Should the documentation be updated?
Updated JavaDoc to reflect the possible inconsistencies callers may see when using get/optBigDecimal

Does it break the unit tests?
No. There was no test case that covered verifying that opt/getBigDecimal values matched. New test cases were added. See stleary/JSON-Java-unit-test#88

Was any code refactored in this commit?
Yes,

JSONObject.writeValue and JSONWriter.valueToString. I updated how numbers are validated before outputting them. We now use a regex that properly matches JSON Numbers instead of the BigDecimal constructor. Performance testing between the two implementations shows the regex offers significant speed gains over the BigDecimal constructor in most use cases. (see the new test case in the test case PR). For the most common cases, the RegEx offers about 80% less runtime (9355ms for BigDecimal vs 1902ms RegEx on my machine). Adding in the worst case scenario (an extremely long string that looks like a number until the last digit) all the test cases ran at approximately 12417ms for BigDecimal and 10039ms for the RegEx. The savings are still there, but not nearly as significant.
The get<numberType> and opt<number type> methods have been changed to use the common getNumber and optNumber under the hood. Functionality should be identical, but there is now less duplicated code and a single (two really...) place where numbers are possibly coerced from other types instead of it happening in many code points.

Review status
ACCEPTED

johnjaylward · 2018-10-02T17:22:41Z

forgot to update JSONArray. I'll do that quick now.

* Performance: Updates JSONWriter to use a regex to decide if writing as a number is best.

stleary · 2018-10-04T02:13:09Z

Accepted, starting 3 day comment window. Thanks @johnjaylward, changes look good.

johnjaylward · 2018-10-04T20:04:31Z

@stleary can you review this latest commit too. fine if you want to restart the 3 day window. I was making sure we didn't have more cases of inconsistencies with numbers and decided to unify the Number methods a little more.

stleary · 2018-10-05T12:10:41Z

No worries, changes look reasonable, will verify the unit tests tonight.

guidomedina

Another drawback of this code clean up is, say you have the following Json, a number represented as String which I have seen in some APIs:

{ "a" : "3"}

because native calls to Xxx.parseXxx were removed for the following case an intermediary instance of Number will always be created generating more garbage to be GCed, in other words; you are always "boxing" to Number and unboxing to the request native value.

JSONArray.java

johnjaylward · 2019-08-14T13:24:30Z

Another drawback of this code clean up is, say you have the following Json, a number represented as String which I have seen in some APIs:
{ "a" : "3"}
because native calls to Xxx.parseXxx were removed for the following case an intermediary instance of Number will always be created generating more garbage to be GCed, in other words; you are always "boxing" to Number and unboxing to the request native value.

The values are all stored in a Map object, which means they are all boxed anyway. I don't think I added any extra boxing in that wasn't present before. Are there any areas you can point out specifically?

guidomedina · 2019-08-14T13:59:53Z

In the example I put a = "3" which is a String, in your case it is first wrapped into a Number and then the native value is returned, let's assume the number is Integer which is the extra instance.

In the old code String "3" would be passed to Integer.parseInt("3") which returns the a native int without any object allocation.

Please note that I'm talking about the Json {"a" : "3"} and not {"a" : 3}, the 1st example will store a String in the Map, the 2nd example which is the standard and most common it will be indeed a subclass of Number, because number as a String is supported by this API it is only natural to take this case into consideration as well.

johnjaylward · 2019-08-14T14:56:01Z

Thanks, I'll take a look. That was probably an over site on my part.

johnjaylward · 2019-08-14T16:46:25Z

@guidomedina , I looked closer at the full implementation and I believe the reason I made for going this route was as follows:

A JSON Text of {"key":"3"} will usually be passed to a JSON object like so: jo = New JSONObject("{\"key\":\"3\"}");. The JOSN parser will parse the String 3 into an Integer 3 and store the Integer in the map.

The only way we would see the conversion happening in the get/opt and making extra Objects for GC, would be something like these cases:
Conversion from XML:

String xml = "<root><key>3</key></root>";
JSONObject jo = XML.ToJSONObject(xml, true); // keep all values as strings
jo.getInt(key); // creates the intermediate Number which then calls Number.intValue()

Custom built up JSONObject/array:

JSONObject jo = new JSONObject();
jo.put("key", "3"); // explicitly put the value as a string
jo.getInt("key"); // creates the intermediate Number which then calls Number.intValue()

The first one could be avoided by pre-processing the string using XML.ToJSONObject(xml), but that comes with it's own issues.

The second one seems totally avoidable altogether if the coder had planed out ahead of time what types go into the keys instead of lazily putting in a String and depending on the conversions. The "parse" of the sting will likely take up much more CPU time than the Number intermediary creation and GC, but that would probably be best to actually write some performance tests against instead of making generalizations.

guidomedina · 2019-08-14T17:11:20Z

This is not always the case, in the utility industry in general IoT providers come in many flavors, we cannot control how a third party provides us their Json.

Creating an intermediary instance of any of the Number sub-classes has to parse first and then wrap the native value within a reference.

See, Number is an interface:

http://kickjava.com/src/java/lang/Number.java.htm
and to obtain an Integer instance it has to parse first, this is the same for any other Number subclass, here is the source code of valueOf method:

public static Integer valueOf(String s) throws NumberFormatException  {
    return new Integer(parseInt(s, 10));
}

johnjaylward · 2019-08-14T17:34:30Z

I double checked the implementation, and I misspoke. A call like this JSONObject jo = new JSONObject("{\"key\":\"3\"}"); will always store the "3" as a String. The parser only trys finding/coercing values that are unquoted, and if that fails, uses it as a string. So JSONObject jo = new JSONObject("{\"key\":3a}"); where 3a is unquoted in the JSON would try to be parsed as a Number first but fails, then will be stored as a String.

I'll look into updating the calls to get the extra Number instances removed. I don't really like so much copy/paste code, but I'm not seeing a more generic way of doing it.

guidomedina · 2019-08-14T17:36:57Z

There isn't really a generic way to do this unfortunately, see this other implementation for example: https://github.com/eclipse-vertx/vert.x/blob/master/src/main/java/io/vertx/core/json/JsonObject.java

johnjaylward · 2019-08-14T19:06:20Z

@guidomedina I finally remembered my final reasoning for changing the functions to always use the get/optNumber functions.

Previously, the parser for get/optInt would call Integer.parseInt(string). This would fail for things like {"key":"3.14"}. However, the same request for opt/getBigInteger would return a BigInteger with a value of 3. By switching to the more generic method and using the Number interface to get the underlying values, I was able to make these calls consistent.

jo = new JSONObject("{\"key\":\"3.14\"}");
jo.optInt("key", 0); // before the change, `0`, after the change `3`
jo.optLong("key", 0); // before the change, `0`, after the change `3`
jo.optBigInteger("key", 0); // before the change, `3`, after the change `3`

The end result was that we ended up with more GC, but more consistent functionality.

johnjaylward · 2019-08-14T19:15:49Z

Would you prefer the get<Number> methods to have the harder checks (i.e. no coercion from float/double to int) while the opt<Number> methods retain their more lax approach?

The changes in this PR aren't quite what I mentioned above (it was a few PRs to get where the current code is). This one, the opt<Number> methods where specifically changed from using a BigDecimal parser then down casting to an int via BigDecimal.intValue. In both cases (BigDecimal call directly, or the optNumber call, we end up with a short lived intermediate object to get the double/float/long/int.

Personally, I feel that the opt/get methods should use the same type conversions in the background, so using the same example {"key":"3.14"}, a call to getInt and optInt should return the same value of 3

johnjaylward mentioned this pull request Oct 2, 2018

Updated testcases for PR440 stleary/JSON-Java-unit-test#88

Merged

johnjaylward force-pushed the FixForBigDecimal branch 2 times, most recently from cdb3bfb to 6661986 Compare October 2, 2018 17:44

johnjaylward changed the title ~~Fixes #438~~ Fixes #438 - Corrections to BigDecimal consistency Oct 2, 2018

johnjaylward force-pushed the FixForBigDecimal branch 5 times, most recently from 46e7e2f to afcd9ab Compare October 2, 2018 19:23

John J. Aylward added 3 commits October 2, 2018 15:28

* Fixes opt/getBigDecimal to be consistent

a63fa03

* Performance: Updates JSONWriter to use a regex to decide if writing as a number is best.

cleanup of minor warnings

bc347d2

fix javadoc

30c1bd1

johnjaylward force-pushed the FixForBigDecimal branch from afcd9ab to 30c1bd1 Compare October 2, 2018 19:29

remove unneeded casts

71c6dd1

removes duplicate code in number getters

34cfe6d

stleary merged commit 1a811f1 into stleary:master Oct 6, 2018

guidomedina reviewed Aug 14, 2019

View reviewed changes

JSONArray.java Show resolved Hide resolved

johnjaylward deleted the FixForBigDecimal branch August 14, 2019 16:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes #438 - Corrections to BigDecimal consistency #440

Fixes #438 - Corrections to BigDecimal consistency #440

johnjaylward commented Oct 2, 2018 •

edited

johnjaylward commented Oct 2, 2018

stleary commented Oct 4, 2018

johnjaylward commented Oct 4, 2018

stleary commented Oct 5, 2018

guidomedina left a comment •

edited

johnjaylward commented Aug 14, 2019

guidomedina commented Aug 14, 2019 •

edited

johnjaylward commented Aug 14, 2019

johnjaylward commented Aug 14, 2019 •

edited

guidomedina commented Aug 14, 2019 •

edited

johnjaylward commented Aug 14, 2019

guidomedina commented Aug 14, 2019

johnjaylward commented Aug 14, 2019

johnjaylward commented Aug 14, 2019

Fixes #438 - Corrections to BigDecimal consistency #440

Fixes #438 - Corrections to BigDecimal consistency #440

Conversation

johnjaylward commented Oct 2, 2018 • edited

johnjaylward commented Oct 2, 2018

stleary commented Oct 4, 2018

johnjaylward commented Oct 4, 2018

stleary commented Oct 5, 2018

guidomedina left a comment • edited

Choose a reason for hiding this comment

johnjaylward commented Aug 14, 2019

guidomedina commented Aug 14, 2019 • edited

johnjaylward commented Aug 14, 2019

johnjaylward commented Aug 14, 2019 • edited

guidomedina commented Aug 14, 2019 • edited

johnjaylward commented Aug 14, 2019

guidomedina commented Aug 14, 2019

johnjaylward commented Aug 14, 2019

johnjaylward commented Aug 14, 2019

johnjaylward commented Oct 2, 2018 •

edited

guidomedina left a comment •

edited

guidomedina commented Aug 14, 2019 •

edited

johnjaylward commented Aug 14, 2019 •

edited

guidomedina commented Aug 14, 2019 •

edited