Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hardcoded 15 significant digits in num_to_char.c #412

Open
cat-zeppelin opened this issue Feb 10, 2023 · 14 comments
Open

Hardcoded 15 significant digits in num_to_char.c #412

cat-zeppelin opened this issue Feb 10, 2023 · 14 comments

Comments

@cat-zeppelin
Copy link
Contributor

cat-zeppelin commented Feb 10, 2023

Accordingly to Standard for Floating-Point Arithmetic (IEEE754) if a double-precision number converted to string with 17 significant digits and then converted back to double-precision representation, the final result must match the original number.

But unfortunatelly in numerous places in num_to_char.c (e.g. row 30
int sig_digits = signif ? ceil(fmin(15, precision)) : 0;)
) the number of significant digits limited to 15, so that it is impossible to serialize a double precision number to json and deserialize it to the original value.

The current implementation returns FALSE for the simple check

json <- jsonlite::toJSON(pi, digits = I(15))
pi2 <- jsonlite::fromJSON(json)
identical(pi, pi2)
# > [1] FALSE

I've rebuilt the package locally with fixed 15 to 17 in num_to_char.c, and now it works as it must work. This code returns TRUE

json <- jsonlite::toJSON(pi, digits = I(17))
pi2 <- jsonlite::fromJSON(json)
identical(pi, pi2)
# > [1] TRUE
@jeroen
Copy link
Owner

jeroen commented Feb 10, 2023

Can you show us the json string for both examples?

@cat-zeppelin
Copy link
Contributor Author

jsonlite::toJSON(pi, digits = I(17))
[3.1415926535897931]
jsonlite::toJSON(pi, digits = I(15))
[3.14159265358979]

@jeroen
Copy link
Owner

jeroen commented Feb 10, 2023

OK and now look up the true value of π with 17 digits.

@cat-zeppelin
Copy link
Contributor Author

np, here it is
json <- jsonlite::toJSON(pi, digits = I(17))
pi2 <- jsonlite::fromJSON(json)
identical(pi, pi2)

> [1] TRUE

@cat-zeppelin
Copy link
Contributor Author

identical(as.numeric("3.1415926535897932"), as.numeric("3.1415926535897931"))
[1] TRUE

@jeroen
Copy link
Owner

jeroen commented Feb 10, 2023

Here is a hint ;) https://www.quora.com/Whats-the-17th-digit-of-pi

It may roundtrip, but the digit is not significant in the statistical sense. It is actually wrong in this case, so that is why we don't print it.

Anyway I guess we can change it, given that it only appears if users explicitly use I(17)

@cat-zeppelin
Copy link
Contributor Author

it is not said by standard that different decimal representation won't be the same in binary, rather it is said that 17 digits is minimum necessary number of significant digits to represent a floating point number with 53 bits mantisa in decimal representation

@cat-zeppelin
Copy link
Contributor Author

cat-zeppelin commented Feb 10, 2023

IEEE 754 says:
image

@jeroen
Copy link
Owner

jeroen commented Feb 10, 2023

it is not said by standard that different decimal representation won't be the same in binary, rather it is said that 17 digits is minimum necessary number of significant digits to represent a floating point number with 53 bits mantisa in decimal representation

Yes I understand it very well :) The question is: do we want a json string that is an accurate representation of the real number (pi in this case) or of the float value approximating that number (including the inaccurate noise part)? As a statistician I am inclined to the former, and and consider the 17th digit insignificant, because it is not accurate.

@cat-zeppelin
Copy link
Contributor Author

but the same may happen with 0.1 simpy. there is no exact representation in binary form of this simple decimal number.
we can not cover all cases of transforming binary to decimal exactelly. we just need to follow standarts that allow 1-to-1 forward and back transforming from binary to decimal and back

@jeroen
Copy link
Owner

jeroen commented Feb 10, 2023

we just need to follow standarts that allow 1-to-1 forward and back transforming from binary to decimal and back

Then the problem appears the other way around, with 17 digits we can no longer convert the decimal to binary and back.

x <- "[3.1415926535897931]"
toJSON(fromJSON(x), digits = I(17))

Anyway it's fine by me, can you send a PR with your change?

@cat-zeppelin
Copy link
Contributor Author

you use perhaps your original package that takes min between 15 and precision.
if you rebuild it then all good

x <- "[3.1415926535897931]"
jsonlite::toJSON(jsonlite::fromJSON(x), digits = I(17))
[3.1415926535897931]

another exmple of why we should not believe decimal represenation of binary number is below: (quite obvious)

x <- 0.099999999999999999
jsonlite::toJSON(x)
[0.1]

as for PR - I'll do

@Kodiologist
Copy link

@jeroen I'm guessing this issue should be closed now that the PR is in.

Relatedly, the help string for serializeJSON says "JSON is a text based format which leads to loss of precision when printing numbers", which isn't quite true. Rather, rounding causes loss of precision. I'd recommend setting the default digits value to NA for this and other functions and removing the warning. This will protect against loss of precision not requested by the user.

@jeroen
Copy link
Owner

jeroen commented Jun 26, 2023

@cat-zeppelin your change has caused some breakage. Can you please fix #420 ?

jeroen added a commit that referenced this issue Jun 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants