Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Retain excess precision during float conversions #441

Merged
merged 3 commits into from
Oct 21, 2021
Merged

Conversation

paupino
Copy link
Owner

@paupino paupino commented Oct 20, 2021

Fixes #438

Additional functions from_f32_retain and from_f64_retain created to retain additional float bits while parsing f32/f64 respectively.

@hniksic
Copy link

hniksic commented Oct 21, 2021

Thanks for implementing this. As mentioned in #438, please do consider renaming the functions to from_f32_retain and from_f64_retain, which are nicer/shorter but still descriptive.

I tried this branch with several numbers, included the 0.1 example, and the results are as expected - Decimal::from_f32_retain_excess_bits(0.1) returns 0.100000001490116119384765625 and Decimal::from_f64_retain_excess_bits(0.1) returns 0.1000000000000000055511151231.

The f64 result looked at first like it lacked a digit - it returns a decimal with mantissa() and scale() of 1000000000000000055511151231 and 28 respectively, where I expected 10000000000000000555111512313 and 29. But then I learned that the scale is capped at 28, so this is actually correct.

Out of curiosity, why is the maximum scale 28? The number of available bits after taking account the mantissa and sign is quite large (31 bits, allowing scales up to a billion). I understand that the limit corresponds to the maximum scale of the mantissa, but it seems to unnecessarily limit the significant precision of numbers closer to zero.

@paupino
Copy link
Owner Author

paupino commented Oct 21, 2021

Cool - I've changed function names to be more concise.

@schungx also wrote some great comments about a precision of 28 vs 29 in issue #414 - highlights:

A 96-bit mantissa can only hold 28 full decimal digits (the 29th one incomplete), meaning that numeric results should always normalize to 10^28, throwing out the remainder (if <= 4) or adding one (if >= 5). Essentially rounding the mantissa to 28 digits.

...because 2^96 as a mantissa cannot represent 29 full digits (one of the digits is at most 7, missing 8 and 9). Therefore, you technically cannot represent exact precision of all 29-digit numbers. Which then leads to the fact that, in any calculation, the 29th digit will always be suspect due to imprecise precision.

Thus using a 96-bit mantissa gives you either: a 28-digit precise number, or... a 29-digit number which has only 28 digits if it starts with 8 or 9...

In addition, it allows us to possibly make performance optimizations for various operations - since we know what the constraints are.

@paupino
Copy link
Owner Author

paupino commented Oct 21, 2021

Also, @hniksic thank you for taking the time to review! Very much appreciated.

@paupino paupino marked this pull request as ready for review October 21, 2021 15:14
@paupino paupino changed the title Feature: Float excess precision retention Additional functions to retain additional float bits while parsing f32/f64 Oct 21, 2021
@paupino paupino changed the title Additional functions to retain additional float bits while parsing f32/f64 Feature: Retain excess precision during float conversions Oct 21, 2021
@paupino paupino merged commit d762427 into master Oct 21, 2021
@paupino paupino deleted the issue/438 branch January 11, 2022 03:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature request: Conversion from float retaining excess precision
2 participants