Feature: Retain excess precision during float conversions #441

paupino · 2021-10-20T17:21:50Z

Fixes #438

Additional functions from_f32_retain and from_f64_retain created to retain additional float bits while parsing f32/f64 respectively.

hniksic · 2021-10-21T06:58:22Z

Thanks for implementing this. As mentioned in #438, please do consider renaming the functions to from_f32_retain and from_f64_retain, which are nicer/shorter but still descriptive.

I tried this branch with several numbers, included the 0.1 example, and the results are as expected - Decimal::from_f32_retain_excess_bits(0.1) returns 0.100000001490116119384765625 and Decimal::from_f64_retain_excess_bits(0.1) returns 0.1000000000000000055511151231.

The f64 result looked at first like it lacked a digit - it returns a decimal with mantissa() and scale() of 1000000000000000055511151231 and 28 respectively, where I expected 10000000000000000555111512313 and 29. But then I learned that the scale is capped at 28, so this is actually correct.

Out of curiosity, why is the maximum scale 28? The number of available bits after taking account the mantissa and sign is quite large (31 bits, allowing scales up to a billion). I understand that the limit corresponds to the maximum scale of the mantissa, but it seems to unnecessarily limit the significant precision of numbers closer to zero.

paupino · 2021-10-21T15:12:51Z

Cool - I've changed function names to be more concise.

@schungx also wrote some great comments about a precision of 28 vs 29 in issue #414 - highlights:

A 96-bit mantissa can only hold 28 full decimal digits (the 29th one incomplete), meaning that numeric results should always normalize to 10^28, throwing out the remainder (if <= 4) or adding one (if >= 5). Essentially rounding the mantissa to 28 digits.

...because 2^96 as a mantissa cannot represent 29 full digits (one of the digits is at most 7, missing 8 and 9). Therefore, you technically cannot represent exact precision of all 29-digit numbers. Which then leads to the fact that, in any calculation, the 29th digit will always be suspect due to imprecise precision.

Thus using a 96-bit mantissa gives you either: a 28-digit precise number, or... a 29-digit number which has only 28 digits if it starts with 8 or 9...

In addition, it allows us to possibly make performance optimizations for various operations - since we know what the constraints are.

paupino · 2021-10-21T15:13:58Z

Also, @hniksic thank you for taking the time to review! Very much appreciated.

Demo Excess precision retention

5dc706e

paupino mentioned this pull request Oct 20, 2021

Feature request: Conversion from float retaining excess precision #438

Closed

Move logic into an explicit function

8a10728

Rename function to be more concise

c81d2b5

paupino marked this pull request as ready for review October 21, 2021 15:14

paupino changed the title ~~Feature: Float excess precision retention~~ Additional functions to retain additional float bits while parsing f32/f64 Oct 21, 2021

paupino changed the title ~~Additional functions to retain additional float bits while parsing f32/f64~~ Feature: Retain excess precision during float conversions Oct 21, 2021

paupino merged commit d762427 into master Oct 21, 2021

paupino deleted the issue/438 branch January 11, 2022 03:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: Retain excess precision during float conversions #441

Feature: Retain excess precision during float conversions #441

paupino commented Oct 20, 2021 •

edited

hniksic commented Oct 21, 2021

paupino commented Oct 21, 2021

paupino commented Oct 21, 2021

Feature: Retain excess precision during float conversions #441

Feature: Retain excess precision during float conversions #441

Conversation

paupino commented Oct 20, 2021 • edited

hniksic commented Oct 21, 2021

paupino commented Oct 21, 2021

paupino commented Oct 21, 2021

paupino commented Oct 20, 2021 •

edited