Add `Decimal256` API #1914

viirya · 2022-06-20T04:50:01Z

Which issue does this PR close?

Closes #1913.

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

codecov-commenter · 2022-06-20T05:08:04Z

Codecov Report

Merging #1914 (32be11a) into master (ded6316) will increase coverage by 0.01%.
The diff coverage is 90.72%.

@@            Coverage Diff             @@
##           master    #1914      +/-   ##
==========================================
+ Coverage   83.41%   83.43%   +0.01%     
==========================================
  Files         214      214              
  Lines       56991    57061      +70     
==========================================
+ Hits        47541    47610      +69     
- Misses       9450     9451       +1

Impacted Files	Coverage Δ
arrow/src/util/decimal.rs	`91.50% <90.62%> (+11.91%)`	⬆️
arrow/src/array/array_binary.rs	`94.18% <100.00%> (ø)`
arrow/src/array/array_dictionary.rs	`91.53% <0.00%> (-0.39%)`	⬇️
parquet_derive/src/parquet_field.rs	`65.98% <0.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ded6316...32be11a. Read the comment docs.

viirya · 2022-06-20T17:59:20Z

arrow/src/util/decimal.rs

-        self.value.partial_cmp(&other.value)
+impl Decimal128 {
+    /// Creates `Decimal128` from an `i128` value.
+    pub fn new_from_i128(precision: usize, scale: usize, value: i128) -> Self {


new_from_i128 is still convenient to use. So I'm not sure if we want to remove it. I keep it so far.

It is mostly used in test code internally. Another option is to not expose it, maybe we can make it crate public.

viirya · 2022-06-20T20:39:45Z

arrow/src/util/decimal.rs

+    /// returning an error. The bytes should be stored in little-endian order.
+    ///
+    /// Safety:
+    /// This method doesn't validate if the decimal value represented by the bytes


DecimalArray and DecimalBuilder already do value validation. So I skip validation here.

DecimalArray and DecimalBuilder already do value validation. So I skip validation here.

If we can make sure this, I think it looks good me.

viirya · 2022-06-20T20:41:27Z

arrow/src/util/decimal.rs

+    /// If the string representation cannot be fitted with the precision of the decimal,
+    /// the string will be truncated.


I've considered if the truncation is needed here. I added it eventually because as_string outputting a string larger than specified precision looks weird.

DecimalArray and DecimalArray already do value validation. So I skip validation here.

Why do we need the truncation if we can always make sure that the value can be stored within the precision?

I just wonder if users might take it to use separately (i.e., without the value validation in DecimalArray and DecimalBuilder) and getting confused.

alamb · 2022-06-21T19:51:17Z

I will try and find time to review this tomorrow

arrow/src/util/decimal.rs

HaoYang670 · 2022-06-22T12:22:36Z

arrow/src/util/decimal.rs

+                if sign.len() == 1 {
+                    bound += 1;
+                }
+                let value_str = value_str[0..bound].to_string();


Why do we slice from the most significant digit?
If value_str = 1000, precision = 3, scale = 1, then the expected output is 10.0?

If the precision is 3, and the max value is 999

arrow/src/util/decimal.rs

liukun4515 · 2022-06-22T12:40:55Z

I will review this PR tomorrow. @viirya

alamb

I think it looks good to me after resolving comments from @HaoYang670 and @liukun4515

arrow/src/util/decimal.rs

Co-authored-by: Remzi Yang <59198230+HaoYang670@users.noreply.github.com>

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

viirya · 2022-06-22T22:09:36Z

Seems some unrelated errors in interop test between Go and C#:

################# FAILURES #################
FAILED TEST: datetime Go producing,  C# consuming
1 failures
1
Error: `docker-compose --file /home/runner/work/arrow-rs/arrow-rs/docker-compose.yml run --rm -e ARCHERY_INTEGRATION_WITH_RUST=1 conda-integration` exited with a non-zero exit code 1, see the process log above.

arrow/src/util/decimal.rs

liukun4515 · 2022-06-23T09:36:03Z

arrow/src/util/decimal.rs

+        assert_eq!(value.to_string(), "-0.01");
+
+        bytes = i128::MAX.to_le_bytes();
+        let value = Decimal128::try_new_from_bytes(38, 2, &bytes).unwrap();


The width of i128::MAX.to_le_bytes() is 39.
The input value of i128 or bytes are larger than the precision, and it looks a bit wired to me.
For decimal(3,2), if we put 12345 and 12344 to the decimal(3,2), we will get the same value of tostring
But the ord and eq is not same.

i128 maximum is larger than the precision 38 so truncation is happened. I thought either I put value validation into Decimal structs, or do string truncation. As I mentioned earlier, DecimalArray/DecimalBuilder already do value validation, I feel it is redundant to do it here again. Ideally these Decimal structs are used only in input/output of DecimalArray. Truncation is used to for sure that we won't produce a string longer than its precision. For its usage, it should not be given a value larger than its precision because it will be caught by DecimalArray/DecimalBuilder.

Alternatively it is an option to get error from to_string if string length is larger than precision. But as DecimalBuilder can optionally skip value validation, it means we probably can have Decimal structs with such case. Then having an error seems not fitting with it.

liukun4515

It’s almost looks good to me.

viirya · 2022-06-23T17:05:37Z

@alamb Can we add this into 17.0.0 too? This changes Decimal128 so it will be an api change in 18.0.0 if we put this to 18.0.0. If it is okay to change, then that's fine.

viirya · 2022-06-23T17:17:50Z

If this is not needed to catch up 17.0.0, I will leave it open for a while in case others want to comment.

alamb · 2022-06-23T19:29:43Z

Thanks @viirya and @liukun4515 for the careful review

I think this look good enough to go for me -- given @liukun4515 approved this PR I will assume he is ok with merging as is (and we can improve it more in future PRs).

I don't quite follow all the discussion on #1914 (comment) so I don't know if there is any outstanding issues there we should be tracking or not 🤔 Let me know if it would help if I filed follow on tasks.

viirya · 2022-06-23T19:36:28Z

Thank you @alamb @liukun4515 @HaoYang670

github-actions bot added the arrow Changes to the arrow crate label Jun 20, 2022

Add Decimal256

e488722

viirya force-pushed the decimal256 branch from fd59500 to e488722 Compare June 20, 2022 05:30

Dedup

0c369fc

viirya commented Jun 20, 2022

View reviewed changes

Truncate string representation by precision

cb4c555

viirya force-pushed the decimal256 branch from 39e78bd to cb4c555 Compare June 20, 2022 20:07

viirya commented Jun 20, 2022

View reviewed changes

HaoYang670 reviewed Jun 22, 2022

View reviewed changes

arrow/src/util/decimal.rs Outdated Show resolved Hide resolved

HaoYang670 reviewed Jun 22, 2022

View reviewed changes

arrow/src/util/decimal.rs Outdated Show resolved Hide resolved

HaoYang670 reviewed Jun 22, 2022

View reviewed changes

arrow/src/util/decimal.rs Outdated Show resolved Hide resolved

alamb approved these changes Jun 22, 2022

View reviewed changes

arrow/src/util/decimal.rs Outdated Show resolved Hide resolved

arrow/src/util/decimal.rs Show resolved Hide resolved

viirya and others added 5 commits June 22, 2022 13:37

Update arrow/src/util/decimal.rs

d75366c

Co-authored-by: Remzi Yang <59198230+HaoYang670@users.noreply.github.com>

Update arrow/src/util/decimal.rs

84a2075

Co-authored-by: Remzi Yang <59198230+HaoYang670@users.noreply.github.com>

Update arrow/src/util/decimal.rs

3346246

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

For review

40c1084

Fix clippy

3155441

HaoYang670 reviewed Jun 23, 2022

View reviewed changes

arrow/src/util/decimal.rs Outdated Show resolved Hide resolved

For review

d0677df

HaoYang670 reviewed Jun 23, 2022

View reviewed changes

arrow/src/util/decimal.rs Outdated Show resolved Hide resolved

Move another one

32be11a

liukun4515 reviewed Jun 23, 2022

View reviewed changes

liukun4515 approved these changes Jun 23, 2022

View reviewed changes

alamb merged commit f0df5e0 into apache:master Jun 23, 2022

alamb changed the title ~~Add Decimal256 API~~ Add Decimal256 API Jun 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `Decimal256` API #1914

Add `Decimal256` API #1914

viirya commented Jun 20, 2022

codecov-commenter commented Jun 20, 2022 •

edited

viirya Jun 20, 2022

viirya Jun 20, 2022

viirya Jun 20, 2022 •

edited

liukun4515 Jun 23, 2022

viirya Jun 20, 2022

HaoYang670 Jun 22, 2022

viirya Jun 22, 2022

alamb commented Jun 21, 2022

HaoYang670 Jun 22, 2022

liukun4515 Jun 22, 2022 •

edited

liukun4515 commented Jun 22, 2022

alamb left a comment

viirya commented Jun 22, 2022

liukun4515 Jun 23, 2022

viirya Jun 23, 2022

viirya Jun 23, 2022

liukun4515 left a comment

viirya commented Jun 23, 2022 •

edited

viirya commented Jun 23, 2022

alamb commented Jun 23, 2022

viirya commented Jun 23, 2022

		/// If the string representation cannot be fitted with the precision of the decimal,
		/// the string will be truncated.

Add Decimal256 API #1914

Add Decimal256 API #1914

Conversation

viirya commented Jun 20, 2022

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

codecov-commenter commented Jun 20, 2022 • edited

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

viirya Jun 20, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alamb commented Jun 21, 2022

Choose a reason for hiding this comment

liukun4515 Jun 22, 2022 • edited

Choose a reason for hiding this comment

liukun4515 commented Jun 22, 2022

alamb left a comment

Choose a reason for hiding this comment

viirya commented Jun 22, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

liukun4515 left a comment

Choose a reason for hiding this comment

viirya commented Jun 23, 2022 • edited

viirya commented Jun 23, 2022

alamb commented Jun 23, 2022

viirya commented Jun 23, 2022

Add `Decimal256` API #1914

Add `Decimal256` API #1914

codecov-commenter commented Jun 20, 2022 •

edited

viirya Jun 20, 2022 •

edited

liukun4515 Jun 22, 2022 •

edited

viirya commented Jun 23, 2022 •

edited