Cast: should get the round result for decimal to a decimal with smaller scale #3139

liukun4515 · 2022-11-19T07:58:41Z

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

liukun4515 · 2022-11-19T08:00:36Z

Now it just implement the case of decimal128 to decimal128.
If the method of implementation looks good to all, I will fill out other case and add more test cases

cc @viirya @tustvold

tustvold

I think this should consistently use wrapping or checked add, neg, div, rem, etc... This not only is consistent with other kernels, but avoids differences between release and debug builds

tustvold · 2022-11-21T16:54:09Z

arrow-cast/src/cast.rs

+                        let d = v / div;
+                        let r = v % div;


Suggested change

let d = v / div;

let r = v % div;

let d = v.wrapping_div(div);

let r = v.wrapping_rem(div);

tustvold · 2022-11-21T16:54:52Z

arrow-cast/src/cast.rs

+                        let d = v / div;
+                        let r = v % div;
+                        if v >= 0 && r >= half {
+                            d + 1


Suggested change

d + 1

d.wrapping_add(1)

tustvold · 2022-11-21T16:55:02Z

arrow-cast/src/cast.rs

+                        if v >= 0 && r >= half {
+                            d + 1
+                        } else if v < 0 && r <= neg_half {
+                            d - 1


Suggested change

d - 1

d.wrapping_sub(1)

tustvold · 2022-11-21T16:56:48Z

arrow-cast/src/cast.rs

@@ -1955,12 +1956,26 @@ fn cast_decimal_to_decimal_safe<const BYTE_WIDTH1: usize, const BYTE_WIDTH2: usi
        // For example, input_scale is 4 and output_scale is 3;
        // Original value is 11234_i128, and will be cast to 1123_i128.
        let div = 10_i128.pow((input_scale - output_scale) as u32);
+        let half = div / 2;
+        let neg_half = half.neg();


Suggested change

let neg_half = half.neg();

let neg_half = half.wrapping_neg();

As we've divided by 2 this can't overflow

tustvold · 2022-11-21T16:57:12Z

arrow-cast/src/cast.rs

+            // TODO: it's better to implement the neg
+            let neg_half = half * i256::from_i128(-1);


Suggested change

// TODO: it's better to implement the neg

let neg_half = half * i256::from_i128(-1);

let neg_half = half.wrapping_neg();

liukun4515 · 2022-11-22T02:23:16Z

I think this should consistently use wrapping or checked add, neg, div, rem, etc... This not only is consistent with other kernels, but avoids differences between release and debug builds

The changes i have done will not overflow.
It's good to make consistent between debug and release

tustvold · 2022-11-23T13:39:45Z

Do you intend to switch to explicitly using wrapping / checked operations to ensure consistent behaviour across debug and release, and to be consistent with the other kernels?

liukun4515 · 2022-11-24T09:52:49Z

Do you intend to switch to explicitly using wrapping / checked operations to ensure consistent behaviour across debug and release, and to be consistent with the other kernels?

@tustvold

Sorry for the late reply, i forgot to push the changes.

tustvold

I think there is a logical conflict with one of the tests for negative scales

ursabot · 2022-11-25T07:12:49Z

Benchmark runs are scheduled for baseline = 2c86895 and contender = 187bf61. 187bf61 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

liukun4515 · 2022-11-26T00:49:18Z

I think there is a logical conflict with one of the tests for negative scales

hi @tustvold Can you give an example to explain the conflict.

From #3152, I know negative scala is supported in the Arrow.
Before this, I have not known the usage of negative scale.

liukun4515 · 2022-11-26T00:56:49Z

Maybe I got your thought from this commit 2abbf89

But i need time to get the behavior of negative scale when we do cast in other system.

liukun4515 · 2022-11-26T01:02:47Z

the decimal(10,-1) with the 128-bit integer (123), the string of the value is 1230, if we cast it to the decimal(10,-2), what the 128-bit integer of result should be? @tustvold @viirya

tustvold · 2022-11-26T08:27:11Z

123

liukun4515 · 2022-11-27T01:55:45Z

123

I am confused about this, if the data type is decimal(10,-2) and the 128-bit integer is 123, it represent the value of 12300, and the value has been changed after casting.

I think the 128-bit integer should be 12 after casted to decimal(10,-2).

From the doc: https://arrow.apache.org/docs/python/generated/pyarrow.decimal128.html#pyarrow-decimal128

decimal128(5, -3) can exactly represent the number 12345000 (encoded internally as the 128-bit integer 12345), but neither 123450000 nor 1234500.

tustvold · 2022-11-27T07:58:26Z

Apologies I misread your example, if the integer value was 1230 casting would yield an integer value of 123, with the same string value. Casting an integer value of 123 with a corresponding string value of 1230 I would expect to result in an error, although #3203 would suggest something isn't quite right here yet

liukun4515 requested a review from viirya November 19, 2022 07:58

github-actions bot added the arrow Changes to the arrow crate label Nov 19, 2022

liukun4515 requested a review from alamb November 19, 2022 07:59

liukun4515 mentioned this pull request Nov 21, 2022

Update to arrow and parquet 27.0.0 apache/datafusion#4199

Merged

liukun4515 force-pushed the decimal_round_#3137 branch from 949c6fd to fbc307d Compare November 21, 2022 14:45

tustvold reviewed Nov 21, 2022

View reviewed changes

liukun4515 requested a review from tustvold November 23, 2022 13:38

fix: cast decimal to decimal should be round the result

153781d

liukun4515 force-pushed the decimal_round_#3137 branch from fbc307d to 153781d Compare November 24, 2022 09:51

tustvold approved these changes Nov 24, 2022

View reviewed changes

Merge remote-tracking branch 'upstream/master' into decimal_round_#3137

2abbf89

viirya approved these changes Nov 25, 2022

View reviewed changes

tustvold merged commit 187bf61 into apache:master Nov 25, 2022

alamb mentioned this pull request Nov 25, 2022

Should be the rounding vs truncation when cast decimal to smaller scale #3137

Closed

liukun4515 deleted the decimal_round_#3137 branch November 26, 2022 00:52

viirya mentioned this pull request Nov 27, 2022

Add a cast test case for decimal negative scale #3203

Merged

liukun4515 mentioned this pull request Nov 29, 2022

Get the round result for decimal to a decimal with smaller scale #3224

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cast: should get the round result for decimal to a decimal with smaller scale #3139

Cast: should get the round result for decimal to a decimal with smaller scale #3139

liukun4515 commented Nov 19, 2022

liukun4515 commented Nov 19, 2022 •

edited

tustvold left a comment

tustvold Nov 21, 2022

tustvold Nov 21, 2022

tustvold Nov 21, 2022

tustvold Nov 21, 2022

tustvold Nov 21, 2022

liukun4515 commented Nov 22, 2022 •

edited

tustvold commented Nov 23, 2022

liukun4515 commented Nov 24, 2022

tustvold left a comment

ursabot commented Nov 25, 2022

liukun4515 commented Nov 26, 2022

liukun4515 commented Nov 26, 2022 •

edited

liukun4515 commented Nov 26, 2022

tustvold commented Nov 26, 2022

liukun4515 commented Nov 27, 2022

tustvold commented Nov 27, 2022

	let neg_half = half.neg();
	let neg_half = half.wrapping_neg();

		// TODO: it's better to implement the neg
		let neg_half = half * i256::from_i128(-1);

Cast: should get the round result for decimal to a decimal with smaller scale #3139

Cast: should get the round result for decimal to a decimal with smaller scale #3139

Conversation

liukun4515 commented Nov 19, 2022

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

liukun4515 commented Nov 19, 2022 • edited

tustvold left a comment

Choose a reason for hiding this comment

tustvold Nov 21, 2022

Choose a reason for hiding this comment

tustvold Nov 21, 2022

Choose a reason for hiding this comment

tustvold Nov 21, 2022

Choose a reason for hiding this comment

tustvold Nov 21, 2022

Choose a reason for hiding this comment

tustvold Nov 21, 2022

Choose a reason for hiding this comment

liukun4515 commented Nov 22, 2022 • edited

tustvold commented Nov 23, 2022

liukun4515 commented Nov 24, 2022

tustvold left a comment

Choose a reason for hiding this comment

ursabot commented Nov 25, 2022

liukun4515 commented Nov 26, 2022

liukun4515 commented Nov 26, 2022 • edited

liukun4515 commented Nov 26, 2022

tustvold commented Nov 26, 2022

liukun4515 commented Nov 27, 2022

tustvold commented Nov 27, 2022

liukun4515 commented Nov 19, 2022 •

edited

liukun4515 commented Nov 22, 2022 •

edited

liukun4515 commented Nov 26, 2022 •

edited