Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize: no validation for decimal array for cast/take #2551

Closed
wants to merge 3 commits into from

Conversation

liukun4515
Copy link
Contributor

@liukun4515 liukun4515 commented Aug 22, 2022

Which issue does this PR close?

part of #2313

From the closed pr #2357

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

@github-actions github-actions bot added the arrow Changes to the arrow crate label Aug 22, 2022
@liukun4515 liukun4515 changed the title optimize: no validation for decimal array for cast optimize: no validation for decimal array for cast/take Aug 22, 2022
@liukun4515
Copy link
Contributor Author

liukun4515 commented Aug 22, 2022

cast decimal128 to decimal128 512
                        time:   [3.4851 us 3.5120 us 3.5413 us]
                        change: [-31.129% -29.411% -27.777%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

cast decimal128 to decimal256 512
                        time:   [102.42 us 104.41 us 106.54 us]
                        change: [-2.4822% -0.9755% +0.4987%] (p = 0.22 > 0.05)
                        No change in performance detected.

The performance of casting decimal has been improve significantly for the decimal128 to decimal128 about 30%, when the situation meet the no validation requirement.

@tustvold @alamb

@liukun4515
Copy link
Contributor Author

take decimal128 indices 1024
                        time:   [9.9399 us 10.042 us 10.158 us]
                        change: [-15.743% -14.530% -13.250%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
  6 (6.00%) high mild
  6 (6.00%) high severe

take decimal128 null indices 1024
                        time:   [11.713 us 11.800 us 11.894 us]
                        change: [-25.771% -24.467% -22.984%] (p = 0.00 < 0.05)
                        Performance has improved.

The performance changes of take operation

Comment on lines +535 to +539
.collect::<Result<Decimal128Array>>()?;

// specify the precision/scale without the validation
unsafe {
array.with_precision_and_scale_without_validation(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is something extremely bizarre about this API, we are collecting into a Decimal128Array which to be consistent should be performing validation based on the default decimal precision/scale, and we are then converting it into a different array. I can't help feeling we should be consistently either validating or not validating, at the moment we are basically doing a random combination

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

collecting the i128 data to construct the DecimalArray128, we use the default precision which is MAX precision(38), so all the element will not be overflow.

The take operation will don't change the data and precision, so we can skip the validation, If we use the strict validation mode.

@tustvold
Copy link
Contributor

As we don't consistently validate the contents of Decimal arrays, we can't meaningfully reason about if a given operation will overflow, and so I don't really know how to review this or what this is even really doing... I'll try to drive some consensus forward on #2387

@tustvold
Copy link
Contributor

I believe this will be superseded by #2637 so marking as draft

@tustvold tustvold marked this pull request as draft September 29, 2022 07:46
@tustvold
Copy link
Contributor

#2857 removes decimal validation

@tustvold tustvold closed this Oct 13, 2022
@liukun4515
Copy link
Contributor Author

#2857 removes decimal validation

thanks @tustvold

@liukun4515 liukun4515 deleted the no_validation_#2313 branch October 18, 2022 09:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants