Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Decimal128 API and use it in DecimalArray and DecimalBuilder #1871

Merged
merged 3 commits into from Jun 16, 2022

Conversation

viirya
Copy link
Member

@viirya viirya commented Jun 14, 2022

Which issue does this PR close?

Closes #1870.

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

@github-actions github-actions bot added arrow Changes to the arrow crate parquet Changes to the parquet crate labels Jun 14, 2022
use std::cmp::Ordering;

#[derive(Clone, Debug)]
pub struct Decimal128 {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C++ Decimal128 has implemented some operators. I don't implement the same in this change. This change tries to be functionally equal with current i128 API. We can consider to have these operators if they are needed.

@viirya viirya added the api-change Changes to the arrow API label Jun 14, 2022
@codecov-commenter
Copy link

codecov-commenter commented Jun 14, 2022

Codecov Report

Merging #1871 (ecb026f) into master (cedaf8a) will decrease coverage by 0.00%.
The diff coverage is 90.19%.

@@            Coverage Diff             @@
##           master    #1871      +/-   ##
==========================================
- Coverage   83.46%   83.46%   -0.01%     
==========================================
  Files         201      202       +1     
  Lines       57014    57069      +55     
==========================================
+ Hits        47586    47630      +44     
- Misses       9428     9439      +11     
Impacted Files Coverage Δ
arrow/src/util/decimal.rs 79.59% <79.59%> (ø)
arrow/src/array/array_binary.rs 94.18% <100.00%> (-0.06%) ⬇️
arrow/src/array/builder.rs 86.98% <100.00%> (+0.09%) ⬆️
arrow/src/array/equal_json.rs 89.70% <100.00%> (ø)
arrow/src/array/iterator.rs 96.11% <100.00%> (ø)
arrow/src/compute/kernels/cast.rs 95.77% <100.00%> (ø)
arrow/src/compute/kernels/sort.rs 95.67% <100.00%> (ø)
arrow/src/compute/kernels/take.rs 95.27% <100.00%> (ø)
parquet/src/arrow/arrow_reader.rs 96.87% <100.00%> (ø)
parquet/src/arrow/arrow_writer/mod.rs 97.53% <100.00%> (ø)
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cedaf8a...ecb026f. Read the comment docs.

}

impl PartialEq<Self> for Decimal128 {
fn eq(&self, other: &Self) -> bool {
Copy link
Member Author

@viirya viirya Jun 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit not sure about comparing Decimal128. Although C++ Decimal128 just compares two uint64 values. We also compare i128 directly currently (e.g., ord kernel).

But I'm still wondering it is correct to compare two values with different scale? E.g., 100_i128 (scale 2) and 100_i128 (scale 3)? Isn't it "1.00" and "0.100" respectively?

So I put an assert to check scale here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the assert is needed.
If two decimal128 has diff type(precision or scale), we can't compare the value of i128.

Copy link
Contributor

@tustvold tustvold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this, my only major question concerns how we generalise this to Decimal 256 without having to duplicate lots of code


use std::cmp::Ordering;

#[derive(Debug)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps a doc comment or something

}
}

pub fn new_from_i128(precision: usize, scale: usize, value: i128) -> Self {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm presuming we will want to make Decimal256 and Decimal128 generic versions of the same impl, and so I wonder how methods like this which explicitly name the type will translate? Maybe new_from_raw?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have considered making them generic versions. Because I only implement Decimal128 now, new_from_i128 is used to make Decimal128 fit into existing codes.

Next step I will implement Decimal256 and try generalise it with Decimal128. I think if it works, new_from_bytes (maybe rename to new_from_raw) will be the generalised API. new_from_i128 will be removed if the above idea works.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have some time before next release to have Decimal256 and generalise the API.

@liukun4515
Copy link
Contributor

I want to take a look this pr, please hold it.


impl Eq for Decimal128 {}

impl Decimal128 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a function to get the type of the decimal128 value?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decimal(p, s)? We can have it. Because it is easy to add API but harder to remove. The API starts from minimalism. Currently I keep it as less as possible only to be on par with existing functionality.

@liukun4515
Copy link
Contributor

@viirya
a question which is not about this pr.
How to represent decimal256 in rust?How does c++ implement it?
I'm not familiar with arrow c++ version.

@viirya
Copy link
Member Author

viirya commented Jun 16, 2022

@viirya a question which is not about this pr. How to represent decimal256 in rust?How does c++ implement it? I'm not familiar with arrow c++ version.

Like C++ Arrow Decimal256, we can represent the integer in an array of parts of it. C++ Arrow Decimal256 uses an uint64_t array.

Copy link
Contributor

@liukun4515 liukun4515 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@viirya
Copy link
Member Author

viirya commented Jun 16, 2022

Thank you @tustvold @liukun4515. I'm going to merge this and keeping working on Decimal256 and generalise them.

@viirya viirya merged commit f0bf7f9 into apache:master Jun 16, 2022
let as_array = bytes.try_into();
let value = match as_array {
Ok(v) if bytes.len() == 16 => i128::from_le_bytes(v),
_ => panic!("Input to Decimal128 is not 128bit integer."),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be better to return a Result instead of panic-ing here ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. I will change it to Result. Thanks.

@alamb alamb changed the title Add Decimal128 API and use it in DecimalArray and DecimalBuilder Add Decimal128 API and use it in DecimalArray and DecimalBuilder Jun 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-change Changes to the arrow API arrow Changes to the arrow crate parquet Changes to the parquet crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Decimal128 API and use it in DecimalArray and DecimalBuilder
5 participants