New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add arithmetic and comparison kernels from DF + small modifications #2770
Conversation
I'll note that I seem to have picked the worst time to try to move these kernels here, as things aren't working correctly right now. However, I tested this change with DF and this revision of arrow-rs https://github.com/apache/arrow-rs/tree/5e83ef9cc7e426171f4cb9451fa004c55c7c95be and everything compiled and all tests passed in both DF and arrow-rs. @liukun4515 for your review. I'll add a similar draft PR for DataFusion so you can see both sides of the move. Other half of this PR is in apache/datafusion#3590 |
# Conflicts: # arrow/src/compute/kernels/arithmetic.rs
Never mind, everything is working fine. I've moved all the DF decimal kernel code here and copied the tests over into the correct locations. Ready for review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, thank you. A couple of thoughts
- Some doc comments would be nice
- The overflow semantics of these are a bit confused, the operations themselves are not checked but with_precision_and_scale checks for overflow in the result (Decimal Precision Validation #2387 is likely related)
- Some benchmarks would be good
- We should probably look to use the binary and try_binary kernels as the performance will be significantly better (might need Replace DecimalArray with PrimitiveArray #2637 or some other shenanigans)
Ok(()) | ||
} | ||
|
||
// #[test] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these tests commented out, I presume because they aren't working as expected?
I think this is only part of #1200 as it doesn't hook up the kernels in the dyn kernels |
) -> Result<Decimal128Array> { | ||
let divide = 10_i128.pow(left.scale() as u32); | ||
let array = arith_decimal_scalar(left, right, |left, right| { | ||
let result = BigInt::from(left) * BigInt::from(right) / divide; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't we use i128 for this? The performance should be better than a variable width encoding?
I will review it tomorrow, please hold this pr. |
As it is, I've only copied over what was in DataFusion, but I think that's not suitable for arrow-rs's public APIs. Specifically, the math takes rust primitives (i128) with no checks on precision or scale or any indication what the precision or scale might or should be for the scalar. Also, as a user of arrow-rs, you'd expect to be able to use the same functions to do math on decimals as you do all other number types, but you can't, because they're fundamentally different types here. Is there a way to make DecimalArray a PrimitiveArray? |
Yes (see #2637), and this would definitely be the best path forward if it is tractable. I can try to find some time over the weekend to have a play around with this. I have some ideas, but I'm loathe to subject someone else to my potentially half-baked ideas 😅 |
Do we need to hold this pr and implement them by the proposal in #2637? |
My personal preference would to hold off on this if the consensus is to move forward with making DecimalArray a PrimitiveArray. However, this may not be ready in time for this weeks release, so if there is urgency to add this functionality I don't wish to block it. |
No, there's no urgency in that way. I'll close this PR in favor of the more complete plan for decimal implementation in the other issue. |
Which issue does this PR close?
Closes #1200.
Closes #2766.
Rationale for this change
Per issues, these kernels belong in this project, not in DF. The goal is to get these into arrow-rs where they belong and then do further iteration to improve Decimal128 support.
What changes are included in this PR?
num
crate's BigInt to do division, instead of f64, which aren't suitable for decimal division.Are there any user-facing changes?
addition of these kernels to the public API