New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add divide_opt kernel which produce null values on division by zero error #2710
Conversation
cc @sunchao |
arrow/src/compute/kernels/arity.rs
Outdated
.into_iter() | ||
.zip(iter_b.into_iter()) | ||
.map(|(item_a, item_b)| { | ||
if let (Some(a), Some(b)) = (item_a, item_b) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if both a
and b
are all non-null? can we implement a fast-path for that case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean two input arrays are all non-null? Is there a fast-path? Because op
could produce None
on non-null inputs, we cannot do the trick like iterating all values and computing null buffer separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I'm thinking in that case we can avoid this if let
clause and potentially eliminate branching cost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, you said iterating inputs from two arrays without if
check? Okay, let me add one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the fast-path which simply iterates all values without if
check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/// Unlike `divide` or `divide_checked`, division by zero will get a null value instead | ||
/// returning an `Err`, this also doesn't check overflowing, overflowing will just wrap | ||
/// the result around. | ||
pub fn divide_opt<T>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a note for other reviewer even post-review, we need such divide kernel which doesn't return Error on overflow (wrapping) or division by zero (getting null instead). We may consider to change divide
's division by zero behavior but it will downgrade divide
performance. So currently the best option seems to have this separate kernel.
Thanks. |
Benchmark runs are scheduled for baseline = 7e47fa6 and contender = 4f52a25. 4f52a25 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Which issue does this PR close?
Closes #2709.
Rationale for this change
What changes are included in this PR?
Are there any user-facing changes?