Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add divide_opt kernel which produce null values on division by zero error #2710

Merged
merged 3 commits into from Sep 13, 2022

Conversation

viirya
Copy link
Member

@viirya viirya commented Sep 12, 2022

Which issue does this PR close?

Closes #2709.

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

@github-actions github-actions bot added the arrow Changes to the arrow crate label Sep 12, 2022
@viirya
Copy link
Member Author

viirya commented Sep 12, 2022

cc @sunchao

.into_iter()
.zip(iter_b.into_iter())
.map(|(item_a, item_b)| {
if let (Some(a), Some(b)) = (item_a, item_b) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if both a and b are all non-null? can we implement a fast-path for that case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean two input arrays are all non-null? Is there a fast-path? Because op could produce None on non-null inputs, we cannot do the trick like iterating all values and computing null buffer separately.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I'm thinking in that case we can avoid this if let clause and potentially eliminate branching cost.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, you said iterating inputs from two arrays without if check? Okay, let me add one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the fast-path which simply iterates all values without if check.

Copy link
Member

@sunchao sunchao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

/// Unlike `divide` or `divide_checked`, division by zero will get a null value instead
/// returning an `Err`, this also doesn't check overflowing, overflowing will just wrap
/// the result around.
pub fn divide_opt<T>(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a note for other reviewer even post-review, we need such divide kernel which doesn't return Error on overflow (wrapping) or division by zero (getting null instead). We may consider to change divide's division by zero behavior but it will downgrade divide performance. So currently the best option seems to have this separate kernel.

@viirya viirya merged commit 4f52a25 into apache:master Sep 13, 2022
@viirya
Copy link
Member Author

viirya commented Sep 13, 2022

Thanks.

@ursabot
Copy link

ursabot commented Sep 13, 2022

Benchmark runs are scheduled for baseline = 7e47fa6 and contender = 4f52a25. 4f52a25 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add divide_opt kernel which produce null values on division by zero error
3 participants