Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add NaN handling in dyn scalar comparison kernels #2830

Merged
merged 7 commits into from Oct 6, 2022

Conversation

viirya
Copy link
Member

@viirya viirya commented Oct 6, 2022

Which issue does this PR close?

Closes #2829.

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

@github-actions github-actions bot added the arrow Changes to the arrow crate label Oct 6, 2022
@tustvold
Copy link
Contributor

tustvold commented Oct 6, 2022

I wonder if it might be cleaner to use a trait for this that can be derived for T::Native, much like we do for arithmetic? Using the dyn downcasting machinery feels like a bit of a hack...

@viirya
Copy link
Member Author

viirya commented Oct 6, 2022

I wonder if it might be cleaner to use a trait for this that can be derived for T::Native, much like we do for arithmetic? Using the dyn downcasting machinery feels like a bit of a hack...

Hm, okay. Yea, current approach not looks very clear due to downcasting and try_cast.

@viirya viirya marked this pull request as draft October 6, 2022 05:38
@@ -46,6 +46,7 @@ pub(crate) mod native_op {
+ Div<Output = Self>
+ Rem<Output = Self>
+ Zero
+ num::ToPrimitive
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this addition?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because kernels like eq_dyn_scalar have this type binding. In macro dyn_compare_scalar, ToPrimitive api is used on the input scalar.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the dyn_scalar kernels don't use ArrowNativeTypeOp?

Copy link
Member Author

@viirya viirya Oct 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put is_eq, is_ne..APIs into ArrowNativeTypeOp. If I don't add this addition, the compilier complains:

error[E0599]: no method named `to_f64` found for type parameter `T` in the current scope
    --> arrow/src/compute/kernels/comparison.rs:1228:50
     |
1228 |                 let right = try_to_type!($RIGHT, to_f64)?;
     |                                                  ^^^^^^ method not found in `T`
...
1338 | pub fn eq_dyn_scalar<T>(left: &dyn Array, right: T) -> Result<BooleanArray>
     |                      - method `to_f64` not found for this type parameter              
...
1340 |     T: ArrowNativeTypeOp + num::ToPrimitive,
     |                          ++++++++++++++++++

So I pull the type bound num::ToPrimitive into ArrowNativeTypeOp.

I made all dyn_scalar kernels use ArrowNativeTypeOp now.

Copy link
Contributor

@tustvold tustvold Oct 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see, could we possibly have the dyn_scalar kernels just have the constraint on ToPrimitive + ArrowNativeTypeOp instead of unifying them.

This not only avoids this leaking to the arithmetic kernels, which will complicate porting decimals over, but it also seems wrong that the dyn kernels are performing type coercion on the scalar value at all, and I would like to keep the door open to removing this down the line.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, found that it is not necessary to add the addition. Removed it now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you - FWIW I filed #2837 to deal with the somewhat surprising, at least to me, behaviour of these scalar kernels

@github-actions github-actions bot added the arrow-flight Changes to the arrow-flight crate label Oct 6, 2022
@github-actions github-actions bot removed the arrow-flight Changes to the arrow-flight crate label Oct 6, 2022
@viirya viirya marked this pull request as ready for review October 6, 2022 15:40
Copy link
Contributor

@tustvold tustvold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this 👍

@viirya
Copy link
Member Author

viirya commented Oct 6, 2022

The failed CI is for labeling only.

@viirya
Copy link
Member Author

viirya commented Oct 6, 2022

Thanks @tustvold for reviewing.

@viirya viirya merged commit c93ce39 into apache:master Oct 6, 2022
@ursabot
Copy link

ursabot commented Oct 6, 2022

Benchmark runs are scheduled for baseline = f8c4037 and contender = c93ce39. c93ce39 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add NaN handling in dyn scalar comparison kernels
3 participants