Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DictionaryArray::key function #1912

Merged
merged 1 commit into from Jun 20, 2022
Merged

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Jun 19, 2022

Which issue does this PR close?

Closes #1911

Rationale for this change

See #1911

What changes are included in this PR?

  1. Add DictionaryArray::key function
  2. Tests

Are there any user-facing changes?

Yes new function

@github-actions github-actions bot added the arrow Changes to the arrow crate label Jun 19, 2022
@codecov-commenter
Copy link

Codecov Report

Merging #1912 (dcdbc50) into master (ded6316) will decrease coverage by 0.00%.
The diff coverage is 84.61%.

@@            Coverage Diff             @@
##           master    #1912      +/-   ##
==========================================
- Coverage   83.41%   83.41%   -0.01%     
==========================================
  Files         214      214              
  Lines       56991    57004      +13     
==========================================
+ Hits        47541    47550       +9     
- Misses       9450     9454       +4     
Impacted Files Coverage Δ
arrow/src/array/array_dictionary.rs 91.53% <84.61%> (-0.39%) ⬇️
parquet_derive/src/parquet_field.rs 65.75% <0.00%> (-0.23%) ⬇️
parquet/src/encodings/encoding.rs 93.43% <0.00%> (-0.20%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ded6316...dcdbc50. Read the comment docs.

Copy link
Member

@waynexia waynexia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice interface 👍 I left a comment about that expect.

self.keys
.value(i)
.to_usize()
.expect("Dictionary index not usize")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For most cases this unwrap won't panic, but do we need to maintain the same behavior with https://github.com/apache/arrow-datafusion/blob/080c32400ddfa2d45b5bebb820184eac8fd5a03a/datafusion/common/src/scalar.rs#L342-L358 ? If so the return type can either be Result<Option<_>> or Option<_>, I'm ok with both (hard to choose...).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Panicking should be fine, it's a validation failure if the array contains negative indexes. TBH I keep meaning to change all these checked conversions to numeric casts (i.e. as), I wouldn't be surprised if this leads to non-trivial performance benefits.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filed #1918 to track

Copy link
Member

@viirya viirya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Currently to I need to call value on keys array. This should be convenient.

@tustvold tustvold merged commit 9059cbf into apache:master Jun 20, 2022
@alamb alamb deleted the alamb/dictionary_key branch June 21, 2022 12:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add DictionaryArray::key function
5 participants