Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support casting from Utf8/LargeUtf8 to Binary/LargeBinary #2402

Closed
Dandandan opened this issue Aug 10, 2022 · 4 comments · Fixed by #2456
Closed

Support casting from Utf8/LargeUtf8 to Binary/LargeBinary #2402

Dandandan opened this issue Aug 10, 2022 · 4 comments · Fixed by #2456
Labels
arrow Changes to the arrow crate enhancement Any new improvement worthy of a entry in the changelog

Comments

@Dandandan
Copy link
Contributor

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

Casting from utf8 to binary is missing.

Describe the solution you'd like

Implement cast for utf8(large) to binary(large).

Describe alternatives you've considered

n/a
Additional context

n/a

@Dandandan Dandandan added enhancement Any new improvement worthy of a entry in the changelog arrow Changes to the arrow crate labels Aug 10, 2022
@HaoYang670
Copy link
Contributor

I guess this should be straight forward by changing the data type:

let binary_array_data = utf8_array_data.into_builder()
    .datatype(DataType::Binary)
    .build_unchecked()

This could be a good first issue for developers who want to get familiar with the arrow data type.

@psvri
Copy link
Contributor

psvri commented Aug 13, 2022

I want to pick this up, but I was just a little confused . If i am not mistaken we already have a function to convert into binary array https://docs.rs/arrow/20.0.0/arrow/array/fn.as_generic_binary_array.html .

So based on my understanding, If we need to implement, then we need to implement trait from< GenericStringArray > for GenericBinaryArray.

@HaoYang670
Copy link
Contributor

then we need to implement trait from< GenericStringArray > for GenericBinaryArray.

I guess this is what @Dandandan want.

@Dandandan
Copy link
Contributor Author

Thanks for the interest in this issue. I think the description wasn't fully clear.

Currently the cast kernel in Arrow doesn't support casting string arrays of to binary, so my suggestion is to implement this. It should be pretty straightforward indeed to build an existing string array to a binary array.

@alamb alamb changed the title Support casting from utf8 to binary Support casting from Utf8/LargeUtf8 to Binary/LargeBinary Aug 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate enhancement Any new improvement worthy of a entry in the changelog
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants