Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

not panic when given all nones to FixedSizeBinaryArray::try_from_sparse_iter #1904

Closed
wants to merge 1 commit into from

Conversation

frolovdev
Copy link
Contributor

Which issue does this PR close?

Closes #1390

@github-actions github-actions bot added the arrow Changes to the arrow crate label Jun 17, 2022
@frolovdev frolovdev changed the title not panic not panic when given all nones to Jun 17, 2022
@frolovdev frolovdev changed the title not panic when given all nones to not panic when given all nones to FixedSizeBinaryArray::try_from_sparse_iter Jun 17, 2022
.map(|(col, field)| (col.data_type(), field.data_type()))
.enumerate()
.find(type_not_match);
let is_all_nones = columns.iter().all(|x| {
Copy link
Contributor

@tustvold tustvold Jun 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really not sure about this, perhaps @alamb can weigh in here, skipping schema checks just because the arrays are empty feels likely to cause surprise down the line.

If we do anything in this vein, I think it would be better to special case just empty fixed size arrays with fixed size length 0, and allow them to compare equal to other fixed size arrays of the same type. This would both make the intent clear and limit the potential blast radius.

My personal preference would be to make the try_from_sparse_iter method take the length of the arrays as an argument, rather than adding workarounds in other parts of the codebase.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My personal preference would be to make the try_from_sparse_iter method take the length of the arrays as an argument, rather than adding workarounds in other parts of the codebase

I personally agree with @tustvold. We could add an argument value_len: Option<usize>, such as (not tested):

    pub fn try_from_sparse_iter<T, U>(mut iter: T, value_len: Option<usize>) -> Result<Self>
    where
        T: Iterator<Item = Option<U>>,
        U: AsRef<[u8]>,

And the function will return an error if iter is all-nones and value_len is None. Maybe we could return something such as (not tested):

ArrowError::InvalidArgumentError("cannot infer the value length of the array")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My personal preference would be to make the try_from_sparse_iter method take the length of the arrays as an argument, rather than adding workarounds in other parts of the codebase

I think this sounds like the best of several non ideal options.

@tustvold tustvold marked this pull request as draft June 28, 2022 07:27
@tustvold
Copy link
Contributor

Marking as draft to make it easier to see what is awaiting review, please feel free to unmark once ready for re-review 😄

@tustvold
Copy link
Contributor

Closing this as #3054 has added support for this

@tustvold tustvold closed this Nov 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FixedSizeBinaryArray::try_from_sparse_iter failed when given all Nones
4 participants