Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Boolean take kernel does not handle null indices correctly #2057

Closed
jhorstmann opened this issue Jul 13, 2022 · 0 comments · Fixed by #2064
Closed

Boolean take kernel does not handle null indices correctly #2057

jhorstmann opened this issue Jul 13, 2022 · 0 comments · Fixed by #2064
Labels
arrow Changes to the arrow crate bug

Comments

@jhorstmann
Copy link
Contributor

Describe the bug

The boolean take kernel does not handle null values in the indices array, instead it always accesses the bitmap and values at the masked off index. Usually this index is the default of 0, but if it is out of bounds this will lead to a panic.

To Reproduce

#[test]
fn test_take_bool_nullable_index() {
    // indices where the masked invalid elements would be out of bounds
    let index_data = ArrayData::try_new(
        DataType::Int32,
        6,
        Some(Buffer::from_iter(vec![false, true, false, true, false, true])),
        0,
        vec![Buffer::from_iter(vec![99, 0, 999, 1, 9999, 2])],
        vec![]
    ).unwrap();
    let index = UInt32Array::from(index_data);
    test_take_boolean_arrays(
        vec![Some(true), None, Some(false)],
        &index,
        None,
        vec![None, Some(true), None, None, None, Some(false)],
    );
}
assertion failed: i < (self.bits.len() << 3)
thread 'compute::kernels::take::tests::test_take_bool_nullable_index' panicked at 'assertion failed: i < (self.bits.len() << 3)', arrow/src/bitmap.rs:55:9

Expected behavior

Null indices should be ignored.

Additional context

Could be solved by using PrimitiveArray::iter to iterate over indices. If that reduces performance, some of it could probably be regained by #1857

The performance of the take kernel might also be improved by taking value and validity bits separately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants