Struct equality on slices has false negatives #514

bjchambers · 2021-06-30T16:27:53Z

Describe the bug

A struct array that is sliced to a subset is not equal to a struct array created from just that subset of data. Stepping through the debugger I think it is because some of the null/value comparisons drop the offset and end up comparing the wrong part of the array.

To Reproduce

Failing test case here:
bjchambers@ccc8a4c

Expected behavior
Struct array slices compare equal to equivalent struct arrays.

Or, the printing of the struct arrays shows how they differ.

The text was updated successfully, but these errors were encountered:

bjchambers · 2021-07-03T15:49:26Z

I think the problem here may be more widespread than I originall thought. It looks like the equality methods are inconsistent in how they handle offset. For instance:

pub fn equal(lhs: &ArrayData, rhs: &ArrayData) -> bool {
    let lhs_nulls = lhs.null_buffer();
    let rhs_nulls = rhs.null_buffer();
    utils::base_equal(lhs, rhs)
        && lhs.null_count() == rhs.null_count()
        && utils::equal_nulls(lhs, rhs, lhs_nulls, rhs_nulls, 0, 0, lhs.len())
        && equal_values(lhs, rhs, lhs_nulls, rhs_nulls, 0, 0, lhs.len())
}

This starts has lhs_start and rhs_start as 0. This would suggest that equal_nulls and equal_values should add the start to the offset.

#[inline]
pub(super) fn equal_nulls(
    lhs: &ArrayData,
    rhs: &ArrayData,
    lhs_nulls: Option<&Buffer>,
    rhs_nulls: Option<&Buffer>,
    lhs_start: usize,
    rhs_start: usize,
    len: usize,
) -> bool {
    let lhs_null_count = count_nulls(lhs_nulls, lhs_start, len);
    let rhs_null_count = count_nulls(rhs_nulls, rhs_start, len);
    if lhs_null_count > 0 || rhs_null_count > 0 {
        let lhs_values = lhs_nulls.unwrap().as_slice();
        let rhs_values = rhs_nulls.unwrap().as_slice();
        equal_bits(
            lhs_values,
            rhs_values,
            lhs_start + lhs.offset(),
            rhs_start + rhs.offset(),
            len,
        )
    } else {
        true
    }
}

This uses the lhs_start directly for counting nulls (which doesn't add the offset), but then adds the offset in when calling equal_bits. Looking at other parts of equality (such as boolean, primitives, etc.) it seems that it is inconsistent as to whether the offset should be included or not in the start values.

I can a stab at fixing it for at least the case identified, but I'm somewhat curious what the intended approach is. It seems like there are two options:

The lhs_start and rhs_start already include the offset, so none of the helper methods should add it again. This seems like it could include the offset just once (up front) and then everything else wouldn't need to, but it may run into more problems if some of the helper methods do include the offset (eg., running off the end of the array, etc.). Also may make it harder to use the lhs_start and rhs_start with public methods on the array data that respect offset.
The lhs_start and rhs_start are always relative the corresponding offset. When indexing into the data using methods that aren't relative the offset, it needs to be added.

It seems like option 2 is most consistent with how things are currently implemented.

bjchambers · 2021-07-15T03:53:07Z

I think that #389 seems to have fixed this. I'll make a PR with the corresponding test case to prevent regression.

bjchambers · 2021-07-21T00:26:35Z

Hrm, upon closer inspection, that didn't fix this.

The tests are still failing, and I think I'm outside my depth debugging. It looks like something isn't lining up with the null bits or the null counts in the sliced structs.

@nevi-me do you have any ideas on what may be failing, or suggestions for how to find t he problems? I don't have a strong enough grasp on how it's expected to work, so it's unclear what is wrong when I'm debugging. Any hints would be helpful.

alamb · 2021-09-09T16:23:10Z

@bjchambers -- #691 (comment) fixed something related to equality of lists, perhaps it is related to this one as well

tustvold · 2022-09-27T18:06:58Z

I believe this was fixed by #1589, either way the test now passes

bjchambers added the bug label Jun 30, 2021

This was referenced Jul 3, 2021

nullif operating on ArrayRef #510

Closed

Change nullif to support arbitrary arrays #521

Closed

nevi-me mentioned this issue Jul 11, 2021

make slice work for nested types #389

Merged

bjchambers mentioned this issue Jul 15, 2021

Add test for equality on slices of a struct array #555

Closed

tustvold added a commit to tustvold/arrow-rs that referenced this issue Sep 27, 2022

Add struct equality test case (apache#514)

d08ec7f

tustvold mentioned this issue Sep 27, 2022

Add struct equality test case (#514) #2791

Merged

tustvold closed this as completed in #2791 Sep 27, 2022

tustvold added a commit that referenced this issue Sep 27, 2022

Add struct equality test case (#514) (#2791)

7639f28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Struct equality on slices has false negatives #514

Struct equality on slices has false negatives #514

bjchambers commented Jun 30, 2021 •

edited

bjchambers commented Jul 3, 2021

bjchambers commented Jul 15, 2021

bjchambers commented Jul 21, 2021

alamb commented Sep 9, 2021

tustvold commented Sep 27, 2022

Struct equality on slices has false negatives #514

Struct equality on slices has false negatives #514

Comments

bjchambers commented Jun 30, 2021 • edited

bjchambers commented Jul 3, 2021

bjchambers commented Jul 15, 2021

bjchambers commented Jul 21, 2021

alamb commented Sep 9, 2021

tustvold commented Sep 27, 2022

bjchambers commented Jun 30, 2021 •

edited