Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Struct equality on slices has false negatives #514

Closed
bjchambers opened this issue Jun 30, 2021 · 5 comments · Fixed by #2791
Closed

Struct equality on slices has false negatives #514

bjchambers opened this issue Jun 30, 2021 · 5 comments · Fixed by #2791
Labels

Comments

@bjchambers
Copy link
Contributor

bjchambers commented Jun 30, 2021

Describe the bug

A struct array that is sliced to a subset is not equal to a struct array created from just that subset of data. Stepping through the debugger I think it is because some of the null/value comparisons drop the offset and end up comparing the wrong part of the array.

To Reproduce

Failing test case here:
bjchambers@ccc8a4c

Expected behavior
Struct array slices compare equal to equivalent struct arrays.

Or, the printing of the struct arrays shows how they differ.

@bjchambers bjchambers added the bug label Jun 30, 2021
@bjchambers
Copy link
Contributor Author

I think the problem here may be more widespread than I originall thought. It looks like the equality methods are inconsistent in how they handle offset. For instance:

pub fn equal(lhs: &ArrayData, rhs: &ArrayData) -> bool {
    let lhs_nulls = lhs.null_buffer();
    let rhs_nulls = rhs.null_buffer();
    utils::base_equal(lhs, rhs)
        && lhs.null_count() == rhs.null_count()
        && utils::equal_nulls(lhs, rhs, lhs_nulls, rhs_nulls, 0, 0, lhs.len())
        && equal_values(lhs, rhs, lhs_nulls, rhs_nulls, 0, 0, lhs.len())
}

This starts has lhs_start and rhs_start as 0. This would suggest that equal_nulls and equal_values should add the start to the offset.

#[inline]
pub(super) fn equal_nulls(
    lhs: &ArrayData,
    rhs: &ArrayData,
    lhs_nulls: Option<&Buffer>,
    rhs_nulls: Option<&Buffer>,
    lhs_start: usize,
    rhs_start: usize,
    len: usize,
) -> bool {
    let lhs_null_count = count_nulls(lhs_nulls, lhs_start, len);
    let rhs_null_count = count_nulls(rhs_nulls, rhs_start, len);
    if lhs_null_count > 0 || rhs_null_count > 0 {
        let lhs_values = lhs_nulls.unwrap().as_slice();
        let rhs_values = rhs_nulls.unwrap().as_slice();
        equal_bits(
            lhs_values,
            rhs_values,
            lhs_start + lhs.offset(),
            rhs_start + rhs.offset(),
            len,
        )
    } else {
        true
    }
}

This uses the lhs_start directly for counting nulls (which doesn't add the offset), but then adds the offset in when calling equal_bits. Looking at other parts of equality (such as boolean, primitives, etc.) it seems that it is inconsistent as to whether the offset should be included or not in the start values.

I can a stab at fixing it for at least the case identified, but I'm somewhat curious what the intended approach is. It seems like there are two options:

  1. The lhs_start and rhs_start already include the offset, so none of the helper methods should add it again. This seems like it could include the offset just once (up front) and then everything else wouldn't need to, but it may run into more problems if some of the helper methods do include the offset (eg., running off the end of the array, etc.). Also may make it harder to use the lhs_start and rhs_start with public methods on the array data that respect offset.
  2. The lhs_start and rhs_start are always relative the corresponding offset. When indexing into the data using methods that aren't relative the offset, it needs to be added.

It seems like option 2 is most consistent with how things are currently implemented.

@bjchambers
Copy link
Contributor Author

I think that #389 seems to have fixed this. I'll make a PR with the corresponding test case to prevent regression.

@bjchambers
Copy link
Contributor Author

Hrm, upon closer inspection, that didn't fix this.

The tests are still failing, and I think I'm outside my depth debugging. It looks like something isn't lining up with the null bits or the null counts in the sliced structs.

@nevi-me do you have any ideas on what may be failing, or suggestions for how to find t he problems? I don't have a strong enough grasp on how it's expected to work, so it's unclear what is wrong when I'm debugging. Any hints would be helpful.

@alamb
Copy link
Contributor

alamb commented Sep 9, 2021

@bjchambers -- #691 (comment) fixed something related to equality of lists, perhaps it is related to this one as well

tustvold added a commit to tustvold/arrow-rs that referenced this issue Sep 27, 2022
@tustvold
Copy link
Contributor

I believe this was fixed by #1589, either way the test now passes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants