Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DecimalArray::from_fixed_size_list_array fails when offset > 0 #1958

Closed
HaoYang670 opened this issue Jun 28, 2022 · 1 comment · Fixed by #1964
Closed

DecimalArray::from_fixed_size_list_array fails when offset > 0 #1958

HaoYang670 opened this issue Jun 28, 2022 · 1 comment · Fixed by #1964
Labels
arrow Changes to the arrow crate bug

Comments

@HaoYang670
Copy link
Contributor

HaoYang670 commented Jun 28, 2022

Describe the bug
from_fixed_size_list_array cannot return a correct result when array.offset() > 0

To Reproduce

    #[test]
    fn test_decimal_array_from_fixed_size_list() {
        let value_data = ArrayData::builder(DataType::UInt8)
            .len(48)
            .add_buffer(Buffer::from_slice_ref(&[12_i128, 34, 56]))
            .build()
            .unwrap();
        
        let null_buffer = Buffer::from_slice_ref(&[0b_101]);

        // Construct a list array from the above two
        let list_data_type = DataType::FixedSizeList(
            Box::new(Field::new("item", DataType::UInt8, false)),
            16,
        );
        let list_data = ArrayData::builder(list_data_type.clone())
            .len(2)
            .null_bit_buffer(Some(null_buffer))
            .offset(1)
            .add_child_data(value_data.clone())
            .build()
            .unwrap();
        let list_array = FixedSizeListArray::from(list_data);
        let decimal = DecimalArray::from_fixed_size_list_array(list_array, 38, 0);

        assert_eq!(decimal.len(), 2);
        assert!(decimal.is_null(0));
        assert_eq!(decimal.value_as_string(1), "56".to_string());
    }

Expected behavior
Pass the test.

Additional context
The reason of the failure is that we drop the offset when copying value_buffer and null_buffer:
https://github.com/apache/arrow-rs/blob/master/arrow/src/array/array_binary.rs#L846-L849

A straight way is to add the offset:

        let builder = ArrayData::builder(DataType::Decimal(precision, scale))
            .len(v.len())
            .add_buffer(v.data_ref().child_data()[0].buffers()[0].clone())
            .null_bit_buffer(v.data_ref().null_buffer().cloned())
            .offset(v.offset());

However, the answer is still not 100% correct when the offset of the child data > 0. (Nested offsets always make me confused 😅).

The way I prefer is to build decimal array from FixedSizeBinaryArray and drop this method, so that we can avoid nested offsets. Maybe we can file another issue to track this.

@tustvold
Copy link
Contributor

This is related to #1726

@alamb alamb added the arrow Changes to the arrow crate label Jul 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants