Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the behavior of from_fixed_size_list when offset > 0 #1964

Merged
merged 2 commits into from Jun 29, 2022
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
43 changes: 39 additions & 4 deletions arrow/src/array/array_binary.rs
Expand Up @@ -831,22 +831,26 @@ impl DecimalArray {
precision: usize,
scale: usize,
) -> Self {
let child_data = &v.data_ref().child_data()[0];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somewhat tangential to this PR, but what happens to the child data's null buffer? Perhaps worth a docstring saying it is ignored?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great Catch!
I am not sure whether we should drop it or do list_nulls & child_nulls.

That's why I want to drop this function and build decimal array from FixedSizeBinary array instead.

assert_eq!(
v.data_ref().child_data()[0].child_data().len(),
child_data.child_data().len(),
0,
"DecimalArray can only be created from list array of u8 values \
(i.e. FixedSizeList<PrimitiveArray<u8>>)."
);
assert_eq!(
v.data_ref().child_data()[0].data_type(),
child_data.data_type(),
&DataType::UInt8,
"DecimalArray can only be created from FixedSizeList<u8> arrays, mismatched data types."
);

let list_offset = v.offset();
let child_offset = child_data.offset();
let builder = ArrayData::builder(DataType::Decimal(precision, scale))
.len(v.len())
.add_buffer(v.data_ref().child_data()[0].buffers()[0].clone())
.null_bit_buffer(v.data_ref().null_buffer().cloned());
.add_buffer(v.data_ref().child_data()[0].buffers()[0].slice(child_offset))
HaoYang670 marked this conversation as resolved.
Show resolved Hide resolved
.null_bit_buffer(v.data_ref().null_buffer().cloned())
.offset(list_offset);

let array_data = unsafe { builder.build_unchecked() };
Self::from(array_data)
Expand Down Expand Up @@ -1677,6 +1681,37 @@ mod tests {
);
}

#[test]
fn test_decimal_array_from_fixed_size_list() {
let value_data = ArrayData::builder(DataType::UInt8)
.offset(16)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add offset in the child array

.len(48)
.add_buffer(Buffer::from_slice_ref(&[99999_i128, 12, 34, 56]))
.build()
.unwrap();

let null_buffer = Buffer::from_slice_ref(&[0b101]);

// Construct a list array from the above two
let list_data_type = DataType::FixedSizeList(
Box::new(Field::new("item", DataType::UInt8, false)),
16,
);
let list_data = ArrayData::builder(list_data_type)
.len(2)
.null_bit_buffer(Some(null_buffer))
.offset(1)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add offset in the list array

.add_child_data(value_data)
.build()
.unwrap();
let list_array = FixedSizeListArray::from(list_data);
let decimal = DecimalArray::from_fixed_size_list_array(list_array, 38, 0);

assert_eq!(decimal.len(), 2);
assert!(decimal.is_null(0));
assert_eq!(decimal.value_as_string(1), "56".to_string());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the expected result, because slicing a list array doesn't push down the offset to child

}

#[test]
fn test_fixed_size_binary_array_from_iter() {
let input_arg = vec![vec![1, 2], vec![3, 4], vec![5, 6]];
Expand Down