Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support specifying capacities for ListArrays in MutableArrayData #1884

Closed
jhorstmann opened this issue Jun 15, 2022 · 1 comment · Fixed by #1885
Closed

Support specifying capacities for ListArrays in MutableArrayData #1884

jhorstmann opened this issue Jun 15, 2022 · 1 comment · Fixed by #1885
Labels
arrow Changes to the arrow crate enhancement Any new improvement worthy of a entry in the changelog

Comments

@jhorstmann
Copy link
Contributor

jhorstmann commented Jun 15, 2022

Describe the bug

MutableArrayData::with_capacities allows specifying the capacity for concatenating arrays. Unfortunately the Capacities::List variant does not seem to be supported by the method, leading to a lot of reallocations when concatenating list arrays.

Thanks to @pbillaut for the investigation!

To Reproduce

The following test panics with

Capacities: List(6, Some(Array(15))) not yet supported
thread 'array::transform::tests::test_list_append_with_capacities' panicked at 'Capacities: List(6, Some(Array(15))) not yet supported', arrow/src/array/transform/mod.rs:398:18
    #[test]
    fn test_list_append_with_capacities() -> Result<()> {
        let mut builder = ListBuilder::<Int64Builder>::new(Int64Builder::new(24));
        builder.values().append_slice(&[1, 2, 3])?;
        builder.append(true)?;
        builder.values().append_slice(&[4, 5])?;
        builder.append(true)?;
        builder.values().append_slice(&[6, 7, 8])?;
        builder.values().append_slice(&[9, 10, 11])?;
        builder.append(true)?;
        let a = builder.finish();

        let a_builder = Int64Builder::new(24);
        let mut a_builder = ListBuilder::<Int64Builder>::new(a_builder);
        a_builder.values().append_slice(&[12, 13])?;
        a_builder.append(true)?;
        a_builder.append(true)?;
        a_builder.values().append_slice(&[14, 15])?;
        a_builder.append(true)?;
        let b = a_builder.finish();

        let _mutable = MutableArrayData::with_capacities(
            vec![a.data(), b.data()], false,
            Capacities::List(6, Some(Box::new(Capacities::Array(15)))));

        Ok(())
    }

Expected behavior

Capacities::List should be supported and initialize the correct buffer size for child arrays.

Additional context
Capacities support was added in #382

@jhorstmann jhorstmann added the bug label Jun 15, 2022
@alamb alamb added enhancement Any new improvement worthy of a entry in the changelog arrow Changes to the arrow crate and removed bug labels Jun 23, 2022
@alamb alamb changed the title Specifying capacities for ListArrays is not supported in MutableArrayData Support specifying capacities for ListArrays in MutableArrayData Jun 23, 2022
@alamb
Copy link
Contributor

alamb commented Jun 23, 2022

I am going to presumptively say this is really a feature enhancement rather than a bug, though I can see the case for either view.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate enhancement Any new improvement worthy of a entry in the changelog
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants