Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

As standard layout method #616

Merged
merged 30 commits into from Jun 19, 2019
Merged

As standard layout method #616

merged 30 commits into from Jun 19, 2019

Conversation

andrei-papou
Copy link
Contributor

PR for #532

Added as_contiguous method to ArrayBase which accepts array by reference returns a copy of it in standard layout.

@LukeMathWalker
Copy link
Member

This is quite needed to simplify the story around manipulation of strides, nice.
It should also be fairly easy to approach with some property-based testing using quickcheck, asserting that a random array returns true when probed using is_standard_layout.
We should check these classes of arrays:

  • the weird ones (0 dimensional, multidimensional with some 0-length axes, empty multidimensional arrays);
  • C and F layout for random multidimensional arrays;
  • non-contiguous multidimensional arrays (as you are doing already in the existing unit test).

Another important piece is guarantees, to be added in the method documentation: what happens if I pass in an array that is already owned and arranged with standard layout? Do you still create a copy or do you return the same array?
Any guarantee we promise should be tested, to check for regressions.

@jturner314
Copy link
Member

I think this functionality is important, but I'd like to refine the return type.

what happens if I pass in an array that is already owned and arranged with standard layout? Do you still create a copy or do you return the same array?

Based on the current signature of the method (&ArrayBase -> Array), the only option is to copy it, which is unfortunate. As a result, I'd prefer to change the method to

fn as_standard_layout(&self) -> ArrayCow<'_, A, D>

to avoid copying the data when the array is already in the correct layout.

ArrayCow<'a, A, D> would be a new type for as_standard_layout and similar methods analogous to Cow from the standard library. We'll need to add a new CowRepr type and implement the appropriate *Data* traits for it (see data_traits.rs). See #390 for some related discussion. I can provide more guidance if that would be helpful.

@andrei-papou
Copy link
Contributor Author

I've reworked the implementation to use the copy-on-write strategy. It is still a draft, but could you please take a look and tell me whether I am on a right track? Thanks a lot.

@jturner314
Copy link
Member

Yes, it looks like you're on the right track. I would change the wording a little (rename Temp to Owned and .as_contiguous() to .as_standard_layout()), but otherwise it looks good on first glance.

tests/array.rs Outdated Show resolved Hide resolved
Moved all `as_standard_layout`-related tests to the separate module
@andrei-papou
Copy link
Contributor Author

Seems like I should also write some test for the new ArrayCow type. Should I test only its custom methods? Or maybe also *Data* trait implementations? I need a bit of guidance here. Thanks.

The old implementation incorrectly handled the strides when the array
was not in standard layout. (It created a `Vec` corresponding to an
array in standard layout but didn't update the strides accordingly.)
The new implementation calls `.to_owned()`, which handles the problem
of efficiently creating an owned array, and then moves all the field
values. Additionally, the `ensure_is_owned` function has been removed
because it's equivalent to `DataMut::ensure_unique`.
These functions were used only in the implementation of
`CowRepr::to_owned`. They may turn out to be useful in the future, but
until there's a clear use case that is not served by `.view()` and
`.into_owned()`, let's remove them.
The old implementation panicked when the variants were different; the
new implementation performs the necessary clone instead of panicking.
The variant of `self` after calling `clone_from_with_ptr` should match
the variant of `other`. If the variants are initially different, the
data from `other` needs to be cloned, and `self` needs to be changed
to that variant.
This is analogous to `impl_owned_array`, `impl_views`, and
`impl_raw_views`.
In the vast majority of cases, `A` will implement `Clone`. However,
`ArrayCow` may still be useful for cases where `A` does not implement
`Clone`.
The compiler should have a slightly easier time making this efficient
because there are fewer type conversions this way.
@jturner314
Copy link
Member

I created andrei-papou/ndarray#4 with some refinements.

I'd like to test the following for ArrayCow (for both the View and Owned variants) with non-standard layout (permuted axes as well as axis inversions and steps other than 1):

  • Mutate an element. For the View variant, make sure that the array changes to the Owned variant and that the element values are correct. For the Owned variant, make sure that array.as_ptr() is the same before and after (which indicates that the data was not cloned), and make sure that the element values are correct.

  • Call .clone() on the array. Make sure that the data variant (View vs. Owned), element values, shape, and strides are the same in the original and the clone.

  • Use .clone_from() to clone from one ArrayCow to another, both when they initially have the same variant and when the variants are initially different. Make sure that the data variant (View vs. Owned), element values, shape, and strides in the destination array match the source.

  • Call .into_owned(). For the View variant, make sure that the array changes to the Owned variant and that the element values are correct. For the Owned variant, make sure that array.as_ptr() is the same before and after (which indicates that the data was not cloned), and make sure that the element values are correct.

These tests should be sufficient to check the behavior of CowRepr.

Other than tests, the only remaining piece is documentation (normal doc comments as well as mentioning ArrayCow in the ArrayBase and ndarray_for_numpy_users documents).

Would you be willing to move the ArrayCow stuff into a separate PR?

@jturner314 jturner314 changed the title As contiguous method As standard layout method May 6, 2019
Refine Cow stuff and as_standard_layout impl
@andrei-papou
Copy link
Contributor Author

@jturner314 Thanks for your refinements and guidance. As you suggested, I've moved all the ArrayCow stuff to the separate PR #632 and added some tests there.

@andrei-papou andrei-papou mentioned this pull request May 27, 2019
src/impl_methods.rs Outdated Show resolved Hide resolved
src/impl_methods.rs Show resolved Hide resolved
src/impl_methods.rs Outdated Show resolved Hide resolved
tests/array.rs Show resolved Hide resolved
tests/array.rs Outdated Show resolved Hide resolved
- reduced the amount of the boilerplate code in related tests
- added explicit assertion about the size of the source array when
  copying it to the standard layout.
- added documentation string to the new method
- removed redundant import
@andrei-papou
Copy link
Contributor Author

@jturner314 I've updated the PR according to your comments. Any more suggestions on this?

@jturner314 jturner314 merged commit 5c2da21 into rust-ndarray:master Jun 19, 2019
@jturner314
Copy link
Member

I added a little more to the docs and merged the the PR. Thanks for working on this and the previous PR! This is very useful functionality that's been necessary for quite a while.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants