Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better shape: Deprecate reshape, into_shape #1310

Merged
merged 10 commits into from Mar 10, 2024
Merged

Better shape: Deprecate reshape, into_shape #1310

merged 10 commits into from Mar 10, 2024

Conversation

bluss
Copy link
Member

@bluss bluss commented Jul 25, 2023

into_shape has some slightly questionable default behaviour w.r.t how it handles memory layout. The newer to_shape() fixes those problems! You should use to_shape() if you can.

Into shape doesn't do it /wrong/ but when it has a successful result, it preserves memory layout. C in gives C out. F in gives F out. But if you don't know what memory layout you have going in, into_shape can surprise you.


In this PR:

  • into_shape() deprecated
  • into_shape_with_order() new
  • reshape() deprecated
  • into_shape_clone() new

It's not a breaking change, but it introduces deprecations that will be quite loud.

Examples:

  • into_shape unchanged but deprecated
  • .into_shape_with_order((2, 3)) same as Order::RowMajor below
  • .into_shape_with_order(((2, 3), Order::RowMajor)) reshapes row major arrays, errors on other.
  • .into_shape_with_order(((2, 3), Order::ColMajor)) reshapes col major arrays, errors on other.

Into shape tries to be too great method:

  • User doesn't have to specify index ordering, it just guesses
  • It doesn't fall back to copy or clone
    • Useful for: Owned arrays with non-clonable elements
    • Useful for views: We can't fall back to cloning when reshaping a view

Into shape has a niche. Especially the no-copy shape transformation of views is very useful.
But the interface is not great about communicating the magic, in some ways.

This PR is a draft because it's not yet decided exactly what the into shape interface should be.
If we could, the order argument should always have been present. But we'd like to be backwards compatible as far as possible as we can, there must be many thousands of into shape calls out there.

@nilgoyette
Copy link
Collaborator

This PR is a draft because it's not yet decided exactly what the into shape interface should be.

You've made it like to_shape (using ShapeArg) This is coherent and thus more simple for the users, and it's backward compatible. Seems good!

@bluss
Copy link
Member Author

bluss commented Jul 26, 2023

to_shape assumes RowMajor when nothing is specified by the caller, while into_shape picks whichever seems to fit with the array that is given (true for into_shape both before this PR and after it in its current shape).

to_shape is great because its only precondition for success (not error) is that the requested shape has the same number of elements. It can handle either ordering request. Simple math for the user :)

into_shape is great because it can reshape views. But it has a precondition that the input array is exactly C-contig or F-contig, unfortunately.

Reshaping by following RowMajor or ColumnMajor index ordering has different effects, it has a different result. This is why it's quite dangerous to do what into_shape does, which is to just pick whatever seems to fit with the given array. The distinction might have been invisible to the user.

It's a documentation and API issue, how to, for into_shape and reshaping in general

  • Continue providing easy to use APIs
  • Make the preconditions clear
  • Find the right level of "Do What I Mean"

I'd like feedback on the tradeoffs involved. Continuing like this, with into_shape and to_shape having subtly different defaults might be ok? But it's kind of a sleeping footgun. More docs and more focus on using to_shape as a primary option (it's already placed more prominently in the docs, but I guess old code and examples use into shape throughout).

@bluss
Copy link
Member Author

bluss commented Jul 30, 2023

If I could, I would deprecate and remove into_shape (because it has the "wrong" ordering defaults.). Then add it back with the right defaults.

But I think that causes too much churn in user code, so I'm looking for the second best option.

@nilgoyette
Copy link
Collaborator

If I could, I would deprecate and remove into_shape (because it has the "wrong" ordering defaults.). Then add it back with the right defaults.

I hesitate to give feedback on subjects that I don't feel confident in, but in this particular case... IMO, you should do exactly what you wrote

  • having a "sleeping footgun" is much worse than deprecating a method.
  • you're not at v1 so a breaking change is not unexpected for the users
  • "now" is the best time to do something that could/should have been done ealier :)

@bluss
Copy link
Member Author

bluss commented Aug 2, 2023

For me, into_shape is a very commonly used method so it feels like it would break every program out there. Hence I'm looking for a smoother transition than this.

Even contemplating doing something like

  • into_shape() deprecated
  • into_shape_with_order() new

And after the deprecation period, into_shape_with_order is renamed to into_shape. Maybe the same dance for reshape.

@bluss bluss changed the title Better shape: Deprecate reshape and improve into_shape Better shape: Deprecate reshape, into_shape Aug 3, 2023
@bluss bluss force-pushed the better-into-shape branch 2 times, most recently from cd69772 to 9cbd736 Compare February 26, 2024 20:31
@bluss bluss marked this pull request as ready for review March 9, 2024 10:57
@bluss bluss force-pushed the better-into-shape branch 2 times, most recently from be19c43 to 7e7fc8a Compare March 9, 2024 11:17
@bluss bluss requested a review from adamreichold March 9, 2024 11:17
@bluss
Copy link
Member Author

bluss commented Mar 9, 2024

@adamreichold if you have time, would be cool if you had a look. I think it's ready to go

@adamreichold
Copy link
Collaborator

I plan to but I am not sure when this weekend I will be able to look into it. Are you on a schedule here or would merging it the beginning of next still be fine?

@bluss
Copy link
Member Author

bluss commented Mar 9, 2024

@adamreichold no stress. It's not on any schedule. I've been away for months and so on, would be weird and rude if I asked that anything would suddenly move at a faster pace(!) so I'm not going to do that 🙂 As one of the main maintainers I will continue to sometimes just decide to merge some changes on my own, which is a way I want all the maintainers to work, merge if they think it doesn't require more feedback.

into_shape_with_order:

If an index ordering is not specified, the default is `RowMajor`.  The
operation will only succeed if the array's memory layout is compatible
with the index ordering.

into_shape:

`.into_shape()` "moves" elements differently depending on if the input
array is C-contig or F-contig, it follows the index order that
corresponds to the memory order.

Because of this, the method is deprecated. That reshapes depend on
memory order is not intuitive.
Copy link
Collaborator

@adamreichold adamreichold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nits, but nothing functional.

I agree with the change, but the diff itself speaks volumes as to the downstream breakage, so I think this should also get a dedicated entry in RELEASES.md.

README-quick-start.md Outdated Show resolved Hide resolved
src/impl_methods.rs Show resolved Hide resolved
src/impl_methods.rs Outdated Show resolved Hide resolved
src/impl_methods.rs Outdated Show resolved Hide resolved
src/impl_methods.rs Outdated Show resolved Hide resolved
Co-authored-by: Adam Reichold <adamreichold@users.noreply.github.com>
@bluss bluss added this to the 0.16.0 milestone Mar 10, 2024
@bluss
Copy link
Member Author

bluss commented Mar 10, 2024

Every PR gets an entry in RELEASE.md, but normally, I've been writing that at release time. (Would be great to have a merge conflict free way of doing it in the PR, like the CPython project does.). As you say, this requires a bigger notice.

@adamreichold
Copy link
Collaborator

adamreichold commented Mar 10, 2024

(Would be great to have a merge conflict free way of doing it in the PR, like the CPython project does.)

Over at PyO3, we use Towncrier to semi-automatically produce a changelog from separate "newsfragement" Markdown files. The affinity to Python tooling in general made the choice a bit easier, but I suspect there exist similar tools written in Rust if required.

It does entail quite a bit of setup work up front, but once done writing the changelog entries becomes a breeze (and we even check their presence via the CI via a label to suppress that if the PR does not require an entry).

@bluss bluss added this pull request to the merge queue Mar 10, 2024
@bluss
Copy link
Member Author

bluss commented Mar 10, 2024

Thank you. I have learned a bit (doubtful, but maybe) from other maintainers and trying to be more relaxed about the maintainership style. This is good enough, let's go.

With that said, I'll still try my best to say no to code that breaks Rust's rules (relevant for safety), which we can often spot in code reviews, even if it takes some effort..

Merged via the queue into master with commit ba8f45c Mar 10, 2024
8 checks passed
@bluss bluss deleted the better-into-shape branch March 10, 2024 13:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants