Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEP: Deprecate fastCopyAndTranspose #22313

Merged
merged 2 commits into from Oct 7, 2022

Conversation

rossbar
Copy link
Contributor

@rossbar rossbar commented Sep 19, 2022

I propose to deprecate the fastCopyAndTranspose function, which is currently accessible in the top-level numpy namespace.

Note that fastCopyAndTranspose was (prior to this PR) a public name bound to a private function multiarray._fastCopyAndTranspose. There was a single usage of the function in linalg, but it was straightforward to reorganize things to use the private function instead. It's worth investigating whether the private function is even needed1, but I'm inclined to leave that for a separate PR and just focus on the deprecation to start.

Re: downstream impact --- I grepped through some of the bigger projects in the ecosystem (scipy, matplotlib, scikit-learn, & astropy) and didn't find any instances of fastCopyAndTranspose. I'm happy to do a more thorough search if desired.

Footnotes

  1. rudimentary local benchmarking showed chaining the copy/transpose methods (i.e. a.copy().T) was 3x faster than _fastCopyAndTranspose.

@charris
Copy link
Member

charris commented Sep 20, 2022

(i.e. a.copy().T) was 3x faster than _fastCopyAndTranspose

I suspect the cause is that a.copy().T just makes a copy and fools with the strides, i.e., it is not a real transpose. A true transpose that changes the memory layout is hard to make efficient, and is a nice problem to analyse. I recall doing it with a sort just for kicks.

EDIT: a.T.copy() changes the layout:

In [2]: %timeit a.copy().T
621 µs ± 5.28 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [3]: %timeit a.T.copy()
2.48 ms ± 30.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each

and

In [4]: a.T.copy().flags   # changes memory layout to keep C contiguous
Out[4]: 
  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False

In [5]: a.copy().T.flags  # Doesn't change memory layout, is F contiguous.
Out[5]: 
  C_CONTIGUOUS : False
  F_CONTIGUOUS : True
  OWNDATA : False
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False

@seberg
Copy link
Member

seberg commented Sep 20, 2022

Having a "fast copy and transpose" operation in NumPy (presumably via a blocked iterator) would be amazing. And in principle we could ship some custom stuff inside this function.

But my feeling would be that this is OK to deprecate because arr.T.copy() should know how to make this fast. Ideally the user doesn't need to know about it?

and transpose methods instead, e.g. ``arr.copy().T``
"""
warnings.warn(
"fastCopyAndTranspose is deprecated. Use ``arr.copy().T`` instead",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"fastCopyAndTranspose is deprecated. Use ``arr.copy().T`` instead",
"fastCopyAndTranspose is deprecated. Use ``arr.T.copy()`` instead",

You might also want to mention the "order" argument?

It desn't need much (not C code), but it would be nice to test that the warning is given. One option: Just keep the old tests and add with pytest.warns(DeprecationWarning):, rather than adding a new test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One option: Just keep the old tests and add with pytest.warns(DeprecationWarning):, rather than adding a new test.

The reason I did it this way was that the "private" _fastCopyAndTranspose function is still used, so I reorganized the tests so that the private function is tested instead. I will add back the tests for the public function w/ the warnings catcher, as you're right that those tests should only be removed when the deprecation expires!

Also 👍 for the suggestion - I need to add it to the deprecation directive in the docs too...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay the deprecation message is updated in 7b43315 and the tests of the public fn (with the deprecation check) are added back in bd9236d.

I didn't know how best to mention the order kwarg without going into too much detail for a DeprecationWarning, so I simply went with order='K' - LMK what you think!

@mhvk
Copy link
Contributor

mhvk commented Sep 20, 2022

Saw this by chance. The relevant speed comparison then is:

In [10]: a = np.arange(1e6).reshape(1000, 1000)

In [11]: %timeit np.fastCopyAndTranspose(a)
2.16 ms ± 5.15 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [12]: %timeit a.T.copy()
2.17 ms ± 12.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

So, currently there is no difference at least for this size array (nor with an array a factor 100 larger).
...
...
...
well, that is no surprise since in the end it calls array_fastCopyAndTranspose which passes on to PyArray_CopyAndTranspose which does a transpose and then copies. So, it is not especially fast!

@charris
Copy link
Member

charris commented Sep 20, 2022

which does a transpose and then copies

Heh :)

@charris
Copy link
Member

charris commented Sep 20, 2022

Are you sure about the order='K'? It keeps the order of the axis swap transpose, not the original array. So C contiguous becomes F contiguous. There is a note in the original code saying that was planned for the mythical NumPy 2.0 :)

    /* TODO: Change this to NPY_KEEPORDER for NumPy 2.0 */
    ret = (PyArrayObject *)PyArray_NewCopy(tmp, NPY_CORDER);

@rossbar
Copy link
Contributor Author

rossbar commented Sep 20, 2022

Are you sure about the order='K'?

Nope, I'm not sure at all :) I assumed users might expect the transposed memory layout, but I'm not sure. Ideally we could just say something like "use order='K' to preserve the transposed order", but I didn't want to make the DeprecationWarning message that verbose --- again, I'm wide open to opinions here!

@charris
Copy link
Member

charris commented Sep 20, 2022

Well, it was only used in one place that we know of, so I suppose we really only need to worry about that. Since the deprecation is for the original, it would probably be best to keep the original C order for the replacement suggestion.

@seberg
Copy link
Member

seberg commented Sep 21, 2022

Yeah, lets just keep it without any order (which is C in this case). My thought was we could just write .copy("C") or use "F" as order if the original array is fortran order.

@seberg
Copy link
Member

seberg commented Sep 21, 2022

Sorry, should have looked closer earlier :/. Should we just deprecate this in C? And with that I mean not just move the Deprecation there but also put it into PyArray_CopyAndTranspose so that the C-API version also goes away.

Even in C, the implementation can just call Transpose and then copy (transpose can be passed NULL in C so the PyArray_CopyAndTranpose implementation is more complex than it needs to be).

If it is in C, it would be nice to have a very brief test in test_deprecations.py or manually test the path with DeprecationWarning set to raise.

Also realized that it should have a release to check all boxes probably.

@charris
Copy link
Member

charris commented Sep 29, 2022

I'm happy to put this in as is, the C function can be removed after the deprecation. The only worry is if someone is actually using the C function, and I am doubtful of that. The C function may listed as public, but the underscore suggests that it is private.

@seberg
Copy link
Member

seberg commented Sep 29, 2022

Yeah, both routes seem good. I had briefly checked scipy (doesn't use the C version) and the C version is not documented at all, and also replaced with ~5 lines of code Transpose+Copy (just additional error check and maybe array conversion).

@mattip
Copy link
Member

mattip commented Sep 29, 2022

Release note needed. Deprecating the C version seems reasonable.

numpy/linalg/linalg.py Outdated Show resolved Hide resolved
numpy/core/numeric.py Outdated Show resolved Hide resolved
@rossbar rossbar added triage review Issue/PR to be discussed at the next triage meeting triaged Issue/PR that was discussed in a triage meeting and removed triage review Issue/PR to be discussed at the next triage meeting labels Oct 5, 2022
Copy link
Contributor Author

@rossbar rossbar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This latest iteration is based on the approach discussed at the triage meeting:

  • Remove the internal _fastCopyAndTranspose function now to get read of any aliasing/wrapping of the private Python function. fastCopyAndTranspose now directly wraps the underlying C function.
  • ... which means there only needs to be one warning at the C level, which warns whenever a user calls either the Python (np.fastCopyAndTranspose) or C (PyArray_CopyAndTranspose) functions.
  • Other minor fixups like using DEPRECATE instead of PyErr_WarnFormat for better grepability

This PR definitely needs to be squashed, as it's really 3 separate deprecation attempts in 1. I wanted to preserve the full history to provide context for the discussion on GH. If this latest changeset looks good, LMK and I'm happy to squash this down.

numpy/linalg/linalg.py Outdated Show resolved Hide resolved
Deprecate the fastCopyAndTranspose function from the Python API, and the
underlying PyArray_CopyAndTranspose function from the C-API.

Also removes an internal, private function _fastCopyAndTranspose which
was the original Python wrapper around the C-function.
@rossbar rossbar force-pushed the deprecate-fastCopyAndTranspose branch from 2c442b6 to 06c380c Compare October 6, 2022 18:19
@mattip mattip merged commit ae6a352 into numpy:main Oct 7, 2022
@mattip
Copy link
Member

mattip commented Oct 7, 2022

Thanks @rossbar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
07 - Deprecation triaged Issue/PR that was discussed in a triage meeting
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants