Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add namedtuple return types to linalg functions that return tuples #22786

Merged
merged 5 commits into from May 17, 2023

Conversation

asmeurer
Copy link
Member

That is, eig(), eigh(), qr(), slogdet(), and svd(). For those functions that return non-tuples with certain keyword arguments, the return type is unchanged. This change should be completely backwards compatible.

The namedtuple attribute names come from the array API specification (see, e.g.,
https://data-apis.org/array-api/latest/extensions/generated/signatures.linalg.eigh.html), with the exception of eig() which is just the same as eigh(). The name of the namedtuple object itself is not part of the specification or the public API.

I have not used a namedtuple for the tuple output for qr(mode='raw'), which returns (h, tau).

This updates the docstrings to use the updated namedtuple return names, and also the examples to use those names more consistently. This also updates the tests to check each function for the namedtuple attributes at least once.

That is, eig(), eigh(), qr(), slogdet(), and svd(). For those functions that
return non-tuples with certain keyword arguments, the return type is
unchanged. This change should be completely backwards compatible.

The namedtuple attribute names come from the array API specification (see,
e.g.,
https://data-apis.org/array-api/latest/extensions/generated/signatures.linalg.eigh.html),
with the exception of eig() which is just the same as eigh(). The name of the
namedtuple object itself is not part of the specification or the public API.

I have not used a namedtuple for the tuple output for qr(mode='raw'), which
returns (h, tau).

This updates the docstrings to use the updated namedtuple return names, and
also the examples to use those names more consistently. This also updates the
tests to check each function for the namedtuple attributes at least once.
@@ -17,6 +17,7 @@
import functools
import operator
import warnings
from typing import NamedTuple
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious about this, it isn't the same as collections.namedtuple. Is typing.NamedTuple the compatible version?

eigenvectors: NDArray

class QRResult(NamedTuple):
Q: NDArray
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bit curious about the choice of upper case here and below. Is there a standard for these names?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This NDArray type name is already present in NumPy. Unless you mean the name of the namedtuple.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Public names of type annotations we have, like np.typing.NDArray, tend to be upper-case. I think to not confuse them with lower-case objects

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was referring to "Q".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless you mean the name of the namedtuple.

That was what I was referring to. Looks like NamedTuple is what we want, but am curious if the projects that we want to be compatible with also use it.

Copy link
Member Author

@asmeurer asmeurer Dec 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The uppercase field names are part of the array API spec and were discussed here data-apis/array-api#295. I suppose NumPy could, if it wanted, provide the same fields under multiple names if you really don't like the ones the array API chose, like

class QRResult(NamedTuple):
    Q: NDArray
    R: NDArray

    @property 
    def q(self):
        return self.Q

    @property 
    def r(self):
        return self.R

(although personally I think you should just use these names)

The name of the named tuple is completely arbitrary. We can change it to whatever.

As far as I can tell NamedTuple is basically equivalent to the classic namedtuple. I think the biggest difference is that NamedTuple supports type annotations, which aren't used in this file yet but presumably they will be at some point. I guess it also makes it easier to add other methods or attributes if you ever wanted to do that.

I don't think it really makes a difference whether you use NamedTuple or namedtuple. I think it could even be something else like a dataclass, so long as it also has the correct field names and also defines enough methods to act like a tuple.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell NamedTuple is basically equivalent to the classic namedtuple

I think NamedTuple is better, for instance, it allows accessing the fields like so: a.Q, but that wouldn't work for namedtuple. The declaration of the fields also differs, for instance the declarations here error if namedtuple is substituted. The upshot is that I think one or the other should be chosen for the standard for portable code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

namedtuple works like that too:

>>> namedtuple('test', ['a', 'b'])(1, 2).a
1

I think the only real difference is that NamedTuple lets you specify type annotations. And the class definition is arguably easier to read. As far as the user is concerned though they are identical.

@charris
Copy link
Member

charris commented Dec 13, 2022

Could you fix the lint errors? They don't look very difficult.

@seberg
Copy link
Member

seberg commented Dec 20, 2022

I would lean towards also using a named tuple for (h, tau), it seems strange to not be consistent for a single function. But not a strong opinion.

The changes need to be copied into the typing stubs as well still. Otherwise, I expect the only thing would indeed be small style fixups to make the linter happier.

@charris
Copy link
Member

charris commented Feb 23, 2023

The changes need to be copied into the typing stubs as well still

@asmeurer Be good to finish this up.

@charris
Copy link
Member

charris commented Apr 4, 2023

We should finish this before the 1.25 release.

@seberg seberg added this to the 1.25.0 release milestone Apr 4, 2023
@eric-wieser
Copy link
Member

eric-wieser commented Apr 4, 2023

The name of the namedtuple object itself is not part of the specification or the public API.

What are users expected to do when writing a type annotation for the return value, if ret : QRResult = qr(x) is using private API?

@asmeurer
Copy link
Member Author

asmeurer commented Apr 4, 2023

So maybe the namedtuple classes should be added to the API then. Maybe @BvB93 can give some insight? I'm not really knowledgeable enough about typing stuff to know the right answer here.

@rgommers
Copy link
Member

rgommers commented Apr 5, 2023

I think there's still issues with NamedTuple typing in Python (e.g., this is still true probably: https://stackoverflow.com/questions/60430110/python-typing-support-for-namedtuple). I suspect that the linalg.pyi stubs for these functions need a tweak here, and that tuple unpacking will still work unchanged with mypy. Actually typing the use of qr(x).Q seems tricky.

So maybe the namedtuple classes should be added to the API then.

Not to the np.linalg API, because they're not useful for anything but static typing, and that's not enough of a reason to add them to the public (runtime) API I'd say. We do have numpy.typing for type annotations that users may need, so they could be added there perhaps? That right now has only four entries, so this would grow it quite a bit - but that seems fine. Perhaps would need to be np.typing.linalg.xxx?

@eric-wieser
Copy link
Member

I suppose it would also be reasonable to think about the static types later; at worst users of heavy static typing will have to continue to use the tuple api rather than .Q and .R, which is no different to what they do today.

@BvB93
Copy link
Member

BvB93 commented Apr 5, 2023

I think there's still issues with NamedTuple typing in Python (e.g., this is still true probably: https://stackoverflow.com/questions/60430110/python-typing-support-for-namedtuple). I suspect that the linalg.pyi stubs for these functions need a tweak here, and that tuple unpacking will still work unchanged with mypy. Actually typing the use of qr(x).Q seems tricky.

Runtime support for generic (i.e. parameterizable) named tuples was (re?)-introduced last year again (xref python/cpython#92027), so the issue mentioned in stackoverflow here should be irrelevant now. That, and the named tuples introduced in this PR are currently not generic anyway.

So maybe the namedtuple classes should be added to the API then. Maybe @BvB93 can give some insight? I'm not really knowledgeable enough about typing stuff to know the right answer here.

Named tuples do use a nominal sub-typing pattern just like normal classes (and contrary to protocols), so would have to expose these named tuples somewhere lest, as pointed out by Eric, the users only option would be to resort to the use of private numpy API.

That right now has only four entries, so this would grow it quite a bit - but that seems fine. Perhaps would need to be np.typing.linalg.xxx?

Either that, or if the returning of named tuples is going to be a common thing then something akin to np.typing.named_tuple maybe?

numpy/linalg/linalg.py Outdated Show resolved Hide resolved
numpy/linalg/linalg.py Show resolved Hide resolved
@rgommers
Copy link
Member

rgommers commented Apr 6, 2023

Either that, or if the returning of named tuples is going to be a common thing then something akin to np.typing.named_tuple maybe?

We don't have that many functions that return multiple arrays which can be named sensibly, so I suspect this isn't necessary.

@seberg
Copy link
Member

seberg commented May 8, 2023

Branching for 1.25 will be approaching, might be nice to push this over the finish line.

charris and others added 2 commits May 17, 2023 12:22
Co-authored-by: Bas van Beek <43369155+BvB93@users.noreply.github.com>
@charris
Copy link
Member

charris commented May 17, 2023

The circleci failure seems to be because the main version of .circleci/config.yml is not being used. Will ignore.

@charris charris merged commit 126b46c into numpy:main May 17, 2023
52 of 57 checks passed
@charris
Copy link
Member

charris commented May 17, 2023

Thanks @asmeurer and @BvB93 .

@seberg
Copy link
Member

seberg commented Jun 21, 2023

Just linking because I didn't really expect this pattern of someone noticing: HIPS/autograd#597 (comment)

They have a dictionary for the result types, which probably only contains tuple.

@rgommers
Copy link
Member

🤔 I think I'd consider that an issue in their code, changing tuple to namedtuple seems like a thing that is okay to do, just like changing any other return type to a more specific subclass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants