Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAINT: Add inline type hintings for stats.qmc #13833

Merged
merged 1 commit into from
May 19, 2021

Conversation

V0lantis
Copy link
Contributor

@V0lantis V0lantis commented Apr 10, 2021

Following @tupui and @rgommers comment on #13576, I am submitting this PR to remove _qmc.pyi to add inline type hinting inside _qmc.py

Reference issue

None

What does this implement/fix?

This implementation remove a type files (qmc.pyi) and replace it by inline typing directly inside _qmc.py

@V0lantis V0lantis requested a review from tupui as a code owner April 10, 2021 19:18
@V0lantis
Copy link
Contributor Author

V0lantis commented Apr 10, 2021

I wonder what is the problem : (on my computer)

$ python -u runtests.py --mypy
Building, see build.log...
Build OK (0:00:52.238457 elapsed)
Success: no issues found in 679 source files
$ python --version            
Python 3.9.2

@tupui tupui added static typing Items related to static typing scipy.stats labels Apr 10, 2021
@tupui
Copy link
Member

tupui commented Apr 10, 2021

Thanks! I will pull the branch and check tmr.

Do you plan to also do the same for the discrepancy? I thought we were talking about this instead in the other PR. But this is also wanted 👍

@V0lantis
Copy link
Contributor Author

Oh yes, of course ! I forgot that it was also possible in cython files.

Copy link
Contributor Author

@V0lantis V0lantis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had some comment on line that may not be clear from first sight. I hope this time that all tests will pass

scipy/stats/_qmc.py Outdated Show resolved Hide resolved
Comment on lines 289 to 291
workers = int(workers)
if workers == -1:
workers = os.cpu_count()
if workers is None:
cpu_count = os.cpu_count()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created here this variable since os.cpu_count is returning Optional[int] and workers is a plain int. Since a couple of lines before, I use workers = int(workers), and int is receiving Union[str, bytes, SupportsInt, _SupportsIndex, _SupportsTrunc] and not None, I couldn't change workers type into Optional[int]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was not working to have workers: Optional[int] = int(workers)?

scipy/stats/_qmc.py Outdated Show resolved Hide resolved
scipy/stats/_qmc.py Outdated Show resolved Hide resolved
@tupui
Copy link
Member

tupui commented Apr 11, 2021

Seems like we cannot import numpy.typing. My understanding is that it is just available since NumPy 1.20.

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/vsts/work/1/s/build/testenv/lib/python3.7/site-packages/scipy/signal/__init__.py", line 310, in <module>
    from ._peak_finding import *
  File "/home/vsts/work/1/s/build/testenv/lib/python3.7/site-packages/scipy/signal/_peak_finding.py", line 8, in <module>
    from scipy.stats import scoreatpercentile
  File "/home/vsts/work/1/s/build/testenv/lib/python3.7/site-packages/scipy/stats/__init__.py", line 443, in <module>
    from . import qmc
  File "/home/vsts/work/1/s/build/testenv/lib/python3.7/site-packages/scipy/stats/qmc.py", line 234, in <module>
    from ._qmc import *
  File "/home/vsts/work/1/s/build/testenv/lib/python3.7/site-packages/scipy/stats/_qmc.py", line 30, in <module>
    import numpy.typing as npt
ModuleNotFoundError: No module named 'numpy.typing'

Comment on lines -15 to -16
if TYPE_CHECKING:
import numpy.typing as npt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed for NumPy < 1.20

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not used in SciPy/stats/_entropy.py where there are also type annotations

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure because it uses directly in the type hints np.typing then this is not evaluated at runtime. But here you import numpy.typing so this will only work for NumPy > 1.2. Hence the failing jobs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok will try this solution then

scipy/stats/_qmc.py Outdated Show resolved Hide resolved
scipy/stats/_qmc.py Outdated Show resolved Hide resolved
scipy/stats/_qmc.py Outdated Show resolved Hide resolved
def __init__(self, mean, cov=None, cov_root=None, inv_transform=True,
engine=None, seed=None):
def __init__(
self, mean: np.typing.ArrayLike, cov: Optional[np.typing.ArrayLike] = None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example here. I have no idea what is happening here and I don't think any IDE can help me out here to understand what the signature is.

If I remember correctly for keywords it will automatically register as Optional.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is ok for an IDE. We need the Optional for mypy as it's not enough that the default is None. If that was your concern.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we can write ArrayLike | None maybe instead of this garbled stuff?

Copy link
Member

@tupui tupui Apr 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Union? I think there might be a misunderstanding here. There are 2 parameters here mean and cov

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional[npt.ArrayLike] or without the npt but not sure about the |. Is this supposed to replace the Optional too?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently yes (https://www.python.org/dev/peps/pep-0604/#proposal). Not sure about the best practices we want here. But as we will still need to write the default for python itself, IMO this will look redundant as ArrayLike | None = None

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't change it then

@@ -263,7 +285,7 @@ def discrepancy(sample, iterative=False, method="CD", workers=1):
if not (np.all(sample >= 0) and np.all(sample <= 1)):
raise ValueError("Sample is not in unit hypercube")

workers = int(workers)
workers: Optional[int] = int(workers)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok so this is not possible : scipy/stats/_qmc.py:288: error: Name 'workers' already defined on line 168 [no-redef]

Copy link
Member

@tirthasheshpatel tirthasheshpatel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made some suggestions on failing CI.

scipy/stats/_qmc.py Outdated Show resolved Hide resolved
super().__init__(d=d, seed=seed)
self.centered = centered

def random(self, n=1):
def random(self, n: int = 1) -> Union[np.number, np.ndarray]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def random(self, n: int = 1) -> Union[np.number, np.ndarray]:
def random(self, n: int = 1) -> np.ndarray:

In which case can this return np.number? Line no. 867 seems to always output a np.ndarray. I think the problem is on line 863 where the samples is assigned a float value. So, mypy misassigns float to samples. Doing # type: ignore[return-value] on line 874 would be OK. I don't know if there is a better solution. Maybe use cast?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is if someone uses this function and expects a float, he will have a type error in his IDE (because the IDE considers that it should be an np.ndarray). Am I correct?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But why would someone expect a float? From my understanding, the output is always a ndarray. So, the IDE will correctly show a type error, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed with @tirthasheshpatel, all random are returning np.ndarray. The original types from _qmc.pyi should be correct and not require much adjustment as we already iterated over this.

Copy link
Contributor Author

@V0lantis V0lantis Apr 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I returned to the previous state in 7a599c6 and add a ignore type at the return value

Copy link
Contributor Author

@V0lantis V0lantis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you the 3 of you for your comments. I reviewed them in the past commits

scipy/stats/_qmc.py Outdated Show resolved Hide resolved
scipy/stats/_qmc.py Outdated Show resolved Hide resolved
super().__init__(d=d, seed=seed)
self.centered = centered

def random(self, n=1):
def random(self, n: int = 1) -> Union[np.number, np.ndarray]:
Copy link
Contributor Author

@V0lantis V0lantis Apr 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I returned to the previous state in 7a599c6 and add a ignore type at the return value

def __init__(self, mean, cov=None, cov_root=None, inv_transform=True,
engine=None, seed=None):
def __init__(
self, mean: np.typing.ArrayLike, cov: Optional[np.typing.ArrayLike] = None,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't change it then

Copy link
Member

@tupui tupui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you might have an autoformatting setting to adjust to avoid this double indentation issue.

Comment on lines 40 to 43
IntNumber = Union[int, np.integer]
DecimalNumber = Union[float, np.floating]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be used as before through the file. Here is some background #10844 (review)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is, with the old types (in _qmc.pyi) we have got a lot of types errors when putting them into inline types :

scipy/stats/_qmc.py:400: error: Argument 1 to "ones" has incompatible type "Union[int, number[Any]]"; expected "Union[int, Sequence[int]]"  [arg-type]
scipy/stats/_qmc.py:435: error: Slice index must be an integer or None  [misc]
scipy/stats/_qmc.py:440: error: Slice index must be an integer or None  [misc]
scipy/stats/_qmc.py:492: error: Argument 1 to "zeros" has incompatible type "Union[int, integer[Any]]"; expected "Union[int, Sequence[int]]"  [arg-type]
scipy/stats/_qmc.py:773: error: Incompatible types in assignment (expression has type "Union[int, number[Any]]", variable has type "int")  [assignment]
scipy/stats/_qmc.py:774: error: Argument 1 to "reshape" of "_ArrayOrScalarCommon" has incompatible type "Union[int, integer[Any]]"; expected "int"  [arg-type]
scipy/stats/_qmc.py:774: error: Argument 2 to "reshape" of "_ArrayOrScalarCommon" has incompatible type "Union[int, integer[Any]]"; expected "int"  [arg-type]
scipy/stats/_qmc.py:873: error: Argument 1 to "range" has incompatible type "Union[int, integer[Any]]"; expected "int"  [arg-type]
scipy/stats/_qmc.py:878: error: Incompatible types in assignment (expression has type "Union[int, number[Any]]", variable has type "int")  [assignment]
scipy/stats/_qmc.py:1028: error: Argument 1 to "zeros" has incompatible type "Tuple[Union[int, integer[Any]], Any]"; expected "Union[int, Sequence[int]]"  [arg-type]
scipy/stats/_qmc.py:1032: error: Argument 1 to "zeros" has incompatible type "Union[int, integer[Any]]"; expected "Union[int, Sequence[int]]"  [arg-type]
scipy/stats/_qmc.py:1071: error: Argument 1 to "empty" has incompatible type "Tuple[Union[int, integer[Any]], Union[int, integer[Any]]]"; expected "Union[int, Sequence[int]]"  [arg-type]
scipy/stats/_qmc.py:1075: error: Unsupported operand types for & ("int" and "number[Any]")  [operator]
scipy/stats/_qmc.py:1075: error: Unsupported operand types for & ("integer[Any]" and "number[Any]")  [operator]
scipy/stats/_qmc.py:1075: note: Both left and right operands are unions
scipy/stats/_qmc.py:1084: error: Slice index must be an integer or None  [misc]
scipy/stats/_qmc.py:1089: error: Incompatible types in assignment (expression has type "Union[int, number[Any]]", variable has type "int")  [assignment]
scipy/stats/_qmc.py:1112: error: Unsupported left operand type for & ("number[Any]")  [operator]
scipy/stats/_qmc.py:1112: note: Both left and right operands are unions
scipy/stats/_qmc.py:1120: error: Argument 1 to "random" of "Sobol" has incompatible type "Union[Any, number[Any]]"; expected "Union[int, integer[Any]]"  [arg-type]
scipy/stats/_qmc.py:1303: error: unused 'type: ignore' comment
scipy/stats/_qmc.py:1304: error: Module has no attribute "norm"  [attr-defined]
scipy/stats/_qmc.py:1313: error: Argument 1 to "reshape" of "_ArrayOrScalarCommon" has incompatible type "Union[int, integer[Any]]"; expected "int"  [arg-type]
scipy/stats/_qmc.py:1315: error: Slice index must be an integer or None  [misc]

Copy link
Member

@tupui tupui Apr 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rgommers apparently mypy complains about _IntegerType = Union[int, np.integer]. Is it fine then to use plain int?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's odd, it shouldn't matter if the annotation is inline or in a stub file. Just to make sure: this is with the most recent NumPy version?


scipy/stats/_qmc.py:492: error: Argument 1 to "zeros" has incompatible type "Union[int, integer[Any]]"; expected "Union[int, Sequence[int]]"  [arg-type]

The current definition of _ShapeLike is Union[int, Sequence[int]], so that indeed doesn't accept numpy integer types. That may be an oversight - @BvB93 can you comment?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That definitely seems to be an oversight, yes. At a quick glance int either has to be replaced with the typing.SupportsIndex protocol or, alternatively, a union of int and np.integer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunatly, the nightly release is from 4 days ago (https://anaconda.org/scipy-wheels-nightly/numpy/files?sort=time&sort_order=desc). I am trying to build with @BvB93 modifications right now (so the main release).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should communicate about new guidelines on how to type hint before we start adding "wrong" things.

We're going to find things like this as we go. How about we keep a wiki page with pointers for now? I feel like we're not at the point where it's easy/productive to write a section of the contributing guide yet.

Copy link
Contributor Author

@V0lantis V0lantis Apr 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to say that with the news modifications, we have new MyPy errors unrelated to this PR:

scipy/special/orthogonal.pyi:273: error: Variable "numpy.poly1d" is not valid as a type  [valid-type]
scipy/special/orthogonal.pyi:273: note: See https://mypy.readthedocs.io/en/latest/common_issues.html#variables-vs-type-aliases
scipy/special/orthogonal.pyi:273: error: Invalid base class "numpy.poly1d"  [misc]
scipy/fft/_pocketfft/tests/test_real_transforms.py:256: error: Invalid index type "Tuple[partial[Any], Type[int], int]" for "Dict[Tuple[partial[Any], Type[floating[Any]], int], int]"; expected type "Tuple[partial[Any], Type[floating[Any]], int]"  [index]

I must admit I feel a little bit lost. What should be the next steps ?

Here are the full MyPys' logs :

Build OK (0:21:38.436868 elapsed)
scipy/special/orthogonal.pyi:273: error: Variable "numpy.poly1d" is not valid as a type  [valid-type]
scipy/special/orthogonal.pyi:273: note: See https://mypy.readthedocs.io/en/latest/common_issues.html#variables-vs-type-aliases
scipy/special/orthogonal.pyi:273: error: Invalid base class "numpy.poly1d"  [misc]
scipy/fft/_pocketfft/tests/test_real_transforms.py:256: error: Invalid index type "Tuple[partial[Any], Type[int], int]" for "Dict[Tuple[partial[Any], Type[floating[Any]], int], int]"; expected type "Tuple[partial[Any], Type[floating[Any]], int]"  [index]
scipy/linalg/decomp.py:30: error: Invalid index type "str" for "Dict[generic, Callable[..., ndarray[Any, dtype[Any]]]]"; expected type "generic"  [index]
scipy/linalg/tests/test_basic.py:26: error: Unsupported operand types for + ("List[Type[object]]" and "List[Type[object]]")  [operator]
scipy/stats/_qmc.py:377: error: Argument 3 to "_cy_wrapper_update_discrepancy" has incompatible type "Union[float, floating[Any]]"; expected "float"  [arg-type]
scipy/stats/_qmc.py:435: error: Slice index must be an integer or None  [misc]
scipy/stats/_qmc.py:440: error: Slice index must be an integer or None  [misc]
scipy/stats/_qmc.py:774: error: Argument 1 to "reshape" of "_ArrayOrScalarCommon" has incompatible type "Union[int, integer[Any]]"; expected "int"  [arg-type]
scipy/stats/_qmc.py:774: error: Argument 2 to "reshape" of "_ArrayOrScalarCommon" has incompatible type "Union[int, integer[Any]]"; expected "int"  [arg-type]
scipy/stats/_qmc.py:870: error: Incompatible types in assignment (expression has type "ndarray[Any, dtype[floating[_64Bit]]]", variable has type "float")  [assignment]
scipy/stats/_qmc.py:873: error: Argument 1 to "range" has incompatible type "Union[int, integer[Any]]"; expected "int"  [arg-type]
scipy/stats/_qmc.py:1042: error: Item "RandomState" of "Union[Generator, RandomState]" has no attribute "integers"  [union-attr]
scipy/stats/_qmc.py:1044: error: Item "Generator" of "Union[Generator, RandomState]" has no attribute "randint"  [union-attr]
scipy/stats/_qmc.py:1084: error: Slice index must be an integer or None  [misc]
scipy/stats/_qmc.py:1278: error: Incompatible return value type (got "Union[Any, bool_, ndarray[Any, dtype[bool_]]]", expected "ndarray[Any, Any]")  [return-value]
scipy/stats/_qmc.py:1311: error: Argument 1 to "reshape" of "_ArrayOrScalarCommon" has incompatible type "Union[int, integer[Any]]"; expected "int"  [arg-type]
scipy/stats/_qmc.py:1313: error: Slice index must be an integer or None  [misc]
Found 18 errors in 5 files (checked 679 source files)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed @rgommers. I am not sure I can help here as I am struggling to get all these sorry. I know about normal typing but not NumPy specifics like these. I can review the document though if it can help.

Copy link
Contributor

@BvB93 BvB93 Apr 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to say that with the news modifications, we have new MyPy errors unrelated to this PR:

scipy/special/orthogonal.pyi:273: error: Variable "numpy.poly1d" is not valid as a type  [valid-type]
scipy/special/orthogonal.pyi:273: note: See https://mypy.readthedocs.io/en/latest/common_issues.html#variables-vs-type-aliases
scipy/special/orthogonal.pyi:273: error: Invalid base class "numpy.poly1d"  [misc]

Great, it appears that mypy does not allow the subclassing of objects annotated as Type[Any],
only explicitly defined classes. This is annoying, but it's something that we can fix in numpy with
a small followup on numpy/numpy@b9d9e03 (see numpy/numpy#18780).

scipy/stats/_qmc.py Outdated Show resolved Hide resolved
scipy/stats/_qmc.py Outdated Show resolved Hide resolved
scipy/stats/_qmc.py Outdated Show resolved Hide resolved
scipy/stats/_qmc.py Outdated Show resolved Hide resolved
scipy/stats/_qmc.py Outdated Show resolved Hide resolved
scipy/stats/_qmc.py Outdated Show resolved Hide resolved
scipy/stats/_qmc.py Outdated Show resolved Hide resolved
scipy/stats/_qmc.py Outdated Show resolved Hide resolved
scipy/stats/_qmc.py Outdated Show resolved Hide resolved
@V0lantis
Copy link
Contributor Author

I think you might have an autoformatting setting to adjust to avoid this double indentation issue.

Yes I agree, but shouldn't it be like this ? (https://www.python.org/dev/peps/pep-0008/#indentation) :

# Add 4 spaces (an extra level of indentation) to distinguish arguments from the rest.
def long_function_name(
        var_one, var_two, var_three,
        var_four):
    print(var_one)

@tupui
Copy link
Member

tupui commented Apr 14, 2021

I think you might have an autoformatting setting to adjust to avoid this double indentation issue.

Yes I agree, but shouldn't it be like this ? (https://www.python.org/dev/peps/pep-0008/#indentation) :

# Add 4 spaces (an extra level of indentation) to distinguish arguments from the rest.
def long_function_name(
        var_one, var_two, var_three,
        var_four):
    print(var_one)

I am puzzled here! Thanks for pointing this out. I will have to check. I really thought things like this were not correct

# Hanging indents should add a level.
foo = long_function_name(
    var_one, var_two,
    var_three, var_four)

But should be

# Hanging indents should add a level.
foo = long_function_name(
    var_one, var_two,
    var_three, var_four
)

@tupui
Copy link
Member

tupui commented Apr 14, 2021

According to flake8 (E125), both this and my suggestion are fine. As you wish then if both are evaluated as valid. (Feel free to resolve)

@V0lantis
Copy link
Contributor Author

According to flake8 (E125), both this and my suggestion are fine. As you wish then if both are evaluated as valid. (Feel free to resolve)

Thank you @tupui. Unfortunately I wasn't able to put the same types as before because they raise a lot of errors which were't there with _qmc.pyi file.


# Based on scipy._lib._util.check_random_state
def check_random_state(seed=None):
def check_random_state(seed: SeedType = None) -> Union[np.random.Generator,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd strongly advise you to avoid returning unions whenever possible, as they can be a pain the deal with for end-users.
For example, the Union can readily be removed via the use of two overloads.

if TYPE_CHECKING:
    _GT = TypeVar("_GT", bound=Union[np.random.Generator, np.random.RandomState])

@overload
def check_random_state(seed: Optional[IntNumber] = ...) -> np.random.Generator:
    ...
@overload
def check_random_state(seed: _GT) -> _GT:
    ...
def check_random_state(seed=None):
    pass  # Insert implementation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks scary... But ok we can do this here as in general we don't have much unions.

def discrepancy(
sample: npt.ArrayLike,
iterative: bool = False,
method: str = "CD",
Copy link
Contributor

@BvB93 BvB93 Apr 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since only a specific number of valid strings can be passed to method it'd be worthwhile to use a Literal here.

EDIT: edited the example for the sake of clarity.

if TYPE_CHECKING:
    import sys
    if sys.version_info >= (3, 8:
        from typing import Literal
    else:
        from typing_extensions import Literal

def discrepancy(
    *args, **kwargs,  # Insert all other parameters, etc, etc
    method: Literal["CD", "WD", "MD", "L2-star"],
): ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But then if we do this, I suggest we put this in _utils or some place we can reuse this kind of logic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there is a disagreement here, I am waiting for more input :-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless the type of this method parameter is very common in scipy I'd suggest to just directly assign it to the method parameter without the inclusion of any type aliases.

Copy link
Member

@tupui tupui Apr 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify what I ment. In a lot of stats functions we have a literal (alternative). I am suggesting we could use the same logic in such cases, hence have a common helper to import Literal and other things. Otherwise we will add quite some boilerplate to a lot of files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in f9eb736. I didn't put the Literal in _util though, since the method arguments are quite specific to the discrepancy function

sample: npt.ArrayLike,
iterative: bool = False,
method: str = "CD",
workers: int = 1) -> DecimalNumber:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above: try to avoid returning unions.
Based on the annotations in https://github.com/scipy/scipy/blob/master/scipy/stats/_qmc_cy.pyi it seems to return a normal float.

Suggested change
workers: int = 1) -> DecimalNumber:
workers: int = 1) -> float:

@V0lantis
Copy link
Contributor Author

V0lantis commented Apr 15, 2021

I have updated Nightly CPython workflow in .github/workflow/linux.yml's with python version 3.9. It is just for testing purposes and I intend to remove my changes where all the tests are passing.
Basically, I just do a pull on the main numpy's branch. It is just a test, I am not well experienced with those kinds of things. The goal is to see the latest changes on NumPy types.

We can then switch .github/workflows/linux.yml to use numpy nightlies to test, because this won't be the last time we need a fix in numpy.

@rgommers I would be very interested in "getting my hands dirty" and try to make a PR upon Scipy's workflows to integrate NumPy's Nightly release. But I just saw that the Nightly Released was due every Sunday. Wouldn't it be better to do it rather every night ? For example, @BvB93 latest changes haven't been released yet in the nightly released. If we have to wait for every Sunday to test if our changes are passing, it is rather long.
Or is my method more relevant, i.e. pulling NumPy's main branch and building it directly in the workflow ?

@rgommers
Copy link
Member

@rgommers I would be very interested in "getting my hands dirty" and try to make a PR upon Scipy's workflows to integrate NumPy's Nightly release.

That'd be great, thanks.

But I just saw that the Nightly Released was due every Sunday. Wouldn't it be better to do it rather every night ?

The reason is that we have limited CI over on https://github.com/MacPython/numpy-wheels after TravisCI became less generous.

For example, @BvB93 latest changes haven't been released yet in the nightly released. If we have to wait for every Sunday to test if our changes are passing, it is rather long.
Or is my method more relevant, i.e. pulling NumPy's main branch and building it directly in the workflow ?

If we know we need a change and don't want to wait for the new nightly (which shouldn't happen that often), we can always trigger a one-off build on https://github.com/MacPython/numpy-wheels. That'd be better than building numpy from source - that's a bit slow, if there's something wrong with it it'll then break both numpy and scipy CI.

@V0lantis
Copy link
Contributor Author

V0lantis commented Apr 16, 2021

What should we do for the errors which are not related to my the PR, following the changes from @BvB93. Should I first submit a PR which is going to resolve them ?
Errors :

scipy/special/orthogonal.pyi:273: error: Variable "numpy.poly1d" is not valid as a type  [valid-type]
22
scipy/special/orthogonal.pyi:273: note: See https://mypy.readthedocs.io/en/latest/common_issues.html#variables-vs-type-aliases
23
scipy/special/orthogonal.pyi:273: error: Invalid base class "numpy.poly1d"  [misc]
24
scipy/fft/_pocketfft/tests/test_real_transforms.py:256: error: Invalid index type "Tuple[partial[Any], Type[int], int]" for "Dict[Tuple[partial[Any], Type[floating[Any]], int], int]"; expected type "Tuple[partial[Any], Type[floating[Any]], int]"  [index]

Also, there is an error in Azure tests : AttributeError: module 'numpy.random' has no attribute 'Generator'. Is this related to some workflows failures ?

@rgommers
Copy link
Member

What should we do for the errors which are not related to my the PR, following the changes from @BvB93. Should I first submit a PR which is going to resolve them ?

A separate PR to fix those sounds great.

Also, there is an error in Azure tests : AttributeError: module 'numpy.random' has no attribute 'Generator'. Is this related to some workflows failures ?

This will be due to an old numpy version. You cannot use Generator unconditionally, it was introduced in numpy 1.17 and we still have 1.16.5 as minimum.

Comment on lines 48 to 49
SeedType = Optional[Union[IntNumber, np.random.Generator,
np.random.RandomState]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving these lines to the if TYPE_CHECKING: block should get rid of the numpy 1.16-related AttributeError (#13833 (comment)).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I move all types declaration in the if statement, including IntNumber and every other types I wish to declare ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For statements that can be executed just fine during runtime this won't be (strictly) necessary.
It can still be done however, but this would be more of a stylistic choice; one I don't have any strong opinions about.

@@ -401,7 +446,7 @@ def n_primes(n):
677, 683, 691, 701, 709, 719, 727, 733, 739, 743, 751, 757, 761,
769, 773, 787, 797, 809, 811, 821, 823, 827, 829, 839, 853, 857,
859, 863, 877, 881, 883, 887, 907, 911, 919, 929, 937, 941, 947,
953, 967, 971, 977, 983, 991, 997][:n]
953, 967, 971, 977, 983, 991, 997])[:n]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BvB93 Do you know how to to type n ?
For now, we have the following error :

scipy/stats/_qmc.py:449: error: Slice index must be an integer or None  [misc]
scipy/stats/_qmc.py:454: error: Slice index must be an integer or None  [misc]

because builtin.slice is not accepting other types than int or None (see python/typing#159)
Apparently (python/mypy#2410 (comment)), NumPy slice method accepts other types. Do you know how to correct this ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the current mypy issue, I'm afraid the only way to deal with this false positive is to add a # type: ignore[misc] comment.

*,
iterative: bool = False,
method: str = "CD",
workers: int = 1) -> float:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we use IntegerNumber and DecimalNumber everywhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am really not an expert in this area. Sometime, function are expecting bare int like range function or table , for example in [1, 2, 3][n], n should be a bare python int and not numpy int.
What do you think about that @BvB93 ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both range and list can currently take any object implementing the __index__ protocol,
so using n: np.integer = ... should be fine.

Note that the most recent mypy release still has the range input types defined builtins.int, a more
recent typeshed update has loosened their types to aforementioned protocol:
https://github.com/python/typeshed/blob/740f67245033a4e647706869a01f67d8c079fd7f/stdlib/builtins.pyi#L904-L911

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks both Updated in 85a580f

# Since Generator was introduced in numpy 1.17, the following condition is needed for
# backward compatibility
if TYPE_CHECKING:
SeedType = Optional[Union[IntNumber, np.random.Generator,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like these definitions should be in utils as we are likely going to need them everywhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree as well. Going to do it then

scipy/stats/_qmc.py Show resolved Hide resolved
@V0lantis
Copy link
Contributor Author

Looks good overall, though annotations are still missing in a number of important places:

  • check_random_state function, or all attributes taking its output will need an annotation

Are you sure of that ? I thought it was the purpose of @overload decorator. Line 56 :

@overload
def check_random_state(seed: Optional[IntNumber] = ...) -> np.random.Generator:
    ...

@overload
def check_random_state(seed: GeneratorType) -> GeneratorType:
    ...


# Based on scipy._lib._util.check_random_state
def check_random_state(seed=None):
  • n_primes function, or all attributes taking its output will need an annotation

Thanks, updated in : 4ca1693

  • Sobol.fast_forward method

Updated : def fast_forward(self, n: IntNumber) -> Sobol:

  • Sobol.MAXDIM attribute (class variable, so needs typing.ClassVar)
  • Sobol.MAXBIT attribute (class variable, so needs typing.ClassVar)

Updated as well in 4ca1693

Thank for your review ! :-) @BvB93


Copy link
Contributor Author

@V0lantis V0lantis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tupui for the review

*,
iterative: bool = False,
method: str = "CD",
workers: int = 1) -> float:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am really not an expert in this area. Sometime, function are expecting bare int like range function or table , for example in [1, 2, 3][n], n should be a bare python int and not numpy int.
What do you think about that @BvB93 ?

scipy/stats/_qmc.py Show resolved Hide resolved
# Since Generator was introduced in numpy 1.17, the following condition is needed for
# backward compatibility
if TYPE_CHECKING:
SeedType = Optional[Union[IntNumber, np.random.Generator,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree as well. Going to do it then

Copy link
Contributor

@BvB93 BvB93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure of that ? I thought it was the purpose of @overload decorator. Line 56 :

You are absolutely correct here, I somehow missed it when glancing over the code...

scipy/stats/_qmc.py Outdated Show resolved Hide resolved
scipy/stats/_qmc.py Outdated Show resolved Hide resolved
scipy/stats/_qmc.py Outdated Show resolved Hide resolved
scipy/stats/utils.py Outdated Show resolved Hide resolved
scipy/stats/_qmc.py Outdated Show resolved Hide resolved
scipy/stats/_qmc.py Outdated Show resolved Hide resolved
scipy/stats/_qmc.py Outdated Show resolved Hide resolved
Copy link
Contributor Author

@V0lantis V0lantis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you again @tupui for your review :-). We are getting close (finally) 🙃

scipy/stats/utils.py Outdated Show resolved Hide resolved
scipy/_utils.py Outdated Show resolved Hide resolved
@tupui tupui added this to the 1.7.0 milestone May 14, 2021
@rgommers rgommers changed the title MAINT: Add inline type hintings MAINT: Add inline type hintings for stats.qmc May 14, 2021
Copy link
Member

@tupui tupui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @V0lantis! This will help further typing (the discussion and also what you added in _util.

Side note: before merging though we should have a cleaner history. I would either rebase and squash some commits together, and change the messages to comply with our naming conventions (I guess for type hints this should be MAINT: ...). Or we squash merge and clean the single commit message (co-authors are preserved, and we remove unnecessary things).

import warnings

import numpy as np

if TYPE_CHECKING:
import numpy.typing as npt
from typing import Literal
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from typing import Literal
from typing_extensions import Literal

Note that the Literal import will currently fail for python 3.7 (as it does not exist in python 3.7!).

You'll need either a version check here or an unconditional import from typing_extensions,
the latter which exposes either a backport or a re-export of Literal, depending on the python version.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fyi the import will have to remain within the if TYPE_CHECKING block as typing_extensions is not a runtime dependency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry, I misunderstood :

or an unconditional import from typing_extensions

I though unconditional meant outside the if TYPE_CHECKING.
Thank you very much @BvB93

Copy link
Contributor Author

@V0lantis V0lantis May 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that Scipy doesn't have typing_extensions.
The error can be seen here :

scipy/stats/__init__.py:444: in <module>
    from . import qmc
scipy/stats/qmc.py:234: in <module>
    from ._qmc import *
scipy/stats/_qmc.py:16: in <module>
    from typing_extensions import Literal
E   ModuleNotFoundError: No module named 'typing_extensions'

I found a similar issue here. Should I add typing_extensions in scipy requirements or something?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ModuleNotFoundError is from the CI run prior to 136845f.
The error should not re-appear again, as the module is now no longer imported during runtime.

@tupui
Copy link
Member

tupui commented May 17, 2021

@BvB93 is this ok for you with the latest changes? (CI failure is unrelated and now fixed on master)

Copy link
Contributor

@BvB93 BvB93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, looks good to me.

@tupui
Copy link
Member

tupui commented May 17, 2021

Thanks everyone for the reviews!

@V0lantis let me know what you prefer for the git cleanup before I integrate your PR.
Also just seeing this now, could you merge the two if TYPE_CHECKING together? (non blocking but would be better).

@V0lantis
Copy link
Contributor Author

Thanks everyone for the reviews!

@V0lantis let me know what you prefer for the git cleanup before I integrate your PR.
Also just seeing this now, could you merge the two if TYPE_CHECKING together? (non blocking but would be better).

I have squashed commits into one, but I am not seeing if TYPE_CHECKING repeats...

Comment on lines +20 to +22
if TYPE_CHECKING:
import numpy.typing as npt
from typing_extensions import Literal
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Over here.

Comment on lines 44 to 47
if TYPE_CHECKING:
from scipy._lib._util import (
DecimalNumber, GeneratorType, IntNumber, SeedType
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And over here.

@tupui
Copy link
Member

tupui commented May 17, 2021

Thanks everyone for the reviews!

@V0lantis let me know what you prefer for the git cleanup before I integrate your PR.

Also just seeing this now, could you merge the two if TYPE_CHECKING together? (non blocking but would be better).

I have squashed commits into one, but I am not seeing if TYPE_CHECKING repeats...

Thanks for the cleanup! I will nitpick a bit now sorry, but could you edit the message to add @BvB93 as a co-author at least. It's perfect to squash but we should try to keep this co-authoring as mush as we can 😃

@V0lantis
Copy link
Contributor Author

Thank you both for your patience and your thoughtful comments 🙏

Copy link
Member

@tupui tupui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @V0lantis and @BvB93 for all the work and others for reviewing!

If anyone has time, it would be really helpful to start a page in the wiki. I can do a draft if you guys don't have time. Let me know 😃

Merging once it's green.

* create IntNumber and DecimalNumber type and store them inside
  `scipy/_lib/_util.py`
* Remove `qmc.pyi` anotation file

Co-authored-by: Bas van Beek <43369155+BvB93@users.noreply.github.com>
Co-authored-by: Pamphile ROY <roy.pamphile@gmail.com>
@tupui tupui merged commit 2a4942d into scipy:master May 19, 2021
@V0lantis V0lantis deleted the maint/add_type_hints_qmc branch August 30, 2021 08:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
scipy.stats static typing Items related to static typing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants