Skip to content

Commit

Permalink
Type checking with mypy (#2655)
Browse files Browse the repository at this point in the history
* Type checking with mypy

The rest of the scientific Python stack doesn't seem to support type
annotations yet, but that's OK -- we can use this incrementally in xarray when
it seems appropriate, and may check a few bugs. I'm especially excited to use
this for internal functions, where we don't always bother with full docstrings
(e.g., what is the type of the ``variables`` argument?).

This includes:
1. various minor fixes to ensure that "mypy xarray" passes.
2. adding "mypy xarray" to our lint check on Travis-CI.

For reference, see "Using mypy with an existing codebase":
https://mypy.readthedocs.io/en/stable/existing_code.html

Question: are we OK with (2)? This means Travis-CI will fail if your code
causes mypy to error.

* Lint fix

* DOC: document mypy, don't run it in travis

* document how to run mypy

* fix type annotation

* Pin pytest to avoid pytest-cov failure

see pytest-dev/pytest-cov#253

* Revert pytest pinning

* Revert "Revert pytest pinning"

This reverts commit cd187a6.

* Revert "Pin pytest to avoid pytest-cov failure"

This reverts commit 87ba452.
  • Loading branch information
shoyer committed Jan 8, 2019
1 parent ede3e01 commit 6963164
Show file tree
Hide file tree
Showing 28 changed files with 179 additions and 136 deletions.
3 changes: 2 additions & 1 deletion ci/requirements-py36.yml
Expand Up @@ -14,7 +14,7 @@ dependencies:
- pytest-cov
- pytest-env
- coveralls
- flake8
- pycodestyle
- numpy
- pandas
- scipy
Expand All @@ -32,3 +32,4 @@ dependencies:
- lxml
- pip:
- cfgrib>=0.9.2
- mypy==0.650
1 change: 1 addition & 0 deletions ci/requirements-py37.yml
Expand Up @@ -29,3 +29,4 @@ dependencies:
- pydap
- pip:
- cfgrib>=0.9.2
- mypy==0.650
24 changes: 15 additions & 9 deletions doc/contributing.rst
Expand Up @@ -345,28 +345,34 @@ the more common ``PEP8`` issues:
- passing arguments should have spaces after commas, e.g. ``foo(arg1, arg2, kw1='bar')``

:ref:`Continuous Integration <contributing.ci>` will run
the `flake8 <http://pypi.python.org/pypi/flake8>`_ tool
the `pycodestyle <http://pypi.python.org/pypi/pycodestyle>`_ tool
and report any stylistic errors in your code. Therefore, it is helpful before
submitting code to run the check yourself::

flake8
pycodestyle xarray

If you install `isort <https://github.com/timothycrosley/isort>`_ and
`flake8-isort <https://github.com/gforcada/flake8-isort>`_, this will also show
any errors from incorrectly sorted imports. These aren't currently enforced in
CI. To automatically sort imports, you can run::
Other recommended but optional tools for checking code quality (not currently
enforced in CI):

isort -y
- `mypy <http://mypy-lang.org/>`_ performs static type checking, which can
make it easier to catch bugs. Please run ``mypy xarray`` if you annotate any
code with `type hints <https://docs.python.org/3/library/typing.html>`_.
- `flake8 <http://pypi.python.org/pypi/flake8>`_ includes a few more automated
checks than those enforced by pycodestyle.
- `isort <https://github.com/timothycrosley/isort>`_ will highlight
incorrectly sorted imports. ``isort -y`` will automatically fix them. See
also `flake8-isort <https://github.com/gforcada/flake8-isort>`_.

Note that your code editor probably supports extensions that can show results
of these checks inline as you type.

Backwards Compatibility
~~~~~~~~~~~~~~~~~~~~~~~

Please try to maintain backward compatibility. *xarray* has growing number of users with
lots of existing code, so don't break it if at all possible. If you think breakage is
required, clearly state why as part of the pull request. Also, be careful when changing
method signatures and add deprecation warnings where needed. Also, add the deprecated
sphinx directive to the deprecated functions or methods.
method signatures and add deprecation warnings where needed.

.. _contributing.ci:

Expand Down
58 changes: 58 additions & 0 deletions setup.cfg
Expand Up @@ -20,6 +20,64 @@ default_section=THIRDPARTY
known_first_party=xarray
multi_line_output=4

# Most of the numerical computing stack doesn't have type annotations yet.
[mypy-bottleneck.*]
ignore_missing_imports = True
[mypy-cdms2.*]
ignore_missing_imports = True
[mypy-cf_units.*]
ignore_missing_imports = True
[mypy-cfgrib.*]
ignore_missing_imports = True
[mypy-cftime.*]
ignore_missing_imports = True
[mypy-dask.*]
ignore_missing_imports = True
[mypy-distributed.*]
ignore_missing_imports = True
[mypy-h5netcdf.*]
ignore_missing_imports = True
[mypy-h5py.*]
ignore_missing_imports = True
[mypy-iris.*]
ignore_missing_imports = True
[mypy-matplotlib.*]
ignore_missing_imports = True
[mypy-Nio.*]
ignore_missing_imports = True
[mypy-numpy.*]
ignore_missing_imports = True
[mypy-netCDF4.*]
ignore_missing_imports = True
[mypy-netcdftime.*]
ignore_missing_imports = True
[mypy-pandas.*]
ignore_missing_imports = True
[mypy-PseudoNetCDF.*]
ignore_missing_imports = True
[mypy-pydap.*]
ignore_missing_imports = True
[mypy-pytest.*]
ignore_missing_imports = True
[mypy-rasterio.*]
ignore_missing_imports = True
[mypy-scipy.*]
ignore_missing_imports = True
[mypy-seaborn.*]
ignore_missing_imports = True
[mypy-toolz.*]
ignore_missing_imports = True
[mypy-zarr.*]
ignore_missing_imports = True

# written by versioneer
[mypy-xarray._version]
ignore_errors = True
# version spanning code is hard to type annotate (and most of this module will
# be going away soon anyways)
[mypy-xarray.core.pycompat]
ignore_errors = True

[versioneer]
VCS = git
style = pep440
Expand Down
3 changes: 2 additions & 1 deletion xarray/backends/file_manager.py
@@ -1,5 +1,6 @@
import contextlib
import threading
from typing import Any, Dict
import warnings

from ..core import utils
Expand All @@ -13,7 +14,7 @@
assert FILE_CACHE.maxsize, 'file cache must be at least size one'


REF_COUNTS = {}
REF_COUNTS = {} # type: Dict[Any, int]

_DEFAULT_MODE = utils.ReprObject('<unused>')

Expand Down
3 changes: 2 additions & 1 deletion xarray/backends/locks.py
@@ -1,5 +1,6 @@
import multiprocessing
import threading
from typing import Any, MutableMapping
import weakref

try:
Expand All @@ -20,7 +21,7 @@
NETCDFC_LOCK = SerializableLock()


_FILE_LOCKS = weakref.WeakValueDictionary()
_FILE_LOCKS = weakref.WeakValueDictionary() # type: MutableMapping[Any, threading.Lock] # noqa


def _get_threaded_lock(key):
Expand Down
9 changes: 5 additions & 4 deletions xarray/coding/cftime_offsets.py
Expand Up @@ -43,6 +43,7 @@
import re
from datetime import timedelta
from functools import partial
from typing import ClassVar, Optional

import numpy as np

Expand Down Expand Up @@ -74,7 +75,7 @@ def get_date_type(calendar):


class BaseCFTimeOffset(object):
_freq = None
_freq = None # type: ClassVar[str]

def __init__(self, n=1):
if not isinstance(n, int):
Expand Down Expand Up @@ -254,9 +255,9 @@ def onOffset(self, date):


class YearOffset(BaseCFTimeOffset):
_freq = None
_day_option = None
_default_month = None
_freq = None # type: ClassVar[str]
_day_option = None # type: ClassVar[str]
_default_month = None # type: ClassVar[int]

def __init__(self, n=1, month=None):
BaseCFTimeOffset.__init__(self, n)
Expand Down
12 changes: 7 additions & 5 deletions xarray/coding/variables.py
@@ -1,6 +1,7 @@
"""Coders for individual Variable objects."""
from __future__ import absolute_import, division, print_function

from typing import Any
import warnings
from functools import partial

Expand Down Expand Up @@ -126,11 +127,12 @@ def pop_to(source, dest, key, name=None):
return value


def _apply_mask(data, # type: np.ndarray
encoded_fill_values, # type: list
decoded_fill_value, # type: Any
dtype, # type: Any
): # type: np.ndarray
def _apply_mask(
data: np.ndarray,
encoded_fill_values: list,
decoded_fill_value: Any,
dtype: Any,
) -> np.ndarray:
"""Mask all matching values in a NumPy arrays."""
data = np.asarray(data, dtype=dtype)
condition = False
Expand Down
2 changes: 1 addition & 1 deletion xarray/core/alignment.py
Expand Up @@ -31,7 +31,7 @@ def _get_joiner(join):
raise ValueError('invalid value for join: %s' % join)


_DEFAULT_EXCLUDE = frozenset()
_DEFAULT_EXCLUDE = frozenset() # type: frozenset


def align(*objects, **kwargs):
Expand Down
4 changes: 2 additions & 2 deletions xarray/core/common.py
Expand Up @@ -24,7 +24,7 @@ def wrapped_func(self, dim=None, axis=None, skipna=None,
return self.reduce(func, dim, axis,
skipna=skipna, allow_lazy=True, **kwargs)
else:
def wrapped_func(self, dim=None, axis=None,
def wrapped_func(self, dim=None, axis=None, # type: ignore
**kwargs):
return self.reduce(func, dim, axis,
allow_lazy=True, **kwargs)
Expand Down Expand Up @@ -56,7 +56,7 @@ def wrapped_func(self, dim=None, skipna=None,
numeric_only=numeric_only, allow_lazy=True,
**kwargs)
else:
def wrapped_func(self, dim=None, **kwargs):
def wrapped_func(self, dim=None, **kwargs): # type: ignore
return self.reduce(func, dim,
numeric_only=numeric_only, allow_lazy=True,
**kwargs)
Expand Down
42 changes: 23 additions & 19 deletions xarray/core/computation.py
Expand Up @@ -8,6 +8,10 @@
import operator
from collections import Counter
from distutils.version import LooseVersion
from typing import (
AbstractSet, Any, Dict, Iterable, List, Mapping, Union, Tuple,
TYPE_CHECKING, TypeVar
)

import numpy as np

Expand All @@ -16,8 +20,11 @@
from .merge import expand_and_merge_variables
from .pycompat import OrderedDict, basestring, dask_array_type
from .utils import is_dict_like
from .variable import Variable
if TYPE_CHECKING:
from .dataset import Dataset

_DEFAULT_FROZEN_SET = frozenset()
_DEFAULT_FROZEN_SET = frozenset() # type: frozenset
_NO_FILL_VALUE = utils.ReprObject('<no-fill-value>')
_DEFAULT_NAME = utils.ReprObject('<default-name>')
_JOINS_WITHOUT_FILL_VALUES = frozenset({'inner', 'exact'})
Expand Down Expand Up @@ -111,8 +118,7 @@ def to_gufunc_string(self):
return str(alt_signature)


def result_name(objects):
# type: List[object] -> Any
def result_name(objects: list) -> Any:
# use the same naming heuristics as pandas:
# https://github.com/blaze/blaze/issues/458#issuecomment-51936356
names = {getattr(obj, 'name', _DEFAULT_NAME) for obj in objects}
Expand All @@ -138,10 +144,10 @@ def _get_coord_variables(args):


def build_output_coords(
args, # type: list
signature, # type: _UFuncSignature
exclude_dims=frozenset(), # type: set
):
args: list,
signature: _UFuncSignature,
exclude_dims: AbstractSet = frozenset(),
) -> 'List[OrderedDict[Any, Variable]]':
"""Build output coordinates for an operation.
Parameters
Expand All @@ -159,7 +165,6 @@ def build_output_coords(
-------
OrderedDict of Variable objects with merged coordinates.
"""
# type: (...) -> List[OrderedDict[Any, Variable]]
input_coords = _get_coord_variables(args)

if exclude_dims:
Expand Down Expand Up @@ -220,17 +225,15 @@ def apply_dataarray_ufunc(func, *args, **kwargs):
return out


def ordered_set_union(all_keys):
# type: List[Iterable] -> Iterable
def ordered_set_union(all_keys: List[Iterable]) -> Iterable:
result_dict = OrderedDict()
for keys in all_keys:
for key in keys:
result_dict[key] = None
return result_dict.keys()


def ordered_set_intersection(all_keys):
# type: List[Iterable] -> Iterable
def ordered_set_intersection(all_keys: List[Iterable]) -> Iterable:
intersection = set(all_keys[0])
for keys in all_keys[1:]:
intersection.intersection_update(keys)
Expand Down Expand Up @@ -284,9 +287,9 @@ def _as_variables_or_variable(arg):

def _unpack_dict_tuples(
result_vars, # type: Mapping[Any, Tuple[Variable]]
num_outputs, # type: int
num_outputs, # type: int
):
# type: (...) -> Tuple[Dict[Any, Variable]]
# type: (...) -> Tuple[Dict[Any, Variable], ...]
out = tuple(OrderedDict() for _ in range(num_outputs))
for name, values in result_vars.items():
for value, results_dict in zip(values, out):
Expand Down Expand Up @@ -438,8 +441,11 @@ def apply_groupby_ufunc(func, *args):
return combined


def unified_dim_sizes(variables, exclude_dims=frozenset()):
# type: Iterable[Variable] -> OrderedDict[Any, int]
def unified_dim_sizes(
variables: Iterable[Variable],
exclude_dims: AbstractSet = frozenset(),
) -> 'OrderedDict[Any, int]':

dim_sizes = OrderedDict()

for var in variables:
Expand All @@ -460,11 +466,9 @@ def unified_dim_sizes(variables, exclude_dims=frozenset()):

SLICE_NONE = slice(None)

# A = TypeVar('A', numpy.ndarray, dask.array.Array)


def broadcast_compat_data(variable, broadcast_dims, core_dims):
# type: (Variable[A], tuple, tuple) -> A
# type: (Variable, tuple, tuple) -> Any
data = variable.data

old_dims = variable.dims
Expand Down
3 changes: 2 additions & 1 deletion xarray/core/dataarray.py
Expand Up @@ -771,7 +771,8 @@ def __deepcopy__(self, memo=None):
return self.copy(deep=True)

# mutable objects should not be hashable
__hash__ = None
# https://github.com/python/mypy/issues/4266
__hash__ = None # type: ignore

@property
def chunks(self):
Expand Down

0 comments on commit 6963164

Please sign in to comment.