diff --git a/.github/pull_request_template.md b/.github/pull_request_template.md index a442c7fc..8fe5d78f 100644 --- a/.github/pull_request_template.md +++ b/.github/pull_request_template.md @@ -5,5 +5,6 @@ Add a short description describing the pull request (PR) here. - [ ] closes issue #xxxx - [ ] is documented - [ ] PEP8 compliant code +- [ ] type hints for functions and methods - [ ] tests added / passed - [ ] Example Notebook (for new features) diff --git a/doc/concepts/armamodel.ipynb b/doc/concepts/armamodel.ipynb index a82fe9b2..ab192e3d 100644 --- a/doc/concepts/armamodel.ipynb +++ b/doc/concepts/armamodel.ipynb @@ -1,13 +1,14 @@ { "cells": [ { + "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# ARMA(1,1) Noise Model for Pastas\n", "*R.A. Collenteur, University of Graz, May 2020*\n", "\n", - "In this notebook an Autoregressive-Moving-Average (ARMA(1,1)) noise model is developed for Pastas models. This new noise model is tested against synthetic data generated with Numpy or Statsmodels' ARMA model. This noise model is tested on head time series with a regular time step.\n", + "In this notebook an Autoregressive-Moving-Average (ARMA(1,1)) noise model is developed for Pastas models. This new noise model is tested against synthetic data generated with NumPy or Statsmodels' ARMA model. This noise model is tested on head time series with a regular time step.\n", "\n", "
\n", " \n", @@ -82,11 +83,12 @@ ] }, { + "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## 3. Generate ARMA(1,1) noise and add it to the synthetic heads\n", - "In the following code-block, noise is generated using an ARMA(1,1) process using Numpy. An alternative procedure is available from Statsmodels (commented out now). More information about the ARMA model can be found on the [statsmodels website](https://www.statsmodels.org/dev/generated/statsmodels.tsa.arima_process.ArmaProcess.html). The noise is added to the head series generated in the previous code-block." + "In the following code-block, noise is generated using an ARMA(1,1) process using NumPy. An alternative procedure is available from Statsmodels (commented out now). More information about the ARMA model can be found on the [statsmodels website](https://www.statsmodels.org/dev/generated/statsmodels.tsa.arima_process.ArmaProcess.html). The noise is added to the head series generated in the previous code-block." ] }, { @@ -107,7 +109,7 @@ "# arma = stats.tsa.ArmaProcess(ar, ma)\n", "# noise = arma.generate_sample(head[0].index.size)*np.std(head.values) * 0.1\n", "\n", - "# generate samples using Numpy\n", + "# generate samples using NumPy\n", "random_seed = np.random.RandomState(1234)\n", "\n", "noise = random_seed.normal(0,1,len(head)) * np.std(head.values) * 0.1\n", @@ -281,7 +283,7 @@ ], "metadata": { "kernelspec": { - "display_name": "Python 3 (ipykernel)", + "display_name": "pastas_dev", "language": "python", "name": "python3" }, @@ -295,7 +297,12 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.9.7" + "version": "3.9.15 | packaged by conda-forge | (main, Nov 22 2022, 08:41:22) [MSC v.1929 64 bit (AMD64)]" + }, + "vscode": { + "interpreter": { + "hash": "29475f8be425919747d373d827cb41e481e140756dd3c75aa328bf3399a0138e" + } } }, "nbformat": 4, diff --git a/doc/concepts/noisemodel.ipynb b/doc/concepts/noisemodel.ipynb index cd35a1e3..cdcdba9d 100644 --- a/doc/concepts/noisemodel.ipynb +++ b/doc/concepts/noisemodel.ipynb @@ -1,13 +1,14 @@ { "cells": [ { + "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# Testing an AR(1) Noise Model for Pastas\n", "*R.A. Collenteur, University of Graz, May 2020*\n", "\n", - "In this notebook the classical Autoregressive AR1 noise model is tested for Pastas models. This noise model is tested against synthetic data generated with Numpy or Statsmodels' ARMA model. This noise model is tested on head time series with a regular time step." + "In this notebook the classical Autoregressive AR1 noise model is tested for Pastas models. This noise model is tested against synthetic data generated with NumPy or Statsmodels' ARMA model. This noise model is tested on head time series with a regular time step." ] }, { @@ -90,7 +91,7 @@ "np.random.seed(1234)\n", "alpha= 0.8\n", "\n", - "# generate samples using Numpy\n", + "# generate samples using NumPy\n", "random_seed = np.random.RandomState(1234)\n", "noise = random_seed.normal(0,1,len(head)) * np.std(head.values) * 0.2\n", "a = np.zeros_like(head[0])\n", @@ -239,7 +240,7 @@ ], "metadata": { "kernelspec": { - "display_name": "Python 3 (ipykernel)", + "display_name": "pastas_dev", "language": "python", "name": "python3" }, @@ -253,7 +254,12 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.9.7" + "version": "3.9.15 | packaged by conda-forge | (main, Nov 22 2022, 08:41:22) [MSC v.1929 64 bit (AMD64)]" + }, + "vscode": { + "interpreter": { + "hash": "29475f8be425919747d373d827cb41e481e140756dd3c75aa328bf3399a0138e" + } } }, "nbformat": 4, diff --git a/doc/examples/02_fix_parameters.ipynb b/doc/examples/02_fix_parameters.ipynb index 7aabdeef..0bbc8660 100644 --- a/doc/examples/02_fix_parameters.ipynb +++ b/doc/examples/02_fix_parameters.ipynb @@ -146,13 +146,14 @@ ] }, { + "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### First time series model\n", "Once the time series are read from the data files, a time series model can be constructed by going through the following three steps:\n", "\n", - "1. Creat a `Model` object by passing it the observed head series. Store your model in a variable so that you can use it later on. \n", + "1. Create a `Model` object by passing it the observed head series. Store your model in a variable so that you can use it later on. \n", "2. Add the stresses that are expected to cause the observed head variation to the model. In this example, this is only the recharge series. For each stess, a `StressModel` object needs to be created. Each `StressModel` object needs three input arguments: the time series of the stress, the response function that is used to simulate the effect of the stress, and a name. In addition, it is recommended to specified the `kind` of series, which is used to perform a number of checks on the series and fix problems when needed. This checking and fixing of problems (for example, what to substitute for a missing value) depends on the kind of series. In this case, the time series of the stress is stored in the variable `recharge`, the Gamma function is used to simulate the response, the series will be called `'recharge'`, and the kind is `prec` which stands for precipitation. One of the other keyword arguments of the `StressModel` class is `up`, which means that a positive stress results in an increase (up) of the head. The default value is `True`, which we use in this case as a positive recharge will result in the heads going up. Each `StressModel` object needs to be stored in a variable, after which it can be added to the model. \n", "3. When everything is added, the model can be solved. The default option is to minimize the sum of the squares of the errors between the observed and modeled heads. " ] diff --git a/doc/examples/example_uncertainty.py b/doc/examples/example_uncertainty.py index d7ade861..59557886 100644 --- a/doc/examples/example_uncertainty.py +++ b/doc/examples/example_uncertainty.py @@ -29,12 +29,12 @@ # # Plot some results axes = ml.plots.results(tmin="2010", tmax="2015", figsize=(10, 6)) axes[0].fill_between(df.index, df.iloc[:, 0], df.iloc[:, 1], color="gray", - zorder=-1, alpha=0.5, label="95% Prediction interval") + zorder=-1, alpha=0.5, label="95% Prediction interval") axes[0].legend(ncol=3) df = ml.fit.ci_contribution("recharge", tmin="2010", tmax="2015") axes[2].fill_between(df.index, df.iloc[:, 0], df.iloc[:, 1], color="gray", - zorder=-1, alpha=0.5, label="95% confidence") + zorder=-1, alpha=0.5, label="95% confidence") df = ml.fit.ci_step_response("recharge", alpha=0.05, n=1000) axes[3].fill_between(df.index, df.iloc[:, 0], df.iloc[:, 1], color="gray", - zorder=-1, alpha=0.5, label="95% confidence") + zorder=-1, alpha=0.5, label="95% confidence") diff --git a/doc/examples/groundwater_paper/Ex1_simple_model/Example1.ipynb b/doc/examples/groundwater_paper/Ex1_simple_model/Example1.ipynb index 6bfee40b..07dc2c1d 100644 --- a/doc/examples/groundwater_paper/Ex1_simple_model/Example1.ipynb +++ b/doc/examples/groundwater_paper/Ex1_simple_model/Example1.ipynb @@ -666,7 +666,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.9.7" + "version": "3.9.15 | packaged by conda-forge | (main, Nov 22 2022, 08:41:22) [MSC v.1929 64 bit (AMD64)]" }, "vscode": { "interpreter": { diff --git a/pastas/io/base.py b/pastas/io/base.py index 989accad..306db317 100644 --- a/pastas/io/base.py +++ b/pastas/io/base.py @@ -9,10 +9,13 @@ import pastas as ps from pandas import to_numeric +# Type Hinting +from pastas.typing import Model + logger = getLogger(__name__) -def load(fname, **kwargs): +def load(fname: str, **kwargs) -> Model: """Method to load a Pastas Model from file. Parameters @@ -44,14 +47,18 @@ def load(fname, **kwargs): ml = _load_model(data) - logger.info("Pastas Model from file %s successfully loaded. This file " - "was created with Pastas %s. Your current version of Pastas " - "is: %s", fname, data["file_info"]["pastas_version"], - ps.__version__) + logger.info( + "Pastas Model from file %s successfully loaded. This file " + "was created with Pastas %s. Your current version of Pastas " + "is: %s", + fname, + data["file_info"]["pastas_version"], + ps.__version__, + ) return ml -def _load_model(data): +def _load_model(data: dict) -> Model: """Internal method to create a model from a dictionary.""" # Create model oseries = ps.TimeSeries(**data["oseries"]) @@ -76,8 +83,9 @@ def _load_model(data): else: noise = False - ml = ps.Model(oseries, constant=constant, noisemodel=noise, name=name, - metadata=metadata) + ml = ps.Model( + oseries, constant=constant, noisemodel=noise, name=name, metadata=metadata + ) if "settings" in data.keys(): ml.settings.update(data["settings"]) @@ -88,13 +96,15 @@ def _load_model(data): for name, ts in data["stressmodels"].items(): # Deal with old StressModel2 files for version 0.22.0. Remove in 0.23.0. if ts["stressmodel"] == "StressModel2": - logger.warning("StressModel2 is removed since Pastas 0.22.0 and " - "is replaced by the RechargeModel using a Linear " - "recharge model. Make sure to save this file " - "again using Pastas version 0.22.0 as this file " - "cannot be loaded in newer Pastas versions. This " - "will automatically update your model to the newer " - "RechargeModel stress model.") + logger.warning( + "StressModel2 is removed since Pastas 0.22.0 and " + "is replaced by the RechargeModel using a Linear " + "recharge model. Make sure to save this file " + "again using Pastas version 0.22.0 as this file " + "cannot be loaded in newer Pastas versions. This " + "will automatically update your model to the newer " + "RechargeModel stress model." + ) ts["stressmodel"] = "RechargeModel" ts["recharge"] = "Linear" ts["prec"] = ts["stress"][0] @@ -125,7 +135,7 @@ def _load_model(data): ts["rfunc"] = getattr(ps.rfunc, ts["rfunc"])(**rfunc_kwargs) if "recharge" in ts.keys(): recharge_kwargs = {} - if 'recharge_kwargs' in ts: + if "recharge_kwargs" in ts: recharge_kwargs = ts.pop("recharge_kwargs") ts["recharge"] = getattr( ps.recharge, ts["recharge"])(**recharge_kwargs) @@ -171,7 +181,7 @@ def _load_model(data): return ml -def dump(fname, data, **kwargs): +def dump(fname: str, data: dict, **kwargs): """Method to save a pastas-model to a file. Parameters diff --git a/pastas/io/men.py b/pastas/io/men.py index df104a5f..fc0e6add 100644 --- a/pastas/io/men.py +++ b/pastas/io/men.py @@ -14,13 +14,17 @@ from ..utils import datetime2matlab -def load(fname): +def load(fname: str) -> NotImplementedError: raise NotImplementedError("This is not implemented yet. See the " "reads-section for a Menyanthes-read") -def dump(fname, data, version=3, verbose=True): - # version can also be a specific version, like '2.x.g.t (beta)', or an integer (see below) +def dump(fname: str, + data: dict, + version: int = 3, + verbose: bool = True) -> None: + # version can also be a specific version, + # like '2.x.g.t (beta)', or an integer (see below) if version == 3: version = '3.x.b.c (gamma)' elif version == 2: diff --git a/pastas/io/pas.py b/pastas/io/pas.py index 1548dbda..688ecd26 100644 --- a/pastas/io/pas.py +++ b/pastas/io/pas.py @@ -16,12 +16,12 @@ logger = getLogger(__name__) -def load(fname): +def load(fname: str) -> dict: data = json.load(open(fname), object_hook=pastas_hook) return data -def pastas_hook(obj): +def pastas_hook(obj: dict): for key, value in obj.items(): if key in ["tmin", "tmax", "date_modified", "date_created"]: val = Timestamp(value) @@ -56,7 +56,7 @@ def pastas_hook(obj): return obj -def dump(fname, data): +def dump(fname: str, data: dict) -> None: json.dump(data, open(fname, 'w'), indent=4, cls=PastasEncoder) logger.info("%s file successfully exported", fname) @@ -67,7 +67,7 @@ class PastasEncoder(json.JSONEncoder): Notes ----- Currently supported formats are: DataFrame, Series, - Timedelta, TimeStamps. + Timedelta, Timestamps. see: https://docs.python.org/3/library/json.html """ diff --git a/pastas/model.py b/pastas/model.py index ab266c3a..64de8417 100644 --- a/pastas/model.py +++ b/pastas/model.py @@ -1,25 +1,36 @@ -"""This module contains the Model class in Pastas.""" +# This module contains the Model class in Pastas. +# Python Dependencies from collections import OrderedDict from itertools import combinations from logging import getLogger from os import getlogin +# External Dependencies import numpy as np from pandas import (DataFrame, Series, Timedelta, Timestamp, - date_range, concat, to_timedelta) - -from .decorators import get_stressmodel -from .io.base import _load_model, dump -from .modelstats import Statistics -from .noisemodels import NoiseModel -from .modelplots import Plotting -from .solver import LeastSquares -from .stressmodels import Constant -from .timeseries import TimeSeries -from .utils import (_get_dt, _get_time_offset, frequency_is_supported, - get_sample, validate_name) -from .version import __version__ + date_range, concat, to_timedelta, DatetimeIndex) + +# Internal Pastas +from pastas.decorators import get_stressmodel +from pastas.io.base import _load_model, dump +from pastas.modelstats import Statistics +from pastas.noisemodels import NoiseModel +from pastas.modelplots import Plotting +from pastas.solver import LeastSquares +from pastas.stressmodels import Constant +from pastas.timeseries import TimeSeries +from pastas.transform import ThresholdTransform +from pastas.utils import (_get_dt, _get_time_offset, frequency_is_supported, + get_sample, validate_name) +from pastas.version import __version__ + +# Type Hinting +from typing import Union, Optional, Tuple, List +from pastas.typing import StressModel, Solver, TimestampType, ArrayLike +from pastas.typing import NoiseModel as NoiseModelType +from pastas.typing import TimeSeries as TimeSeriesType +from pastas.typing import Model as ModelType class Model: @@ -59,8 +70,9 @@ class Model: >>> ml = Model(oseries) """ - def __init__(self, oseries, constant=True, noisemodel=True, name=None, - metadata=None, freq="D"): + def __init__(self, oseries: Union[Series, TimeSeriesType], constant: bool = True, + noisemodel: bool = True, name: Optional[str] = None, + metadata: Optional[dict] = None, freq: str = "D") -> None: self.logger = getLogger(__name__) @@ -133,7 +145,9 @@ def __repr__(self): const=not self.constant is None, noise=not self.noisemodel is None) - def add_stressmodel(self, stressmodel, replace=False): + def add_stressmodel(self, + stressmodel: StressModel, + replace: bool = False) -> None: """Add a stressmodel to the main model. Parameters @@ -187,7 +201,7 @@ def add_stressmodel(self, stressmodel, replace=False): "overlap with ml.oseries.") self._check_stressmodel_compatibility() - def add_constant(self, constant): + def add_constant(self, constant: Constant) -> None: """Add a Constant to the time series Model. Parameters @@ -204,7 +218,7 @@ def add_constant(self, constant): self.parameters = self.get_init_parameters(initial=False) self._check_stressmodel_compatibility() - def add_transform(self, transform): + def add_transform(self, transform: ThresholdTransform): """Add a Transform to the time series Model. Parameters @@ -226,7 +240,7 @@ def add_transform(self, transform): self.parameters = self.get_init_parameters(initial=False) self._check_stressmodel_compatibility() - def add_noisemodel(self, noisemodel): + def add_noisemodel(self, noisemodel: NoiseModelType) -> None: """Adds a noisemodel to the time series Model. Parameters @@ -251,7 +265,7 @@ def add_noisemodel(self, noisemodel): self.parameters = self.get_init_parameters(initial=False) @get_stressmodel - def del_stressmodel(self, name): + def del_stressmodel(self, name: str): """Method to safely delete a stress model from the Model. Parameters @@ -268,7 +282,7 @@ def del_stressmodel(self, name): self.stressmodels.pop(name, None) self.parameters = self.get_init_parameters(initial=False) - def del_constant(self): + def del_constant(self) -> None: """Method to safely delete the Constant from the Model.""" if self.constant is None: self.logger.warning("No constant is present in this model.") @@ -276,7 +290,7 @@ def del_constant(self): self.constant = None self.parameters = self.get_init_parameters(initial=False) - def del_transform(self): + def del_transform(self) -> None: """Method to safely delete the transform from the Model.""" if self.transform is None: self.logger.warning("No transform is present in this model.") @@ -284,7 +298,7 @@ def del_transform(self): self.transform = None self.parameters = self.get_init_parameters(initial=False) - def del_noisemodel(self): + def del_noisemodel(self) -> None: """Method to safely delete the noise model from the Model.""" if self.noisemodel is None: self.logger.warning("No noisemodel is present in this model.") @@ -292,8 +306,13 @@ def del_noisemodel(self): self.noisemodel = None self.parameters = self.get_init_parameters(initial=False) - def simulate(self, p=None, tmin=None, tmax=None, freq=None, warmup=None, - return_warmup=False): + def simulate(self, + p: Optional[ArrayLike] = None, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + freq: Optional[str] = None, + warmup: Optional[float] = None, + return_warmup: bool = False) -> Series: """Method to simulate the time series model. Parameters @@ -312,7 +331,7 @@ def simulate(self, p=None, tmin=None, tmax=None, freq=None, warmup=None, String with the frequency the stressmodels are simulated. Must be one of the following: (D, h, m, s, ms, us, ns) or a multiple of that e.g. "7D". - warmup: float or int, optional + warmup: float, optional Warmup period (in Days). return_warmup: bool, optional Return the simulation including the the warmup period or not, @@ -383,7 +402,12 @@ def simulate(self, p=None, tmin=None, tmax=None, freq=None, warmup=None, sim.name = 'Simulation' return sim - def residuals(self, p=None, tmin=None, tmax=None, freq=None, warmup=None): + def residuals(self, + p: Optional[ArrayLike] = None, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + freq: Optional[str] = None, + warmup: Optional[float] = None) -> Series: """Method to calculate the residual series. Parameters @@ -402,7 +426,7 @@ def residuals(self, p=None, tmin=None, tmax=None, freq=None, warmup=None): String with the frequency the stressmodels are simulated. Must be one of the following: (D, h, m, s, ms, us, ns) or a multiple of that e.g. "7D". - warmup: float or int, optional + warmup: float, optional Warmup period (in Days). Returns @@ -452,7 +476,12 @@ def residuals(self, p=None, tmin=None, tmax=None, freq=None, warmup=None): res.name = "Residuals" return res - def noise(self, p=None, tmin=None, tmax=None, freq=None, warmup=None): + def noise(self, + p: Optional[ArrayLike] = None, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + freq: Optional[str] = None, + warmup: Optional[float] = None) -> Series: """Method to simulate the noise when a noisemodel is present. Parameters @@ -509,8 +538,12 @@ def noise(self, p=None, tmin=None, tmax=None, freq=None, warmup=None): noise = self.noisemodel.simulate(res, p) return noise - def noise_weights(self, p=None, tmin=None, tmax=None, freq=None, - warmup=None): + def noise_weights(self, + p: Optional[list] = None, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + freq: Optional[str] = None, + warmup: Optional[float] = None) -> ArrayLike: """Internal method to calculate the noise weights.""" # Get parameters if none are provided if p is None: @@ -524,8 +557,11 @@ def noise_weights(self, p=None, tmin=None, tmax=None, freq=None, return weights - def observations(self, tmin=None, tmax=None, freq=None, - update_observations=False): + def observations(self, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + freq: Optional[str] = None, + update_observations: bool = False) -> Series: """Method that returns the observations series used for calibration. Parameters @@ -586,8 +622,15 @@ def observations(self, tmin=None, tmax=None, freq=None, oseries_calib = self.oseries_calib return oseries_calib - def initialize(self, tmin=None, tmax=None, freq=None, warmup=None, - noise=None, weights=None, initial=True, fit_constant=True): + def initialize(self, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + freq: Optional[str] = None, + warmup: Optional[float] = None, + noise: Optional[bool] = None, + weights: Optional[Series] = None, + initial: bool = True, + fit_constant: bool = True) -> None: """Method to initialize the model. This method is called by the solve-method, but can also be @@ -641,9 +684,18 @@ def initialize(self, tmin=None, tmax=None, freq=None, warmup=None, self.parameters.loc["constant_d", "initial"] = 0.0 self.normalize_residuals = True - def solve(self, tmin=None, tmax=None, freq=None, warmup=None, noise=True, - solver=None, report=True, initial=True, weights=None, - fit_constant=True, **kwargs): + def solve(self, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + freq: Optional[str] = None, + warmup: Optional[float] = None, + noise: bool = True, + solver: Optional[Solver] = None, + report: bool = True, + initial: bool = True, + weights: Optional[Series] = None, + fit_constant: bool = True, + **kwargs) -> None: """Method to solve the time series model. Parameters @@ -658,13 +710,13 @@ def solve(self, tmin=None, tmax=None, freq=None, warmup=None, noise=True, String with the frequency the stressmodels are simulated. Must be one of the following (D, h, m, s, ms, us, ns) or a multiple of that e.g. "7D". - warmup: float or int, optional + warmup: float, optional Warmup period (in Days) for which the simulation is calculated, but not used for the calibration period. noise: bool, optional Argument that determines if a noisemodel is used (only if present). The default is noise=True. - solver: pastas.solver.BaseSolver class, optional + solver: pastas.solver.Solver class, optional Class used to solve the model. Options are: ps.LeastSquares (default) or ps.LmfitSolve. A class is needed, not an instance of the class! @@ -744,8 +796,14 @@ def solve(self, tmin=None, tmax=None, freq=None, warmup=None, noise=True, output = None print(self.fit_report(output=output)) - def set_parameter(self, name, initial=None, vary=None, pmin=None, - pmax=None, optimal=None, move_bounds=False): + def set_parameter(self, + name: str, + initial: Optional[float] = None, + vary: Optional[bool] = None, + pmin: Optional[float] = None, + pmax: Optional[float] = None, + optimal: Optional[float] = None, + move_bounds: bool = False) -> None: """Method to change the parameter properties. Parameters @@ -824,7 +882,7 @@ def set_parameter(self, name, initial=None, vary=None, pmin=None, if optimal is not None: self.parameters.loc[name, "optimal"] = optimal - def _get_time_offset(self, freq): + def _get_time_offset(self, freq: str) -> Timedelta: """Internal method to get the time offsets from the stressmodels. Parameters @@ -858,7 +916,12 @@ def _get_time_offset(self, freq): else: return Timedelta(0) - def _get_sim_index(self, tmin, tmax, freq, warmup, update_sim_index=False): + def _get_sim_index(self, + tmin: Timestamp, + tmax: Timestamp, + freq: str, + warmup: Timedelta, + update_sim_index: bool = False) -> DatetimeIndex: """Internal method to get the simulation index, including the warmup. Parameters @@ -900,7 +963,10 @@ def _get_sim_index(self, tmin, tmax, freq, warmup, update_sim_index=False): sim_index = self.sim_index return sim_index - def get_tmin(self, tmin=None, use_oseries=True, use_stresses=False): + def get_tmin(self, + tmin: Optional[TimestampType] = None, + use_oseries: bool = True, + use_stresses: bool = False) -> Timestamp: """Method that checks and returns valid values for tmin. Parameters @@ -958,7 +1024,10 @@ def get_tmin(self, tmin=None, use_oseries=True, use_stresses=False): return tmin - def get_tmax(self, tmax=None, use_oseries=True, use_stresses=False): + def get_tmax(self, + tmax: Optional[TimestampType] = None, + use_oseries: bool = True, + use_stresses: bool = False) -> Timestamp: """Method that checks and returns valid values for tmax. Parameters @@ -1019,7 +1088,9 @@ def get_tmax(self, tmax=None, use_oseries=True, use_stresses=False): return tmax - def get_init_parameters(self, noise=None, initial=True): + def get_init_parameters(self, + noise: Optional[bool] = None, + initial: bool = True) -> DataFrame: """Method to get all initial parameters from the individual objects. Parameters @@ -1059,7 +1130,7 @@ def get_init_parameters(self, noise=None, initial=True): return parameters - def get_parameters(self, name=None): + def get_parameters(self, name: Optional[str] = None) -> ArrayLike: """Method to obtain the parameters needed for calculation. This method is used by the simulation, residuals and the noise @@ -1072,8 +1143,8 @@ def get_parameters(self, name=None): Returns ------- - p: numpy.ndarray - Numpy array with the parameters used in the time series model. + p: array_like + NumPy array with the parameters used in the time series model. """ if name: p = self.parameters.loc[self.parameters.name == name] @@ -1089,12 +1160,12 @@ def get_parameters(self, name=None): return parameters.to_numpy(dtype=float) - def get_stressmodel_names(self): + def get_stressmodel_names(self) -> List[str]: """Returns list of stressmodel names.""" return list(self.stressmodels.keys()) @get_stressmodel - def get_stressmodel_settings(self, name): + def get_stressmodel_settings(self, name: str) -> Union[dict, None]: """Method to obtain the time series settings for a stress model. Parameters @@ -1116,9 +1187,15 @@ def get_stressmodel_settings(self, name): return settings @get_stressmodel - def get_contribution(self, name, tmin=None, tmax=None, freq=None, - warmup=None, istress=None, return_warmup=False, - p=None): + def get_contribution(self, + name: str, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + freq: Optional[str] = None, + warmup: Optional[float] = None, + istress: Optional[int] = None, + return_warmup: bool = False, + p: Optional[ArrayLike] = None) -> Series: """Method to get the contribution of a stressmodel. Parameters @@ -1187,7 +1264,7 @@ def get_contribution(self, name, tmin=None, tmax=None, freq=None, return contrib - def get_contributions(self, split=True, **kwargs): + def get_contributions(self, split: bool = True, **kwargs) -> List[Series]: """Method to get contributions of all stressmodels. Parameters @@ -1220,7 +1297,10 @@ def get_contributions(self, split=True, **kwargs): contribs.append(contrib) return contribs - def get_transform_contribution(self, tmin=None, tmax=None): + def get_transform_contribution( + self, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None) -> Series: """Method to get the contribution of a transform. Parameters @@ -1244,8 +1324,11 @@ def get_transform_contribution(self, tmin=None, tmax=None): sim_org = ml.simulate(tmin=tmin, tmax=tmax) return sim - sim_org - def get_output_series(self, tmin=None, tmax=None, add_contributions=True, - split=True): + def get_output_series(self, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + add_contributions: bool = True, + split: bool = True) -> DataFrame: """Method to get all the modeled output time series from the Model. Parameters @@ -1296,8 +1379,13 @@ def get_output_series(self, tmin=None, tmax=None, add_contributions=True, df = concat(df, axis=1) return df - def _get_response(self, block_or_step, name, p=None, dt=None, add_0=False, - **kwargs): + def _get_response(self, + block_or_step: str, + name: str, + p: Optional[ArrayLike] = None, + dt: Optional[float] = None, + add_0: bool = False, + **kwargs) -> Series: """Internal method to compute the block and step response. Parameters @@ -1314,7 +1402,7 @@ def _get_response(self, block_or_step, name, p=None, dt=None, add_0=False, timestep for the response function. add_0: bool, optional Add a zero at t=0. - kwargs + kwargs: dict: passed to rfunc.step() or rfunc.block() Returns ------- @@ -1352,7 +1440,12 @@ def _get_response(self, block_or_step, name, p=None, dt=None, add_0=False, return response @get_stressmodel - def get_block_response(self, name, p=None, add_0=False, dt=None, **kwargs): + def get_block_response(self, + name: str, + p: Optional[ArrayLike] = None, + add_0: bool = False, + dt: Optional[float] = None, + **kwargs) -> Series: """Method to obtain the block response for a stressmodel. The optimal parameters are used when available, initial otherwise. @@ -1369,6 +1462,7 @@ def get_block_response(self, name, p=None, add_0=False, dt=None, **kwargs): Adds 0 at t=0 at the start of the response, defaults to False. dt: float, optional timestep for the response function. + kwargs: dict Returns ------- @@ -1380,7 +1474,12 @@ def get_block_response(self, name, p=None, add_0=False, dt=None, **kwargs): p=p, add_0=add_0, **kwargs) @get_stressmodel - def get_step_response(self, name, p=None, add_0=False, dt=None, **kwargs): + def get_step_response(self, + name, + p: Optional[ArrayLike] = None, + add_0: bool = False, + dt: Optional[float] = None, + **kwargs) -> Series: """Method to obtain the step response for a stressmodel. The optimal parameters are used when available, initial otherwise. @@ -1408,7 +1507,10 @@ def get_step_response(self, name, p=None, add_0=False, dt=None, **kwargs): p=p, add_0=add_0, **kwargs) @get_stressmodel - def get_response_tmax(self, name, p=None, cutoff=0.999): + def get_response_tmax(self, + name: str, + p: ArrayLike = None, + cutoff: float = 0.999) -> float: """Method to get the tmax used for the response function. Parameters @@ -1445,8 +1547,16 @@ def get_response_tmax(self, name, p=None, cutoff=0.999): return tmax @get_stressmodel - def get_stress(self, name, tmin=None, tmax=None, freq=None, warmup=None, - istress=None, return_warmup=False, p=None): + def get_stress( + self, + name: str, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + freq: Optional[str] = None, + warmup: Optional[float] = None, + istress: Optional[int] = None, + return_warmup: bool = False, + p: Optional[ArrayLike] = None) -> Union[Series, List[Series]]: """Method to obtain the stress(es) from the stressmodel. Parameters @@ -1463,7 +1573,7 @@ def get_stress(self, name, tmin=None, tmax=None, freq=None, warmup=None, String with the frequency the stressmodels are simulated. Must be one of the following: (D, h, m, s, ms, us, ns) or a multiple of that e.g. "7D". - warmup: float or int, optional + warmup: float, optional Warmup period (in Days). istress: int, optional When multiple stresses are present in a stressmodel, this keyword @@ -1512,7 +1622,7 @@ def get_stress(self, name, tmin=None, tmax=None, freq=None, warmup=None, return stress - def _get_file_info(self): + def _get_file_info(self) -> dict: """Internal method to get the file information. Returns @@ -1536,7 +1646,10 @@ def _get_file_info(self): return file_info - def fit_report(self, output="basic", warnings=True, warnbounds=None): + def fit_report(self, + output: str = "basic", + warnings: bool = True, + warnbounds: Optional[bool] = None) -> str: """Method that reports on the fit after a model is optimized. Parameters @@ -1686,7 +1799,7 @@ def fit_report(self, output="basic", warnings=True, warnbounds=None): return report - def _check_response_tmax(self, cutoff=None): + def _check_response_tmax(self, cutoff: Optional[float] = None) -> DataFrame: """Internal method to check whether response tmax is smaller than calibration period. @@ -1720,7 +1833,7 @@ def _check_response_tmax(self, cutoff=None): return check - def _check_parameters_bounds(self): + def _check_parameters_bounds(self) -> Tuple[Series, Series]: """Internal method to check if the optimal parameters are close to pmin or pmax. @@ -1764,7 +1877,7 @@ def _check_parameters_bounds(self): return lowerhit, upperhit - def to_dict(self, series=True, file_info=True): + def to_dict(self, series: bool = True, file_info: bool = True) -> dict: """Method to export a model to a dictionary. Parameters @@ -1814,7 +1927,7 @@ def to_dict(self, series=True, file_info=True): return data - def to_file(self, fname, series=True, **kwargs): + def to_file(self, fname: str, series: Union[bool, str] = True, **kwargs) -> None: """Method to save the Pastas model to a file. Parameters @@ -1842,7 +1955,7 @@ def to_file(self, fname, series=True, **kwargs): # Write the dicts to a file return dump(fname, data, **kwargs) - def copy(self, name=None): + def copy(self, name: Optional[str] = None) -> ModelType: """Method to copy a model. Parameters @@ -1866,7 +1979,7 @@ def copy(self, name=None): ml.name = name return ml - def _check_stressmodel_compatibility(self): + def _check_stressmodel_compatibility(self) -> None: """Internal method to check if the stressmodels are compatible with the model.""" for sm in self.stressmodels.values(): diff --git a/pastas/modelcompare.py b/pastas/modelcompare.py index ddeef528..ab50ccb8 100644 --- a/pastas/modelcompare.py +++ b/pastas/modelcompare.py @@ -7,6 +7,9 @@ import pastas as ps from warnings import warn +from typing import Optional, Tuple, List +from pastas.typing import Model, Axes + class CompareModels: """Class for visually comparing pastas Models. @@ -53,7 +56,7 @@ class CompareModels: mc.figure.savefig("modelcomparison.png") """ - def __init__(self, models=None): + def __init__(self, models: Optional[List[Model]] = None) -> None: """Initialize model compare class. Parameters @@ -70,7 +73,10 @@ def __init__(self, models=None): self.adjust_height = False self.smdict = None - def initialize_figure(self, mosaic=None, figsize=(10, 8), cmap="tab10"): + def initialize_figure(self, + mosaic: Optional[List[List[str]]] = None, + figsize: Tuple[int] = (10, 8), + cmap: str = "tab10") -> None: """initialize a custom figure based on a mosaic. Parameters @@ -91,9 +97,12 @@ def initialize_figure(self, mosaic=None, figsize=(10, 8), cmap="tab10"): self.axes = axes self.cmap = plt.get_cmap(cmap) - def initialize_adjust_height_figure( - self, mosaic=None, figsize=(10, 8), cmap="tab10", smdict=None - ): + def initialize_adjust_height_figure(self, + mosaic: Optional[List[ + List[str]]] = None, + figsize: Tuple[int] = (10, 8), + cmap: str = "tab10", + smdict: Optional[dict] = None) -> None: """initialize subplots based on a mosaic with equal vertical scales. The height of each subplot is calculated based on the y-data limits in @@ -199,13 +208,13 @@ def initialize_adjust_height_figure( heights_list += [mosfrac[ky]] self.mosaic = mosaic - figure, axes = plt.subplot_mosaic( + fig, axes = plt.subplot_mosaic( self.mosaic, figsize=figsize, gridspec_kw=dict(height_ratios=heights_list), ) - self.figure = figure + self.figure = fig self.axes = axes self.cmap = plt.get_cmap(cmap) @@ -214,7 +223,7 @@ def initialize_adjust_height_figure( if axlbl in ["sim", "res"] or axlbl.startswith("con"): self.axes[axlbl].autoscale(enable=None, axis="y", tight=True) - def get_unique_stressmodels(self, models=None): + def get_unique_stressmodels(self, models: List[Model] = None) -> List[str]: """Get all unique stressmodel names. Parameters @@ -232,7 +241,9 @@ def get_unique_stressmodels(self, models=None): return sm_unique - def get_default_mosaic(self, n_stressmodels=None): + def get_default_mosaic(self, + n_stressmodels: Optional[int] = None + ) -> List[List[str]]: """Get default mosaic for matplotlib.subplot_mosaic(). Parameters @@ -261,7 +272,7 @@ def get_default_mosaic(self, n_stressmodels=None): return mosaic - def get_tmin_tmax(self, models=None): + def get_tmin_tmax(self, models: List[Model] = None) -> DataFrame: """get tmin and tmax of all models. Parameters @@ -281,7 +292,9 @@ def get_tmin_tmax(self, models=None): return tmintmax - def get_metrics(self, models=None, metric_selection=None): + def get_metrics(self, + models: Optional[List[Model]] = None, + metric_selection: Optional[List[str]] = None) -> DataFrame: """get metrics of all models in a DataFrame. Parameters @@ -310,8 +323,10 @@ def get_metrics(self, models=None, metric_selection=None): return metrics def get_parameters( - self, models=None, param_col="optimal", param_selection=None - ): + self, + models: Optional[List[Model]] = None, + param_col: str = "optimal", + param_selection: Optional[List[str]] = None) -> DataFrame: """get parameter values of all models in a DataFrame. Parameters @@ -320,7 +335,7 @@ def get_parameters( list of models to get parameters for, by default None param_col : str, optional name of parameter column to obtain, by default "optimal" - param_selection : str, optional + param_selection : list of str, optional string to filter parameter selection, by default None Returns @@ -346,7 +361,9 @@ def get_parameters( else: return params - def get_diagnostics(self, models=None, diag_col="P-value"): + def get_diagnostics(self, + models: Optional[List[Model]] = None, + diag_col: str = "P-value") -> DataFrame: """Get p-values of statistical tests in a DataFrame. Parameters @@ -366,7 +383,7 @@ def get_diagnostics(self, models=None, diag_col="P-value"): return diags.transpose() - def plot_oseries(self, axn="sim"): + def plot_oseries(self, axn: str = "sim") -> None: """Plot all oseries, unless all oseries are the same. Parameters @@ -403,7 +420,7 @@ def plot_oseries(self, axn="sim"): linewidth=0.5, ) - def plot_simulation(self, axn="sim"): + def plot_simulation(self, axn: str = "sim") -> None: """plot model simulation. Parameters @@ -424,7 +441,7 @@ def plot_simulation(self, axn="sim"): color=self.cmap(i), ) - def plot_residuals(self, axn="res"): + def plot_residuals(self, axn: str = "res") -> None: """plot residuals. Parameters @@ -440,11 +457,11 @@ def plot_residuals(self, axn="res"): self.axes[axn].plot( residuals.index, residuals.values, - label=f"Residuals", + label="Residuals", color=self.cmap(i), ) - def plot_noise(self, axn="res"): + def plot_noise(self, axn: str = "res") -> None: """plot noise. Parameters @@ -461,12 +478,15 @@ def plot_noise(self, axn="res"): self.axes[axn].plot( noise.index, noise.values, - label=f"Noise", + label="Noise", linestyle="--", color=f"C{i}", ) - def plot_response(self, smdict=None, axn="rf{i}", response="step"): + def plot_response(self, + smdict: Optional[dict] = None, + axn: str = "rf{i}", + response: str = "step") -> None: """plot step or block responses. Parameters @@ -504,7 +524,7 @@ def plot_response(self, smdict=None, axn="rf{i}", response="step"): for j, namlist in self.smdict.items(): for smn in namlist: # skip if contribution not in model - if not smn in ml.stressmodels: + if smn not in ml.stressmodels: continue if response == "step": step = ml.get_step_response(smn, add_0=True) @@ -523,7 +543,10 @@ def plot_response(self, smdict=None, axn="rf{i}", response="step"): color=self.cmap(i), ) - def plot_contribution(self, smdict=None, axn="con{i}", normalized=False): + def plot_contribution(self, + smdict: Optional[dict] = None, + axn: str = "con{i}", + normalized: bool = False) -> None: """plot stressmodel contributions. Parameters @@ -565,7 +588,7 @@ def plot_contribution(self, smdict=None, axn="con{i}", normalized=False): for i, ml in enumerate(self.models): for j, namlist in self.smdict.items(): for smn in namlist: - if not smn in ml.stressmodels: + if smn not in ml.stressmodels: continue for con in ml.get_contributions(split=False): if smn in con.name: @@ -584,7 +607,9 @@ def plot_contribution(self, smdict=None, axn="con{i}", normalized=False): color=self.cmap(i), ) - def plot_stress(self, axn="stress", names=None): + def plot_stress(self, + axn: str = "stress", + names: Optional[List[str]] = None) -> None: """plot stresses time series. Parameters @@ -611,7 +636,7 @@ def plot_stress(self, axn="stress", names=None): color=self.cmap(i), ) - def plot_acf(self, axn="acf"): + def plot_acf(self, axn: str = "acf") -> None: """plot autocorrelation plot. Parameters @@ -642,7 +667,9 @@ def plot_acf(self, axn="acf"): label=label, ) - def plot_table(self, axn="table", df=None): + def plot_table(self, + axn: str = "table", + df: Optional[DataFrame] = None) -> None: """plot dataframe as table Parameters @@ -669,9 +696,10 @@ def plot_table(self, axn="table", df=None): self.axes[axn].set_xticks([]) self.axes[axn].set_yticks([]) - def plot_table_params( - self, axn="tab", param_col="optimal", param_selection=None - ): + def plot_table_params(self, + axn: str = "tab", + param_col: str = "optimal", + param_selection: Optional[List[str]] = None) -> None: """plot model parameters table. Parameters @@ -680,7 +708,7 @@ def plot_table_params( name of labeled axes to plot table on, by default "tab" param_col : str, optional name of parameter column to include, by default "optimal" - param_selection : str, optional + param_selection : list of str, optional string to filter parameter names that are included in table, by default None """ @@ -698,7 +726,10 @@ def plot_table_params( cols = params.columns.to_list()[-1:] + params.columns.to_list()[:-1] self.plot_table(axn=axn, df=params[cols]) - def plot_table_metrics(self, axn="met", metric_selection=None): + def plot_table_metrics( + self, + axn: str = "met", + metric_selection: Optional[List[str]] = None) -> None: """plot metrics table. Parameters @@ -732,7 +763,9 @@ def plot_table_metrics(self, axn="met", metric_selection=None): cols = metrics.columns.to_list()[-1:] + metrics.columns.to_list()[:-1] self.plot_table(axn=axn, df=metrics[cols].round(2)) - def plot_table_diagnostics(self, axn="diag", diag_col="P-value"): + def plot_table_diagnostics(self, + axn: str = "diag", + diag_col: str = "P-value") -> None: """plot diagnostics table. Parameters @@ -751,7 +784,7 @@ def plot_table_diagnostics(self, axn="diag", diag_col="P-value"): cols = diags.columns.to_list()[-1:] + diags.columns.to_list()[:-1] self.plot_table(axn=axn, df=diags[cols]) - def share_xaxes(self, axes): + def share_xaxes(self, axes: List[Axes]) -> None: """share x-axes. Parameters @@ -765,7 +798,7 @@ def share_xaxes(self, axes): for t in iax.get_xticklabels(): t.set_visible(False) - def share_yaxes(self, axes): + def share_yaxes(self, axes: List[Axes]) -> None: """share y-axes. Parameters @@ -778,15 +811,15 @@ def share_yaxes(self, axes): def plot( self, - smdict=None, - normalized=False, - param_selection=None, - figsize=(10, 8), - grid=True, - legend=True, - adjust_height=False, - legend_kwargs=None, - ): + smdict: Optional[dict] = None, + normalized: bool = False, + param_selection: Optional[list] = None, + figsize: Optional[tuple] = (10, 8), + grid: bool = True, + legend: bool = True, + adjust_height: bool = False, + legend_kwargs: Optional[dict] = None, + ) -> None: """plot the models in a comparison plot. The resulting plot is similar to `ml.plots.results()`. diff --git a/pastas/modelplots.py b/pastas/modelplots.py index 731dcd03..d1eba0a7 100644 --- a/pastas/modelplots.py +++ b/pastas/modelplots.py @@ -14,6 +14,10 @@ from .plots import series, diagnostics, cum_frequency, \ _table_formatter_params, _table_formatter_stderr +# Type Hinting +from typing import Optional, List, Union +from pastas.typing import Axes, Figure, TimestampType, Model + logger = logging.getLogger(__name__) @@ -30,17 +34,24 @@ class Plotting: """ - def __init__(self, ml): + def __init__(self, ml: Model) -> None: self.ml = ml # Store a reference to the model class - def __repr__(self): + def __repr__(self) -> str: msg = "This module contains all the built-in plotting options that " \ "are available." return msg @model_tmin_tmax - def plot(self, tmin=None, tmax=None, oseries=True, simulation=True, - ax=None, figsize=None, legend=True, **kwargs): + def plot(self, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + oseries: bool = True, + simulation: bool = True, + ax: Optional[Axes] = None, + figsize: Optional[tuple] = None, + legend: bool = True, + **kwargs) -> Axes: """Make a plot of the observed and simulated series. Parameters @@ -97,9 +108,16 @@ def plot(self, tmin=None, tmax=None, oseries=True, simulation=True, return ax @model_tmin_tmax - def results(self, tmin=None, tmax=None, figsize=(10, 8), split=False, - adjust_height=True, return_warmup=False, block_or_step='step', - fig=None, **kwargs): + def results(self, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + figsize: tuple = (10, 8), + split: bool = False, + adjust_height: bool = True, + return_warmup: bool = False, + block_or_step: str = 'step', + fig: Optional[Figure] = None, + **kwargs) -> Axes: """Plot different results in one window to get a quick overview. Parameters @@ -135,13 +153,15 @@ def results(self, tmin=None, tmax=None, figsize=(10, 8), split=False, o = self.ml.observations(tmin=tmin, tmax=tmax) o_nu = self.ml.oseries.series.drop(o.index) if return_warmup: - o_nu = o_nu[tmin - self.ml.settings['warmup']: tmax] + o_nu = o_nu[tmin - self.ml.settings['warmup']:tmax] else: - o_nu = o_nu[tmin: tmax] - sim = self.ml.simulate(tmin=tmin, tmax=tmax, + o_nu = o_nu[tmin:tmax] + sim = self.ml.simulate(tmin=tmin, + tmax=tmax, return_warmup=return_warmup) res = self.ml.residuals(tmin=tmin, tmax=tmax) - contribs = self.ml.get_contributions(split=split, tmin=tmin, + contribs = self.ml.get_contributions(split=split, + tmin=tmin, tmax=tmax, return_warmup=return_warmup) @@ -278,9 +298,17 @@ def results(self, tmin=None, tmax=None, figsize=(10, 8), split=False, return fig.axes @model_tmin_tmax - def decomposition(self, tmin=None, tmax=None, ytick_base=True, split=True, - figsize=(10, 8), axes=None, name=None, - return_warmup=False, min_ylim_diff=None, **kwargs): + def decomposition(self, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + ytick_base: bool = True, + split: bool = True, + figsize: tuple = (10, 8), + axes: Optional[Axes] = None, + name: Optional[str] = None, + return_warmup: bool = False, + min_ylim_diff: Optional[float] = None, + **kwargs) -> Axes: """Plot the decomposition of a time-series in the different stresses. Parameters @@ -412,8 +440,15 @@ def decomposition(self, tmin=None, tmax=None, ytick_base=True, split=True, return axes @model_tmin_tmax - def diagnostics(self, tmin=None, tmax=None, figsize=(10, 5), bins=50, - acf_options=None, fig=None, alpha=0.05, **kwargs): + def diagnostics(self, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + figsize: tuple = (10, 5), + bins: int = 50, + acf_options: Optional[dict] = None, + fig: Optional[Figure] = None, + alpha: float = 0.05, + **kwargs) -> Axes: """Plot a window that helps in diagnosing basic model assumptions. Parameters @@ -474,8 +509,12 @@ def diagnostics(self, tmin=None, tmax=None, figsize=(10, 5), bins=50, **kwargs) @model_tmin_tmax - def cum_frequency(self, tmin=None, tmax=None, ax=None, figsize=(5, 2), - **kwargs): + def cum_frequency(self, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + ax: Optional[Axes] = None, + figsize: tuple = (5, 2), + **kwargs) -> Axes: """Plot the cumulative frequency for the observations and simulation. Parameters @@ -503,8 +542,11 @@ def cum_frequency(self, tmin=None, tmax=None, ax=None, figsize=(5, 2), obs = self.ml.observations(tmin=tmin, tmax=tmax) return cum_frequency(obs, sim, ax=ax, figsize=figsize, **kwargs) - def block_response(self, stressmodels=None, ax=None, figsize=None, - **kwargs): + def block_response(self, + stressmodels: Optional[List[str]] = None, + ax: Optional[Axes] = None, + figsize: Optional[tuple] = None, + **kwargs) -> Axes: """Plot the block response for a specific stressmodels. Parameters @@ -542,8 +584,11 @@ def block_response(self, stressmodels=None, ax=None, figsize=None, plt.legend(legend) return ax - def step_response(self, stressmodels=None, ax=None, figsize=None, - **kwargs): + def step_response(self, + stressmodels: Optional[List[str]] = None, + ax: Optional[Axes] = None, + figsize: Optional[tuple] = None, + **kwargs) -> Axes: """Plot the step response for a specific stressmodels. Parameters @@ -578,14 +623,20 @@ def step_response(self, stressmodels=None, ax=None, figsize=None, return ax @model_tmin_tmax - def stresses(self, tmin=None, tmax=None, cols=1, split=True, sharex=True, - figsize=(10, 8), **kwargs): + def stresses(self, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + cols: int = 1, + split: bool = True, + sharex: bool = True, + figsize: tuple = (10, 8), + **kwargs) -> Axes: """This method creates a graph with all the stresses used in the model. Parameters ---------- - tmin - tmax + tmin: str or pd.Timestamp, optional + tmax: str or pd.Timestamp, optional cols: int number of columns used for plotting. split: bool, optional @@ -597,7 +648,7 @@ def stresses(self, tmin=None, tmax=None, cols=1, split=True, sharex=True, Returns ------- - axes: matplotlib.axes + axes: matplotlib.axes.Axes matplotlib axes instance. """ stresses = _get_stress_series(self.ml, split=split) @@ -623,10 +674,17 @@ def stresses(self, tmin=None, tmax=None, cols=1, split=True, sharex=True, return axes @model_tmin_tmax - def contributions_pie(self, tmin=None, tmax=None, ax=None, - figsize=None, split=True, partition='std', - wedgeprops=None, startangle=90, - autopct='%1.1f%%', **kwargs): + def contributions_pie(self, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + ax: Optional[Axes] = None, + figsize: Optional[Figure] = None, + split: bool = True, + partition: str = 'std', + wedgeprops: Optional[dict] = None, + startangle: float = 90.0, + autopct: str = "%1.1f%%", + **kwargs) -> Axes: """Make a pie chart of the contributions. This plot is based on the TNO Groundwatertoolbox. @@ -691,8 +749,13 @@ def contributions_pie(self, tmin=None, tmax=None, ax=None, return ax @model_tmin_tmax - def stacked_results(self, tmin=None, tmax=None, figsize=(10, 8), - stacklegend=False, stacklegend_kws=None, **kwargs): + def stacked_results(self, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + figsize: tuple = (10, 8), + stacklegend: bool = False, + stacklegend_kws: Optional[dict] = None, + **kwargs) -> Axes: """Create a results plot, similar to `ml.plots.results()`, in which the individual contributions of stresses (in stressmodels with multiple stresses) are stacked. @@ -712,7 +775,7 @@ def stacked_results(self, tmin=None, tmax=None, figsize=(10, 8), axes: list of axes objects """ - # %% Contribution per stress on model results plot + # Contribution per stress on model results plot def custom_sort(t): """Sort by mean contribution.""" return t[1].mean() @@ -774,7 +837,11 @@ def custom_sort(t): return axes @model_tmin_tmax - def series(self, tmin=None, tmax=None, split=True, **kwargs): + def series(self, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + split: bool = True, + **kwargs) -> Axes: """Method to plot all the time series going into a Pastas Model. Parameters @@ -806,8 +873,13 @@ def series(self, tmin=None, tmax=None, split=True, **kwargs): return axes @model_tmin_tmax - def summary_pdf(self, tmin=None, tmax=None, fname=None, dpi=150, - results_kwargs={}, diagnostics_kwargs={}): + def summary_pdf(self, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + fname: Optional[str] = None, + dpi: int = 150, + results_kwargs: Optional[dict] = None, + diagnostics_kwargs: Optional[dict] = None) -> Figure: """Create a PDF file (A4) with the results and diagnostics plot. Parameters @@ -830,6 +902,13 @@ def summary_pdf(self, tmin=None, tmax=None, fname=None, dpi=150, """ if fname is None: fname = "{}.pdf".format(self.ml.name) + + if results_kwargs is None: + results_kwargs = {} + + if diagnostics_kwargs is None: + diagnostics_kwargs = {} + pdf = PdfPages(fname) fig = plt.figure(figsize=(8.27, 11.69), dpi=50) @@ -849,7 +928,7 @@ def summary_pdf(self, tmin=None, tmax=None, fname=None, dpi=150, return fig -def _get_height_ratios(ylims): +def _get_height_ratios(ylims: List[Union[list, tuple]]) -> List[float]: height_ratios = [] for ylim in ylims: hr = ylim[1] - ylim[0] @@ -859,7 +938,7 @@ def _get_height_ratios(ylims): return height_ratios -def _get_stress_series(ml, split=True): +def _get_stress_series(ml, split: bool = True) -> List[Series]: stresses = [] for name in ml.stressmodels.keys(): nstress = len(ml.stressmodels[name].stress) diff --git a/pastas/modelstats.py b/pastas/modelstats.py index 2457e278..d15e3426 100644 --- a/pastas/modelstats.py +++ b/pastas/modelstats.py @@ -28,12 +28,26 @@ from .decorators import model_tmin_tmax from .stats import diagnostics, metrics +# Type Hinting +from typing import Optional, List +from pastas.typing import Model, TimestampType + class Statistics: # Save all statistics that can be calculated. - ops = ["rmse", "rmsn", "sse", "mae", "nse", "evp", "rsq", "bic", "aic", ] - - def __init__(self, ml): + ops = [ + "rmse", + "rmsn", + "sse", + "mae", + "nse", + "evp", + "rsq", + "bic", + "aic", + ] + + def __init__(self, ml: Model): """This class provides statistics to to pastas Model class. Parameters @@ -58,7 +72,11 @@ def __repr__(self): return msg @model_tmin_tmax - def rmse(self, tmin=None, tmax=None, weighted=False, **kwargs): + def rmse(self, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + weighted: bool = False, + **kwargs) -> float: """Root mean squared error of the residuals. Parameters @@ -77,7 +95,11 @@ def rmse(self, tmin=None, tmax=None, weighted=False, **kwargs): return metrics.rmse(res=res, weighted=weighted, **kwargs) @model_tmin_tmax - def rmsn(self, tmin=None, tmax=None, weighted=False, **kwargs): + def rmsn(self, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + weighted: bool = False, + **kwargs) -> float: """Root mean squared error of the noise. Parameters @@ -104,7 +126,9 @@ def rmsn(self, tmin=None, tmax=None, weighted=False, **kwargs): return metrics.rmse(res=res, weighted=weighted, **kwargs) @model_tmin_tmax - def sse(self, tmin=None, tmax=None): + def sse(self, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None) -> float: """Sum of the squares of the error (SSE) Parameters @@ -120,7 +144,11 @@ def sse(self, tmin=None, tmax=None): return metrics.sse(res=res) @model_tmin_tmax - def mae(self, tmin=None, tmax=None, weighted=False, **kwargs): + def mae(self, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + weighted: bool = False, + **kwargs) -> float: """Mean Absolute Error (MAE) of the residuals. Parameters @@ -139,7 +167,11 @@ def mae(self, tmin=None, tmax=None, weighted=False, **kwargs): return metrics.mae(res=res, weighted=weighted, **kwargs) @model_tmin_tmax - def nse(self, tmin=None, tmax=None, weighted=False, **kwargs): + def nse(self, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + weighted: bool = False, + **kwargs) -> float: """Nash-Sutcliffe coefficient for model fit . Parameters @@ -159,7 +191,11 @@ def nse(self, tmin=None, tmax=None, weighted=False, **kwargs): return metrics.nse(obs=obs, res=res, weighted=weighted, **kwargs) @model_tmin_tmax - def pearsonr(self, tmin=None, tmax=None, weighted=False, **kwargs): + def pearsonr(self, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + weighted: bool = False, + **kwargs) -> float: """Compute the (weighted) Pearson correlation (r). Parameters @@ -179,7 +215,11 @@ def pearsonr(self, tmin=None, tmax=None, weighted=False, **kwargs): return metrics.pearsonr(obs=obs, sim=sim, weighted=weighted, **kwargs) @model_tmin_tmax - def evp(self, tmin=None, tmax=None, weighted=False, **kwargs): + def evp(self, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + weighted: bool = False, + **kwargs) -> float: """Explained variance percentage. Parameters @@ -199,7 +239,11 @@ def evp(self, tmin=None, tmax=None, weighted=False, **kwargs): return metrics.evp(obs=obs, res=res, weighted=weighted, **kwargs) @model_tmin_tmax - def rsq(self, tmin=None, tmax=None, weighted=False, **kwargs): + def rsq(self, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + weighted: bool = False, + **kwargs) -> float: """R-squared. Parameters @@ -219,7 +263,11 @@ def rsq(self, tmin=None, tmax=None, weighted=False, **kwargs): return metrics.rsq(obs=obs, res=res, weighted=weighted, **kwargs) @model_tmin_tmax - def kge_2012(self, tmin=None, tmax=None, weighted=False, **kwargs): + def kge_2012(self, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + weighted: bool = False, + **kwargs) -> float: """Kling-Gupta Efficiency. Parameters @@ -239,7 +287,9 @@ def kge_2012(self, tmin=None, tmax=None, weighted=False, **kwargs): return metrics.kge_2012(obs=obs, sim=sim, weighted=weighted, **kwargs) @model_tmin_tmax - def bic(self, tmin=None, tmax=None): + def bic(self, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None) -> float: """Bayesian Information Criterium (BIC). Parameters @@ -261,7 +311,9 @@ def bic(self, tmin=None, tmax=None): return metrics.bic(res=res, nparam=nparam) @model_tmin_tmax - def aic(self, tmin=None, tmax=None): + def aic(self, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None) -> float: """Akaike Information Criterium (AIC). Parameters @@ -282,7 +334,10 @@ def aic(self, tmin=None, tmax=None): return metrics.aic(res=res, nparam=nparam) @model_tmin_tmax - def summary(self, tmin=None, tmax=None, stats=None): + def summary(self, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + stats: Optional[List[str]] = None) -> DataFrame: """Returns a Pandas DataFrame with goodness-of-fit metrics. Parameters @@ -321,8 +376,12 @@ def summary(self, tmin=None, tmax=None, stats=None): return stats @model_tmin_tmax - def diagnostics(self, tmin=None, tmax=None, alpha=0.05, stats=(), - float_fmt="{0:.2f}"): + def diagnostics(self, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + alpha: float = 0.05, + stats: tuple = (), + float_fmt: str = "{0:.2f}") -> DataFrame: if self.ml.noisemodel and self.ml.settings["noise"]: series = self.ml.noise(tmin=tmin, tmax=tmax) nparam = self.ml.noisemodel.nparam @@ -330,5 +389,8 @@ def diagnostics(self, tmin=None, tmax=None, alpha=0.05, stats=(), series = self.ml.residuals(tmin=tmin, tmax=tmax) nparam = 0 - return diagnostics(series=series, alpha=alpha, nparam=nparam, - stats=stats, float_fmt=float_fmt) + return diagnostics(series=series, + alpha=alpha, + nparam=nparam, + stats=stats, + float_fmt=float_fmt) diff --git a/pastas/noisemodels.py b/pastas/noisemodels.py index 18e3a9c0..eeb1374d 100644 --- a/pastas/noisemodels.py +++ b/pastas/noisemodels.py @@ -26,19 +26,23 @@ from .decorators import njit, set_parameter from .utils import check_numba +# Type Hinting +from typing import Optional +from pastas.typing import ArrayLike + __all__ = ["NoiseModel", "ArmaModel"] class NoiseModelBase: _name = "NoiseModelBase" - def __init__(self): + def __init__(self) -> None: self.nparam = 1 self.name = "noise" self.parameters = DataFrame( columns=["initial", "pmin", "pmax", "vary", "name"]) - def set_init_parameters(self, oseries=None): + def set_init_parameters(self, oseries: Optional[Series] = None) -> None: if oseries is not None: pinit = np.diff(oseries.index.to_numpy()) / Timedelta("1D") pinit = np.median(pinit) @@ -48,7 +52,7 @@ def set_init_parameters(self, oseries=None): "noise") @set_parameter - def _set_initial(self, name, value): + def _set_initial(self, name: str, value: float) -> None: """Internal method to set the initial parameter value. Notes @@ -58,7 +62,7 @@ def _set_initial(self, name, value): self.parameters.loc[name, "initial"] = value @set_parameter - def _set_pmin(self, name, value): + def _set_pmin(self, name: str, value: float) -> None: """Internal method to set the minimum value of the noisemodel. Notes @@ -68,7 +72,7 @@ def _set_pmin(self, name, value): self.parameters.loc[name, "pmin"] = value @set_parameter - def _set_pmax(self, name, value): + def _set_pmax(self, name: str, value: float) -> None: """Internal method to set the maximum parameter values. Notes @@ -78,7 +82,7 @@ def _set_pmax(self, name, value): self.parameters.loc[name, "pmax"] = value @set_parameter - def _set_vary(self, name, value): + def _set_vary(self, name: str, value: float) -> None: """Internal method to set if the parameter is varied. Notes @@ -87,11 +91,11 @@ def _set_vary(self, name, value): """ self.parameters.loc[name, "vary"] = value - def to_dict(self): + def to_dict(self) -> dict: return {"type": self._name} @staticmethod - def weights(res, p): + def weights(res, p) -> int: return 1 @@ -126,14 +130,14 @@ class NoiseModel(NoiseModelBase): """ _name = "NoiseModel" - def __init__(self, norm=True): + def __init__(self, norm: bool = True) -> None: NoiseModelBase.__init__(self) self.norm = norm self.nparam = 1 self.set_init_parameters() @staticmethod - def simulate(res, p): + def simulate(res: Series, p: ArrayLike) -> Series: """Simulate noise from the residuals. Parameters @@ -155,7 +159,7 @@ def simulate(res, p): * res.values[:-1]) return Series(data=v, index=res.index, name="Noise") - def weights(self, res, p): + def weights(self, res: Series, p: ArrayLike) -> Series: """Method to calculate the weights for the noise. Parameters @@ -163,8 +167,8 @@ def weights(self, res, p): res: pandas.Series Pandas Series with the residuals to compute the weights for. The Series index must be a DatetimeIndex. - p: numpy.ndarray - numpy array with the parameters used in the noise model. + p: array_like + NumPy array with the parameters used in the noise model. Returns ------- @@ -214,13 +218,13 @@ class ArmaModel(NoiseModelBase): """ _name = "ArmaModel" - def __init__(self): + def __init__(self) -> None: check_numba() NoiseModelBase.__init__(self) self.nparam = 2 self.set_init_parameters() - def set_init_parameters(self, oseries=None): + def set_init_parameters(self, oseries: Series = None) -> None: if oseries is not None: pinit = np.diff(oseries.index.to_numpy()) / Timedelta("1D") pinit = np.median(pinit) @@ -231,7 +235,7 @@ def set_init_parameters(self, oseries=None): self.parameters.loc["noise_beta"] = (1., -np.inf, np.inf, True, "noise") - def simulate(self, res, p): + def simulate(self, res: Series, p: ArrayLike) -> Series: alpha = p[0] beta = p[1] @@ -242,7 +246,8 @@ def simulate(self, res, p): @staticmethod @njit - def calculate_noise(res, odelt, alpha, beta): + def calculate_noise(res: ArrayLike, odelt: ArrayLike, alpha: float, + beta: float) -> ArrayLike: # Create an array to store the noise a = np.zeros_like(res) a[0] = res[0] diff --git a/pastas/plots.py b/pastas/plots.py index 0e6ec980..fd464dff 100644 --- a/pastas/plots.py +++ b/pastas/plots.py @@ -7,20 +7,25 @@ import matplotlib.pyplot as plt import numpy as np -from pandas import DataFrame, Timestamp +from pandas import Series, DataFrame, Timestamp from scipy.stats import gaussian_kde, norm, probplot from pastas.modelcompare import CompareModels from pastas.stats.core import acf as get_acf from pastas.stats.metrics import evp, rmse +# Type Hinting +from typing import Optional, List, Tuple +from pastas.typing import ArrayLike, Axes, Figure, TimestampType, Model + + logger = logging.getLogger(__name__) __all__ = ["compare", "series", "acf", "diagnostics", "cum_frequency", "TrackSolve"] -def compare(models, adjust_height=False, **kwargs): +def compare(models: List[Model], adjust_height: bool = True, **kwargs) -> Axes: """Plot multiple Pastas models in one figure to visually compare models. Note @@ -52,9 +57,17 @@ def compare(models, adjust_height=False, **kwargs): return mc.axes -def series(head=None, stresses=None, hist=True, kde=False, titles=True, - tmin=None, tmax=None, labels=None, figsize=(10, 5)): - """Plot all the input time series in a single plot. +def series( + head: Optional[Series] = None, + stresses: Optional[List[Series]] = None, + hist: bool = True, + kde: bool = False, + titles: bool = True, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + labels: Optional[List[str]] = None, + figsize: tuple = (10, 5)) -> Axes: + """Plot all the input time Series in a single plot. Parameters ---------- @@ -70,7 +83,7 @@ def series(head=None, stresses=None, hist=True, kde=False, titles=True, scipy.gaussian_kde using scott to calculate the estimator bandwidth. Returns the number of observations, mean, skew and kurtosis. titles: bool - Set the titles or not. Taken from the name attribute of the Series. + Set the titles or not. Taken from the name attribute of the series. tmin: str or pd.Timestamp tmax: str or pd.Timestamp labels: List of str @@ -193,8 +206,14 @@ def series(head=None, stresses=None, hist=True, kde=False, titles=True, return axes -def acf(series, alpha=0.05, lags=365, acf_options=None, smooth_conf=True, - color='k', ax=None, figsize=(5, 2)): +def acf(series: Series, + alpha: float = 0.05, + lags: int = 365, + acf_options: Optional[dict] = None, + smooth_conf: bool = True, + color: str = "k", + ax: Optional[Axes] = None, + figsize: tuple = (5, 2)) -> Axes: """Plot of the autocorrelation function of a time series. Parameters @@ -260,9 +279,15 @@ def acf(series, alpha=0.05, lags=365, acf_options=None, smooth_conf=True, return ax -def diagnostics(series, sim=None, alpha=0.05, bins=50, acf_options=None, - figsize=(10, 5), fig=None, heteroscedasicity=True, - **kwargs): +def diagnostics(series: Series, + sim: Optional[Series] = None, + alpha: float = 0.05, + bins: int = 50, + acf_options: Optional[dict] = None, + figsize: tuple = (10, 5), + fig: Optional[Figure] = None, + heteroscedasicity: bool = True, + **kwargs) -> Axes: """Plot that helps in diagnosing basic model assumptions. Parameters @@ -270,7 +295,7 @@ def diagnostics(series, sim=None, alpha=0.05, bins=50, acf_options=None, series: pandas.Series Pandas Series with the residual time series to diagnose. sim: pandas.Series, optional - Pandas Series with the simulated time series. Used to diagnose on + Pandas series with the simulated time series. Used to diagnose on heteroscedasticity. Ignored if heteroscedasticity is set to False. alpha: float, optional Significance level to calculate the (1-alpha)-confidence intervals. @@ -382,8 +407,11 @@ def diagnostics(series, sim=None, alpha=0.05, bins=50, acf_options=None, return fig.axes -def cum_frequency(obs, sim=None, ax=None, figsize=(5, 2)): - """Plot of the cumulative frequency of a time series. +def cum_frequency(obs: Series, + sim: Optional[Series] = None, + ax: Optional[Axes] = None, + figsize: tuple = (5, 2)) -> Axes: + """Plot of the cumulative frequency of a time Series. Parameters ---------- @@ -433,10 +461,10 @@ class TrackSolve: pastas Model to track tmin : str or pandas.Timestamp, optional start time for simulation, by default None which - defaults to first index in ml.oseries.series + defaults to first index in ml.oseries.Series tmax : str or pandas.Timestamp, optional end time for simulation, by default None which - defaults to last index in ml.oseries.series + defaults to last index in ml.oseries.Series update_iter : int, optional if visualizing optimization progress, update plot every update_iter iterations, by default nparam @@ -485,7 +513,12 @@ class TrackSolve: Access the resulting figure through `track.fig`. """ - def __init__(self, ml, tmin=None, tmax=None, update_iter=None): + def __init__(self, + ml: Model, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + update_iter: Optional[int] = None) -> None: + logger.warning("TrackSolve feature under development. If you find any " "bugs please post an issue on GitHub: " "https://github.com/pastas/pastas/issues") @@ -534,13 +567,13 @@ def __init__(self, ml, tmin=None, tmax=None, update_iter=None): # calculate EVP self.evp = np.array([evp(obs=self.obs, res=res)]) - def track_solve(self, params): + def track_solve(self, params: ArrayLike) -> None: """Append parameters to self.parameters DataFrame and update itercount, rmse values and evp. Parameters ---------- - params : np.array + params : array_like array containing parameters """ # update tmin/tmax and freq once after starting solve @@ -565,28 +598,28 @@ def track_solve(self, params): # recalculate EVP self.evp = np.r_[self.evp, evp(obs=self.obs, res=r_res)] - def _update_axes(self): + def _update_axes(self) -> None: """extend xlim if number of iterations exceeds current window.""" for iax in self.axes[1:]: iax.set_xlim(right=self.viewlim) self.fig.canvas.draw() - def _update_settings(self): + def _update_settings(self) -> None: self.tmin = self.ml.settings["tmin"] self.tmax = self.ml.settings["tmax"] self.freq = self.ml.settings["freq"] - def _noise(self, params): + def _noise(self, params: ArrayLike) -> ArrayLike: """get noise. Parameters ---------- - params: np.array + params: array_like array containing parameters Returns ------- - noise: np.array + noise: array_like array containing noise """ noise = self.ml.noise(p=params, @@ -594,7 +627,7 @@ def _noise(self, params): tmax=self.tmax) return noise - def _residuals(self, params): + def _residuals(self, params: ArrayLike) -> ArrayLike: """calculate residuals. Parameters @@ -604,7 +637,7 @@ def _residuals(self, params): Returns ------- - res: np.array + res: array_like array containing residuals """ res = self.ml.residuals(p=params, @@ -612,20 +645,22 @@ def _residuals(self, params): tmax=self.tmax) return res - def _simulate(self): + def _simulate(self) -> Series: """simulate model with last entry in self.parameters. Returns ------- sim: pd.Series - series containing model evaluation + Series containing model evaluation """ sim = self.ml.simulate(p=self.parameters.iloc[-1, :].values, tmin=self.tmin, tmax=self.tmax, freq=self.ml.settings["freq"]) return sim - def initialize_figure(self, figsize=(10, 8), dpi=100): + def initialize_figure(self, + figsize: Tuple[int] = (10, 8), + dpi: int = 100) -> Figure: """Initialize figure for plotting optimization progress. Parameters @@ -658,7 +693,7 @@ def initialize_figure(self, figsize=(10, 8), dpi=100): self.ax0.set_ylabel("head") self.ax0.set_title( "Iteration: {0} (EVP: {1:.2f}%)".format(self.itercount, - self.evp[-1])) + self.evp[-1])) self.ax0.legend(loc=(0, 1), frameon=False, ncol=2) omax = self.obs.max() omin = self.obs.min() @@ -725,7 +760,7 @@ def initialize_figure(self, figsize=(10, 8), dpi=100): self.fig.tight_layout() return self.fig - def plot_track_solve(self, params): + def plot_track_solve(self, params: ArrayLike) -> None: """Method to plot model simulation while model is being solved. Pass this method to ml.solve(), e.g.: @@ -734,7 +769,7 @@ def plot_track_solve(self, params): Parameters ---------- - params : np.array + params : array_like array containing parameters """ if not hasattr(self, "fig"): @@ -779,11 +814,11 @@ def plot_track_solve(self, params): # update title self.ax0.set_title( "Iteration: {0} (EVP: {1:.2f}%)".format(self.itercount, - self.evp[-1])) + self.evp[-1])) plt.pause(1e-10) self.fig.canvas.draw() - def plot_track_solve_history(self, fig=None): + def plot_track_solve_history(self, fig: Optional[Figure] = None) -> List[Axes]: """Plot optimization history. Parameters @@ -812,7 +847,7 @@ def plot_track_solve_history(self, fig=None): return fig.axes -def _table_formatter_params(s): +def _table_formatter_params(s: float) -> str: """Internal method for formatting parameters in tables in Pastas plots. Parameters @@ -835,7 +870,7 @@ def _table_formatter_params(s): return f"{s:.2f}" -def _table_formatter_stderr(s): +def _table_formatter_stderr(s: float) -> str: """Internal method for formatting stderrs in tables in Pastas plots. Parameters diff --git a/pastas/read/dinoloket.py b/pastas/read/dinoloket.py index a3437303..4d5d0a8d 100644 --- a/pastas/read/dinoloket.py +++ b/pastas/read/dinoloket.py @@ -29,8 +29,8 @@ def read_dino(fname, variable='Stand_cm_tov_NAP', factor=0.01): ts: pastas.TimeSeries returns a Pastas TimeSeries object or a list of objects. """ - warnings.warn("The read module of pastas is deprecated please use hydropandas instead -> https://hydropandas.readthedocs.io", DeprecationWarning) - + warnings.warn( + "The read module of pastas is deprecated please use hydropandas instead -> https://hydropandas.readthedocs.io", DeprecationWarning) # Read the file dino = DinoGrondwaterstand(fname) @@ -79,8 +79,8 @@ def read_dino_level_gauge(fname, variable='Stand_cm_tov_NAP', factor=0.01): ts: pastas.TimeSeries returns a Pastas TimeSeries object or a list of objects. """ - warnings.warn("The read module of pastas is deprecated please use hydropandas instead -> https://hydropandas.readthedocs.io", DeprecationWarning) - + warnings.warn( + "The read module of pastas is deprecated please use hydropandas instead -> https://hydropandas.readthedocs.io", DeprecationWarning) # Read the file dino = DinoPeilschaal(fname) @@ -113,7 +113,8 @@ def read_dino_level_gauge(fname, variable='Stand_cm_tov_NAP', factor=0.01): class DinoGrondwaterstand: def __init__(self, fname): - warnings.warn("The read module of pastas is deprecated please use hydropandas instead -> https://hydropandas.readthedocs.io", DeprecationWarning) + warnings.warn( + "The read module of pastas is deprecated please use hydropandas instead -> https://hydropandas.readthedocs.io", DeprecationWarning) with open(fname, 'r') as f: # lees de header line = f.readline() @@ -245,7 +246,8 @@ def __init__(self, fname): class DinoPeilschaal: def __init__(self, fname): - warnings.warn("The read module of pastas is deprecated please use hydropandas instead -> https://hydropandas.readthedocs.io", DeprecationWarning) + warnings.warn( + "The read module of pastas is deprecated please use hydropandas instead -> https://hydropandas.readthedocs.io", DeprecationWarning) with open(fname, 'r') as f: # lees de header diff --git a/pastas/read/knmi.py b/pastas/read/knmi.py index 213b8841..455c6ddf 100644 --- a/pastas/read/knmi.py +++ b/pastas/read/knmi.py @@ -27,7 +27,8 @@ def read_knmi(fname, variables='RD'): ts: pastas.TimeSeries returns a Pastas TimeSeries object or a list of objects. """ - warnings.warn("The read module of pastas is deprecated please use hydropandas instead -> https://hydropandas.readthedocs.io", DeprecationWarning) + warnings.warn( + "The read module of pastas is deprecated please use hydropandas instead -> https://hydropandas.readthedocs.io", DeprecationWarning) knmi = KnmiStation.fromfile(fname) if variables is None: @@ -109,14 +110,15 @@ class KnmiStation: """ def __init__(self, *args, **kwargs): - warnings.warn("The read module of pastas is deprecated please use hydropandas instead -> https://hydropandas.readthedocs.io", DeprecationWarning) + warnings.warn( + "The read module of pastas is deprecated please use hydropandas instead -> https://hydropandas.readthedocs.io", DeprecationWarning) self.stations = DataFrame() self.variables = dict() self.data = DataFrame() if len(args) > 0 or len(kwargs) > 0: warnings.warn("In the future use KnmiStation.download(**kwargs) " - "instead of KnmiStation(**kwargs)", FutureWarning) + "instead of KnmiStation(**kwargs)", FutureWarning) self._download(*args, **kwargs) # diable download method, as old code will call this again self.download = lambda *args, **kwargs: None diff --git a/pastas/read/waterbase.py b/pastas/read/waterbase.py index 4ca242f0..13fd90ee 100644 --- a/pastas/read/waterbase.py +++ b/pastas/read/waterbase.py @@ -38,7 +38,8 @@ def read_waterbase(fname, locations=None, variable="NUMERIEKEWAARDE", the xy-coordinates are calculates as the mean xy-coordinate in case these values are not unique. """ - warnings.warn("The read module of pastas is deprecated please use hydropandas instead -> https://hydropandas.readthedocs.io", DeprecationWarning) + warnings.warn( + "The read module of pastas is deprecated please use hydropandas instead -> https://hydropandas.readthedocs.io", DeprecationWarning) ts = [] df = read_csv(fname, delimiter=";", index_col="Date", decimal=",", diff --git a/pastas/recharge.py b/pastas/recharge.py index bc15ec4a..147b9224 100644 --- a/pastas/recharge.py +++ b/pastas/recharge.py @@ -41,9 +41,13 @@ where, power from pandas import DataFrame -from pastas.decorators import njit +from .decorators import njit from .utils import check_numba +# Type Hinting +from typing import Tuple +from pastas.typing import ArrayLike + logger = getLogger(__name__) @@ -52,13 +56,13 @@ class RechargeBase: """ - def __init__(self, **kwargs): + def __init__(self) -> None: self.snow = False self.nparam = 0 - self.kwargs = kwargs + self.kwargs = {} @staticmethod - def get_init_parameters(name="recharge"): + def get_init_parameters(name: str = "recharge") -> DataFrame: """Method to obtain the initial parameters. Parameters @@ -94,17 +98,18 @@ class Linear(RechargeBase): """ _name = "Linear" - def __init__(self): + def __init__(self) -> None: RechargeBase.__init__(self) self.nparam = 1 - def get_init_parameters(self, name="recharge"): + def get_init_parameters(self, name: str = "recharge") -> DataFrame: parameters = DataFrame( columns=["initial", "pmin", "pmax", "vary", "name"]) parameters.loc[name + "_f"] = (-1.0, -2.0, 0.0, True, name) return parameters - def simulate(self, prec, evap, p, **kwargs): + def simulate(self, prec: ArrayLike, evap: ArrayLike, p: ArrayLike, + **kwargs) -> ArrayLike: """Simulate the precipitation excess flux. Parameters @@ -124,12 +129,12 @@ def simulate(self, prec, evap, p, **kwargs): """ return add(prec, multiply(evap, p)) - def get_water_balance(self, prec, evap, p, **kwargs): + def get_water_balance(self, prec: ArrayLike, evap: ArrayLike, p: ArrayLike, + **kwargs) -> DataFrame: ea = multiply(evap, p) r = add(prec, multiply(evap, p)) - data = DataFrame(data=vstack((prec, ea, -r)).T, + return DataFrame(data=vstack((prec, ea, -r)).T, columns=["P", "Ea", "R"]) - return data class FlexModel(RechargeBase): @@ -181,10 +186,12 @@ class FlexModel(RechargeBase): """ _name = "FlexModel" - def __init__(self, interception=True, snow=False, gw_uptake=False): + def __init__(self, + interception: bool = True, + snow: bool = False, + gw_uptake: bool = False): check_numba() - RechargeBase.__init__(self, interception=interception, snow=snow, - gw_uptake=gw_uptake) + RechargeBase.__init__(self) self.snow = snow self.interception = interception self.gw_uptake = gw_uptake @@ -196,7 +203,7 @@ def __init__(self, interception=True, snow=False, gw_uptake=False): if self.snow: self.nparam += 2 - def get_init_parameters(self, name="recharge"): + def get_init_parameters(self, name: str = "recharge") -> DataFrame: parameters = DataFrame( columns=["initial", "pmin", "pmax", "vary", "name"]) parameters.loc[name + "_srmax"] = (250.0, 1e-5, 1e3, True, name) @@ -214,17 +221,23 @@ def get_init_parameters(self, name="recharge"): return parameters - def simulate(self, prec, evap, temp, p, dt=1.0, return_full=False, - **kwargs): + def simulate(self, + prec: ArrayLike, + evap: ArrayLike, + temp: ArrayLike, + p: ArrayLike, + dt: float = 1.0, + return_full: bool = False, + **kwargs) -> ArrayLike: """Simulate the soil water balance model. Parameters ---------- - prec: numpy.Array + prec: array_like Precipitation flux in mm/d. Must have the same length as evap. - evap: numpy.Array + evap: array_like Potential evaporation flux in mm/d. - temp: numpy.Array + temp: array_like Temperature in degrees Celcius. p: array_like array_like object with the values as floats representing the @@ -233,11 +246,11 @@ def simulate(self, prec, evap, temp, p, dt=1.0, return_full=False, time step for the calculation of the recharge. Only dt=1 is possible now. return_full: bool - return all fluxes and states as Numpy arrays. + return all fluxes and states as NumPy arrays. Returns ------- - r: numpy.Array + r: array_like Recharge flux calculated by the model. """ @@ -293,15 +306,20 @@ def simulate(self, prec, evap, temp, p, dt=1.0, return_full=False, @staticmethod @njit - def get_root_zone_balance(pe, ep, srmax=250.0, lp=0.25, ks=100.0, - gamma=4.0, dt=1.0): + def get_root_zone_balance(pe: ArrayLike, + ep: ArrayLike, + srmax: float = 250.0, + lp: float = 0.25, + ks: float = 100.0, + gamma: float = 4.0, + dt: float = 1.0) -> Tuple[ArrayLike]: """Method to compute the water balance of the root zone reservoir. Parameters ---------- - pe: numpy.Array + pe: array_like Effective precipitation flux in mm/d. - ep: numpy.Array + ep: array_like Potential evaporation flux in mm/d. srmax: float, optional Maximum storage capacity of the root zone. @@ -317,16 +335,16 @@ def get_root_zone_balance(pe, ep, srmax=250.0, lp=0.25, ks=100.0, Returns ------- - sr: numpy.Array + sr: array_like Storage in the root zone reservoir. - r: numpy.Array + r: array_like Recharge flux in mm/d - ea: numpy.Array + ea: array_like Evaporation flux in mm/d. Consists of transpiration and soil evaporation. Does not include interception evaporation. - q: numpy.Array + q: array_like surface runoff flux in mm/d. - pe: numpy.Array + pe: array_like Incoming infiltration flux in mm/d. Notes @@ -366,15 +384,18 @@ def get_root_zone_balance(pe, ep, srmax=250.0, lp=0.25, ks=100.0, @staticmethod @njit - def get_interception_balance(pr, ep, simax=2.0, dt=1.0): + def get_interception_balance(pr: ArrayLike, + ep: ArrayLike, + simax: float = 2.0, + dt: float = 1.0) -> Tuple[ArrayLike]: """Method to compute the water balance of the interception reservoir. Parameters ---------- - pr: numpy.Array - Numpy Array with rainfall in mm/day. - ep: numpy.Array - Numpy Array with potential evaporation in mm/day. + pr: array_like + NumPy Array with rainfall in mm/day. + ep: array_like + NumPy Array with potential evaporation in mm/day. simax: float, optional storage capacity of the interception reservoir. dt: float @@ -382,11 +403,11 @@ def get_interception_balance(pr, ep, simax=2.0, dt=1.0): Returns ------- - si: numpy.Array + si: array_like Interception storage. - ei: numpy.Array + ei: array_like Interception evaporation. - pi: numpy.Array + pi: array_like Incoming rainfall that is intercepted. Notes @@ -420,25 +441,28 @@ def get_interception_balance(pr, ep, simax=2.0, dt=1.0): @staticmethod @njit - def get_snow_balance(prec, temp, tt=0.0, k=2.0): + def get_snow_balance(prec: ArrayLike, + temp: ArrayLike, + tt: float = 0.0, + k: float = 2.0) -> Tuple[ArrayLike]: """Method to compute the water balance of the snow reservoir. Parameters ---------- - prec: numpy.Array - Numpy Array with precipitation in mm/day. - temp: numpy.Array - Numpy Array with the mean daily temperature in degree Celsius. + prec: array_like + NumPy Array with precipitation in mm/day. + temp: array_like + NumPy Array with the mean daily temperature in degree Celsius. tt: float, optional k: float, optional Returns ------- - ss: numpy.Array + ss: array_like storage in the snow reservoir. - ps: numpy.Array + ps: array_like snowfall flux in mm/d. - m: numpy.Array + m: array_like snow melt flux in mm/d. Notes @@ -468,9 +492,15 @@ def get_snow_balance(prec, temp, tt=0.0, k=2.0): return ss[:-1], ps, -m - def get_water_balance(self, prec, evap, temp, p, dt=1.0, **kwargs): - data = self.simulate(prec=prec, evap=evap, temp=temp, p=p, dt=dt, - return_full=True, **kwargs) + def get_water_balance(self, + prec: ArrayLike, + evap: ArrayLike, + temp: ArrayLike, + p: ArrayLike, + dt: float = 1.0, + **kwargs) -> DataFrame: + data = self.simulate(prec=prec, evap=evap, temp=temp, + p=p, dt=dt, return_full=True, **kwargs) columns = ["State Root zone (Sr)", "Recharge (R)", "Actual evaporation (Ea)", "Surface Runoff (Q)", @@ -484,21 +514,23 @@ def get_water_balance(self, prec, evap, temp, p, dt=1.0, **kwargs): if self.snow: columns += ["State Snow (Ss)", "Snowfall (Ps)", "Snowmelt (M)", ] - data = DataFrame(data=vstack(data).T, columns=columns) - return data + return DataFrame(data=vstack(data).T, columns=columns) - def check_snow_balance(self, prec, temp, **kwargs): - ss, ps, m = self.get_snow_balance(prec, temp, **kwargs) + def check_snow_balance(self, prec: ArrayLike, temp: ArrayLike, + **kwargs) -> float: + ss, ps, m = self.get_snow_balance(prec, temp) error = (ss[0] - ss[-1] + (ps + m).sum()) return error - def check_interception_balance(self, prec, evap, **kwargs): - si, ei, pi = self.get_interception_balance(prec, evap, **kwargs) + def check_interception_balance(self, prec: ArrayLike, evap: ArrayLike, + **kwargs) -> float: + si, ei, pi = self.get_interception_balance(prec, evap) error = (si[0] - si[-1] + (pi + ei).sum()) return error - def check_root_zone_balance(self, prec, evap, **kwargs): - sr, r, ea, q, pe = self.get_root_zone_balance(prec, evap, **kwargs) + def check_root_zone_balance(self, prec: ArrayLike, evap: ArrayLike, + **kwargs) -> float: + sr, r, ea, q, pe = self.get_root_zone_balance(prec, evap) error = (sr[0] - sr[-1] + (r + ea + q + pe).sum()) return error @@ -526,12 +558,12 @@ class Berendrecht(RechargeBase): """ _name = "Berendrecht" - def __init__(self): + def __init__(self) -> None: check_numba() RechargeBase.__init__(self) self.nparam = 7 - def get_init_parameters(self, name="recharge"): + def get_init_parameters(self, name: str = "recharge") -> DataFrame: parameters = DataFrame( columns=["initial", "pmin", "pmax", "vary", "name"]) parameters.loc[name + "_fi"] = (0.9, 0.7, 1.3, False, name) @@ -543,14 +575,20 @@ def get_init_parameters(self, name="recharge"): parameters.loc[name + "_ks"] = (100.0, 1, 1e4, True, name) return parameters - def simulate(self, prec, evap, p, dt=1.0, return_full=False, **kwargs): + def simulate(self, + prec: ArrayLike, + evap: ArrayLike, + p: ArrayLike, + dt: ArrayLike = 1.0, + return_full: bool = False, + **kwargs) -> Tuple[ArrayLike]: """Simulate the recharge flux. Parameters ---------- - prec: numpy.Array + prec: array_like Precipitation flux in mm/d. Has to have the same length as evap. - evap: numpy.Array + evap: array_like Potential evapotranspiration flux in mm/d. p: array_like array_like object with the values as floats representing the @@ -561,7 +599,7 @@ def simulate(self, prec, evap, p, dt=1.0, return_full=False, **kwargs): Returns ------- - r: numpy.Array + r: array_like Recharge flux calculated by the model. """ @@ -575,8 +613,16 @@ def simulate(self, prec, evap, p, dt=1.0, return_full=False, **kwargs): @staticmethod @njit - def get_recharge(prec, evap, fi=1.0, fc=1.0, sr=0.5, de=250.0, l=-2.0, - m=0.5, ks=50.0, dt=1.0): + def get_recharge(prec: ArrayLike, + evap: ArrayLike, + fi: float = 1.0, + fc: float = 1.0, + sr: float = 0.5, + de: float = 250.0, + l: float = -2.0, + m: float = 0.5, + ks: float = 50.0, + dt: float = 1.0) -> ArrayLike: """ Internal method used for the recharge calculation. If Numba is available, this method is significantly faster. @@ -608,7 +654,12 @@ def get_recharge(prec, evap, fi=1.0, fc=1.0, sr=0.5, de=250.0, l=-2.0, s[t + 1] = s[t] + dt / de * (pe[t] - ea[t] - r[t]) return r, s, ea, pe - def get_water_balance(self, prec, evap, p, dt=1.0, **kwargs): + def get_water_balance(self, + prec: ArrayLike, + evap: ArrayLike, + p: ArrayLike, + dt: float = 1.0, + **kwargs) -> DataFrame: r, s, ea, pe = self.simulate(prec, evap, p=p, dt=dt, return_full=True, **kwargs) s = s * p[3] # Because S is computed dimensionless in this model @@ -643,7 +694,7 @@ class Peterson(RechargeBase): with the parameters: - .. math:: + .. math:: \hat{S_{cap}} = 10^{S_{cap}}; \hat{k_{sat}} = 10^{k_{sat}}; \hat{\beta} = 10^{\beta} @@ -653,12 +704,12 @@ class Peterson(RechargeBase): """ _name = "Peterson" - def __init__(self): + def __init__(self) -> None: check_numba() RechargeBase.__init__(self) self.nparam = 5 - def get_init_parameters(self, name="recharge"): + def get_init_parameters(self, name: str = "recharge") -> DataFrame: parameters = DataFrame( columns=["initial", "pmin", "pmax", "vary", "name"]) parameters.loc[name + "_scap"] = (1.5, 0.5, 3.0, True, name) @@ -668,14 +719,20 @@ def get_init_parameters(self, name="recharge"): parameters.loc[name + "_gamma"] = (1.0, 0.0, 2.0, True, name) return parameters - def simulate(self, prec, evap, p, dt=1.0, return_full=False, **kwargs): + def simulate(self, + prec: ArrayLike, + evap: ArrayLike, + p: ArrayLike, + dt: float = 1.0, + return_full: bool = False, + **kwargs) -> ArrayLike: """Simulate the recharge flux. Parameters ---------- - prec: numpy.Array + prec: array_like Precipitation flux in mm/d. Must have the same length as evap. - evap: numpy.Array + evap: array_like Potential evapotranspiration flux in mm/d. p: array_like array_like object with the values as floats representing the @@ -685,7 +742,7 @@ def simulate(self, prec, evap, p, dt=1.0, return_full=False, **kwargs): Returns ------- - r: numpy.Array + r: array_like Recharge flux calculated by the model. """ @@ -699,8 +756,14 @@ def simulate(self, prec, evap, p, dt=1.0, return_full=False, **kwargs): @staticmethod @njit - def get_recharge(prec, evap, scap=1.0, alpha=1.0, - ksat=1.0, beta=0.5, gamma=1.0, dt=1.0): + def get_recharge(prec: ArrayLike, + evap: ArrayLike, + scap: float = 1.0, + alpha: float = 1.0, + ksat: float = 1.0, + beta: float = 0.5, + gamma: float = 1.0, + dt: float = 1.0): """ Internal method used for the recharge calculation. If Numba is available, this method is significantly faster. @@ -728,7 +791,12 @@ def get_recharge(prec, evap, scap=1.0, alpha=1.0, max(0.0, sm[t] + (pe[t] - ea[t] - r[t]) * dt)) return r, sm[1:], ea, pe - def get_water_balance(self, prec, evap, p, dt=1.0, **kwargs): + def get_water_balance(self, + prec: ArrayLike, + evap: ArrayLike, + p: ArrayLike, + dt: float = 1.0, + **kwargs) -> DataFrame: r, s, ea, pe = self.simulate(prec, evap, p=p, dt=dt, return_full=True, **kwargs) data = DataFrame(data=vstack((s, pe, ea, r)).T, diff --git a/pastas/reservoir.py b/pastas/reservoir.py index a4882dc8..2c8ebe15 100644 --- a/pastas/reservoir.py +++ b/pastas/reservoir.py @@ -14,19 +14,23 @@ import numpy as np from numpy import float64 -from scipy.integrate import solve_ivp # only used in simulateold -from pandas import DataFrame +# from scipy.integrate import solve_ivp # only used in simulateold +from pandas import DataFrame, Series from pastas.decorators import njit +# Type Hinting +from pastas.typing import ArrayLike + + class ReservoirBase: """Base class for reservoir classes.""" - def __init__(self): + def __init__(self) -> None: self.temp = False self.nparam = 0 @staticmethod - def get_init_parameters(name="recharge"): + def get_init_parameters(name: str = "recharge") -> DataFrame: """Method to obtain the initial parameters. Parameters @@ -43,7 +47,12 @@ def get_init_parameters(name="recharge"): columns=["initial", "pmin", "pmax", "vary", "name"]) return parameters - def simulate(self, prec, evap, p, dt=1.0, **kwargs): + def simulate(self, + prec: Series, + evap: Series, + p: ArrayLike, + dt: float = 1.0, + **kwargs) -> ArrayLike: pass @@ -60,28 +69,28 @@ class Reservoir1(ReservoirBase): """ _name = "Reservoir1" - def __init__(self, initialhead): + def __init__(self, initialhead: float) -> None: ReservoirBase.__init__(self) self.nparam = 4 self.initialhead = initialhead - def get_init_parameters(self, name="reservoir"): + def get_init_parameters(self, name: str = "reservoir") -> DataFrame: parameters = DataFrame( columns=["initial", "pmin", "pmax", "vary", "name"]) parameters.loc[name + "_S"] = (0.1, 0.001, 1, True, name) parameters.loc[name + "_c"] = (100, 1, 5000, True, name) dmean = self.initialhead - parameters.loc[name + "_d"] = (dmean, dmean - 10, + parameters.loc[name + "_d"] = (dmean, dmean - 10, dmean + 10, True, name) parameters.loc[name + "_f"] = (-1.0, -2.0, 0.0, True, name) return parameters - - def simulate(self, prec, evap, p): + + def simulate(self, prec: Series, evap: Series, p: ArrayLike) -> ArrayLike: return self.simulatehead(prec.values, evap.values, p) @staticmethod @njit - def simulatehead(prec, evap, p): + def simulatehead(prec: ArrayLike, evap: ArrayLike, p: ArrayLike) -> ArrayLike: """Simulate the head in the reservoir Parameters @@ -109,7 +118,7 @@ def simulatehead(prec, evap, p): return h[1:] # def simulateold(self, prec, evap, p, **kwargs): -# """Implementation using solve_ivp - too slow and +# """Implementation using solve_ivp - too slow and # Parameters # ---------- @@ -129,11 +138,11 @@ def simulatehead(prec, evap, p): # tmax = len(prec) # eps = 1e-6 # t = np.linspace(1 + eps, tmax - eps, tmax) -# path2 = solve_ivp(self.dhdt, (0, t[-1]), y0=[d], t_eval=t, rtol=1e-4, +# path2 = solve_ivp(self.dhdt, (0, t[-1]), y0=[d], t_eval=t, rtol=1e-4, # max_step=1, method='RK23', args=(prec, evap, p)) # h = path2.y[0] # return h - + # def dhdt(self, t, h, prec, evap, p): # S, c, d, f = p # #print(t, int(t), int(t)) @@ -142,6 +151,7 @@ def simulatehead(prec, evap, p): # #rv += -expit(100 * (h[0] - 19.6)) * (h[0] - 19.6) / 20 # return rv + class Reservoir2(ReservoirBase): """Single reservoir with outflow at two heights @@ -155,30 +165,30 @@ class Reservoir2(ReservoirBase): """ _name = "Reservoir2" - def __init__(self, initialhead): + def __init__(self, initialhead: float) -> None: ReservoirBase.__init__(self) self.nparam = 6 self.initialhead = initialhead - def get_init_parameters(self, name="reservoir"): + def get_init_parameters(self, name: str = "reservoir") -> DataFrame: parameters = DataFrame( columns=["initial", "pmin", "pmax", "vary", "name"]) parameters.loc[name + "_S"] = (0.1, 0.001, 1, True, name) parameters.loc[name + "_c"] = (100, 1, 5000, True, name) dmean = self.initialhead - parameters.loc[name + "_d"] = (dmean, dmean - 10, + parameters.loc[name + "_d"] = (dmean, dmean - 10, dmean + 10, True, name) parameters.loc[name + "_f"] = (-1.0, -2.0, 0.0, True, name) parameters.loc[name + "_c2"] = (100, 1, 1000, True, name) parameters.loc[name + "_deld"] = (0.01, 0.001, 10, True, name) return parameters - - def simulate(self, prec, evap, p): + + def simulate(self, prec: Series, evap: Series, p: ArrayLike) -> ArrayLike: return self.simulatehead(prec.values, evap.values, p) @staticmethod @njit - def simulatehead(prec, evap, p): + def simulatehead(prec: ArrayLike, evap: ArrayLike, p: ArrayLike) -> ArrayLike: """Simulate the head in the reservoir Parameters @@ -206,4 +216,4 @@ def simulatehead(prec, evap, p): h[i] = h[i - 1] + delt * (rech / S - (h[i - 1] - d) / (c * S)) if h[i - 1] > d2: h[i] -= delt * (h[i - 1] - d2) / (c2 * S) - return h[1:] \ No newline at end of file + return h[1:] diff --git a/pastas/rfunc.py b/pastas/rfunc.py index 50368bec..a0f82984 100644 --- a/pastas/rfunc.py +++ b/pastas/rfunc.py @@ -17,6 +17,10 @@ except ModuleNotFoundError: prange = range +# Type Hinting +from typing import Optional, Union +from pastas.typing import ArrayLike + logger = getLogger(__name__) __all__ = ["Gamma", "Exponential", "Hantush", "Polder", "FourParam", @@ -27,14 +31,16 @@ class RfuncBase: _name = "RfuncBase" - def __init__(self, **kwargs): + def __init__(self, **kwargs) -> None: self.up = True self.meanstress = 1 self.cutoff = 0.999 self.kwargs = kwargs - def _set_init_parameter_settings(self, up=True, meanstress=1, - cutoff=0.999): + def _set_init_parameter_settings(self, + up: bool = True, + meanstress: float = 1.0, + cutoff: float = 0.999) -> None: self.up = up # Completely arbitrary number to prevent division by zero if 1e-8 > meanstress > 0: @@ -44,7 +50,7 @@ def _set_init_parameter_settings(self, up=True, meanstress=1, self.meanstress = meanstress self.cutoff = cutoff - def get_init_parameters(self, name): + def get_init_parameters(self, name: str) -> DataFrame: """Get initial parameters and bounds. It is called by the stressmodel. Parameters @@ -59,7 +65,7 @@ def get_init_parameters(self, name): """ pass - def get_tmax(self, p, cutoff=None): + def get_tmax(self, p: ArrayLike, cutoff: Optional[float] = None) -> float: """Method to get the response time for a certain cutoff. Parameters @@ -78,7 +84,11 @@ def get_tmax(self, p, cutoff=None): """ pass - def step(self, p, dt=1, cutoff=None, maxtmax=None): + def step(self, + p: ArrayLike, + dt: float = 1.0, + cutoff: Optional[float] = None, + maxtmax: Optional[int] = None) -> ArrayLike: """Method to return the step function. Parameters @@ -95,12 +105,16 @@ def step(self, p, dt=1, cutoff=None, maxtmax=None): Returns ------- - s: numpy.array + s: array_like Array with the step response. """ pass - def block(self, p, dt=1, cutoff=None, maxtmax=None): + def block(self, + p: ArrayLike, + dt: float = 1.0, + cutoff: Optional[float] = None, + maxtmax: Optional[int] = None) -> ArrayLike: """Method to return the block function. Parameters @@ -117,13 +131,13 @@ def block(self, p, dt=1, cutoff=None, maxtmax=None): Returns ------- - s: numpy.array + s: array_like Array with the block response. """ s = self.step(p, dt, cutoff, maxtmax) return np.append(s[0], np.subtract(s[1:], s[:-1])) - def impulse(self, t, p): + def impulse(self, t: ArrayLike, p: ArrayLike) -> ArrayLike: """Method to return the impulse response function. Parameters @@ -140,7 +154,7 @@ def impulse(self, t, p): Returns ------- - s: numpy.array + s: array_like Array with the impulse response. Note @@ -149,7 +163,11 @@ def impulse(self, t, p): """ pass - def get_t(self, p, dt, cutoff, maxtmax=None): + def get_t(self, + p: ArrayLike, + dt: float, + cutoff: float, + maxtmax: Optional[int] = None) -> ArrayLike: """Internal method to determine the times at which to evaluate the step-response, from t=0. @@ -169,7 +187,7 @@ def get_t(self, p, dt, cutoff, maxtmax=None): Returns ------- - t: numpy.array + t: array_like Array with the times """ if isinstance(dt, np.ndarray): @@ -207,11 +225,11 @@ class Gamma(RfuncBase): """ _name = "Gamma" - def __init__(self): + def __init__(self) -> None: RfuncBase.__init__(self) self.nparam = 3 - def get_init_parameters(self, name): + def get_init_parameters(self, name: str) -> DataFrame: parameters = DataFrame( columns=['initial', 'pmin', 'pmax', 'vary', 'name']) if self.up: @@ -230,20 +248,24 @@ def get_init_parameters(self, name): parameters.loc[name + '_a'] = (10, 0.01, 1e4, True, name) return parameters - def get_tmax(self, p, cutoff=None): + def get_tmax(self, p: ArrayLike, cutoff: Optional[float] = None) -> float: if cutoff is None: cutoff = self.cutoff return gammaincinv(p[1], cutoff) * p[2] - def gain(self, p): + def gain(self, p: ArrayLike) -> float: return p[0] - def step(self, p, dt=1, cutoff=None, maxtmax=None): + def step(self, + p: ArrayLike, + dt: float = 1.0, + cutoff: Optional[float] = None, + maxtmax: Optional[int] = None) -> ArrayLike: t = self.get_t(p, dt, cutoff, maxtmax) s = p[0] * gammainc(p[1], t / p[2]) return s - def impulse(self, t, p): + def impulse(self, t: ArrayLike, p: ArrayLike) -> ArrayLike: A, n, a = p ir = A * t ** (n - 1) * np.exp(-t / a) / (a ** n * gamma(n)) return ir @@ -273,11 +295,11 @@ class Exponential(RfuncBase): """ _name = "Exponential" - def __init__(self): + def __init__(self) -> None: RfuncBase.__init__(self) self.nparam = 2 - def get_init_parameters(self, name): + def get_init_parameters(self, name: str) -> DataFrame: parameters = DataFrame( columns=['initial', 'pmin', 'pmax', 'vary', 'name']) if self.up: @@ -294,20 +316,24 @@ def get_init_parameters(self, name): parameters.loc[name + '_a'] = (10, 0.01, 1000, True, name) return parameters - def get_tmax(self, p, cutoff=None): + def get_tmax(self, p: ArrayLike, cutoff=None) -> float: if cutoff is None: cutoff = self.cutoff return -p[1] * np.log(1 - cutoff) - def gain(self, p): + def gain(self, p: ArrayLike) -> float: return p[0] - def step(self, p, dt=1, cutoff=None, maxtmax=None): + def step(self, + p: ArrayLike, + dt: float = 1.0, + cutoff: Optional[float] = None, + maxtmax: Optional[float] = None) -> ArrayLike: t = self.get_t(p, dt, cutoff, maxtmax) s = p[0] * (1.0 - np.exp(-t / p[1])) return s - def impulse(self, t, p): + def impulse(self, t: ArrayLike, p: ArrayLike) -> ArrayLike: A, a = p ir = A / a * np.exp(-t / a) return ir @@ -345,7 +371,7 @@ class HantushWellModel(RfuncBase): """ _name = "HantushWellModel" - def __init__(self, use_numba=False, quad=False): + def __init__(self, use_numba: bool = False, quad: bool = False) -> None: RfuncBase.__init__(self, use_numba=use_numba, quad=quad) self.distances = None self.nparam = 3 @@ -359,10 +385,10 @@ def __init__(self, use_numba=False, quad=False): if self.use_numba: self.use_numba = check_numba_scipy() - def set_distances(self, distances): + def set_distances(self, distances) -> None: self.distances = distances - def get_init_parameters(self, name): + def get_init_parameters(self, name: str) -> DataFrame: if self.distances is None: raise(Exception('distances is None. Set using method' ' set_distances() or use Hantush.')) @@ -389,7 +415,7 @@ def get_init_parameters(self, name): return parameters @staticmethod - def _get_distance_from_params(p): + def _get_distance_from_params(p: ArrayLike) -> float: if len(p) == 3: r = 1.0 logger.info("No distance passed to HantushWellModel, " @@ -398,7 +424,7 @@ def _get_distance_from_params(p): r = p[3] return r - def get_tmax(self, p, cutoff=None): + def get_tmax(self, p: ArrayLike, cutoff: Optional[float] = None) -> float: r = self._get_distance_from_params(p) # approximate formula for tmax if cutoff is None: @@ -411,7 +437,7 @@ def get_tmax(self, p, cutoff=None): else: return lambertw(1 / ((1 - cutoff) * k0rho)).real * a - def gain(self, p, r=None): + def gain(self, p: ArrayLike, r: Optional[float] = None) -> float: if r is None: r = self._get_distance_from_params(p) rho = 2 * r * np.exp(p[2] / 2) @@ -419,12 +445,12 @@ def gain(self, p, r=None): @staticmethod @njit - def _integrand_hantush(y, b): + def _integrand_hantush(y: float, b: float) -> float: return np.exp(-y - (b / y)) / y @staticmethod @njit(parallel=True) - def numba_step(A, a, b, r, t): + def numba_step(A: float, a: float, b: float, r: float, t: ArrayLike) -> ArrayLike: rho = 2 * r * np.exp(b / 2) rhosq = rho**2 k0rho = k0(rho) @@ -442,7 +468,7 @@ def numba_step(A, a, b, r, t): return A * F / 2 @staticmethod - def numpy_step(A, a, b, r, t): + def numpy_step(A: float, a: float, b: float, r: float, t: ArrayLike) -> ArrayLike: rho = 2 * r * np.exp(b / 2) rhosq = rho**2 k0rho = k0(rho) @@ -457,7 +483,8 @@ def numpy_step(A, a, b, r, t): tau2 + rhosq / (4 * tau2)) return A * F / 2 - def quad_step(self, A, a, b, r, t): + def quad_step(self, A: float, a: float, b: float, r: float, + t: ArrayLike) -> ArrayLike: F = np.zeros_like(t) brsq = np.exp(b) * r**2 u = a * brsq / t @@ -466,7 +493,11 @@ def quad_step(self, A, a, b, r, t): u[i], np.inf, args=(brsq,))[0] return F * A / 2 - def step(self, p, dt=1, cutoff=None, maxtmax=None): + def step(self, + p: ArrayLike, + dt: float = 1.0, + cutoff: Optional[float] = None, + maxtmax: Optional[int] = None) -> ArrayLike: A, a, b = p[:3] r = self._get_distance_from_params(p) t = self.get_t(p, dt, cutoff, maxtmax) @@ -481,7 +512,12 @@ def step(self, p, dt=1, cutoff=None, maxtmax=None): return self.numpy_step(A, a, b, r, t) @staticmethod - def variance_gain(A, b, var_A, var_b, cov_Ab, r=1.0): + def variance_gain(A: float, + b: float, + var_A: float, + var_b: float, + cov_Ab: float, + r: Optional[float] = 1.0) -> Union[float, ArrayLike]: """Calculate variance of the gain from parameters A and b. Variance of the gain is calculated based on propagation of @@ -508,13 +544,13 @@ def variance_gain(A, b, var_A, var_b, cov_Ab, r=1.0): cov_Ab : float covariance between A and b, can be obtained from the covariance matrix (e.g. ml.fit.pcov) - r : float or np.array, optional + r : float or array_like, optional distance(s) between observation well and stress(es), default value is 1.0 Returns ------- - var_gain : float or np.array + var_gain : float or array_like variance of the gain calculated based on propagation of uncertainty of parameters A and b. @@ -564,7 +600,7 @@ class Hantush(RfuncBase): """ _name = "Hantush" - def __init__(self, use_numba=False, quad=False): + def __init__(self, use_numba: bool = False, quad: bool = False) -> None: RfuncBase.__init__(self, use_numba=use_numba, quad=quad) self.nparam = 3 self.use_numba = use_numba @@ -577,7 +613,7 @@ def __init__(self, use_numba=False, quad=False): if self.use_numba: self.use_numba = check_numba_scipy() - def get_init_parameters(self, name): + def get_init_parameters(self, name: str) -> DataFrame: parameters = DataFrame( columns=['initial', 'pmin', 'pmax', 'vary', 'name']) if self.up: @@ -593,7 +629,7 @@ def get_init_parameters(self, name): parameters.loc[name + '_b'] = (1, 1e-6, 25, True, name) return parameters - def get_tmax(self, p, cutoff=None): + def get_tmax(self, p: ArrayLike, cutoff: Optional[float] = None) -> float: # approximate formula for tmax if cutoff is None: cutoff = self.cutoff @@ -602,17 +638,17 @@ def get_tmax(self, p, cutoff=None): return lambertw(1 / ((1 - cutoff) * k0(rho))).real * a @staticmethod - def gain(p): + def gain(p: ArrayLike) -> float: return p[0] @staticmethod @njit - def _integrand_hantush(y, b): + def _integrand_hantush(y: float, b: float) -> float: return np.exp(-y - (b / y)) / y @staticmethod @njit(parallel=True) - def numba_step(A, a, b, t): + def numba_step(A: float, a: float, b: float, t: ArrayLike) -> ArrayLike: rho = 2 * np.sqrt(b) rhosq = rho**2 k0rho = k0(rho) @@ -630,7 +666,7 @@ def numba_step(A, a, b, t): return A * F / (2 * k0rho) @staticmethod - def numpy_step(A, a, b, t): + def numpy_step(A: float, a: float, b: float, t: ArrayLike) -> ArrayLike: rho = 2 * np.sqrt(b) rhosq = rho**2 k0rho = k0(rho) @@ -646,7 +682,7 @@ def numpy_step(A, a, b, t): tau2 + rhosq / (4 * tau2)) return A * F / (2 * k0rho) - def quad_step(self, A, a, b, t): + def quad_step(self, A: float, a: float, b: float, t: ArrayLike) -> ArrayLike: F = np.zeros_like(t) u = a * b / t for i in range(0, len(t)): @@ -654,7 +690,11 @@ def quad_step(self, A, a, b, t): u[i], np.inf, args=(b,))[0] return F * A / (2 * k0(2 * np.sqrt(b))) - def step(self, p, dt=1, cutoff=None, maxtmax=None): + def step(self, + p: ArrayLike, + dt: float = 1.0, + cutoff: Optional[float] = None, + maxtmax: Optional[int] = None) -> ArrayLike: A, a, b = p t = self.get_t(p, dt, cutoff, maxtmax) @@ -667,7 +707,7 @@ def step(self, p, dt=1, cutoff=None, maxtmax=None): else: # otherwise numpy is faster return self.numpy_step(A, a, b, t) - def impulse(self, t, p): + def impulse(self, t: ArrayLike, p: ArrayLike) -> ArrayLike: A, a, b = p ir = A / (2 * t * k0(2 * np.sqrt(b))) * np.exp(-t / a - a * b / t) return ir @@ -693,11 +733,11 @@ class Polder(RfuncBase): """ _name = "Polder" - def __init__(self): + def __init__(self) -> None: RfuncBase.__init__(self) self.nparam = 3 - def get_init_parameters(self, name): + def get_init_parameters(self, name) -> DataFrame: parameters = DataFrame( columns=['initial', 'pmin', 'pmax', 'vary', 'name']) parameters.loc[name + '_A'] = (1, 0, 2, True, name) @@ -705,7 +745,7 @@ def get_init_parameters(self, name): parameters.loc[name + '_b'] = (1, 1e-6, 25, True, name) return parameters - def get_tmax(self, p, cutoff=None): + def get_tmax(self, p: ArrayLike, cutoff: Optional[float] = None) -> float: if cutoff is None: cutoff = self.cutoff _, a, b = p @@ -716,14 +756,18 @@ def get_tmax(self, p, cutoff=None): tmax = a * y ** 2 return tmax - def gain(self, p): + def gain(self, p: ArrayLike) -> float: # the steady state solution of Mazure g = p[0] * np.exp(-np.sqrt(4 * p[2])) if not self.up: g = -g return g - def step(self, p, dt=1, cutoff=None, maxtmax=None): + def step(self, + p: ArrayLike, + dt: float = 1.0, + cutoff: Optional[float] = None, + maxtmax: Optional[int] = None) -> ArrayLike: t = self.get_t(p, dt, cutoff, maxtmax) A, a, b = p s = A * self.polder_function(np.sqrt(b), np.sqrt(t / a)) @@ -732,13 +776,13 @@ def step(self, p, dt=1, cutoff=None, maxtmax=None): s = -s return s - def impulse(self, t, p): + def impulse(self, t: ArrayLike, p: ArrayLike) -> ArrayLike: A, a, b = p ir = A * t ** (-1.5) * np.exp(-t / a - b / t) return ir @staticmethod - def polder_function(x, y): + def polder_function(x: float, y: float) -> float: s = 0.5 * np.exp(2 * x) * erfc(x / y + y) + \ 0.5 * np.exp(-2 * x) * erfc(x / y - y) return s @@ -760,11 +804,11 @@ class One(RfuncBase): """ _name = "One" - def __init__(self): + def __init__(self) -> None: RfuncBase.__init__(self) self.nparam = 1 - def get_init_parameters(self, name): + def get_init_parameters(self, name: str) -> DataFrame: parameters = DataFrame( columns=['initial', 'pmin', 'pmax', 'vary', 'name']) if self.up: @@ -778,19 +822,27 @@ def get_init_parameters(self, name): self.meanstress, np.nan, np.nan, True, name) return parameters - def get_tmax(self, p, cutoff=None): + def get_tmax(self, p: ArrayLike, cutoff: Optional[float] = None) -> float: return 0. - def gain(self, p): + def gain(self, p: ArrayLike) -> float: return p[0] - def step(self, p, dt=1, cutoff=None, maxtmax=None): + def step(self, + p: ArrayLike, + dt: float = 1.0, + cutoff: Optional[float] = None, + maxtmax: Optional[int] = None) -> ArrayLike: if isinstance(dt, np.ndarray): return p[0] * np.ones(len(dt)) else: return p[0] * np.ones(1) - def block(self, p, dt=1, cutoff=None, maxtmax=None): + def block(self, + p: ArrayLike, + dt: float = 1.0, + cutoff: Optional[float] = None, + maxtmax: Optional[int] = None) -> ArrayLike: return p[0] * np.ones(1) @@ -820,12 +872,12 @@ class FourParam(RfuncBase): """ _name = "FourParam" - def __init__(self, quad=False): + def __init__(self, quad: bool = False) -> None: RfuncBase.__init__(self, quad=quad) self.nparam = 4 self.quad = quad - def get_init_parameters(self, name): + def get_init_parameters(self, name: str) -> DataFrame: parameters = DataFrame( columns=['initial', 'pmin', 'pmax', 'vary', 'name']) if self.up: @@ -845,10 +897,10 @@ def get_init_parameters(self, name): return parameters @staticmethod - def function(t, p): + def function(t: float, p: ArrayLike) -> float: return (t ** (p[1] - 1)) * np.exp(-t / p[2] - p[2] * p[3] / t) - def get_tmax(self, p, cutoff=None): + def get_tmax(self, p: ArrayLike, cutoff: Optional[float] = None) -> float: if cutoff is None: cutoff = self.cutoff @@ -883,10 +935,14 @@ def get_tmax(self, p, cutoff=None): return np.searchsorted(y, cutoff) @staticmethod - def gain(p): + def gain(p: ArrayLike) -> float: return p[0] - def step(self, p, dt=1, cutoff=None, maxtmax=None): + def step(self, + p: ArrayLike, + dt: float = 1.0, + cutoff: Optional[float] = None, + maxtmax: Optional[int] = None) -> ArrayLike: if self.quad: t = self.get_t(p, dt, cutoff, maxtmax) @@ -968,11 +1024,11 @@ class DoubleExponential(RfuncBase): """ _name = "DoubleExponential" - def __init__(self): + def __init__(self) -> None: RfuncBase.__init__(self) self.nparam = 4 - def get_init_parameters(self, name): + def get_init_parameters(self, name: str) -> DataFrame: parameters = DataFrame( columns=['initial', 'pmin', 'pmax', 'vary', 'name']) if self.up: @@ -991,7 +1047,7 @@ def get_init_parameters(self, name): parameters.loc[name + '_a2'] = (10, 0.01, 5000, True, name) return parameters - def get_tmax(self, p, cutoff=None): + def get_tmax(self, p: ArrayLike, cutoff: Optional[float] = None) -> float: if cutoff is None: cutoff = self.cutoff if p[2] > p[3]: # a1 > a2 @@ -999,10 +1055,14 @@ def get_tmax(self, p, cutoff=None): else: # a1 < a2 return -p[3] * np.log(1 - cutoff) - def gain(self, p): + def gain(self, p: ArrayLike) -> float: return p[0] - def step(self, p, dt=1, cutoff=None, maxtmax=None): + def step(self, + p: ArrayLike, + dt: float = 1.0, + cutoff: Optional[float] = None, + maxtmax: Optional[int] = None) -> ArrayLike: t = self.get_t(p, dt, cutoff, maxtmax) s = p[0] * (1 - ((1 - p[1]) * np.exp(-t / p[2]) + p[1] * np.exp(-t / p[3]))) @@ -1038,27 +1098,31 @@ class Edelman(RfuncBase): """ _name = "Edelman" - def __init__(self): + def __init__(self) -> None: RfuncBase.__init__(self) self.nparam = 1 - def get_init_parameters(self, name): + def get_init_parameters(self, name: str) -> DataFrame: parameters = DataFrame( columns=['initial', 'pmin', 'pmax', 'vary', 'name']) beta_init = 1.0 parameters.loc[name + '_beta'] = (beta_init, 0, 1000, True, name) return parameters - def get_tmax(self, p, cutoff=None): + def get_tmax(self, p: ArrayLike, cutoff: Optional[float] = None) -> float: if cutoff is None: cutoff = self.cutoff return 1. / (p[0] * erfcinv(cutoff * erfc(0))) ** 2 @staticmethod - def gain(p): + def gain(p: ArrayLike) -> float: return 1. - def step(self, p, dt=1, cutoff=None, maxtmax=None): + def step(self, + p: ArrayLike, + dt: float = 1.0, + cutoff: Optional[float] = None, + maxtmax: Optional[int] = None) -> ArrayLike: t = self.get_t(p, dt, cutoff, maxtmax) s = erfc(1 / (p[0] * np.sqrt(t))) return s @@ -1101,12 +1165,12 @@ class Kraijenhoff(RfuncBase): """ _name = "Kraijenhoff" - def __init__(self, n_terms=10): + def __init__(self, n_terms: int = 10) -> None: RfuncBase.__init__(self, n_terms=n_terms) self.nparam = 3 self.n_terms = n_terms - def get_init_parameters(self, name): + def get_init_parameters(self, name: str) -> DataFrame: parameters = DataFrame( columns=['initial', 'pmin', 'pmax', 'vary', 'name']) if self.up: @@ -1124,16 +1188,20 @@ def get_init_parameters(self, name): parameters.loc[name + '_b'] = (0, 0, 0.499999, True, name) return parameters - def get_tmax(self, p, cutoff=None): + def get_tmax(self, p: ArrayLike, cutoff: Optional[float] = None) -> float: if cutoff is None: cutoff = self.cutoff return - p[1] * np.log(1 - cutoff) @staticmethod - def gain(p): + def gain(p: ArrayLike) -> float: return p[0] - def step(self, p, dt=1, cutoff=None, maxtmax=None): + def step(self, + p: ArrayLike, + dt: float = 1.0, + cutoff: Optional[float] = None, + maxtmax: Optional[int] = None) -> ArrayLike: t = self.get_t(p, dt, cutoff, maxtmax) h = 0 for n in range(self.n_terms): @@ -1158,11 +1226,12 @@ class Spline(RfuncBase): cutoff: float proportion after which the step function is cut off. default is 0.999. this parameter is ignored by Points - t: list - times at which the response function is defined kind: string see scipy.interpolate.interp1d. Most useful for a smooth response - function are ‘quadratic’ and ‘cubic’. + function are 'quadratic' and 'cubic'. + t: list + times at which the response function is defined + Notes ----- @@ -1175,14 +1244,15 @@ class Spline(RfuncBase): """ _name = "Spline" - def __init__(self, kind='quadratic', - t=[1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024]): + def __init__(self, kind: str = 'quadratic', t: Optional[list] = None) -> None: RfuncBase.__init__(self, kind=kind, t=t) self.kind = kind + if t is None: + t = [1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024] self.t = t self.nparam = len(t) + 1 - def get_init_parameters(self, name): + def get_init_parameters(self, name: str) -> DataFrame: parameters = DataFrame( columns=['initial', 'pmin', 'pmax', 'vary', 'name']) if self.up: @@ -1206,13 +1276,17 @@ def get_init_parameters(self, name): return parameters - def get_tmax(self, p, cutoff=None): + def get_tmax(self, p: ArrayLike, cutoff: Optional[float] = None) -> float: return self.t[-1] - def gain(self, p): + def gain(self, p: ArrayLike) -> float: return p[0] - def step(self, p, dt=1, cutoff=None, maxtmax=None): + def step(self, + p: ArrayLike, + dt: float = 1.0, + cutoff: Optional[float] = None, + maxtmax: Optional[int] = None) -> ArrayLike: f = interp1d(self.t, p[1:len(self.t) + 1], kind=self.kind) t = self.get_t(p, dt, cutoff, maxtmax) s = p[0] * f(t) diff --git a/pastas/solver.py b/pastas/solver.py index e16142ef..59093420 100644 --- a/pastas/solver.py +++ b/pastas/solver.py @@ -13,10 +13,15 @@ from logging import getLogger import numpy as np -from pandas import DataFrame +from pandas import DataFrame, Series from scipy.linalg import svd from scipy.optimize import least_squares +# Type Hinting +from typing import Optional, Tuple, Union +from pastas.typing import ArrayLike, CallBack, Function, Model + + logger = getLogger(__name__) @@ -27,10 +32,11 @@ class BaseSolver: Attributes ---------- model: pastas.Model instance - pcor: pandas.DataFrame - Pandas DataFrame with the correlation between the optimized parameters. pcov: pandas.DataFrame Pandas DataFrame with the correlation between the optimized parameters. + pcor: pandas.DataFrame + Based on pcov, cannot be parsed. + Pandas DataFrame with the correlation between the optimized parameters. nfev: int Number of times the model is called during optimization. result: object @@ -39,7 +45,12 @@ class BaseSolver: """ - def __init__(self, ml, pcov=None, nfev=None, obj_func=None, **kwargs): + def __init__(self, + ml: Model, + pcov: Optional[DataFrame] = None, + nfev: Optional[int] = None, + obj_func: Optional[Function] = None, + **kwargs) -> None: self.ml = ml self.pcov = pcov # Covariances of the parameters if pcov is None: @@ -50,8 +61,14 @@ def __init__(self, ml, pcov=None, nfev=None, obj_func=None, **kwargs): self.obj_func = obj_func self.result = None # Object returned by the optimization method - def misfit(self, p, noise, weights=None, callback=None, - returnseparate=False): + def misfit( + self, + p: ArrayLike, + noise: bool, + weights: Optional[Series] = None, + callback: Optional[CallBack] = None, + returnseparate: bool = False + ) -> Union[ArrayLike, Tuple[ArrayLike, ArrayLike, ArrayLike]]: """This method is called by all solvers to obtain a series that are minimized in the optimization process. It handles the application of the weights, a noisemodel and other optimization options. @@ -73,12 +90,13 @@ def misfit(self, p, noise, weights=None, callback=None, Returns ------- - rv: - residuals series (if noise=False) or noise series (if noise=True) + rv: array_like + residuals array (if noise=False) or noise array (if noise=True) """ # Get the residuals or the noise if noise: rv = self.ml.noise(p) * self.ml.noise_weights(p) + else: rv = self.ml.residuals(p) @@ -98,7 +116,11 @@ def misfit(self, p, noise, weights=None, callback=None, return rv.values - def prediction_interval(self, n=1000, alpha=0.05, max_iter=10, **kwargs): + def prediction_interval(self, + n: int = 1000, + alpha: float = 0.05, + max_iter: int = 10, + **kwargs) -> DataFrame: """Method to calculate the prediction interval for the simulation. Returns @@ -124,7 +146,11 @@ def prediction_interval(self, n=1000, alpha=0.05, max_iter=10, **kwargs): rv = data.quantile(q, axis=1).transpose() return rv - def ci_simulation(self, n=1000, alpha=0.05, max_iter=10, **kwargs): + def ci_simulation(self, + n: int = 1000, + alpha: float = 0.05, + max_iter: int = 10, + **kwargs) -> DataFrame: """Method to calculate the confidence interval for the simulation. Returns @@ -145,8 +171,12 @@ def ci_simulation(self, n=1000, alpha=0.05, max_iter=10, **kwargs): alpha=alpha, max_iter=max_iter, **kwargs) - def ci_block_response(self, name, n=1000, alpha=0.05, max_iter=10, - **kwargs): + def ci_block_response(self, + name: str, + n: int = 1000, + alpha: float = 0.05, + max_iter: int = 10, + **kwargs) -> DataFrame: """Method to calculate the confidence interval for the block response. Returns @@ -169,8 +199,12 @@ def ci_block_response(self, name, n=1000, alpha=0.05, max_iter=10, max_iter=max_iter, dt=dt, **kwargs) - def ci_step_response(self, name, n=1000, alpha=0.05, max_iter=10, - **kwargs): + def ci_step_response(self, + name: str, + n: int = 1000, + alpha: float = 0.05, + max_iter: int = 10, + **kwargs) -> DataFrame: """Method to calculate the confidence interval for the step response. Returns @@ -193,7 +227,12 @@ def ci_step_response(self, name, n=1000, alpha=0.05, max_iter=10, max_iter=max_iter, dt=dt, **kwargs) - def ci_contribution(self, name, n=1000, alpha=0.05, max_iter=10, **kwargs): + def ci_contribution(self, + name: str, + n: int = 1000, + alpha: float = 0.05, + max_iter: int = 10, + **kwargs) -> DataFrame: """Method to calculate the confidence interval for the contribution. Returns @@ -215,17 +254,20 @@ def ci_contribution(self, name, n=1000, alpha=0.05, max_iter=10, **kwargs): max_iter=max_iter, **kwargs) - def get_parameter_sample(self, name=None, n=None, max_iter=10): + def get_parameter_sample(self, + name: Optional[str] = None, + n: int = None, + max_iter: int = 10) -> ArrayLike: """Method to obtain a parameter sets for monte carlo analyses. Parameters ---------- - n: int, optional - Number of random samples drawn from the bivariate normal - distribution. name: str, optional Name of the stressmodel or model component to obtain the parameters for. + n: int, optional + Number of random samples drawn from the bivariate normal + distribution. max_iter : int, optional maximum number of iterations for truncated multivariate sampling, default is 10. Increase this value if number of @@ -233,8 +275,8 @@ def get_parameter_sample(self, name=None, n=None, max_iter=10): Returns ------- - numpy.ndarray - Numpy array with N parameter samples. + array_like + array with N parameter samples. """ p = self.ml.get_parameters(name=name) pcov = self._get_covariance_matrix(name=name) @@ -274,8 +316,12 @@ def get_parameter_sample(self, name=None, n=None, max_iter=10): f"{samples.shape[0]}/{n}. Increase 'max_iter'.") return samples[:n, :] - def _get_realizations(self, func, n=None, name=None, max_iter=10, - **kwargs): + def _get_realizations(self, + func: Function, + n: Optional[int] = None, + name: Optional[str] = None, + max_iter: int = 10, + **kwargs) -> DataFrame: """Internal method to obtain n number of parameter realizations.""" if name: kwargs["name"] = name @@ -289,15 +335,20 @@ def _get_realizations(self, func, n=None, name=None, max_iter=10, return DataFrame.from_dict(data, orient="columns", dtype=float) - def _get_confidence_interval(self, func, n=None, name=None, alpha=0.05, - max_iter=10, **kwargs): + def _get_confidence_interval(self, + func: Function, + n: Optional[int] = None, + name: Optional[str] = None, + max_iter: int = 10, + alpha: float = 0.05, + **kwargs) -> DataFrame: """Internal method to obtain a confidence interval.""" q = [alpha / 2, 1 - alpha / 2] data = self._get_realizations(func=func, n=n, name=name, max_iter=max_iter, **kwargs) return data.quantile(q=q, axis=1).transpose() - def _get_covariance_matrix(self, name=None): + def _get_covariance_matrix(self, name: Optional[str] = None) -> DataFrame: """Internal method to obtain the covariance matrix from the model. Parameters @@ -322,7 +373,7 @@ def _get_covariance_matrix(self, name=None): return pcov @staticmethod - def _get_correlations(pcov): + def _get_correlations(pcov: DataFrame) -> DataFrame: """Internal method to obtain the parameter correlations from the covariance matrix. @@ -345,7 +396,7 @@ def _get_correlations(pcov): pcor = DataFrame(data=corr, index=index, columns=index) return pcor - def to_dict(self): + def to_dict(self) -> dict: data = { "name": self._name, "pcov": self.pcov, @@ -358,8 +409,12 @@ def to_dict(self): class LeastSquares(BaseSolver): _name = "LeastSquares" - def __init__(self, ml, pcov=None, nfev=None, **kwargs): - """Solver based on Scipy's least_squares method. + def __init__(self, + ml: Model, + pcov: Optional[DataFrame] = None, + nfev: Optional[int] = None, + **kwargs) -> None: + """Solver based on Scipy's least_squares method [scipy_ref]_. Notes ----- @@ -380,7 +435,11 @@ def __init__(self, ml, pcov=None, nfev=None, **kwargs): """ BaseSolver.__init__(self, ml=ml, pcov=pcov, nfev=nfev, **kwargs) - def solve(self, noise=True, weights=None, callback=None, **kwargs): + def solve(self, + noise: bool = True, + weights: Optional[Series] = None, + callback: Optional[CallBack] = None, + **kwargs) -> Tuple[bool, ArrayLike, ArrayLike]: self.vary = self.ml.parameters.vary.values.astype(bool) self.initial = self.ml.parameters.initial.values.copy() parameters = self.ml.parameters.loc[self.vary] @@ -409,26 +468,31 @@ def solve(self, noise=True, weights=None, callback=None, **kwargs): return success, optimal, stderr - def objfunction(self, p, noise, weights, callback): + def objfunction(self, p: ArrayLike, noise: bool, weights: Series, + callback: CallBack) -> ArrayLike: par = self.initial par[self.vary] = p return self.misfit(p=par, noise=noise, weights=weights, callback=callback) - def _get_covariances(self, jacobian, cost, absolute_sigma=False): + def _get_covariances(self, + jacobian: ArrayLike, + cost: float, + absolute_sigma: bool = False) -> ArrayLike: """Internal method to get the covariance matrix from the jacobian. Parameters ---------- - jacobian: numpy.ndarray + jacobian: array_like cost: float absolute_sigma: bool Default is False Returns ------- - pcov: numpy.Array - numpy array with the covariance matrix. + pcov: array_like + array with the covariance matrix. + Notes ----- @@ -468,8 +532,12 @@ def _get_covariances(self, jacobian, cost, absolute_sigma=False): class LmfitSolve(BaseSolver): _name = "LmfitSolve" - def __init__(self, ml, pcov=None, nfev=None, **kwargs): - """Solving the model using the LmFit solver. + def __init__(self, + ml: Model, + pcov: Optional[DataFrame] = None, + nfev: Optional[int] = None, + **kwargs) -> None: + """Solving the model using the LmFit solver [LM]_. This is basically a wrapper around the scipy solvers, adding some cool functionality for boundary conditions. @@ -487,8 +555,12 @@ def __init__(self, ml, pcov=None, nfev=None, **kwargs): raise ImportError(msg) BaseSolver.__init__(self, ml=ml, pcov=pcov, nfev=nfev, **kwargs) - def solve(self, noise=True, weights=None, callback=None, method="leastsq", - **kwargs): + def solve(self, + noise: bool = True, + weights: Optional[Series] = None, + callback: Optional[CallBack] = None, + method: Optional[str] = "leastsq", + **kwargs) -> Tuple[bool, ArrayLike, ArrayLike]: # Deal with the parameters parameters = lmfit.Parameters() @@ -531,7 +603,7 @@ def solve(self, noise=True, weights=None, callback=None, method="leastsq", return success, optimal[:idx], stderr[:idx] - def objfunction(self, parameters, noise, weights, callback): + def objfunction(self, parameters: DataFrame, noise: bool, weights: Series, + callback: CallBack) -> ArrayLike: p = np.array([p.value for p in parameters.values()]) - return self.misfit(p=p, noise=noise, weights=weights, - callback=callback) + return self.misfit(p=p, noise=noise, weights=weights, callback=callback) diff --git a/pastas/stats/__init__.py b/pastas/stats/__init__.py index ca7be714..7fb880a7 100644 --- a/pastas/stats/__init__.py +++ b/pastas/stats/__init__.py @@ -14,8 +14,8 @@ from .core import acf, ccf, mean, std, var from .dutch import ghg, glg, gvg, q_ghg, q_glg, q_gvg -from .metrics import (aic, bic, evp, kge_2012, mae, nse, pearsonr, rmse, rsq, - sse) +from .metrics import (aic, bic, evp, kge_2012, mae, + nse, pearsonr, rmse, rsq, sse) from .sgi import sgi import pastas.stats.signatures as signatures import pastas.stats.metrics as metrics diff --git a/pastas/stats/core.py b/pastas/stats/core.py index f70928ea..f623de8c 100644 --- a/pastas/stats/core.py +++ b/pastas/stats/core.py @@ -7,15 +7,25 @@ from numpy import (append, arange, array, average, corrcoef, diff, empty_like, exp, inf, nan, ones, pi, sqrt) -from pandas import DataFrame, Timedelta, TimedeltaIndex +from pandas import Series, DataFrame, Timedelta, TimedeltaIndex from scipy.stats import norm from ..decorators import njit from ..utils import check_numba +# Type Hinting +from typing import Union, Tuple +from pastas.typing import ArrayLike -def acf(x, lags=365, bin_method='rectangle', bin_width=0.5, max_gap=inf, - min_obs=20, full_output=False, alpha=0.05): + +def acf(x: Series, + lags: ArrayLike = 365, + bin_method: str = 'rectangle', + bin_width: float = 0.5, + max_gap: float = inf, + min_obs: int = 20, + full_output: bool = False, + alpha: float = 0.05) -> Union[Series, DataFrame]: """Calculate the autocorrelation function for irregular time steps. Parameters @@ -85,8 +95,15 @@ def acf(x, lags=365, bin_method='rectangle', bin_width=0.5, max_gap=inf, return c -def ccf(x, y, lags=365, bin_method='rectangle', bin_width=0.5, - max_gap=inf, min_obs=20, full_output=False, alpha=0.05): +def ccf(x: Series, + y: Series, + lags: ArrayLike = 365, + bin_method: str = 'rectangle', + bin_width: float = 0.5, + max_gap: float = inf, + min_obs: int = 20, + full_output: bool = False, + alpha: float = 0.05) -> Union[Series, DataFrame]: """Method to compute the cross-correlation for irregular time series. Parameters @@ -177,7 +194,7 @@ def ccf(x, y, lags=365, bin_method='rectangle', bin_width=0.5, return result.ccf -def _preprocess(x, max_gap): +def _preprocess(x: Series, max_gap: int) -> Tuple[Series, float, float]: """Internal method to preprocess the time series.""" dt = x.index.to_series().diff().dropna().values / Timedelta(1, "D") dt_mu = dt[dt < max_gap].mean() # Deal with big gaps if present @@ -190,7 +207,13 @@ def _preprocess(x, max_gap): @njit -def _compute_ccf_rectangle(lags, t_x, x, t_y, y, bin_width=0.5): +def _compute_ccf_rectangle( + lags: ArrayLike, + t_x: ArrayLike, + x: ArrayLike, + t_y: ArrayLike, + y: ArrayLike, + bin_width: float = 0.5) -> Tuple[ArrayLike, ArrayLike]: """Internal numba-optimized method to compute the ccf.""" c = empty_like(lags) b = empty_like(lags) @@ -216,7 +239,13 @@ def _compute_ccf_rectangle(lags, t_x, x, t_y, y, bin_width=0.5): @njit -def _compute_ccf_gaussian(lags, t_x, x, t_y, y, bin_width=0.5): +def _compute_ccf_gaussian( + lags: ArrayLike, + t_x: ArrayLike, + x: ArrayLike, + t_y: ArrayLike, + y: ArrayLike, + bin_width: float = 0.5) -> Tuple[ArrayLike, ArrayLike]: """Internal numba-optimized method to compute the ccf.""" c = empty_like(lags) b = empty_like(lags) @@ -245,7 +274,8 @@ def _compute_ccf_gaussian(lags, t_x, x, t_y, y, bin_width=0.5): return c, b -def _compute_ccf_regular(lags, x, y): +def _compute_ccf_regular(lags: ArrayLike, x: ArrayLike, + y: ArrayLike) -> Tuple[ArrayLike, ArrayLike]: c = empty_like(lags) for i, lag in enumerate(lags): c[i] = corrcoef(x[:-int(lag)], y[int(lag):])[0, 1] @@ -253,7 +283,7 @@ def _compute_ccf_regular(lags, x, y): return c, b -def mean(x, weighted=True, max_gap=30): +def mean(x: Series, weighted: bool = True, max_gap: int = 30) -> ArrayLike: """Method to compute the (weighted) mean of a time series. Parameters @@ -281,7 +311,7 @@ def mean(x, weighted=True, max_gap=30): return average(x.to_numpy(), weights=w) -def var(x, weighted=True, max_gap=30): +def var(x: Series, weighted: bool = True, max_gap: int = 30) -> ArrayLike: """Method to compute the (weighted) variance of a time series. Parameters @@ -312,7 +342,7 @@ def var(x, weighted=True, max_gap=30): return sigma -def std(x, weighted=True, max_gap=30): +def std(x: Series, weighted: bool = True, max_gap: int = 30) -> ArrayLike: """Method to compute the (weighted) variance of a time series. Parameters @@ -336,7 +366,7 @@ def std(x, weighted=True, max_gap=30): # Helper functions -def _get_weights(x, weighted, max_gap=30): +def _get_weights(x: Series, weighted: bool = True, max_gap: int = 30) -> ArrayLike: """Helper method to compute the weights as the time step between obs. Parameters diff --git a/pastas/stats/dutch.py b/pastas/stats/dutch.py index ea09b1c8..b9ab9dfd 100644 --- a/pastas/stats/dutch.py +++ b/pastas/stats/dutch.py @@ -8,8 +8,16 @@ from pandas import Series, Timedelta, concat, date_range from pastas.utils import get_sample +# Type Hinting +from typing import Optional, Union +from pastas.typing import Function, TimestampType -def q_ghg(series, tmin=None, tmax=None, q=0.94, by_year=True): + +def q_ghg(series: Series, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + q: float = 0.94, + by_year: bool = True) -> Series: """Gemiddeld Hoogste Grondwaterstand (GHG) also called MHGL (Mean High Groundwater Level). @@ -33,7 +41,11 @@ def q_ghg(series, tmin=None, tmax=None, q=0.94, by_year=True): return _q_gxg(series, q, tmin=tmin, tmax=tmax, by_year=by_year) -def q_glg(series, tmin=None, tmax=None, q=0.06, by_year=True): +def q_glg(series: Series, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + q: float = 0.06, + by_year: bool = True) -> Series: """Gemiddeld Laagste Grondwaterstand (GLG) also called MLGL (Mean Low Groundwater Level). @@ -57,7 +69,10 @@ def q_glg(series, tmin=None, tmax=None, q=0.06, by_year=True): return _q_gxg(series, q, tmin=tmin, tmax=tmax, by_year=by_year) -def q_gvg(series, tmin=None, tmax=None, by_year=True): +def q_gvg(series: Series, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + by_year: bool = True) -> Series: """Gemiddeld Voorjaarsgrondwaterstand (GVG) also called MSGL (Mean Spring Groundwater Level). @@ -96,8 +111,15 @@ def q_gvg(series, tmin=None, tmax=None, by_year=True): return nan -def ghg(series, tmin=None, tmax=None, fill_method='nearest', limit=0, - output='mean', min_n_meas=16, min_n_years=8, year_offset='a-mar'): +def ghg(series: Series, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + fill_method: str = 'nearest', + limit: int = 0, + output: str = 'mean', + min_n_meas: int = 16, + min_n_years: int = 8, + year_offset: str = 'a-mar') -> Union[Series, float]: """Calculate the 'Gemiddelde Hoogste Grondwaterstand' (Average High Groundwater Level) @@ -167,8 +189,15 @@ def mean_high(s, min_n_meas): year_offset=year_offset) -def glg(series, tmin=None, tmax=None, fill_method='nearest', limit=0, - output='mean', min_n_meas=16, min_n_years=8, year_offset='a-mar'): +def glg(series: Series, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + fill_method: str = 'nearest', + limit: int = 0, + output: str = 'mean', + min_n_meas: int = 16, + min_n_years: int = 8, + year_offset: str = 'a-mar') -> Union[Series, float]: """Calculate the 'Gemiddelde Laagste Grondwaterstand' (Average Low Groundwater Level). @@ -239,8 +268,15 @@ def mean_low(s, min_n_meas): year_offset=year_offset) -def gvg(series, tmin=None, tmax=None, fill_method='linear', limit=8, - output='mean', min_n_meas=2, min_n_years=8, year_offset='a'): +def gvg(series: Series, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + fill_method: str = 'linear', + limit: int = 8, + output: str = 'mean', + min_n_meas: int = 2, + min_n_years: int = 8, + year_offset: str = 'a') -> Union[Series, float]: """Calculate the 'Gemiddelde Voorjaars Grondwaterstand' (Average Spring Groundwater Level). @@ -298,8 +334,15 @@ def _mean_spring(s, min_n_meas): year_offset=year_offset) -def gg(series, tmin=None, tmax=None, fill_method='nearest', limit=0, - output='mean', min_n_meas=16, min_n_years=8, year_offset='a-mar'): +def gg(series: Series, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + fill_method: str = 'nearest', + limit: int = 0, + output: str = 'mean', + min_n_meas: int = 16, + min_n_years: int = 8, + year_offset: str = 'a-mar') -> Union[Series, float]: """Calculate the 'Gemiddelde Grondwaterstand' (Average Groundwater Level) Parameters @@ -354,7 +397,7 @@ def mean_all(s, min_n_meas): # Helper functions -def _get_spring(series, min_n_meas): +def _get_spring(series: Series, min_n_meas: int) -> float: """Internal method to get values of timeseries values in spring. Part of year aggregator function for gvg method. @@ -376,7 +419,7 @@ def _get_spring(series, min_n_meas): return series.loc[inspring] -def _in_spring(series): +def _in_spring(series: Series) -> Series: """Internal method to test if timeseries index is between 14 March and 15 April. @@ -395,8 +438,10 @@ def isinspring(x): return (((x.month == 3) and (x.day >= 14)) or return Series(series.index.map(isinspring), index=series.index) -def _gxg(series, year_agg, tmin, tmax, fill_method, limit, output, - min_n_meas, min_n_years, year_offset): +def _gxg(series: Series, year_agg: Function, tmin: Optional[TimestampType], + tmax: Optional[TimestampType], fill_method: str, + limit: Union[int, None], output: str, min_n_meas: int, + min_n_years: int, year_offset: str) -> Union[Series, float]: """Internal method for classic GXG statistics. Resampling the series to every 14th and 28th of the month. Taking the mean of aggregated values per year. @@ -538,7 +583,11 @@ def _gxg(series, year_agg, tmin, tmax, fill_method, limit, output, raise (ValueError(msg)) -def _q_gxg(series, q, tmin=None, tmax=None, by_year=True): +def _q_gxg(series: Series, + q: float, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + by_year: bool = True) -> Series: """Dutch groundwater statistics GHG and GLG approximated by taking quantiles of the timeseries values per year and taking the mean of the quantiles. diff --git a/pastas/stats/metrics.py b/pastas/stats/metrics.py index dcc92b1c..4bc99308 100644 --- a/pastas/stats/metrics.py +++ b/pastas/stats/metrics.py @@ -16,6 +16,9 @@ from numpy import abs, average, log, nan, sqrt from pastas.stats.core import _get_weights, mean, std, var +from pandas import Series +# Type Hinting +from typing import Optional __all__ = ["rmse", "sse", "mae", "nse", "evp", "rsq", "bic", "aic", "pearsonr", "kge_2012"] @@ -24,8 +27,13 @@ # Absolute Error Metrics -def mae(obs=None, sim=None, res=None, missing="drop", weighted=False, - max_gap=30): + +def mae(obs: Optional[Series] = None, + sim: Optional[Series] = None, + res: Optional[Series] = None, + missing: str = "drop", + weighted: bool = False, + max_gap: int = 30) -> float: """Compute the (weighted) Mean Absolute Error (MAE). Parameters @@ -72,8 +80,12 @@ def mae(obs=None, sim=None, res=None, missing="drop", weighted=False, return (w * abs(res.to_numpy())).sum() -def rmse(obs=None, sim=None, res=None, missing="drop", weighted=False, - max_gap=30): +def rmse(obs: Optional[Series] = None, + sim: Optional[Series] = None, + res: Optional[Series] = None, + missing: str = "drop", + weighted: bool = False, + max_gap: int = 30) -> float: """Compute the (weighted) Root Mean Squared Error (RMSE). Parameters @@ -119,7 +131,10 @@ def rmse(obs=None, sim=None, res=None, missing="drop", weighted=False, return sqrt((w * res.to_numpy() ** 2).sum()) -def sse(obs=None, sim=None, res=None, missing="drop"): +def sse(obs: Optional[Series] = None, + sim: Optional[Series] = None, + res: Optional[Series] = None, + missing: str = "drop") -> float: """Compute the Sum of the Squared Errors (SSE). Parameters @@ -159,7 +174,12 @@ def sse(obs=None, sim=None, res=None, missing="drop"): # Percentage Error Metrics -def pearsonr(obs, sim, missing="drop", weighted=False, max_gap=30): + +def pearsonr(obs: Series, + sim: Series, + missing: str = "drop", + weighted: bool = False, + max_gap: int = 30) -> float: """Compute the (weighted) Pearson correlation (r). Parameters @@ -211,7 +231,12 @@ def pearsonr(obs, sim, missing="drop", weighted=False, max_gap=30): return r -def evp(obs, sim=None, res=None, missing="drop", weighted=False, max_gap=30): +def evp(obs: Series, + sim: Optional[Series] = None, + res: Optional[Series] = None, + missing: str = "drop", + weighted: bool = False, + max_gap: int = 30) -> float: """Compute the (weighted) Explained Variance Percentage (EVP). Parameters @@ -265,7 +290,12 @@ def evp(obs, sim=None, res=None, missing="drop", weighted=False, max_gap=30): var(obs, weighted=weighted, max_gap=max_gap))) * 100 -def nse(obs, sim=None, res=None, missing="drop", weighted=False, max_gap=30): +def nse(obs: Series, + sim: Optional[Series] = None, + res: Optional[Series] = None, + missing: str = "drop", + weighted: bool = False, + max_gap: int = 30) -> float: """Compute the (weighted) Nash-Sutcliffe Efficiency (NSE). Parameters @@ -310,11 +340,16 @@ def nse(obs, sim=None, res=None, missing="drop", weighted=False, max_gap=30): mu = average(obs.to_numpy(), weights=w) return 1 - (w * res.to_numpy() ** 2).sum() / \ - (w * (obs.to_numpy() - mu) ** 2).sum() + (w * (obs.to_numpy() - mu) ** 2).sum() -def rsq(obs, sim=None, res=None, missing="drop", weighted=False, max_gap=30, - nparam=None): +def rsq(obs: Series, + sim: Optional[Series] = None, + res: Optional[Series] = None, + missing: str = "drop", + weighted: bool = False, + max_gap: int = 30, + nparam: Optional[int] = None) -> float: """Compute R-squared, possibly adjusted for the number of free parameters. Parameters @@ -373,7 +408,11 @@ def rsq(obs, sim=None, res=None, missing="drop", weighted=False, max_gap=30, return 1.0 - rss / tss -def bic(obs=None, sim=None, res=None, missing="drop", nparam=1): +def bic(obs: Optional[Series] = None, + sim: Optional[Series] = None, + res: Optional[Series] = None, + missing: str = "drop", + nparam: int = 1) -> float: """Compute the Bayesian Information Criterium (BIC). Parameters @@ -417,7 +456,11 @@ def bic(obs=None, sim=None, res=None, missing="drop", nparam=1): nparam * log(res.index.size)) -def aic(obs=None, sim=None, res=None, missing="drop", nparam=1): +def aic(obs: Optional[Series] = None, + sim: Optional[Series] = None, + res: Optional[Series] = None, + missing: str = "drop", + nparam: int = 1) -> float: """Compute the Akaike Information Criterium (AIC). Parameters @@ -462,8 +505,11 @@ def aic(obs=None, sim=None, res=None, missing="drop", nparam=1): # Forecast Error Metrics - -def kge_2012(obs, sim, missing="drop", weighted=False, max_gap=30): +def kge_2012(obs: Series, + sim: Series, + missing: str = "drop", + weighted: bool = False, + max_gap: int = 30) -> float: """Compute the (weighted) Kling-Gupta Efficiency (KGE). Parameters diff --git a/pastas/stats/sgi.py b/pastas/stats/sgi.py index cbcfee1f..08b29088 100644 --- a/pastas/stats/sgi.py +++ b/pastas/stats/sgi.py @@ -2,12 +2,12 @@ Index.""" from numpy import linspace +from pandas import Series from scipy.stats import norm -def sgi(series): - """Method to compute the Standardized Groundwater Index - :cite:t:`bloomfield_analysis_2013`. +def sgi(series: Series) -> Series: + """Method to compute the Standardized Groundwater Index [sgi_2013]_. Parameters ---------- diff --git a/pastas/stats/signatures.py b/pastas/stats/signatures.py index 4fe75fc3..fdb81bd6 100644 --- a/pastas/stats/signatures.py +++ b/pastas/stats/signatures.py @@ -1,10 +1,15 @@ """This module contains methods to compute the groundwater signatures.""" import pandas as pd -from pandas import Timedelta, DatetimeIndex, cut +from pandas import Timedelta, DatetimeIndex, cut, Series from numpy import diff, sqrt, log, arange, nan + import pastas as ps from scipy.stats import linregress +# Type Hinting +from typing import Optional, Tuple + + __all__ = ["cv_period_mean", "cv_date_min", "cv_fall_rate", "cv_rise_rate", "parde_seasonality", "avg_seasonal_fluctuation", "magnitude", "interannual_variation", "low_pulse_count", "high_pulse_count", @@ -16,12 +21,12 @@ "richards_pathlength", "richards_baker_index", "baseflow_stability"] -def _normalize(series): +def _normalize(series: Series) -> Series: series = (series - series.min()) / (series.max() - series.min()) return series -def cv_period_mean(series, freq="M"): +def cv_period_mean(series: Series, freq: str = "M") -> float: """Coefficient of variation of mean head over a period (default monthly). Parameters @@ -52,7 +57,7 @@ def cv_period_mean(series, freq="M"): return cv -def cv_date_min(series): +def cv_date_min(series: Series) -> float: """Coefficient of variation of the date of annual minimum head. Parameters @@ -83,7 +88,7 @@ def cv_date_min(series): return cv -def parde_seasonality(series, normalize=True): +def parde_seasonality(series: Series, normalize: bool = True) -> float: """Parde seasonality according to [parde_1933]_. Parameters @@ -113,7 +118,7 @@ def parde_seasonality(series, normalize=True): return coefficients.max() - coefficients.min() -def parde_coefficients(series, normalize=True): +def parde_coefficients(series: Series, normalize: bool = True) -> float: """Parde coefficients for each month [parde_1933]_. Parameters @@ -147,7 +152,7 @@ def parde_coefficients(series, normalize=True): return coefficients -def _martens(series, normalize=True): +def _martens(series: Series, normalize: bool = True) -> Tuple[Series, Series]: """Functions for the Martens average seasonal fluctuation and interanual fluctuation. @@ -178,7 +183,7 @@ def _martens(series, normalize=True): return hl, hw -def avg_seasonal_fluctuation(series, normalize=False): +def avg_seasonal_fluctuation(series: Series, normalize: bool = True) -> float: """Classification according to [martens_2013]_. Parameters @@ -217,7 +222,7 @@ def avg_seasonal_fluctuation(series, normalize=False): return hw.mean() - hl.mean() -def interannual_variation(series, normalize=False): +def interannual_variation(series: Series, normalize: bool = True) -> float: """Interannual variation after [martens_2013]_. Parameters @@ -258,8 +263,11 @@ def interannual_variation(series, normalize=False): return (hw.max() - hw.min()) + (hl.max() - hl.min()) / 2 -def colwell_components(series, bins=11, freq="M", method="mean", - normalize=True): +def colwell_components(series: Series, + bins: int = 11, + freq: str = "M", + method: str = "mean", + normalize: bool = True) -> Tuple[float, float, float]: """Colwell predictability, constant, and contingency [colwell_1974]_. Parameters @@ -319,7 +327,7 @@ def colwell_components(series, bins=11, freq="M", method="mean", hx = -(x / z * log(x / z)).sum() hy = - (y / z * log(y / z)).sum() - hxy = - (df / z * log(df / z, where=df!=0)).sum().sum() + hxy = - (df / z * log(df / z, where=df != 0)).sum().sum() # Compute final components p = 1 - (hxy - hy) / log(bins) # Predictability @@ -328,8 +336,11 @@ def colwell_components(series, bins=11, freq="M", method="mean", return p, c, m -def colwell_constancy(series, bins=11, freq="M", method="mean", - normalize=True): +def colwell_constancy(series: Series, + bins: int = 11, + freq: str = "M", + method: str = "mean", + normalize: bool = True) -> Tuple[float, float, float]: """Colwells constancy index after [colwell_1974]_. Parameters @@ -361,13 +372,18 @@ def colwell_constancy(series, bins=11, freq="M", method="mean", Contingency of periodic phenomena. Ecology, 55(5), 1148–1153. """ - return \ - colwell_components(series=series, bins=bins, freq=freq, method=method, - normalize=normalize)[1] - - -def colwell_contingency(series, bins=11, freq="M", method="mean", - normalize=True): + return colwell_components(series=series, + bins=bins, + freq=freq, + method=method, + normalize=normalize)[1] + + +def colwell_contingency(series: Series, + bins: int = 11, + freq: str = "M", + method: str = "mean", + normalize: bool = True) -> Tuple[float, float, float]: """Colwell contingency [colwell_1974]_ Parameters @@ -402,12 +418,14 @@ def colwell_contingency(series, bins=11, freq="M", method="mean", Contingency of periodic phenomena. Ecology, 55(5), 1148–1153. """ - return \ - colwell_components(series=series, bins=bins, freq=freq, method=method, - normalize=normalize)[2] + return colwell_components(series=series, + bins=bins, + freq=freq, + method=method, + normalize=normalize)[2] -def low_pulse_count(series, quantile=0.2): +def low_pulse_count(series: Series, quantile: float = 0.2) -> int: """Number of times the series drops below a certain threshold. Parameters @@ -439,7 +457,7 @@ def low_pulse_count(series, quantile=0.2): return (h.astype(int).diff() > 0).sum() -def high_pulse_count(series, quantile=0.8): +def high_pulse_count(series: Series, quantile: float = 0.8) -> int: """Number of times the series exceeds a certain threshold. Parameters @@ -471,7 +489,7 @@ def high_pulse_count(series, quantile=0.8): return (h.astype(int).diff() > 0).sum() -def low_pulse_duration(series, quantile=0.2): +def low_pulse_duration(series: Series, quantile: float = 0.8) -> float: """Average duration of pulses where the head is below a certain threshold. Parameters @@ -502,7 +520,7 @@ def low_pulse_duration(series, quantile=0.2): return (diff(sel.to_numpy()) / Timedelta("1D"))[::2].mean() -def high_pulse_duration(series, quantile=0.8): +def high_pulse_duration(series: Series, quantile: float = 0.8) -> float: """Average duration of pulses where the head exceeds a certain threshold. Parameters @@ -534,7 +552,7 @@ def high_pulse_duration(series, quantile=0.8): return (diff(sel.to_numpy()) / Timedelta("1D"))[::2].mean() -def amplitude_range(series): +def amplitude_range(series: Series) -> float: """Range of unscaled groundwater head. Parameters @@ -554,7 +572,7 @@ def amplitude_range(series): return series.max() - series.min() -def rise_rate(series, normalize=False): +def rise_rate(series: Series, normalize: bool = False) -> float: """Mean of positive head changes from one day to the next. Parameters @@ -587,7 +605,7 @@ def rise_rate(series, normalize=False): return rises.mean() -def fall_rate(series, normalize=False): +def fall_rate(series: Series, normalize: bool = False) -> float: """Mean negative head changes from one day to the next. Parameters @@ -621,7 +639,7 @@ def fall_rate(series, normalize=False): return falls.mean() -def cv_rise_rate(series, normalize=False): +def cv_rise_rate(series: Series, normalize: bool = False) -> float: """Coefficient of Variation in rise rate. Parameters @@ -654,7 +672,7 @@ def cv_rise_rate(series, normalize=False): return rises.std() / rises.mean() -def cv_fall_rate(series, normalize=False): +def cv_fall_rate(series: Series, normalize: bool = False) -> float: """Coefficient of Variation in fall rate. Parameters @@ -687,7 +705,7 @@ def cv_fall_rate(series, normalize=False): return falls.std() / falls.mean() -def magnitude(series): +def magnitude(series: Series) -> float: """Difference of peak head to base head, divided by base head. Parameters @@ -716,7 +734,7 @@ def magnitude(series): return (series.max() - series.min()) / series.min() -def reversals_avg(series): +def reversals_avg(series: Series) -> float: """Average annual number of rises and falls in daily head. Parameters @@ -744,7 +762,7 @@ def reversals_avg(series): return reversals.resample("A").sum().mean() -def reversals_cv(series): +def reversals_cv(series: Series) -> float: """Coefficient of Variation in annual number of rises and falls. Parameters @@ -772,7 +790,7 @@ def reversals_cv(series): return reversals.std() / reversals.mean() -def mean_annual_maximum(series, normalize=False): +def mean_annual_maximum(series: Series, normalize: bool = False) -> float: """Mean of annual maximum. Parameters @@ -802,7 +820,7 @@ def mean_annual_maximum(series, normalize=False): return series.resample("A").max().mean() -def bimodality_coefficient(series, normalize=True): +def bimodality_coefficient(series: Series, normalize: bool = False) -> float: """Bimodality coefficient after [Ellison_1987]_. Parameters @@ -850,7 +868,9 @@ def bimodality_coefficient(series, normalize=True): (kurt + ((3 * ((n - 1) ** 2)) / ((n - 2) * (n - 3)))) -def recession_constant(series, bins=20, normalize=False): +def recession_constant(series: Series, + bins: int = 20, + normalize: bool = False) -> float: """Recession constant after [kirchner_2009]_. Parameters @@ -895,7 +915,9 @@ def recession_constant(series, bins=20, normalize=False): return fit.slope -def recovery_constant(series, bins=20, normalize=False): +def recovery_constant(series: Series, + bins: int = 20, + normalize: bool = False) -> float: """Recovery constant after [kirchner_2009]_. Parameters @@ -941,7 +963,10 @@ def recovery_constant(series, bins=20, normalize=False): return fit.slope -def duration_curve_slope(series, l=0.1, u=0.9, normalize=True): +def duration_curve_slope(series: Series, + l: float = 0.1, + u: float = 0.9, + normalize: bool = True) -> float: """Slope of the duration curve between percentile l and u. Parameters @@ -980,7 +1005,10 @@ def duration_curve_slope(series, l=0.1, u=0.9, normalize=True): return linregress(s.index, s.values).slope -def duration_curve_range(series, l=0.1, u=0.9, normalize=True): +def duration_curve_range(series: Series, + l: float = 0.1, + u: float = 0.9, + normalize: bool = True) -> float: """Range of the duration curve between the percentile l and u. Parameters @@ -1015,7 +1043,7 @@ def duration_curve_range(series, l=0.1, u=0.9, normalize=True): return series.quantile(u) - series.quantile(l) -def richards_pathlength(series, normalize=True): +def richards_pathlength(series: Series, normalize: bool = True) -> float: """The path length of the time series, standardized by time series length. Parameters @@ -1052,7 +1080,7 @@ def richards_pathlength(series, normalize=True): return sum(sqrt(dh ** 2 + dt ** 2)) / sum(dt) -def richards_baker_index(series, normalize=True): +def richards_baker_index(series: Series, normalize: bool = True) -> float: """Richards-Baker index according to [baker_2004]_. Parameters @@ -1086,7 +1114,7 @@ def richards_baker_index(series, normalize=True): return series.diff().dropna().abs().sum() / series.sum() -def _baseflow(series, normalize=True): +def _baseflow(series: Series, normalize: bool = True) -> Tuple[Series, Series]: """Baseflow function for the baseflow index and stability Parameters @@ -1128,7 +1156,7 @@ def _baseflow(series, normalize=True): return series, ht -def baseflow_index(series, normalize=True): +def baseflow_index(series: Series, normalize: bool = True) -> float: """Baseflow index according to [wmo_2008]_. Parameters @@ -1161,7 +1189,7 @@ def baseflow_index(series, normalize=True): return series.resample("D").mean().sum() / ht.sum() -def baseflow_stability(series, normalize=True): +def baseflow_stability(series: Series, normalize: bool = True) -> float: """Baseflow stability after [heudorfer_2019]_. Parameters @@ -1196,7 +1224,7 @@ def baseflow_stability(series, normalize=True): return ht.resample("A").mean().max() - ht.resample("A").mean().min() -def hurst_exponent(series): +def hurst_exponent(series: Series): """Hurst exponent according to [wang_2006]_. Returns @@ -1218,7 +1246,7 @@ def hurst_exponent(series): return NotImplementedError -def autocorr(series, freq="w"): +def autocorr(series: Series, freq: str = "w"): """Lag where the first peak in the autocorrelation function occurs. Returns @@ -1238,7 +1266,7 @@ def autocorr(series, freq="w"): return NotImplementedError -def lyapunov_exponent(series): +def lyapunov_exponent(series: Series): """The exponential rate of divergence of nearby data points [hilborn_1994]_. @@ -1261,7 +1289,7 @@ def lyapunov_exponent(series): return NotImplementedError -def peak_timescale(series): +def peak_timescale(series: Series): """Area under peak divided by difference of peak head to peak base. Returns @@ -1283,7 +1311,7 @@ def peak_timescale(series): return NotImplementedError -def excess_mass(series): +def excess_mass(series: Series): """Test statistic of the dip test, after [hartigan_1985]_. Returns @@ -1304,7 +1332,7 @@ def excess_mass(series): return NotImplementedError -def critical_bandwidth(series): +def critical_bandwidth(series: Series): """Test statistic of the Silverman test, after [silverman_1981]_. Returns @@ -1326,7 +1354,7 @@ def critical_bandwidth(series): return NotImplementedError -def peak_base_time(series): +def peak_base_time(series: Series): """Difference between peak and base head, standardized by duration of peak. Returns @@ -1347,7 +1375,7 @@ def peak_base_time(series): return NotImplementedError -def summary(series, signatures=None): +def summary(series: Series, signatures: Optional[list] = None) -> Series: """Method to get many signatures for a time series. Parameters diff --git a/pastas/stats/tests.py b/pastas/stats/tests.py index 88da67aa..32e7c980 100644 --- a/pastas/stats/tests.py +++ b/pastas/stats/tests.py @@ -7,17 +7,21 @@ from logging import getLogger from numpy import arange, cumsum, finfo, median, nan, sqrt, zeros -from pandas import DataFrame, date_range, infer_freq +from pandas import Series, DataFrame, date_range, infer_freq from pastas.stats.core import acf as get_acf from pastas.utils import _get_time_offset, get_equidistant_series from scipy.stats import chi2, norm, normaltest, shapiro +# Type Hinting +from typing import Tuple + + logger = getLogger(__name__) __all__ = ["durbin_watson", "ljung_box", "runs_test", "stoffer_toloi", "diagnostics", "plot_acf", "plot_diagnostics"] -def durbin_watson(series=None): +def durbin_watson(series: Series) -> float: """Durbin-Watson test for autocorrelation. Parameters @@ -86,7 +90,10 @@ def durbin_watson(series=None): return dw_stat, p -def ljung_box(series=None, lags=15, nparam=0, full_output=False): +def ljung_box(series: Series, + lags: int = 15, + nparam: int = 0, + full_output: bool = False) -> Tuple[float, float]: """Ljung-box test for autocorrelation. Parameters @@ -182,7 +189,7 @@ def ljung_box(series=None, lags=15, nparam=0, full_output=False): return q_stat[-1], pval[-1] -def runs_test(series, cutoff="median"): +def runs_test(series: Series, cutoff: str = "median") -> Tuple[float, float]: """Runs test for autocorrelation. Parameters @@ -274,8 +281,12 @@ def runs_test(series, cutoff="median"): return z_stat, pval -def stoffer_toloi(series, lags=15, nparam=0, freq="D", - snap_to_equidistant_timestamps=False): +def stoffer_toloi( + series: Series, + lags: int = 15, + nparam: int = 0, + freq: str = "D", + snap_to_equidistant_timestamps: bool = False) -> Tuple[float, float]: """Adapted Ljung-Box test to deal with missing data [stoffer_1992]_. Parameters @@ -290,7 +301,7 @@ def stoffer_toloi(series, lags=15, nparam=0, freq="D", freq: str, optional String with the frequency to resample the time series to. snap_to_equidistant_timestamps : bool, optional - if False (default), a sample is taken from series with equidistant + if False (default), a sample is taken from series with equidistant timesteps using pandas' reindex. Only values are kept that lie on those equidistant timestamps. If True, an equidistant timeseries is created taking as many values as possible from the original series @@ -411,8 +422,12 @@ def stoffer_toloi(series, lags=15, nparam=0, freq="D", return qm, pval -def diagnostics(series, alpha=0.05, nparam=0, lags=15, stats=(), - float_fmt="{0:.2f}"): +def diagnostics(series: Series, + alpha: float = 0.05, + nparam: int = 0, + lags: int = 15, + stats: tuple = (), + float_fmt: str = "{0:.2f}") -> DataFrame: """Methods to compute various diagnostics checks for a time series. Parameters @@ -426,8 +441,8 @@ def diagnostics(series, alpha=0.05, nparam=0, lags=15, stats=(), lags: int, optional Maximum number of lags (in days) to compute the autocorrelation tests for. - stats: list, optional - List with the diagnostic checks to perform. Not implemented yet. + stats: tuple, optional + Tuple with the diagnostic checks to perform. Not implemented yet. float_fmt: str String to use for formatting the floats in the returned DataFrame. @@ -461,6 +476,7 @@ def diagnostics(series, alpha=0.05, nparam=0, lags=15, stats=(), In this example, the Null-hypothesis is not rejected and the data may be assumed to be white noise. """ + cols = ["Checks", "Statistic", "P-value"] df = DataFrame(index=stats, columns=cols) diff --git a/pastas/stressmodels.py b/pastas/stressmodels.py index 7e8367e7..3dbd2ac1 100644 --- a/pastas/stressmodels.py +++ b/pastas/stressmodels.py @@ -28,6 +28,10 @@ from .timeseries import TimeSeries from .utils import check_numba, validate_name +# Type Hinting +from typing import Optional, Tuple, List, Union +from pastas.typing import ArrayLike, RFunc, Recharge, Reservoir, TimestampType, Model + logger = getLogger(__name__) __all__ = ["StressModel", "Constant", "StepModel", "LinearTrend", @@ -47,8 +51,14 @@ class StressModelBase: """ _name = "StressModelBase" - def __init__(self, name, tmin, tmax, rfunc=None, up=True, meanstress=1, - cutoff=0.999): + def __init__(self, + name: str, + tmin: TimestampType, + tmax: TimestampType, + rfunc: Optional[RFunc] = None, + up: bool = True, + meanstress: float = 1.0, + cutoff: float = 0.999) -> None: self.name = validate_name(name) self.tmin = tmin self.tmax = tmax @@ -70,14 +80,14 @@ def __init__(self, name, tmin, tmax, rfunc=None, up=True, meanstress=1, self.stress = [] @property - def nparam(self): + def nparam(self) -> Tuple[int]: return self.parameters.index.size - def set_init_parameters(self): + def set_init_parameters(self) -> None: """Set the initial parameters (back) to their default values.""" @set_parameter - def _set_initial(self, name, value): + def _set_initial(self, name: str, value: float) -> None: """Internal method to set the initial parameter value. Notes @@ -87,7 +97,7 @@ def _set_initial(self, name, value): self.parameters.loc[name, 'initial'] = value @set_parameter - def _set_pmin(self, name, value): + def _set_pmin(self, name: str, value: float) -> None: """Internal method to set the lower bound of the parameter value. Notes @@ -97,7 +107,7 @@ def _set_pmin(self, name, value): self.parameters.loc[name, 'pmin'] = value @set_parameter - def _set_pmax(self, name, value): + def _set_pmax(self, name: str, value: float) -> None: """Internal method to set the upper bound of the parameter value. Notes @@ -107,7 +117,7 @@ def _set_pmax(self, name, value): self.parameters.loc[name, 'pmax'] = value @set_parameter - def _set_vary(self, name, value): + def _set_vary(self, name: str, value: float) -> None: """Internal method to set if the parameter is varied during optimization. @@ -117,7 +127,7 @@ def _set_vary(self, name, value): """ self.parameters.loc[name, 'vary'] = bool(value) - def update_stress(self, **kwargs): + def update_stress(self, **kwargs) -> None: """Method to update the settings of the individual TimeSeries. Notes @@ -135,7 +145,7 @@ def update_stress(self, **kwargs): if "freq" in kwargs: self.freq = kwargs["freq"] - def dump_stress(self, series=True): + def dump_stress(self, series: bool = True) -> dict: """Method to dump all stresses in the stresses list. Parameters @@ -157,8 +167,13 @@ def dump_stress(self, series=True): return data - def get_stress(self, p=None, tmin=None, tmax=None, freq=None, - istress=None, **kwargs): + def get_stress(self, + p: Optional[ArrayLike] = None, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + freq: Optional[str] = None, + istress: Optional[int] = None, + **kwargs) -> DataFrame: """Returns the stress or stresses of the time series object as a pandas DataFrame. @@ -179,7 +194,7 @@ def get_stress(self, p=None, tmin=None, tmax=None, freq=None, return self.stress[0].series - def to_dict(self, series=True): + def to_dict(self, series: bool = True) -> dict: """Method to export the StressModel object. Returns @@ -195,14 +210,15 @@ def to_dict(self, series=True): } return data - def get_nsplit(self): + def get_nsplit(self) -> int: """Determine in how many timeseries the contribution can be split.""" if hasattr(self, 'nsplit'): return self.nsplit else: return len(self.stress) - def _get_block(self, p, dt, tmin, tmax): + def _get_block(self, p: ArrayLike, dt: float, tmin: TimestampType, + tmax: TimestampType) -> ArrayLike: """Internal method to get the block-response function.""" if tmin is not None and tmax is not None: day = Timedelta(1, 'D') @@ -256,8 +272,15 @@ class StressModel(StressModelBase): """ _name = "StressModel" - def __init__(self, stress, rfunc, name, up=True, cutoff=0.999, - settings=None, metadata=None, meanstress=None): + def __init__(self, + stress: Series, + rfunc: RFunc, + name: str, + up: bool = True, + cutoff: float = 0.999, + settings: Optional[Union[dict, str]] = None, + metadata: Optional[dict] = None, + meanstress: Optional[float] = None) -> None: if isinstance(stress, list): stress = stress[0] # TODO Temporary fix Raoul, 2017-10-24 @@ -275,11 +298,16 @@ def __init__(self, stress, rfunc, name, up=True, cutoff=0.999, self.stress = [stress] self.set_init_parameters() - def set_init_parameters(self): + def set_init_parameters(self) -> None: """Set the initial parameters (back) to their default values.""" self.parameters = self.rfunc.get_init_parameters(self.name) - def simulate(self, p, tmin=None, tmax=None, freq=None, dt=1.0): + def simulate(self, + p: ArrayLike, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + freq: Optional[str] = None, + dt: float = 1.0) -> Series: """Simulates the head contribution. Parameters @@ -305,7 +333,7 @@ def simulate(self, p, tmin=None, tmax=None, freq=None, dt=1.0): index=stress.index, name=self.name, fastpath=True) return h - def to_dict(self, series=True): + def to_dict(self, series: bool = True) -> dict: """Method to export the StressModel object. Returns @@ -355,7 +383,12 @@ class StepModel(StressModelBase): """ _name = "StepModel" - def __init__(self, tstart, name, rfunc=None, up=True, cutoff=0.999): + def __init__(self, + tstart: TimestampType, + name: str, + rfunc: Optional[RFunc] = None, + up: bool = True, + cutoff: float = 0.999) -> None: if rfunc is None: rfunc = One() StressModelBase.__init__(self, name=name, tmin=Timestamp.min, @@ -364,7 +397,7 @@ def __init__(self, tstart, name, rfunc=None, up=True, cutoff=0.999): self.tstart = Timestamp(tstart) self.set_init_parameters() - def set_init_parameters(self): + def set_init_parameters(self) -> None: self.parameters = self.rfunc.get_init_parameters(self.name) tmin = Timestamp.min.toordinal() tmax = Timestamp.max.toordinal() @@ -373,7 +406,12 @@ def set_init_parameters(self): self.parameters.loc[self.name + "_tstart"] = (tinit, tmin, tmax, False, self.name) - def simulate(self, p, tmin=None, tmax=None, freq=None, dt=1): + def simulate(self, + p: ArrayLike, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + freq: Optional[str] = None, + dt: float = 1.0) -> Series: tstart = Timestamp.fromordinal(int(p[-1])) tindex = date_range(tmin, tmax, freq=freq) h = Series(0, tindex, name=self.name) @@ -385,7 +423,7 @@ def simulate(self, p, tmin=None, tmax=None, freq=None, dt=1): index=h.index, name=self.name, fastpath=True) return h - def to_dict(self, series=True): + def to_dict(self, series: bool = True) -> dict: data = { "stressmodel": self._name, 'tstart': self.tstart, @@ -418,14 +456,19 @@ class LinearTrend(StressModelBase): """ _name = "LinearTrend" - def __init__(self, start, end, name="trend"): - StressModelBase.__init__(self, name=name, tmin=Timestamp.min, + def __init__(self, + start: TimestampType, + end: TimestampType, + name: str = "trend") -> None: + StressModelBase.__init__(self, + name=name, + tmin=Timestamp.min, tmax=Timestamp.max) self.start = start self.end = end self.set_init_parameters() - def set_init_parameters(self): + def set_init_parameters(self) -> None: """Set the initial parameters for the stress model.""" start = Timestamp(self.start).toordinal() end = Timestamp(self.end).toordinal() @@ -439,7 +482,12 @@ def set_init_parameters(self): self.parameters.loc[self.name + "_tend"] = (end, tmin, tmax, False, self.name) - def simulate(self, p, tmin=None, tmax=None, freq=None, dt=1): + def simulate(self, + p: ArrayLike, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + freq: Optional[str] = None, + dt: float = 1.0) -> Series: """Simulate the trend.""" tindex = date_range(tmin, tmax, freq=freq) @@ -459,7 +507,7 @@ def simulate(self, p, tmin=None, tmax=None, freq=None, dt=1): trend = trend.cumsum() * p[0] return trend.rename(self.name) - def to_dict(self, series=None): + def to_dict(self, series: bool = True) -> dict: data = { "stressmodel": self._name, 'start': self.start, @@ -482,7 +530,7 @@ class Constant(StressModelBase): """ _name = "Constant" - def __init__(self, name="constant", initial=0.0): + def __init__(self, name: str = "constant", initial: float = 0.0) -> None: StressModelBase.__init__(self, name=name, tmin=Timestamp.min, tmax=Timestamp.max) self.initial = initial @@ -493,7 +541,7 @@ def set_init_parameters(self): self.initial, np.nan, np.nan, True, self.name) @staticmethod - def simulate(p=None): + def simulate(p: Optional[float] = None) -> float: return p @@ -508,7 +556,7 @@ class WellModel(StressModelBase): this model only works with the HantushWellModel response function. name: str Name of the stressmodel. - distances: list or list-like + distances: array_like list of distances to oseries, must be ordered the same as the stresses. up: bool, optional @@ -537,8 +585,15 @@ class WellModel(StressModelBase): """ _name = "WellModel" - def __init__(self, stress, rfunc, name, distances, up=False, cutoff=0.999, - settings="well", sort_wells=True): + def __init__(self, + stress: List[Series], + rfunc: RFunc, + name: str, + distances: ArrayLike, + up: bool = False, + cutoff: float = 0.999, + settings: str = "well", + sort_wells: bool = True) -> None: if not (isinstance(rfunc, HantushWellModel) or issubclass(rfunc, HantushWellModel)): raise NotImplementedError("WellModel only supports the rfunc " @@ -594,11 +649,17 @@ def __init__(self, stress, rfunc, name, distances, up=False, cutoff=0.999, self.freq = self.stress[0].settings["freq"] self.set_init_parameters() - def set_init_parameters(self): + def set_init_parameters(self) -> None: self.parameters = self.rfunc.get_init_parameters(self.name) - def simulate(self, p=None, tmin=None, tmax=None, freq=None, dt=1, - istress=None, **kwargs): + def simulate(self, + p: Optional[ArrayLike] = None, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + freq: Optional[str] = None, + dt: float = 1.0, + istress: Optional[int] = None, + **kwargs) -> Series: distances = self.get_distances(istress=istress) stress_df = self.get_stress(p=p, tmin=tmin, tmax=tmax, freq=freq, istress=istress, squeeze=False) @@ -653,8 +714,14 @@ def handle_stress(stress, settings): "dict or list.") return data - def get_stress(self, p=None, tmin=None, tmax=None, freq=None, - istress=None, squeeze=True, **kwargs): + def get_stress(self, + p: Optional[ArrayLike] = None, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + freq: Optional[str] = None, + istress: Optional[int] = None, + squeeze: bool = True, + **kwargs) -> DataFrame: if tmin is None: tmin = self.tmin if tmax is None: @@ -678,7 +745,7 @@ def get_stress(self, p=None, tmin=None, tmax=None, freq=None, else: return self.stress[istress].series.to_frame() - def get_distances(self, istress=None): + def get_distances(self, istress: Optional[int] = None) -> DataFrame: if istress is None: return self.distances elif isinstance(istress, list): @@ -686,7 +753,7 @@ def get_distances(self, istress=None): else: return self.distances.iloc[istress:istress + 1] - def get_parameters(self, model=None, istress=None): + def get_parameters(self, model=None, istress: Optional[int] = None) -> ArrayLike: """ Get parameters including distance to observation point and return as array (dimensions = (nstresses, 4)). @@ -701,7 +768,7 @@ def get_parameters(self, model=None, istress=None): Returns ------- - p : np.array + p : array_like parameters for each stress as row of array, if istress is used returns only one row. @@ -719,7 +786,7 @@ def get_parameters(self, model=None, istress=None): p_with_r = np.r_[p, distances] return p_with_r - def to_dict(self, series=True): + def to_dict(self, series: bool = True) -> dict: """Method to export the WellModel object. Returns @@ -741,7 +808,10 @@ def to_dict(self, series=True): } return data - def variance_gain(self, model, istress=None, r=None): + def variance_gain(self, + model: Model, + istress: Optional[int] = None, + r: Optional[ArrayLike] = None) -> float: """Calculate variance of the gain for WellModel. Variance of the gain is calculated based on propagation of uncertainty @@ -754,7 +824,7 @@ def variance_gain(self, model, istress=None, r=None): optimized model istress : int or list of int, optional index of stress(es) for which to calculate variance of gain - r : np.array, optional + r : array_like, optional radial distance(s) at which to calculate variance of the gain, only considered if istress is None @@ -862,10 +932,20 @@ class RechargeModel(StressModelBase): """ _name = "RechargeModel" - def __init__(self, prec, evap, rfunc=None, name="recharge", - recharge=None, temp=None, cutoff=0.999, - settings=("prec", "evap", "evap"), - metadata=(None, None, None)): + def __init__( + self, + prec: Series, + evap: Series, + rfunc: Optional[RFunc] = None, + name: str = "recharge", + recharge: Optional[Recharge] = None, + temp: Series = None, + cutoff: float = 0.999, + settings: Tuple[Union[str, dict], Union[str, dict], + Union[str, dict]] = ("prec", "evap", "evap"), + metadata: Optional[Tuple[dict, dict, dict]] = (None, None, None) + ) -> None: + if rfunc is None: rfunc = Exponential() @@ -937,14 +1017,14 @@ def __init__(self, prec, evap, rfunc=None, name="recharge", else: self.nsplit = 1 - def set_init_parameters(self): + def set_init_parameters(self) -> None: """Internal method to set the initial parameters.""" self.parameters = concat( [self.rfunc.get_init_parameters(self.name), self.recharge.get_init_parameters(self.name) ]) - def update_stress(self, **kwargs): + def update_stress(self, **kwargs) -> None: """Method to update the settings of the individual TimeSeries. Notes @@ -964,8 +1044,14 @@ def update_stress(self, **kwargs): if "freq" in kwargs: self.freq = kwargs["freq"] - def simulate(self, p=None, tmin=None, tmax=None, freq=None, dt=1.0, - istress=None): + def simulate(self, + p: Optional[ArrayLike] = None, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + freq: Optional[str] = None, + dt: float = 1.0, + istress: Optional[int] = None, + **kwargs) -> Series: """Method to simulate the contribution of recharge to the head. Parameters @@ -1002,8 +1088,13 @@ def simulate(self, p=None, tmin=None, tmax=None, freq=None, dt=1.0, return Series(data=fftconvolve(stress, b, 'full')[:stress.size], index=self.prec.series.index, name=name, fastpath=True) - def get_stress(self, p=None, tmin=None, tmax=None, freq=None, - istress=None, **kwargs): + def get_stress(self, + p: Optional[ArrayLike] = None, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + freq: Optional[str] = None, + istress: Optional[int] = None, + **kwargs) -> Series: """Method to obtain the recharge stress calculated by the model. Parameters @@ -1036,14 +1127,13 @@ def get_stress(self, p=None, tmin=None, tmax=None, freq=None, if istress is None: prec = self.prec.series.values evap = self.evap.series.values + temp = None if self.temp is not None: temp = self.temp.series.values - else: - temp = None if p is None: p = self.parameters.initial.values - stress = self.recharge.simulate(prec=prec, evap=evap, temp=temp, - p=p[-self.recharge.nparam:]) + stress = self.recharge.simulate( + prec=prec, evap=evap, p=p[-self.recharge.nparam:], **{"temp": temp}) return Series(data=stress, index=self.prec.series.index, name="recharge", fastpath=True) elif istress == 0: @@ -1053,7 +1143,11 @@ def get_stress(self, p=None, tmin=None, tmax=None, freq=None, else: return self.temp.series - def get_water_balance(self, p=None, tmin=None, tmax=None, freq=None): + def get_water_balance(self, + p: Optional[ArrayLike] = None, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + freq: Optional[str] = None) -> DataFrame: """Method to obtain the water balance components. Parameters @@ -1109,7 +1203,7 @@ def get_water_balance(self, p=None, tmin=None, tmax=None, freq=None): df.index = self.prec.series.index return df - def to_dict(self, series=True): + def to_dict(self, series: bool = True) -> dict: data = { "stressmodel": self._name, "prec": self.prec.to_dict(series=series), @@ -1167,8 +1261,14 @@ class TarsoModel(RechargeModel): """ _name = "TarsoModel" - def __init__(self, prec, evap, oseries=None, dmin=None, dmax=None, - rfunc=None, **kwargs): + def __init__(self, + prec: Series, + evap: Series, + oseries: Optional[Series] = None, + dmin: Optional[float] = None, + dmax: Optional[float] = None, + rfunc: Optional[RFunc] = Exponential, + **kwargs) -> None: check_numba() if oseries is not None: if dmin is not None or dmax is not None: @@ -1190,7 +1290,7 @@ def __init__(self, prec, evap, oseries=None, dmin=None, dmax=None, self.dmax = dmax super().__init__(prec=prec, evap=evap, rfunc=rfunc, **kwargs) - def set_init_parameters(self): + def set_init_parameters(self) -> None: # parameters for the first drainage level p0 = self.rfunc.get_init_parameters(self.name) one = One() @@ -1214,20 +1314,25 @@ def set_init_parameters(self): # combine all parameters self.parameters = concat([p0, p1, pr]) - def simulate(self, p=None, tmin=None, tmax=None, freq=None, dt=1): + def simulate(self, + p: Optional[ArrayLike] = None, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + freq=None, + dt: float = 1.0) -> Series: stress = self.get_stress(p=p, tmin=tmin, tmax=tmax, freq=freq) h = self.tarso(p[:-self.recharge.nparam], stress.values, dt) sim = Series(h, name=self.name, index=stress.index) return sim - def to_dict(self, series=True): + def to_dict(self, series: bool = True) -> dict: data = super().to_dict(series) data['dmin'] = self.dmin data['dmax'] = self.dmax return data @staticmethod - def _check_stressmodel_compatibility(ml): + def _check_stressmodel_compatibility(ml: Model) -> None: """Internal method to check if no other stressmodels, a constants or a transform is used.""" msg = "A TarsoModel cannot be combined with %s. Either remove the" \ @@ -1241,7 +1346,7 @@ def _check_stressmodel_compatibility(ml): @staticmethod @njit - def tarso(p, r, dt): + def tarso(p: ArrayLike, r: ArrayLike, dt: float) -> ArrayLike: """Calculates the head based on exponential decay of the previous timestep and recharge, using two thresholds.""" A0, a0, d0, A1, a1, d1 = p @@ -1326,8 +1431,16 @@ class ChangeModel(StressModelBase): """ _name = "ChangeModel" - def __init__(self, stress, rfunc1, rfunc2, name, tchange, up=True, - cutoff=0.999, settings=None, metadata=None): + def __init__(self, + stress: Series, + rfunc1: RFunc, + rfunc2: RFunc, + name: str, + tchange: str, + up: bool = True, + cutoff: float = 0.999, + settings: Optional[Union[dict, str]] = None, + metadata: Optional[dict] = None) -> None: if isinstance(stress, list): stress = stress[0] # TODO Temporary fix Raoul, 2017-10-24 @@ -1358,7 +1471,7 @@ def __init__(self, stress, rfunc1, rfunc2, name, tchange, up=True, self.stress = [stress] self.set_init_parameters() - def set_init_parameters(self): + def set_init_parameters(self) -> None: """Internal method to set the initial parameters.""" self.parameters = concat( [self.rfunc1.get_init_parameters("{}_1".format(self.name)), @@ -1374,7 +1487,12 @@ def set_init_parameters(self): False, self.name) self.parameters.name = self.name - def simulate(self, p, tmin=None, tmax=None, freq=None, dt=1.0): + def simulate(self, + p: ArrayLike, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + freq: Optional[str] = None, + dt: float = 1.0) -> Series: self.update_stress(tmin=tmin, tmax=tmax, freq=freq) rfunc1 = self.rfunc1.block(p[:self.rfunc1.nparam]) rfunc2 = self.rfunc2.block( @@ -1432,9 +1550,15 @@ class ReservoirModel(StressModelBase): """ _name = "ReservoirModel" - def __init__(self, stress, reservoir, name, meanhead, - settings=("prec", "evap"), metadata=(None, None), - meanstress=None): + def __init__(self, + stress: Series, + reservoir: Reservoir, + name: str, + meanhead: float, + settings: Optional[Tuple[Union[str, dict], + Union[str, dict]]] = ("prec", "evap"), + metadata: Optional[Tuple[dict, dict]] = (None, None), + meanstress: Optional[float] = None) -> None: # Set resevoir object self.reservoir = reservoir(meanhead) @@ -1469,11 +1593,17 @@ def __init__(self, stress, reservoir, name, meanhead, self.freq = stress0.settings["freq"] self.set_init_parameters() - def set_init_parameters(self): + def set_init_parameters(self) -> None: """Set the initial parameters back to their default values.""" self.parameters = self.reservoir.get_init_parameters(self.name) - def simulate(self, p, tmin=None, tmax=None, freq=None, dt=1, istress=None): + def simulate(self, + p: ArrayLike, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + freq: Optional[str] = None, + dt: float = 1.0, + istress: Optional[int] = None) -> Series: """Simulates the head contribution. Parameters @@ -1498,8 +1628,13 @@ def simulate(self, p, tmin=None, tmax=None, freq=None, dt=1, istress=None): index=stress[0].index, name=self.name, fastpath=True) return h - def get_stress(self, p=None, tmin=None, tmax=None, freq=None, - istress=0, **kwargs): + def get_stress(self, + p: Optional[ArrayLike] = None, + tmin: Optional[TimestampType] = None, + tmax: Optional[TimestampType] = None, + freq: Optional[str] = None, + istress: int = 0, + **kwargs) -> Tuple[Series, Series]: if tmin is None: tmin = self.tmin if tmax is None: @@ -1509,7 +1644,7 @@ def get_stress(self, p=None, tmin=None, tmax=None, freq=None, return self.stress[0].series, self.stress[1].series - def to_dict(self, series=True): + def to_dict(self, series: bool = True) -> dict: """Method to export the StressModel object. Returns diff --git a/pastas/timer.py b/pastas/timer.py index 281adf35..2a31eff8 100644 --- a/pastas/timer.py +++ b/pastas/timer.py @@ -3,6 +3,9 @@ except ModuleNotFoundError: raise ModuleNotFoundError("SolveTimer requires 'tqdm' to be installed.") +# Type Hinting +from typing import Optional + class ExceededMaxSolveTime(Exception): """Custom Exception when model optimization exceeds threshold. @@ -14,7 +17,7 @@ class SolveTimer(tqdm): """Progress indicator for model optimization. Usage - ----- + ----- Print timer and number of iterations in console while running `ml.solve()`:: @@ -25,7 +28,7 @@ class SolveTimer(tqdm): Optimization progress: 73it [00:01, 67.68it/s] - Set maximum allowable time (in seconds) for solve, otherwise raise + Set maximum allowable time (in seconds) for solve, otherwise raise ExceededMaxSolveTime exception:: with SolveTimer(max_time=60) as t: @@ -37,7 +40,10 @@ class SolveTimer(tqdm): be updated quite as nicely. """ - def __init__(self, max_time=None, *args, **kwargs): + def __init__(self, + max_time: Optional[float] = None, + *args, + **kwargs) -> None: """Initialize SolveTimer. Parameters @@ -54,7 +60,7 @@ def __init__(self, max_time=None, *args, **kwargs): self.max_time = max_time super(SolveTimer, self).__init__(*args, **kwargs) - def timer(self, _, n=1): + def timer(self, _, n: int = 1): """Callback method for ps.Model.solve(). """ displayed = super(SolveTimer, self).update(n) diff --git a/pastas/timeseries.py b/pastas/timeseries.py index e96c87b4..2cbe5c31 100644 --- a/pastas/timeseries.py +++ b/pastas/timeseries.py @@ -1,12 +1,18 @@ from logging import getLogger import pandas as pd +from pandas import Series from pandas.tseries.frequencies import to_offset from .rcparams import rcParams from .utils import (_get_dt, _get_stress_dt, _get_time_offset, timestep_weighted_resample, validate_name) +# Type Hinting +from typing import Optional, Union +from pastas.typing import Axes + + logger = getLogger(__name__) @@ -54,8 +60,13 @@ class TimeSeries: """ _predefined_settings = rcParams["timeseries"] - def __init__(self, series, name=None, settings=None, metadata=None, - freq_original=None, **kwargs): + def __init__(self, + series: Series, + name: Optional[str] = None, + settings: Optional[Union[str, dict]] = None, + metadata: Optional[dict] = None, + freq_original: str = None, + **kwargs) -> None: if isinstance(series, TimeSeries): # Copy all the series self._series_original = series.series_original.copy() @@ -137,7 +148,7 @@ def __init__(self, series, name=None, settings=None, metadata=None, if update: self.update_series(force_update=True, **self.settings) - def __repr__(self): + def __repr__(self) -> str: """Prints a simple string representation of the time series.""" return f"{self.__class__.__name__}" \ f"(name={self.name}, " \ @@ -147,11 +158,11 @@ def __repr__(self): f"tmax={self.settings['tmax']})" @property - def series_original(self): + def series_original(self) -> Series: return self._series_original @series_original.setter - def series_original(self, series): + def series_original(self, series: Series) -> None: """Sets a new freq_original for the TimeSeries.""" if not isinstance(series, pd.Series): raise TypeError(f"Expected a Pandas Series, got {type(series)}") @@ -169,7 +180,7 @@ def series_original(self, series): self.update_series(force_update=True, **self.settings) @property - def series(self): + def series(self) -> Series: return self._series @series.setter @@ -188,7 +199,7 @@ def series_validated(self, value): " it is calculated from series_original. Please" " set series_original to update the series.") - def update_series(self, force_update=False, **kwargs): + def update_series(self, force_update: bool = False, **kwargs) -> None: """Method to update the series with new options. Parameters @@ -220,11 +231,11 @@ def update_series(self, force_update=False, **kwargs): fill_after: str or float, optional Method used to extend a time series after any measurements are available. Possible values are: "mean" or a float value. - tmin: str or pandas.TimeStamp, optional - String that can be converted to, or a Pandas TimeStamp with the + tmin: str or pandas.Timestamp, optional + String that can be converted to, or a Pandas Timestamp with the minimum time of the series. - tmax: str or pandas.TimeStamp, optional - String that can be converted to, or a Pandas TimeStamp with the + tmax: str or pandas.Timestamp, optional + String that can be converted to, or a Pandas Timestamp with the maximum time of the series. norm: str or float, optional String with the method to normalize the time series with. @@ -254,7 +265,7 @@ def update_series(self, force_update=False, **kwargs): self._series = series - def multiply(self, other): + def multiply(self, other: float) -> None: """Method to multiply the original time series. Parameters @@ -265,7 +276,7 @@ def multiply(self, other): self._series_original = self.series_original.multiply(other) self.update_series(force_update=True) - def _validate_series(self, series): + def _validate_series(self, series: Series) -> Series: """Validate user provided time series. Parameters @@ -355,7 +366,7 @@ def _validate_series(self, series): return series - def _update_settings(self, **kwargs): + def _update_settings(self, **kwargs) -> bool: """Internal method that check if an update is actually necessary. Returns @@ -375,7 +386,7 @@ def _update_settings(self, **kwargs): update = True return update - def _change_frequency(self, series): + def _change_frequency(self, series: Series) -> Series: """Method to change the frequency of the time series.""" freq = self.settings["freq"] @@ -405,7 +416,7 @@ def _change_frequency(self, series): return series - def _sample_up(self, series): + def _sample_up(self, series: Series) -> Series: """Resample the time series when the frequency increases (e.g. from weekly to daily values).""" method = self.settings["sample_up"] @@ -439,7 +450,7 @@ def _sample_up(self, series): return series - def _sample_down(self, series): + def _sample_down(self, series: Series) -> Series: """Resample the time series when the frequency decreases (e.g. from daily to weekly values). @@ -490,7 +501,7 @@ def _sample_down(self, series): return series - def _sample_weighted(self, series): + def _sample_weighted(self, series: Series) -> Series: freq = self.settings["freq"] time_offset = self.settings['time_offset'] tindex = pd.date_range(series.index[0].ceil(freq) + time_offset, @@ -500,7 +511,7 @@ def _sample_weighted(self, series): "timestep_weighted_resample.", self.name, freq) return series - def _fill_nan(self, series): + def _fill_nan(self, series: Series) -> Series: """Fill up the nan-values when present and a constant frequency is required.""" @@ -534,7 +545,7 @@ def _fill_nan(self, series): return series - def _fill_before(self, series): + def _fill_before(self, series: Series) -> Series: """Method to add a period in front of the available time series.""" freq = self.settings["freq"] method = self.settings["fill_before"] @@ -567,7 +578,7 @@ def _fill_before(self, series): return series - def _fill_after(self, series): + def _fill_after(self, series: Series) -> Series: """Method to add a period in front of the available time series.""" freq = self.settings["freq"] method = self.settings["fill_after"] @@ -600,7 +611,7 @@ def _fill_after(self, series): return series - def _normalize(self, series): + def _normalize(self, series: Series) -> Series: """Method to normalize the time series.""" method = self.settings["norm"] @@ -626,7 +637,7 @@ def _normalize(self, series): return series - def to_dict(self, series=True): + def to_dict(self, series: Optional[bool] = True) -> dict: """Method to export the Time Series to a json format. Parameters @@ -655,7 +666,7 @@ def to_dict(self, series=True): return data - def plot(self, original=False, **kwargs): + def plot(self, original: bool = False, **kwargs) -> Axes: """Method to plot the TimeSeries object. Plots the edited series by default. diff --git a/pastas/transform.py b/pastas/transform.py index 0f1536e6..5b4ccfb9 100644 --- a/pastas/transform.py +++ b/pastas/transform.py @@ -4,11 +4,14 @@ nonlinear effects. """ import numpy as np -from pandas import DataFrame +from pandas import DataFrame, Series from .decorators import set_parameter from .utils import validate_name +# Type Hinting +from pastas.typing import ArrayLike, Model + class ThresholdTransform: """ThresholdTransform lowers the simulation when it exceeds a certain @@ -38,8 +41,12 @@ class ThresholdTransform: """ _name = "ThresholdTransform" - def __init__(self, value=np.nan, vmin=np.nan, vmax=np.nan, - name='ThresholdTransform', nparam=2): + def __init__(self, + value: float = np.nan, + vmin: float = np.nan, + vmax: float = np.nan, + name: str = 'ThresholdTransform', + nparam: int = 2) -> None: self.value = value self.vmin = vmin self.vmax = vmax @@ -48,7 +55,7 @@ def __init__(self, value=np.nan, vmin=np.nan, vmax=np.nan, self.parameters = DataFrame( columns=['initial', 'pmin', 'pmax', 'vary', 'name']) - def set_model(self, ml): + def set_model(self, ml: Model) -> None: obs = ml.observations() if np.isnan(self.value): self.value = obs.min() + 0.75 * (obs.max() - obs.min()) @@ -58,7 +65,7 @@ def set_model(self, ml): self.vmax = obs.max() self.set_init_parameters() - def set_init_parameters(self): + def set_init_parameters(self) -> None: self.parameters.loc[self.name + '_1'] = ( self.value, self.vmin, self.vmax, True, self.name) if self.nparam == 2: @@ -66,7 +73,7 @@ def set_init_parameters(self): 0.5, 0., 1., True, self.name) @set_parameter - def _set_initial(self, name, value): + def _set_initial(self, name: str, value: float) -> None: """Internal method to set the initial parameter value. Notes @@ -76,7 +83,7 @@ def _set_initial(self, name, value): self.parameters.loc[name, 'initial'] = value @set_parameter - def _set_pmin(self, name, value): + def _set_pmin(self, name: str, value: float) -> None: """Internal method to set the lower bound of the parameter value. Notes @@ -86,7 +93,7 @@ def _set_pmin(self, name, value): self.parameters.loc[name, 'pmin'] = value @set_parameter - def _set_pmax(self, name, value): + def _set_pmax(self, name: str, value: float) -> None: """Internal method to set the upper bound of the parameter value. Notes @@ -96,7 +103,7 @@ def _set_pmax(self, name, value): self.parameters.loc[name, 'pmax'] = value @set_parameter - def _set_vary(self, name, value): + def _set_vary(self, name: str, value: float) -> None: """Internal method to set if the parameter is varied during optimization. @@ -106,7 +113,7 @@ def _set_vary(self, name, value): """ self.parameters.loc[name, 'vary'] = bool(value) - def simulate(self, h, p): + def simulate(self, h: Series, p: ArrayLike) -> Series: if self.nparam == 1: # value above a threshold p[0] are equal to the threshold h[h > p[0]] = p[0] @@ -118,7 +125,7 @@ def simulate(self, h, p): raise ValueError('Not yet implemented yet') return h - def to_dict(self): + def to_dict(self) -> dict: data = { "transform": self._name, "value": self.value, diff --git a/pastas/typing/__init__.py b/pastas/typing/__init__.py new file mode 100644 index 00000000..1bbe7202 --- /dev/null +++ b/pastas/typing/__init__.py @@ -0,0 +1,16 @@ +from .types import ( + Axes, + Figure, + TimestampType, + Model, + TimeSeries, + StressModel, + NoiseModel, + Solver, + Recharge, + Reservoir, + CallBack, + Function, + RFunc, + ArrayLike, +) diff --git a/pastas/typing/types.py b/pastas/typing/types.py new file mode 100644 index 00000000..c72bb21b --- /dev/null +++ b/pastas/typing/types.py @@ -0,0 +1,34 @@ +# Type hinting for Pastas library +# Typing +from typing import Union, Any, TypeVar, TYPE_CHECKING + +# External libraries +# Pandas +from pandas import Timestamp +# Matplotlib +from matplotlib.axes._base import _AxesBase +from matplotlib.figure import FigureBase +# Numpy +from numpy.typing import ArrayLike as NumpyArrayLike + +# External Types +Axes = TypeVar("Axes", bound=_AxesBase) # Matplotlib Axes +Figure = TypeVar("Figure", bound=FigureBase) # Matplotlib Figure +ArrayLike = TypeVar("ArrayLike", bound=NumpyArrayLike) # Array Like (NumPy based) + +# Internal library +if TYPE_CHECKING: # https://mypy.readthedocs.io/en/latest/runtime_troubles.html + import pastas as ps + +# Internal Types +TimestampType = TypeVar("TimestampType", bound=Union[str, Timestamp]) # Tmin or Tmax +Model = TypeVar("Model", bound="ps.Model") # Model +TimeSeries = TypeVar("TimeSeries", bound="ps.TimeSeries") # Time Series +StressModel = TypeVar("StressModel", bound="ps.stressmodels.StressModelBase") # Stress Model +NoiseModel = TypeVar("NoiseModel", bound="ps.noisemodels.NoiseModelBase") # Noise Model +Solver = TypeVar("Solver", bound="ps.solver.BaseSolver") # Base Solver +Recharge = TypeVar("Recharge", bound="ps.recharge.RechargeBase") # Recharge Base +Reservoir = TypeVar("Reservoir", bound="ps.reservoir.ReservoirBase") # Reservoir Base +CallBack = TypeVar("CallBack", bound=Any) # Callback +Function = TypeVar("Function", bound=Any) # Function (e.g. Objective Function) +RFunc = TypeVar("RFunc", bound="ps.rfunc.RfuncBase") # rFunc Base diff --git a/pastas/utils.py b/pastas/utils.py index d4799507..d788ab7e 100644 --- a/pastas/utils.py +++ b/pastas/utils.py @@ -7,15 +7,20 @@ import numpy as np from packaging import version -from pandas import Series, Timedelta, Timestamp, date_range, to_datetime +from pandas import Series, Timedelta, Timestamp, date_range, to_datetime, Index, DatetimeIndex from pandas.tseries.frequencies import to_offset from scipy import __version__ as sc_version from scipy import interpolate +# Type Hinting +from typing import Optional, Tuple, Any +from pastas.typing import ArrayLike, TimestampType +from pastas.typing import Model as ModelType + logger = logging.getLogger(__name__) -def frequency_is_supported(freq): +def frequency_is_supported(freq: str) -> str: """Method to determine if a frequency is supported for a Pastas model. Parameters @@ -59,7 +64,7 @@ def frequency_is_supported(freq): return freq -def _get_stress_dt(freq): +def _get_stress_dt(freq: str) -> float: """Internal method to obtain a timestep in days from a frequency string. Parameters @@ -113,7 +118,7 @@ def _get_stress_dt(freq): return dt -def _get_dt(freq): +def _get_dt(freq: str) -> float: """Internal method to obtain a timestep in DAYS from a frequency string. Parameters @@ -130,8 +135,8 @@ def _get_dt(freq): return dt -def _get_time_offset(t, freq): - """Internal method to calculate the time offset of a TimeStamp. +def _get_time_offset(t: Timestamp, freq: str) -> Timedelta: + """Internal method to calculate the time offset of a Timestamp. Parameters ---------- @@ -151,7 +156,7 @@ def _get_time_offset(t, freq): return t - t.floor(freq) -def get_sample(tindex, ref_tindex): +def get_sample(tindex: Index, ref_tindex: Index) -> Index: """Sample the index so that the frequency is not higher than the frequency of ref_tindex. @@ -181,7 +186,7 @@ def get_sample(tindex, ref_tindex): return tindex[ind] -def timestep_weighted_resample(series0, tindex): +def timestep_weighted_resample(series0: Series, tindex: Index) -> Series: """Resample a timeseries to a new tindex, using an overlapping period weighted average. @@ -241,7 +246,7 @@ def timestep_weighted_resample(series0, tindex): return series -def timestep_weighted_resample_fast(series0, freq): +def timestep_weighted_resample_fast(series0: Series, freq: str) -> Series: """Resample a time series to a new frequency, using an overlapping period weighted average. @@ -296,7 +301,9 @@ def timestep_weighted_resample_fast(series0, freq): return series -def get_equidistant_series(series, freq, minimize_data_loss=False): +def get_equidistant_series(series: Series, + freq: str, + minimize_data_loss: bool = False) -> Series: """Get equidistant timeseries using nearest reindexing. This method will shift observations to the nearest equidistant timestep to @@ -397,7 +404,7 @@ def get_equidistant_series(series, freq, minimize_data_loss=False): return s -def to_daily_unit(series, method=True): +def to_daily_unit(series: Series, method: bool = True) -> Series: """Experimental method, use wth caution! Recalculate a timeseries of a stress with a non-daily unit (e/g. @@ -413,7 +420,7 @@ def to_daily_unit(series, method=True): return series -def excel2datetime(tindex, freq="D"): +def excel2datetime(tindex: DatetimeIndex, freq="D") -> DatetimeIndex: """Method to convert excel datetime to pandas timetime objects. Parameters @@ -430,7 +437,7 @@ def excel2datetime(tindex, freq="D"): return datetimes -def datenum_to_datetime(datenum): +def datenum_to_datetime(datenum: float) -> datetime: """Convert Matlab datenum into Python datetime. Parameters @@ -448,15 +455,15 @@ def datenum_to_datetime(datenum): + timedelta(days=days) - timedelta(days=366) -def datetime2matlab(tindex): +def datetime2matlab(tindex: DatetimeIndex) -> ArrayLike: mdn = tindex + Timedelta(days=366) frac = (tindex - tindex.round("D")).seconds / (24.0 * 60.0 * 60.0) return mdn.toordinal() + frac -def get_stress_tmin_tmax(ml): +def get_stress_tmin_tmax(ml: ModelType) -> Tuple[TimestampType, TimestampType]: """Get the minimum and maximum time that all of the stresses have data.""" - from .model import Model + from pastas import Model tmin = Timestamp.min tmax = Timestamp.max if isinstance(ml, Model): @@ -469,7 +476,8 @@ def get_stress_tmin_tmax(ml): return tmin, tmax -def initialize_logger(logger=None, level=logging.INFO): +def initialize_logger(logger: Optional[Any] = None, + level: Optional[Any] = logging.INFO) -> None: """Internal method to create a logger instance to log program output. Parameters @@ -487,8 +495,9 @@ def initialize_logger(logger=None, level=logging.INFO): # add_file_handlers(logger) -def set_console_handler(logger=None, level=logging.INFO, - fmt="%(levelname)s: %(message)s"): +def set_console_handler(logger: Optional[Any] = None, + level: Optional[Any] = logging.INFO, + fmt: str = "%(levelname)s: %(message)s") -> None: """Method to add a console handler to the logger of Pastas. Parameters @@ -508,7 +517,7 @@ def set_console_handler(logger=None, level=logging.INFO, logger.addHandler(ch) -def set_log_level(level): +def set_log_level(level: str) -> None: """Set the log-level of the console. This method is just a wrapper around set_console_handler. @@ -527,7 +536,7 @@ def set_log_level(level): set_console_handler(level=level) -def remove_console_handler(logger=None): +def remove_console_handler(logger: Optional[Any] = None) -> None: """Method to remove the console handler to the logger of Pastas. Parameters @@ -544,11 +553,15 @@ def remove_console_handler(logger=None): logger.removeHandler(handler) -def add_file_handlers(logger=None, filenames=('info.log', 'errors.log'), - levels=(logging.INFO, logging.ERROR), maxBytes=10485760, - backupCount=20, encoding='utf8', - fmt='%(asctime)s - %(name)s - %(levelname)s - %(message)s', - datefmt='%y-%m-%d %H:%M'): +def add_file_handlers( + logger: Optional[Any] = None, + filenames: Tuple[str] = ('info.log', 'errors.log'), + levels: Tuple[Any] = (logging.INFO, logging.ERROR), + maxBytes: int = 10485760, + backupCount: int = 20, + encoding: str = 'utf8', + fmt: str = '%(asctime)s - %(name)s - %(levelname)s - %(message)s', + datefmt: str = '%y-%m-%d %H:%M') -> None: """Method to add file handlers in the logger of Pastas. Parameters @@ -573,7 +586,7 @@ def add_file_handlers(logger=None, filenames=('info.log', 'errors.log'), logger.addHandler(fh) -def remove_file_handlers(logger=None): +def remove_file_handlers(logger: Optional[logging.Logger] = None) -> None: """Method to remove any file handlers in the logger of Pastas. Parameters @@ -590,7 +603,7 @@ def remove_file_handlers(logger=None): logger.removeHandler(handler) -def validate_name(name, raise_error=False): +def validate_name(name: str, raise_error: bool = False) -> str: """Method to check user-provided names and log a warning if wrong. Parameters @@ -615,7 +628,8 @@ def validate_name(name, raise_error=False): name = str(name) for char in ilchar: if char in name: - msg = f"User-provided name '{name}' contains illegal character. Please remove {char} from name." + msg = f"User-provided name '{name}' contains illegal character." + msg += f"Please remove {char} from name." if raise_error: raise Exception(msg) else: @@ -624,7 +638,7 @@ def validate_name(name, raise_error=False): return name -def show_versions(lmfit=False, numba=False): +def show_versions(lmfit: bool = False, numba: bool = False) -> None: """Method to print the version of dependencies. Parameters @@ -644,8 +658,8 @@ def show_versions(lmfit=False, numba=False): msg = ( f"Python version: {os_version}\n" - f"Numpy version: {np_version}\n" - f"Scipy version: {sc_version}\n" + f"NumPy version: {np_version}\n" + f"SciPy version: {sc_version}\n" f"Pandas version: {pd_version}\n" f"Pastas version: {ps_version}\n" f"Matplotlib version: {mpl_version}" @@ -661,7 +675,7 @@ def show_versions(lmfit=False, numba=False): return print(msg) -def check_numba(): +def check_numba() -> None: try: from numba import njit except ImportError: diff --git a/requirements.ci.txt b/requirements.ci.txt index fe32b087..b169a345 100644 --- a/requirements.ci.txt +++ b/requirements.ci.txt @@ -14,4 +14,6 @@ lmfit>=1.0.0 corner -emcee \ No newline at end of file +emcee + +tqdm \ No newline at end of file