Skip to content

Commit

Permalink
Merge pull request scikit-learn#2 from scikit-learn/master
Browse files Browse the repository at this point in the history
Merging changes from the main repository
  • Loading branch information
arka204 committed May 10, 2020
2 parents 3b79637 + c36c104 commit 464dc37
Show file tree
Hide file tree
Showing 270 changed files with 5,474 additions and 2,297 deletions.
1 change: 1 addition & 0 deletions .gitignore
Expand Up @@ -39,6 +39,7 @@ doc/samples
*.prof
.tox/
.coverage
pip-wheel-metadata

lfw_preprocessed/
nips2010_pdf/
Expand Down
22 changes: 22 additions & 0 deletions .pre-commit-config.yaml
@@ -0,0 +1,22 @@
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v2.3.0
hooks:
- id: check-yaml
- id: end-of-file-fixer
- id: trailing-whitespace
- repo: https://gitlab.com/pycqa/flake8
rev: 3.7.8
hooks:
- id: flake8
types: [file, python]
# only check for unused imports for now, as long as
# the code is not fully PEP8 compatible
args: [--select=F401]
- repo: https://github.com/pre-commit/mirrors-mypy
rev: v0.730
hooks:
- id: mypy
args:
- --ignore-missing-imports
files: sklearn/
3 changes: 0 additions & 3 deletions build_tools/azure/install.sh
Expand Up @@ -98,9 +98,6 @@ elif [[ "$DISTRIB" == "conda-pip-latest" ]]; then
python -m pip install -U pip
python -m pip install pytest==$PYTEST_VERSION pytest-cov pytest-xdist

# TODO: Remove pin when https://github.com/python-pillow/Pillow/issues/4518 gets fixed
python -m pip install "pillow>=4.3.0,!=7.1.0,!=7.1.1"

python -m pip install pandas matplotlib pyamg scikit-image
# do not install dependencies for lightgbm since it requires scikit-learn
python -m pip install lightgbm --no-deps
Expand Down
4 changes: 2 additions & 2 deletions build_tools/generate_authors_table.py
Expand Up @@ -11,14 +11,15 @@
import getpass
import time
from pathlib import Path
from os import path

print("user:", file=sys.stderr)
user = input()
passwd = getpass.getpass("Password or access token:\n")
auth = (user, passwd)

LOGO_URL = 'https://avatars2.githubusercontent.com/u/365630?v=4'
REPO_FOLDER = Path(__file__).parent.parent
REPO_FOLDER = Path(path.abspath(__file__)).parent.parent


def get(url):
Expand Down Expand Up @@ -100,7 +101,6 @@ def get_profile(login):
'Duchesnay': 'Edouard Duchesnay',
'Lars': 'Lars Buitinck',
'MechCoder': 'Manoj Kumar',
'jeremiedbb': 'Jérémie Du Boisberranger',
}
if profile["name"] in missing_names:
profile["name"] = missing_names[profile["name"]]
Expand Down
10 changes: 0 additions & 10 deletions conftest.py
Expand Up @@ -99,16 +99,6 @@ def pytest_unconfigure(config):
del sys._is_pytest_session


def pytest_runtest_setup(item):
if isinstance(item, DoctestItem):
set_config(print_changed_only=True)


def pytest_runtest_teardown(item, nextitem):
if isinstance(item, DoctestItem):
set_config(print_changed_only=False)


# TODO: Remove when modules are deprecated in 0.24
# Configures pytest to ignore deprecated modules.
collect_ignore_glob = [
Expand Down
42 changes: 21 additions & 21 deletions doc/about.rst
Expand Up @@ -271,82 +271,82 @@ July 2017.
</div>
</div>

............
Past Sponsors
.............

.. raw:: html

<div class="sk-sponsor-div">
<div class="sk-sponsor-div-box">

`Anaconda, Inc <https://www.anaconda.com/>`_ funds Adrin Jalali since 2019.
`INRIA <https://www.inria.fr>`_ actively supports this project. It has
provided funding for Fabian Pedregosa (2010-2012), Jaques Grobler
(2012-2013) and Olivier Grisel (2013-2017) to work on this project
full-time. It also hosts coding sprints and other events.

.. raw:: html

</div>

<div class="sk-sponsor-div-box">

.. image:: images/anaconda.png
.. image:: images/inria-logo.jpg
:width: 100pt
:align: center
:target: https://sydney.edu.au/
:target: https://www.inria.fr

.. raw:: html

</div>
</div>

Past Sponsors
.............
.....................

.. raw:: html

<div class="sk-sponsor-div">
<div class="sk-sponsor-div-box">

`INRIA <https://www.inria.fr>`_ actively supports this project. It has
provided funding for Fabian Pedregosa (2010-2012), Jaques Grobler
(2012-2013) and Olivier Grisel (2013-2017) to work on this project
full-time. It also hosts coding sprints and other events.
`Paris-Saclay Center for Data Science
<https://www.datascience-paris-saclay.fr/>`_
funded one year for a developer to work on the project full-time
(2014-2015), 50% of the time of Guillaume Lemaitre (2016-2017) and 50% of the
time of Joris van den Bossche (2017-2018).

.. raw:: html

</div>

<div class="sk-sponsor-div-box">

.. image:: images/inria-logo.jpg
.. image:: images/cds-logo.png
:width: 100pt
:align: center
:target: https://www.inria.fr
:target: https://www.datascience-paris-saclay.fr/

.. raw:: html

</div>
</div>

.....................
............

.. raw:: html

<div class="sk-sponsor-div">
<div class="sk-sponsor-div-box">

`Paris-Saclay Center for Data Science
<https://www.datascience-paris-saclay.fr/>`_
funded one year for a developer to work on the project full-time
(2014-2015), 50% of the time of Guillaume Lemaitre (2016-2017) and 50% of the
time of Joris van den Bossche (2017-2018).
`Anaconda, Inc <https://www.anaconda.com/>`_ funded Adrin Jalali in 2019.

.. raw:: html

</div>

<div class="sk-sponsor-div-box">

.. image:: images/cds-logo.png
.. image:: images/anaconda.png
:width: 100pt
:align: center
:target: https://www.datascience-paris-saclay.fr/
:target: https://www.anaconda.com/

.. raw:: html

Expand Down
2 changes: 1 addition & 1 deletion doc/authors.rst
Expand Up @@ -7,7 +7,7 @@
</style>
<div>
<a href='https://github.com/jeremiedbb'><img src='https://avatars2.githubusercontent.com/u/34657725?v=4' class='avatar' /></a> <br />
<p>Jérémie Du Boisberranger</p>
<p>Jérémie du Boisberranger</p>
</div>
<div>
<a href='https://github.com/jorisvandenbossche'><img src='https://avatars2.githubusercontent.com/u/1020496?v=4' class='avatar' /></a> <br />
Expand Down
27 changes: 24 additions & 3 deletions doc/conf.py
Expand Up @@ -17,6 +17,7 @@
import warnings
import re
from packaging.version import parse
from pathlib import Path

# If extensions (or modules to document with autodoc) are in another
# directory, add these directories to sys.path here. If the directory
Expand Down Expand Up @@ -208,6 +209,23 @@
# If true, the reST sources are included in the HTML build as _sources/name.
html_copy_source = True

# Adds variables into templates
html_context = {}
# finds latest release highlights and places it into HTML context for
# index.html
release_highlights_dir = Path("..") / "examples" / "release_highlights"
# Finds the highlight with the latest version number
latest_highlights = sorted(release_highlights_dir.glob(
"plot_release_highlights_*.py"))[-1]
latest_highlights = latest_highlights.with_suffix('').name
html_context["release_highlights"] = \
f"auto_examples/release_highlights/{latest_highlights}"

# get version from higlight name assuming highlights have the form
# plot_release_highlights_0_22_0
highlight_version = ".".join(latest_highlights.split("_")[-3:-1])
html_context["release_highlights_version"] = highlight_version

# -- Options for LaTeX output ------------------------------------------------
latex_elements = {
# The paper size ('letterpaper' or 'a4paper').
Expand Down Expand Up @@ -281,6 +299,11 @@ def __repr__(self):

def __call__(self, directory):
src_path = os.path.normpath(os.path.join(self.src_dir, directory))

# Forces Release Highlights to the top
if os.path.basename(src_path) == "release_highlights":
return "0"

readme = os.path.join(src_path, "README.txt")

try:
Expand Down Expand Up @@ -314,6 +337,7 @@ def __call__(self, directory):
},
# avoid generating too many cross links
'inspect_global_variables': False,
'remove_config_comments': True,
}


Expand Down Expand Up @@ -386,6 +410,3 @@ def setup(app):
warnings.filterwarnings("ignore", category=UserWarning,
message='Matplotlib is currently using agg, which is a'
' non-GUI backend, so cannot show the figure.')

# Reduces the output of estimators
sklearn.set_config(print_changed_only=True)
34 changes: 23 additions & 11 deletions doc/developers/contributing.rst
Expand Up @@ -248,19 +248,28 @@ modifying code and submitting a PR:
and start making changes. Always use a feature branch. It's good
practice to never work on the ``master`` branch!

9. Develop the feature on your feature branch on your computer, using Git to
do the version control. When you're done editing, add changed files using
``git add`` and then ``git commit``::
9. (**Optional**) Install `pre-commit <https://pre-commit.com/#install>`_ to
run code style checks before each commit::

$ git add modified_files
$ git commit
$ pip install pre-commit
$ pre-commit install

to record your changes in Git, then push the changes to your GitHub
account with::
pre-commit checks can be disabled for a particular commit with
`git commit -n`.

10. Develop the feature on your feature branch on your computer, using Git to
do the version control. When you're done editing, add changed files using
``git add`` and then ``git commit``::
$ git add modified_files
$ git commit

to record your changes in Git, then push the changes to your GitHub
account with::

$ git push -u origin my_feature

10. Follow `these
11. Follow `these
<https://help.github.com/articles/creating-a-pull-request-from-a-fork>`_
instructions to create a pull request from your fork. This will send an
email to the committers. You may want to consider sending an email to the
Expand Down Expand Up @@ -422,9 +431,12 @@ You can check for common programming errors with the following tools:

mypy --ignore-missing-import sklearn

must not produce new errors in your pull request. Using `# type: ignore` annotation can be a workaround for a few cases that are not supported by mypy, in particular,
- when importing C or Cython modules
- on properties with decorators
must not produce new errors in your pull request. Using `# type: ignore`
annotation can be a workaround for a few cases that are not supported by
mypy, in particular,

- when importing C or Cython modules
- on properties with decorators

Bonus points for contributions that include a performance analysis with
a benchmark script and profiling output (please report on the mailing
Expand Down
49 changes: 20 additions & 29 deletions doc/developers/develop.rst
Expand Up @@ -246,40 +246,19 @@ whether it is just for you or for contributing it to scikit-learn, there are
several internals of scikit-learn that you should be aware of in addition to
the scikit-learn API outlined above. You can check whether your estimator
adheres to the scikit-learn interface and standards by running
:func:`utils.estimator_checks.check_estimator` on the class::
:func:`~sklearn.utils.estimator_checks.check_estimator` on an instance. The
:func:`~sklearn.utils.parametrize_with_checks` pytest decorator can also be
used (see its docstring for details and possible interactions with `pytest`)::

>>> from sklearn.utils.estimator_checks import check_estimator
>>> from sklearn.svm import LinearSVC
>>> check_estimator(LinearSVC) # passes
>>> check_estimator(LinearSVC()) # passes

The main motivation to make a class compatible to the scikit-learn estimator
interface might be that you want to use it together with model evaluation and
selection tools such as :class:`model_selection.GridSearchCV` and
:class:`pipeline.Pipeline`.

Setting `generate_only=True` returns a generator that yields (estimator, check)
tuples where the check can be called independently from each other, i.e.
`check(estimator)`. This allows all checks to be run independently and report
the checks that are failing. scikit-learn provides a pytest specific decorator,
:func:`~sklearn.utils.parametrize_with_checks`, making it easier to test
multiple estimators::

from sklearn.utils.estimator_checks import parametrize_with_checks
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeRegressor

@parametrize_with_checks([LogisticRegression, DecisionTreeRegressor])
def test_sklearn_compatible_estimator(estimator, check):
check(estimator)

This decorator sets the `id` keyword in `pytest.mark.parameterize` exposing
the name of the underlying estimator and check in the test name. This allows
`pytest -k` to be used to specify which tests to run.

.. code-block: bash
pytest test_check_estimators.py -k check_estimators_fit_returns_self
Before detailing the required interface below, we describe two ways to achieve
the correct interface more easily.

Expand Down Expand Up @@ -531,17 +510,29 @@ requires_fit (default=True)
requires_positive_X (default=False)
whether the estimator requires positive X.

requires_y (default=False)
whether the estimator requires y to be passed to `fit`, `fit_predict` or
`fit_transform` methods. The tag is True for estimators inheriting from
`~sklearn.base.RegressorMixin` and `~sklearn.base.ClassifierMixin`.

requires_positive_y (default=False)
whether the estimator requires a positive y (only applicable for regression).

_skip_test (default=False)
whether to skip common tests entirely. Don't use this unless you have a
*very good* reason.

_xfail_test (default=False)
dictionary ``{check_name : reason}`` of common checks to mark as a
known failure, with the associated reason. Don't use this unless you have a
*very good* reason.
_xfail_checks (default=False)
dictionary ``{check_name: reason}`` of common checks that will be marked
as `XFAIL` for pytest, when using
:func:`~sklearn.utils.estimator_checks.parametrize_with_checks`. This tag
currently has no effect on
:func:`~sklearn.utils.estimator_checks.check_estimator`.
Don't use this unless there is a *very good* reason for your estimator
not to pass the check.
Also note that the usage of this tag is highly subject to change because
we are trying to make it more flexible: be prepared for breaking changes
in the future.

stateless (default=False)
whether the estimator needs access to data for fitting. Even though an
Expand Down
2 changes: 1 addition & 1 deletion doc/developers/plotting.rst
Expand Up @@ -50,7 +50,7 @@ attributes::
estimator.__class__.__name__)
return viz.plot(ax=ax, name=name, **kwargs)

Read more in :ref:`sphx_glr_auto_examples_plot_roc_curve_visualization_api.py`
Read more in :ref:`sphx_glr_auto_examples_miscellaneous_plot_roc_curve_visualization_api.py`
and the :ref:`User Guide <visualizations>`.

Plotting with Multiple Axes
Expand Down

0 comments on commit 464dc37

Please sign in to comment.