Skip to content

Commit

Permalink
Consolidate all requirements (#2597)
Browse files Browse the repository at this point in the history
* Empty

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Avoid adding problematic lines to requirements.txt

kedro is preinstalled in the behave environment,
so this should not be needed.
In exchange, requirements.txt files can always be read
by setuptools automatic parsing.

See McK-Private/private-kedro#352 (comment)
for motivation of the original logic.

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Consolidate all requirements

Fix gh-2588.

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Fix dependencies

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Ignore import order in pylint in favour of isort

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Do not replace kedro plugins in e2e tests

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Fix dependabot config

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Add release notes

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

---------

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
  • Loading branch information
astrojuanlu committed Jul 18, 2023
1 parent 151f9ac commit 4eb6d87
Show file tree
Hide file tree
Showing 14 changed files with 118 additions and 123 deletions.
12 changes: 6 additions & 6 deletions .circleci/continue_config.yml
Expand Up @@ -61,7 +61,7 @@ commands:
command: conda install -c conda-forge pytables -y
- run:
name: Install requirements and test requirements
command: pip install --upgrade -r test_requirements.txt
command: pip install --upgrade .[test]
- run:
# this is needed to fix java cacerts so
# spark can automatically download packages from mvn
Expand Down Expand Up @@ -146,7 +146,7 @@ commands:
steps:
- restore_cache:
name: Restore package cache
key: kedro-deps-v1-win-{{ checksum "dependency/requirements.txt" }}-{{ checksum "test_requirements.txt" }}
key: kedro-deps-v1-win-{{ checksum "pyproject.toml" }}-{{ checksum "setup.py" }}
# We don't restore the conda environment cache for python 3.10 as it conflicts with the
# 'Install GDAL, Fiona and pytables' step breaking the conda environment (missing zlib.dll).
- unless:
Expand All @@ -155,7 +155,7 @@ commands:
steps:
- restore_cache:
name: Restore conda environment cache
key: kedro-deps-v1-win-<<parameters.python_version>>-{{ checksum "dependency/requirements.txt" }}-{{ checksum "test_requirements.txt" }}
key: kedro-deps-v1-win-<<parameters.python_version>>-{{ checksum "pyproject.toml" }}-{{ checksum "setup.py" }}
# pytables and Fiona have a series of binary dependencies under Windows that
# are best handled by conda-installing instead of pip-installing them.
# Dependency resolution works best when installing these altogether in one
Expand All @@ -168,7 +168,7 @@ commands:
command: conda activate kedro_builder; pip debug --verbose
- run:
name: Install all requirements
command: conda activate kedro_builder; pip install -v -r test_requirements.txt -U
command: conda activate kedro_builder; pip install -v -U .[test]
- run:
name: Print Python environment
command: conda activate kedro_builder; make print-python-env
Expand Down Expand Up @@ -337,7 +337,7 @@ jobs:
steps:
- save_cache:
name: Save Python package cache
key: kedro-deps-v1-win-{{ checksum "dependency/requirements.txt" }}-{{ checksum "test_requirements.txt" }}
key: kedro-deps-v1-win-{{ checksum "pyproject.toml" }}-{{ checksum "setup.py" }}
paths:
# Cache pip cache and conda packages directories
- c:\tools\miniconda3\pkgs
Expand All @@ -350,7 +350,7 @@ jobs:
steps:
- save_cache:
name: Save conda environment cache
key: kedro-deps-v1-win-<<parameters.python_version>>-{{ checksum "dependency/requirements.txt" }}-{{ checksum "test_requirements.txt" }}
key: kedro-deps-v1-win-<<parameters.python_version>>-{{ checksum "pyproject.toml" }}-{{ checksum "setup.py" }}
paths:
- c:\tools\miniconda3\envs\kedro_builder
- run:
Expand Down
2 changes: 1 addition & 1 deletion .github/dependabot.yml
Expand Up @@ -6,7 +6,7 @@
version: 2
updates:
- package-ecosystem: "pip" # See documentation for possible values
directory: "/dependency" # Location of package manifests
directory: "/" # Location of package manifests
schedule:
interval: "weekly"
labels:
Expand Down
3 changes: 1 addition & 2 deletions .gitpod.yml
Expand Up @@ -5,10 +5,9 @@ tasks:

init: |
make sign-off
pip install -e /workspace/kedro
pip install -e /workspace/kedro[test]
cd /workspace
yes project | kedro new -s pandas-iris --checkout main
pip install -r /workspace/kedro/test_requirements.txt
cd /workspace/kedro
pre-commit install --install-hooks
Expand Down
2 changes: 1 addition & 1 deletion .readthedocs.yml
Expand Up @@ -36,4 +36,4 @@ python:
path: .
extra_requirements:
- docs
- requirements: test_requirements.txt
- test
2 changes: 0 additions & 2 deletions MANIFEST.in
@@ -1,7 +1,5 @@
include README.md
include LICENSE.md
include dependency/requirements.txt
include test_requirements.txt
include kedro/framework/project/default_logging.yml
include kedro/ipython/*.png
include kedro/ipython/*.svg
Expand Down
2 changes: 1 addition & 1 deletion Makefile
Expand Up @@ -48,7 +48,7 @@ package: clean install
python -m pip install build && python -m build

install-test-requirements:
pip install -r test_requirements.txt
pip install .[test]

install-pre-commit: install-test-requirements
pre-commit install --install-hooks
Expand Down
1 change: 1 addition & 0 deletions RELEASE.md
Expand Up @@ -15,6 +15,7 @@
* Activated all built-in resolvers by default for `OmegaConfigLoader` except for `oc.env`.

## Bug fixes and other changes
* Consolidated dependencies and optional dependencies in `pyproject.toml`.

## Documentation changes

Expand Down
24 changes: 0 additions & 24 deletions dependency/requirements.txt

This file was deleted.

1 change: 0 additions & 1 deletion features/environment.py
Expand Up @@ -56,7 +56,6 @@ def _setup_context_with_venv(context, venv_dir):
context.pip = str(bin_dir / "pip")
context.python = str(bin_dir / "python")
context.kedro = str(bin_dir / "kedro")
context.requirements_path = Path("dependency/requirements.txt").resolve()

# clone the environment, remove any condas and venvs and insert our venv
context.env = os.environ.copy()
Expand Down
11 changes: 5 additions & 6 deletions features/steps/cli_steps.py
Expand Up @@ -11,6 +11,7 @@
import toml
import yaml
from behave import given, then, when
from packaging.requirements import Requirement

import kedro
from features.steps import util
Expand Down Expand Up @@ -407,18 +408,16 @@ def update_pyproject_toml(context: behave.runner.Context, new_source_dir):

@given("I have updated kedro requirements")
def update_kedro_req(context: behave.runner.Context):
"""Replace kedro as a standalone requirement with a line
that includes all of kedro's dependencies (-r kedro/requirements.txt)
"""
"""Remove kedro as a standalone requirement."""
reqs_path = context.root_project_dir / "src" / "requirements.txt"
kedro_reqs = f"-r {context.requirements_path.as_posix()}"

if reqs_path.is_file():
old_reqs = reqs_path.read_text().splitlines()
new_reqs = []
for req in old_reqs:
if req.startswith("kedro"):
new_reqs.append(kedro_reqs)
if req.startswith("kedro") and Requirement(req).name.lower() == "kedro":
# Do not include kedro as it's preinstalled in the environment
pass
else:
new_reqs.append(req)
new_reqs = "\n".join(new_reqs)
Expand Down
3 changes: 2 additions & 1 deletion features/windows_reqs.txt
@@ -1,4 +1,4 @@
# same versions as `test_requirements`
# same versions as [test] optional requirements
# e2e tests on Windows are slow but we don't need to install
# everything, so just this subset will be enough for CI
behave==1.2.6
Expand All @@ -7,3 +7,4 @@ psutil~=5.8
requests~=2.20
toml~=0.10.1
PyYAML>=4.2, <7.0
packaging>=20.0
32 changes: 29 additions & 3 deletions pyproject.toml
Expand Up @@ -11,6 +11,32 @@ authors = [
]
description = "Kedro helps you build production-ready data and analytics pipelines"
requires-python = ">=3.7"
dependencies = [
"anyconfig~=0.10.0",
"attrs>=21.3",
"build",
"cachetools~=5.3",
"click<9.0",
"cookiecutter>=2.1.1, <3.0",
"dynaconf>=3.1.2, <4.0",
"fsspec>=2021.4, <2024.1", # Upper bound set arbitrarily, to be reassessed in early 2024
"gitpython~=3.0",
"importlib-metadata>=3.6; python_version >= '3.8'",
"importlib_metadata>=3.6, <5.0; python_version < '3.8'", # The "selectable" entry points were introduced in `importlib_metadata` 3.6 and Python 3.10. Bandit on Python 3.7 relies on a library with `importlib_metadata` < 5.0
"importlib_resources>=1.3", # The `files()` API was introduced in `importlib_resources` 1.3 and Python 3.9.
"jmespath>=0.9.5, <1.0",
"more_itertools~=9.0",
"omegaconf~=2.3",
"parse~=1.19.0",
"pip-tools~=6.5",
"pluggy~=1.0",
"PyYAML>=4.2, <7.0",
"rich>=12.0, <14.0",
"rope>=0.21, <2.0", # subject to LGPLv3 license
"setuptools>=65.5.1",
"toml~=0.10",
"toposort~=1.5", # Needs to be at least 1.5 to be able to raise CircularDependencyError
]
keywords = [
"pipelines",
"machine learning",
Expand All @@ -26,7 +52,7 @@ classifiers = [
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
]
dynamic = ["readme", "version", "dependencies", "optional-dependencies"]
dynamic = ["readme", "version", "optional-dependencies"]

[project.urls]
Homepage = "https://kedro.org"
Expand All @@ -46,7 +72,6 @@ include = ["kedro*"]
[tool.setuptools.dynamic]
readme = {file = "README.md", content-type = "text/markdown"}
version = {attr = "kedro.__version__"}
dependencies = {file = "dependency/requirements.txt"}

[tool.black]
exclude = "/templates/|^features/steps/test_starter"
Expand All @@ -67,7 +92,8 @@ unsafe-load-any-extension = false
[tool.pylint.messages_control]
disable = [
"ungrouped-imports",
"duplicate-code"
"duplicate-code",
"wrong-import-order", # taken care of by isort
]
enable = ["useless-suppression"]
[tool.pylint.refactoring]
Expand Down
82 changes: 71 additions & 11 deletions setup.py
@@ -1,23 +1,14 @@
from codecs import open
from glob import glob
from itertools import chain
from os import path

from setuptools import setup

name = "kedro"
here = path.abspath(path.dirname(__file__))

# at least 1.3 to be able to use XMLDataSet and pandas integration with fsspec
PANDAS = "pandas~=1.3"
SPARK = "pyspark>=2.2, <4.0"
HDFS = "hdfs>=2.5.8, <3.0"
S3FS = "s3fs>=0.3.0, <0.5"

# get the dependencies and installs
with open("dependency/requirements.txt", encoding="utf-8") as f:
requires = [x.strip() for x in f if x.strip()]

template_files = []
for pattern in ["**/*", "**/.*", "**/.*/**", "**/.*/.**"]:
template_files.extend(
Expand Down Expand Up @@ -80,7 +71,9 @@ def _collect_requirements(requires):
"tensorflow.TensorflowModelDataset": [
# currently only TensorFlow V2 supported for saving and loading.
# V1 requires HDF5 and serialises differently
"tensorflow~=2.0"
"tensorflow~=2.0; platform_system != 'Darwin' or platform_machine != 'arm64'",
# https://developer.apple.com/metal/tensorflow-plugin/
"tensorflow-macos~=2.0; platform_system == 'Darwin' and platform_machine == 'arm64'",
]
}
yaml_require = {"yaml.YAMLDataSet": [PANDAS, "PyYAML>=4.2, <7.0"]}
Expand Down Expand Up @@ -139,10 +132,77 @@ def _collect_requirements(requires):
}

extras_require["all"] = _collect_requirements(extras_require)
extras_require["test"] = [
"adlfs>=2021.7.1, <=2022.2; python_version == '3.7'",
"adlfs~=2023.1; python_version >= '3.8'",
"bandit>=1.6.2, <2.0",
"behave==1.2.6",
"biopython~=1.73",
"blacken-docs==1.9.2",
"black~=22.0",
"compress-pickle[lz4]~=2.1.0",
"coverage[toml]",
"dask[complete]~=2021.10", # pinned by Snyk to avoid a vulnerability
"delta-spark~=1.2.1", # 1.2.0 has a bug that breaks some of our tests: https://github.com/delta-io/delta/issues/1070
"dill~=0.3.1",
"filelock>=3.4.0, <4.0",
"gcsfs>=2021.4, <=2023.1; python_version == '3.7'",
"gcsfs>=2023.1, <2023.3; python_version >= '3.8'",
"geopandas>=0.6.0, <1.0",
"hdfs>=2.5.8, <3.0",
"holoviews~=1.13.0",
"import-linter[toml]==1.8.0",
"ipython>=7.31.1, <8.0; python_version < '3.8'",
"ipython~=8.10; python_version >= '3.8'",
"isort~=5.0",
"Jinja2<3.1.0",
"joblib>=0.14",
"jupyterlab_server>=2.11.1, <2.16.0", # 2.16.0 requires importlib_metedata >= 4.8.3 which conflicts with flake8 requirement
"jupyterlab~=3.0, <3.6.0", # 3.6.0 requires jupyterlab_server~=2.19
"jupyter~=1.0",
"lxml~=4.6",
"matplotlib>=3.0.3, <3.4; python_version < '3.10'", # 3.4.0 breaks holoviews
"matplotlib>=3.5, <3.6; python_version == '3.10'",
"memory_profiler>=0.50.0, <1.0",
"moto==1.3.7; python_version < '3.10'",
"moto==3.0.4; python_version == '3.10'",
"networkx~=2.4",
"opencv-python~=4.5.5.64",
"openpyxl>=3.0.3, <4.0",
"pandas-gbq>=0.12.0, <0.18.0",
"pandas~=1.3 # 1.3 for read_xml/to_xml",
"Pillow~=9.0",
"plotly>=4.8.0, <6.0",
"pre-commit>=2.9.2, <3.0", # The hook `mypy` requires pre-commit version 2.9.2.
"psutil~=5.8",
"pyarrow>=6.0",
"pylint>=2.17.0, <3.0",
"pyproj~=3.0",
"pyspark>=2.2, <4.0",
"pytest-cov~=3.0",
"pytest-mock>=1.7.1, <2.0",
"pytest-xdist[psutil]~=2.2.1",
"pytest~=7.2",
"redis~=4.1",
"requests-mock~=1.6",
"requests~=2.20",
"s3fs>=0.3.0, <0.5", # Needs to be at least 0.3.0 to make use of `cachable` attribute on S3FileSystem.
"scikit-learn~=1.0.2",
"scipy~=1.7.3",
"SQLAlchemy~=1.2",
"tables~=3.6.0; platform_system == 'Windows' and python_version<'3.9'",
"tables~=3.6; platform_system != 'Windows'",
"tensorflow~=2.0; platform_system != 'Darwin' or platform_machine != 'arm64'",
# https://developer.apple.com/metal/tensorflow-plugin/
"tensorflow-macos~=2.0; platform_system == 'Darwin' and platform_machine == 'arm64'",
"triad>=0.6.7, <1.0",
"trufflehog~=2.1",
"xlsxwriter~=1.0",
]

setup(
package_data={
name: ["py.typed", "test_requirements.txt"] + template_files
"kedro": ["py.typed"] + template_files
},
extras_require=extras_require,
)

0 comments on commit 4eb6d87

Please sign in to comment.