Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request]: Leverage optional dependency groups to reduce dependency count #928

Open
rkingsbury opened this issue Feb 27, 2024 · 6 comments
Assignees

Comments

@rkingsbury
Copy link
Collaborator

Problem

As recently surfaced in a peer review of a package that depends on maggma, maggma introduces a LOT of dependencies (ultimately because it so versatile). However, any given user of maggma is probably only using it to connect 2 or 3 types of Store and might not make use of many of these.

├── maggma [required: >=0.57.5, installed: 0.63.3]
│   ├── aioitertools [required: >=0.5.1, installed: 0.11.0]
│   │   └── typing-extensions [required: >=4.0, installed: 4.10.0]
│   ├── boto3 [required: >=1.20.41, installed: 1.34.49]
│   │   ├── botocore [required: >=1.34.49,<1.35.0, installed: 1.34.49]
│   │   │   ├── jmespath [required: >=0.7.1,<2.0.0, installed: 1.0.1]
│   │   │   ├── python-dateutil [required: >=2.1,<3.0.0, installed: 2.8.2]
│   │   │   │   └── six [required: >=1.5, installed: 1.16.0]
│   │   │   └── urllib3 [required: >=1.25.4,<1.27, installed: 1.26.18]
│   │   ├── jmespath [required: >=0.7.1,<2.0.0, installed: 1.0.1]
│   │   └── s3transfer [required: >=0.10.0,<0.11.0, installed: 0.10.0]
│   │       └── botocore [required: >=1.33.2,<2.0a.0, installed: 1.34.49]
│   │           ├── jmespath [required: >=0.7.1,<2.0.0, installed: 1.0.1]
│   │           ├── python-dateutil [required: >=2.1,<3.0.0, installed: 2.8.2]
│   │           │   └── six [required: >=1.5, installed: 1.16.0]
│   │           └── urllib3 [required: >=1.25.4,<1.27, installed: 1.26.18]
│   ├── dnspython [required: >=1.16.0, installed: 2.6.1]
│   ├── fastapi [required: >=0.42.0, installed: 0.110.0]
│   │   ├── pydantic [required: >=1.7.4,<3.0.0,!=2.1.0,!=2.0.1,!=2.0.0,!=1.8.1,!=1.8, installed: 2.6.2]
│   │   │   ├── annotated-types [required: >=0.4.0, installed: 0.6.0]
│   │   │   ├── pydantic-core [required: ==2.16.3, installed: 2.16.3]
│   │   │   │   └── typing-extensions [required: >=4.6.0,!=4.7.0, installed: 4.10.0]
│   │   │   └── typing-extensions [required: >=4.6.1, installed: 4.10.0]
│   │   ├── starlette [required: >=0.36.3,<0.37.0, installed: 0.36.3]
│   │   │   ├── anyio [required: >=3.4.0,<5, installed: 4.3.0]
│   │   │   │   ├── exceptiongroup [required: >=1.0.2, installed: 1.2.0]
│   │   │   │   ├── idna [required: >=2.8, installed: 3.6]
│   │   │   │   ├── sniffio [required: >=1.1, installed: 1.3.1]
│   │   │   │   └── typing-extensions [required: >=4.1, installed: 4.10.0]
│   │   │   └── typing-extensions [required: >=3.10.0, installed: 4.10.0]
│   │   └── typing-extensions [required: >=4.8.0, installed: 4.10.0]
│   ├── jsonschema [required: >=3.1.1, installed: 4.21.1]
│   │   ├── attrs [required: >=22.2.0, installed: 23.2.0]
│   │   ├── jsonschema-specifications [required: >=2023.03.6, installed: 2023.12.1]
│   │   │   └── referencing [required: >=0.31.0, installed: 0.33.0]
│   │   │       ├── attrs [required: >=22.2.0, installed: 23.2.0]
│   │   │       └── rpds-py [required: >=0.7.0, installed: 0.18.0]
│   │   ├── referencing [required: >=0.28.4, installed: 0.33.0]
│   │   │   ├── attrs [required: >=22.2.0, installed: 23.2.0]
│   │   │   └── rpds-py [required: >=0.7.0, installed: 0.18.0]
│   │   └── rpds-py [required: >=0.7.1, installed: 0.18.0]
│   ├── mongogrant [required: >=0.3.1, installed: 0.3.3]
│   │   ├── click [required: Any, installed: 8.1.7]
│   │   ├── flask [required: >=1.0, installed: 3.0.2]
│   │   │   ├── blinker [required: >=1.6.2, installed: 1.7.0]
│   │   │   ├── click [required: >=8.1.3, installed: 8.1.7]
│   │   │   ├── importlib-metadata [required: >=3.6.0, installed: 7.0.1]
│   │   │   │   └── zipp [required: >=0.5, installed: 3.17.0]
│   │   │   ├── itsdangerous [required: >=2.1.2, installed: 2.1.2]
│   │   │   ├── Jinja2 [required: >=3.1.2, installed: 3.1.3]
│   │   │   │   └── MarkupSafe [required: >=2.0, installed: 2.1.5]
│   │   │   └── werkzeug [required: >=3.0.0, installed: 3.0.1]
│   │   │       └── MarkupSafe [required: >=2.1.1, installed: 2.1.5]
│   │   ├── pymongo [required: >=3.8, installed: 4.6.2]
│   │   │   └── dnspython [required: >=1.16.0,<3.0.0, installed: 2.6.1]
│   │   └── requests [required: Any, installed: 2.31.0]
│   │       ├── certifi [required: >=2017.4.17, installed: 2024.2.2]
│   │       ├── charset-normalizer [required: >=2,<4, installed: 3.3.2]
│   │       ├── idna [required: >=2.5,<4, installed: 3.6]
│   │       └── urllib3 [required: >=1.21.1,<3, installed: 1.26.18]
│   ├── mongomock [required: >=3.10.0, installed: 4.1.2]
│   │   ├── packaging [required: Any, installed: 23.2]
│   │   └── sentinels [required: Any, installed: 1.0.0]
│   ├── monty [required: >=2023.9.25, installed: 2024.2.26]
│   ├── msgpack [required: >=0.5.6, installed: 1.0.7]
│   ├── numpy [required: >=1.17.3, installed: 1.26.4]
│   ├── orjson [required: >=3.9.0, installed: 3.9.15]
│   ├── pydantic [required: >=2.0, installed: 2.6.2]
│   │   ├── annotated-types [required: >=0.4.0, installed: 0.6.0]
│   │   ├── pydantic-core [required: ==2.16.3, installed: 2.16.3]
│   │   │   └── typing-extensions [required: >=4.6.0,!=4.7.0, installed: 4.10.0]
│   │   └── typing-extensions [required: >=4.6.1, installed: 4.10.0]
│   ├── pydantic-settings [required: >=2.0.3, installed: 2.2.1]
│   │   ├── pydantic [required: >=2.3.0, installed: 2.6.2]
│   │   │   ├── annotated-types [required: >=0.4.0, installed: 0.6.0]
│   │   │   ├── pydantic-core [required: ==2.16.3, installed: 2.16.3]
│   │   │   │   └── typing-extensions [required: >=4.6.0,!=4.7.0, installed: 4.10.0]
│   │   │   └── typing-extensions [required: >=4.6.1, installed: 4.10.0]
│   │   └── python-dotenv [required: >=0.21.0, installed: 1.0.1]
│   ├── pydash [required: >=4.1.0, installed: 7.0.7]
│   │   └── typing-extensions [required: >=3.10,!=4.6.0, installed: 4.10.0]
│   ├── pymongo [required: >=4.2.0, installed: 4.6.2]
│   │   └── dnspython [required: >=1.16.0,<3.0.0, installed: 2.6.1]
│   ├── python-dateutil [required: >=2.8.2, installed: 2.8.2]
│   │   └── six [required: >=1.5, installed: 1.16.0]
│   ├── pyzmq [required: >=24.0.1, installed: 25.1.2]
│   ├── ruamel.yaml [required: <0.18, installed: 0.17.40]
│   │   └── ruamel.yaml.clib [required: >=0.2.7, installed: 0.2.8]
│   ├── setuptools [required: Any, installed: 69.1.1]
│   ├── sshtunnel [required: >=0.1.5, installed: 0.4.0]
│   │   └── paramiko [required: >=2.7.2, installed: 3.4.0]
│   │       ├── bcrypt [required: >=3.2, installed: 4.1.2]
│   │       ├── cryptography [required: >=3.3, installed: 42.0.5]
│   │       │   └── cffi [required: >=1.12, installed: 1.16.0]
│   │       │       └── pycparser [required: Any, installed: 2.21]
│   │       └── PyNaCl [required: >=1.5, installed: 1.5.0]
│   │           └── cffi [required: >=1.4.1, installed: 1.16.0]
│   │               └── pycparser [required: Any, installed: 2.21]
│   ├── tqdm [required: >=4.19.6, installed: 4.66.2]
│   └── uvicorn [required: >=0.18.3, installed: 0.27.1]
│       ├── click [required: >=7.0, installed: 8.1.7]
│       ├── h11 [required: >=0.8, installed: 0.14.0]
│       └── typing-extensions [required: >=4.0, installed: 4.10.0]

Proposed Solution

Our setup.py already leverages optional dependency groups for certain Store. My suggestion is to double-down on this concept so that by default, maggma is installed only with a mininmum set of depdendencies to enable

  • MemoryStore
  • JSONStore
  • MongoStore

With all other more specialized stores requiring optional dependency groups.

Our current setup.py

    extras_require={
        "vault": ["hvac>=0.9.5"],
        "memray": ["memray>=1.7.0"],
        "montydb": ["montydb>=2.3.12"],
        "notebook_runner": ["IPython>=8.11", "nbformat>=5.0", "regex>=2020.6"],
        "azure": ["azure-storage-blob>=12.16.0", "azure-identity>=1.12.0"],
        "open_data": ["pandas>=2.1.4", "jsonlines>=4.0.0"],
        "testing": [
            "pytest",
            "pytest-cov",
            "pytest-mock",
            "pytest-asyncio",
            "pytest-xdist",
            "pre-commit",
            "moto",
            "ruff",
            "responses<0.22.0",
            "types-pyYAML",
            "types-setuptools",
            "types-python-dateutil",
            "starlette[full]",
        ],
        "docs": [
            "mkdocs>=1.4.0",
            "mkdocs-material>=8.3.9",
            "mkdocs-minify-plugin>=0.5.0",
            "mkdocstrings[python]>=0.18.1",
            "jinja2<3.2.0",
        ],
    },

Related, the API components carry a whole separate set of dependencies. Perhaps those should be broken into their own dependency group.

Alternatives

An alternative would be to break maggma into namespace packages for Builder , Store, and the API components. I'm not a fan of this option as I find maintaining and installing these really confusing (see: emmet!)

@rkingsbury
Copy link
Collaborator Author

rkingsbury commented Feb 27, 2024

Interested in thoughts from @munrojm , @jmmshn , @Andrew-S-Rosen and others on this

@Andrew-S-Rosen
Copy link
Member

I think that your proposal makes sense, with updated documentation to reflect the "extras" needed for a given Store. From the list of dependencies shown though, it's not initially clear to me what further groups you would have. Any thoughts on that?

I definitely would avoid the namespace stuff for the reasons you stated.

@rkingsbury
Copy link
Collaborator Author

I think that your proposal makes sense, with updated documentation to reflect the "extras" needed for a given Store. From the list of dependencies shown though, it's not initially clear to me what further groups you would have. Any thoughts on that?

I definitely would avoid the namespace stuff for the reasons you stated.

Based on the graph fastapi, boto3, and mongogrant all introduce a lot of extras. Those are needed for, respectively, the API components, S3Store / OpenDataStore, MongoGrantStore (aside - is anyone still using this?). It seems like each of those could be flagged as requiring optional extras. What do you think?

@rkingsbury
Copy link
Collaborator Author

rkingsbury commented Feb 27, 2024

Also, it doesn't appear that dnspython or pyzmq are used anywhere in the code. Does anyone know if these have a function that isn't exposed via import statements? (I'm not too familiar with these)

@Andrew-S-Rosen
Copy link
Member

Andrew-S-Rosen commented Feb 27, 2024

Makes sense re: fastapi, boto3, and mogogrant. I'd support their use as optional dependencies.

Re: pyzmq, I think you may need to look at the zmq import, e.g. here.

@munrojm
Copy link
Member

munrojm commented Feb 27, 2024

I am all for this. Would just want to make sure it is appropriately scoped w.r.t to the default store types supported. My opinion is that It would be nice to support the S3Store / OpenDataStore out of the box. MongoGrantStore should most likely just be removed...

FYI, pyzmq is only used for the distributed builder running.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants