Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modernization: introduce poetry as a packaging and dependency tool #654

Closed
wants to merge 8 commits into from

Conversation

jayaddison
Copy link
Collaborator

@jayaddison jayaddison commented Oct 18, 2022

The changes here layer on top of #650; they're opened as a separate pull request so that GitHub Actions linting and testing runs (those workflows are filtered to the main branch).

To view only the changes between this and #650, visit: issue-617/tox-migration...issue-617/tox-plus-poetry

Relates to #617.

@jayaddison
Copy link
Collaborator Author

This wasn't too tricky to introduce; it seems worth comparing this approach to pyproject.toml without poetry, though. Moving into draft status.

@bfcarpio
Copy link
Collaborator

I'm a bit confused by what I'm seeing here regarding requirements-dev.txt. The project code itself doesn't depend on poetry. Really, the environment requires poetry.

Somewhat relevant and the reason I was looking at that file to begin with is I'd expect that you and hhursev would want to have the export so that people can optionally use poetry, but we still get all the nice hashing supply chain protection.

@jayaddison
Copy link
Collaborator Author

jayaddison commented Oct 19, 2022

The library's dependencies are moved into pyproject.toml (both in this pull request, and in the alternative #655)

requirements-dev.txt represents "the things you need to install-from-scratch to develop and run tests for the library if you only have Python and pip available" -- because poetry, tox and pre-commit generally aren't available by default

I'd vote against using poetry export to generate additional requirements files, because we should define the dependency set in a single place -- if we duplicate it, the different copies will fall out-of-sync.

@bfcarpio
Copy link
Collaborator

I think a potential use case for poetry export is that you should run it after each commit. At least, that's what the pre-commit plugin makes me think. The biggest reason for this I think is not needing poetry when setting up your docker container.

I'm just pointing out a potential use. I'm not opinionated on this. Could write some CI checks to make sure that someone isn't editing it without also changing the poetry files.

@jayaddison
Copy link
Collaborator Author

@bfcarpio good points!

Installation-wise: as I understand it, recent pip versions (19.0 onwards) should be able to read the pyproject.toml build information, allowing them to use the lockfile -- but it's fair to consider environments that might not have recent-enough versions of pip for that.

Development-wise: if we introduced continuous integration checks on the lockfile/dependencies per-pull-request, then developers could still use whatever tools they like, for the most part. An exception is that poetry would become a requirement in order to update the project's dependencies (not usually required for scraper development, only core framework changes).

I definitely get the sense that I could be introducing complexity for goals that aren't worthwhile (I have the online testing debacle in mind at the moment). When debugging it can be super useful to replicate the exact requirements/dependency versions -- but is that tradeoff worth it for more dependencies and potential confusion for developers/install sites? I don't know.

@jayaddison
Copy link
Collaborator Author

Note: pip-tools is a third option that also supports pyproject.toml, and that can produce backwards-compatible requirements.txt-format dependency definitions.

I think it sits somewhere between the setuptools approach in #655 and the poetry approach in this pull request.

To explain:

pip-tools v6.9.0 has fewer dependencies than poetry 1.2.2 does.

Unlike the setuptools approach, with pip-tools we could export a complete set of transitive dependencies -- providing a similar level of dependency transparency and consistency to poetry.lock.

This might sound weird, but: I use pip-tools in openculinary's Python projects and if anything that dissuades me slightly from suggesting it here. Why? So far I've had no problems with it and think it's a good system, but using the same software everywhere introduces a kind of ecosystem diversity risk -- a problem in one place can become a problem everywhere.

(perhaps I'll attempt a draft pull request with it anyway though, to compare what it might look like)

@jayaddison
Copy link
Collaborator Author

Installation-wise: as I understand it, recent pip versions (19.0 onwards) should be able to read the pyproject.toml build information, allowing them to use the lockfile -- but it's fair to consider environments that might not have recent-enough versions of pip for that.

Sorry, a possible correction: PEP-517 support does allow pip to read the pyproject.toml, but I'm not certain whether dependency versions are determined from poetry.lock when it does that.

Similarly: the lockfile is not used to pin dependency versions in the built/published package (either by pip with PEP-517, or poetry itself).

In practical terms, those would mean that the lockfile is useful to provide visibility into the addition/removal of dependencies, and to improve development environment consistency, but it wouldn't provide consistency between developer-install and user-install environments.

(I could be wrong here; continuing to learn)

@jayaddison
Copy link
Collaborator Author

Installation-wise: as I understand it, recent pip versions (19.0 onwards) should be able to read the pyproject.toml build information, allowing them to use the lockfile -- but it's fair to consider environments that might not have recent-enough versions of pip for that.

Sorry, a possible correction: PEP-517 support does allow pip to read the pyproject.toml, but I'm not certain whether dependency versions are determined from poetry.lock when it does that.

Yep, my mistake - the poetry documentation clears this up:

For your library, you may commit the poetry.lock file if you want to. This can help your team to always test against the same dependency versions. However, this lock file will not have any effect on other projects that depend on it. It only has an effect on the main project.

https://python-poetry.org/docs/libraries/#lock-file

@jayaddison
Copy link
Collaborator Author

Here's what I'm thinking currently:

  • The good: poetry.lock improves consistency in a development team, and provides some visibility into dependency set changes
  • The not-so-good: dependency set changes can also occur independently over time as the Python package ecosystem evolves, and poetry.lock doesn't necessarily provide visibility there (it is essentially providing point-in-time snapshots of a continuously-changing scene)
  • Also fine-I-guess-but-not-clearly-good: poetry would introduce additional development-time dependencies and workflow steps

Honestly I don't think we've had many problems development-wise on the project due to differing dependency versions.

So, to refer back to the goal of #617 -- improving developer experience -- the remaining question is whether the point-in-time visibility that poetry.lock would provide is worthwhile, compared to alternative approaches.

And my sense there is that any approach that attempts to commit the dependency set to source control is inherently going to be subject to the 'shifting sands' problem (dependency tree changes elsewhere in the stack), unless all versions are pinned to precise (integrity-hashed?) versions -- a valid albeit maintenance-heavy approach.

Overall, I think a lockfile is the wrong place to attempt to gain that visibility - instead it'd be better to start with the versions of a dependency you're interested in, tail the release history for the top-level dependencies of those (within their semver match constraints), and then emit events to a stream when dependencies are added/removed/changed.

But that's getting a bit off-topic. I don't think the poetry.lock approach is worth the overheads though, so I'm going to close this. I'll re-evaluate the setuptools approach next.

@jayaddison jayaddison closed this Oct 20, 2022
@jayaddison jayaddison deleted the issue-617/tox-plus-poetry branch October 20, 2022 15:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants