Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor gensim.doctest to work for gensim 4.0.0 and up #2914

Merged
merged 3 commits into from Dec 19, 2021

Conversation

tomaarsen
Copy link
Member

Hello!

Pull Request Overview

  • Update gensim.doctest allowing us to remove the <4.0.0 version lock of gensim.

Details

See #2913 (comment) for a short discussion on gensim. It is currently locked to gensim < 4.0.0, and we'll want to start supporting newer versions. However, in 4.0.0, gensim made some changes in syntax, meaning that you either support before 4.0.0, or after 4.0.0.
This is us moving to the latter.

Beyond this PR

Lastly, I would like to propose the following, which is not included in this PR:
Let's deprecate gensim from pip-req.txt. gensim is not used anywhere in the codebase, except for this doctest, so we should remove it from pip-req.txt, and keep it only in requirements-ci.txt/requirements-test.txt. I have some other issues with these files at the moment, which I've brought up before in #2881:

  • Perhaps we can merge requirements-test.txt and requirements-ci.txt. They're both for testing, and are already very similar.
  • It somewhat bothers me that pip-req.txt and requirements-ci.txt/requirements-test.txt don't have the same naming structure. Perhaps we can change this? (To be honest, I don't remember ever seeing a pip-req.txt file before NLTK - I only encounter requirements.txt)

cc: @purificant This PR might be relevant for you and your work to help with #2913.

  • Tom Aarsen

@purificant
Copy link
Member

requirements-ci.txt was created as a clean slate for CI when moving to GitHub Actions, since then this has become our primary CI pipeline, we could definitely clean up other requirement files and consolidate. 👍

@purificant
Copy link
Member

Amazing work @tomaarsen, thank you for doing this. One step closer to py 3.10

Copy link
Contributor

@dannysepler dannysepler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, great work!

@tomaarsen
Copy link
Member Author

I've decided it's best to remove the unused gensim dependency as it might mess with people who wish to use specific gensim versions alongside NLTK, and there is no need to potentially introduce unnecessary version conflict.

@tomaarsen tomaarsen merged commit 6b60213 into nltk:develop Dec 19, 2021
@tomaarsen tomaarsen deleted the feature/gensim-refactor branch December 19, 2021 19:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants