Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix doctests in dependencygraph.py #3048

Merged
merged 10 commits into from
Sep 19, 2022
Merged

Conversation

ekaf
Copy link
Contributor

@ekaf ekaf commented Sep 4, 2022

Reduce the list of failing doctests (Issue #2989), by fixing the tests in nltk/parse/dependencygraph.py.

pytest --doctest-modules --doctest-continue-on-failure ./nltk/parse/dependencygraph.py

With this PR, if the "dot" binary is present, all tests pass. Otherwise, one test is skipped:

============================= test session starts ==============================
platform linux -- Python 3.9.2, pytest-7.1.2, pluggy-1.0.0
rootdir: [...]
plugins: forked-1.4.0, xdist-2.5.0, mock-3.8.2, cov-3.0.0
collected 3 items

nltk/parse/dependencygraph.py s.. [100%]

========================= 2 passed, 1 skipped in 2.58s =========================

@tomaarsen
Copy link
Member

@Madnex perhaps this PR is useful for you, as it shows an example of how doctests can be conditionally skipped.
Looks good at a glance @ekaf. I'll try to get more time to actually have a good look soon.

@tomaarsen tomaarsen added the tests label Sep 4, 2022
@ekaf
Copy link
Contributor Author

ekaf commented Sep 5, 2022

@Madnex @tomaarsen, rather than having several fixtures to check the availability of binaries, it seems more rational to have just one generic function that takes the binary name as parameter.

Then, the new test/binary_fixt can be imported in any test that requires the availability of a particular binary.

For ex. in the updated nltk/tag/hunpos.py:

Check whether the required "hunpos-tag" binary is available:

    >>> from nltk.test.binary_fixt import check_binary
    >>> check_binary('hunpos-tag')

Then:

pytest --doctest-modules --doctest-continue-on-failure ./nltk/tag/hunpos.py

============================= test session starts ==============================
platform linux -- Python 3.9.2, pytest-7.1.2, pluggy-1.0.0
rootdir: [...]
plugins: forked-1.4.0, xdist-2.5.0, mock-3.8.2, cov-3.0.0
collected 1 item

nltk/tag/hunpos.py s [100%]

============================== 1 skipped in 2.25s ==============================

@tomaarsen
Copy link
Member

Seems like a good idea for sure

nltk/test/binary_fixt.py Outdated Show resolved Hide resolved
nltk/test/binary_fixt.py Outdated Show resolved Hide resolved
nltk/test/binary_fixt.py Outdated Show resolved Hide resolved
@ekaf
Copy link
Contributor Author

ekaf commented Sep 6, 2022

Thanks for the review and suggestions @tomaarsen: I guess that LookupError is what you refer to when you call for something more specific.
I had not yet seen the f"string output, because pytest needs to increase verbosity twice for the message to be output without truncation:
pytest -vv --doctest-modules --doctest-continue-on-failure ./nltk/tag/hunpos.py

nltk/tag/hunpos.py::nltk.tag.hunpos.HunposTagger SKIPPED (Skipping test because the hunpos-tag binary was not found) [100%]

At this time, there don't seem to be other cases of unavailable binaries in #2989. There are a number unavailable "jar" packs related to the Stanford parser, but these will probably require a different solution.

However, the new check_binary() function can also replace the discourse, inference, and nonmonotonic fixtures in test/. I think it is preferable to do that within the present PR, in order to check that it works as intended.

@ekaf
Copy link
Contributor Author

ekaf commented Sep 7, 2022

For ex. replace the redundant nltk/test/inference_fixt.py:

pytest -vv --doctest-modules --doctest-continue-on-failure ./nltk/test/inference.doctest

nltk/test/inference.doctest::inference.doctest SKIPPED (Skipping test because the mace4 binary was not found) [100%]

@tomaarsen
Copy link
Member

Indeed, a LookupError seems best. This is looking good!

@stevenbird
Copy link
Member

Looks like we're nearly ready to merge this... @ekaf are you intending to use check_binary() elsewhere in the test package?

@ekaf
Copy link
Contributor Author

ekaf commented Sep 11, 2022

@stevenbird yes, this PR seems complete. I don't intend to use check_binary elsewhere.

@tomaarsen
Copy link
Member

Thoughts on moving check_binary to https://github.com/nltk/nltk/blob/develop/nltk/internals.py, by the way?
It feels a bit odd to include "functionality" in the test folder, but that said, the functionality is exclusively for tests.

@ekaf
Copy link
Contributor Author

ekaf commented Sep 12, 2022

Since it is just a test fixture among many others in the test directory, the current place seems adequate. However, maybe the filename could be changed to something broader (like for ex. setup_fixt.py), in order to open the possibility to include other generic fixtures later.
In particular, it could be relevant to also add a general function to replace the following:

grep importorskip nltk/test/*fixt.py

nltk/test/classify_fixt.py: pytest.importorskip("numpy")
nltk/test/gensim_fixt.py: pytest.importorskip("gensim")
nltk/test/probability_fixt.py: pytest.importorskip("numpy")

@ekaf
Copy link
Contributor Author

ekaf commented Sep 17, 2022

Renamed fixture file with a broader filename, in order to accommodate the eventual addition of other generic fixtures.

The scope of this PR has already expanded beyond just handling dependencygraph.py, so I think it would be best to wait for a future PR to also replace pytest.importorskip(), because those cases are outside issue #2989.

@ekaf ekaf mentioned this pull request Sep 18, 2022
@tomaarsen tomaarsen merged commit 9851f74 into nltk:develop Sep 19, 2022
@tomaarsen
Copy link
Member

This is looking good. Thanks for this @ekaf!

tomaarsen pushed a commit to tomaarsen/nltk that referenced this pull request Dec 6, 2022
* Fix doctests in nltk/parse/dependencygraph.py

* Move pytest import inside setup_dot() function

* Generic fixture to skip doctest when a required binary is not available

* Import generic fixture in dependencygraph.py and hunpos.py

* Return False when binary is unavailable

* Implement suggestions by @tomaahsen

* Replace redundant text fixtures by check_binary

* Broaden fixture filename

* Delete old fixture file
@ekaf ekaf deleted the dependencygraph branch December 8, 2022 10:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants