New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix decode error for bllip parser #2897
Conversation
It looks like the I'm not sure what to do here. We could just not have these new tests run on CI, or perhaps we should only install the pip package on non-Windows environments? |
okay looking for a bit of advice here! nltk's The parser also isn't unit tested. adding tests to the CI build is tricky, since the
|
This is a bit of a tricky one. I'm not familiar with Regarding CI, we already have some third party tools which are downloaded for Mac and Linux only, but adding a package which must be installed via Perhaps all of this means that I'm leaning towards option 1. An interesting mix of everything would be to add the tests as quick unit tests to method/class bodies of That's my view, others might disagree. |
@dmcc do you have any advice for us please? Is there any point in us maintaining NLTK's BLLIP parser? |
@dannysepler |
Yikes, sorry this was broken for so long. BLLIP Parser does provide some functionality not available in the other NLTK parsers, so there's an argument for keeping it running. It's a bit surprising that nobody noticed/reported it was broken for so long, that's a concern. That said, I have recently received bug reports for the lower-level BLLIP Python interface, so it appears some people are still using it. Option 1 sounds good to me -- I don't have easy access to Windows or Mac OS X machines to debug the issues, I'm afraid. |
Thanks @dmcc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dannysepler So, I believe the consensus is to apply the fix, but don't add the test to CI.
I believe all that needs to happen at this point is to remove bliipparser
from requirements-ci.txt. Doing so will cause nltk/test/unit/test_bllip.py
to be skipped in the CI, but users who have bllipparser installed will still be able to run these tests as a part of tox
or pytest nltk/test
.
Beyond that, I have confidence that your changes indeed work, as the tests for macOS and Ubuntu passed.
a56ed42
to
c48c640
Compare
sorry for the delay @tomaarsen, and thank you for the guidance! thank you as well to @dmcc for confirming the parser's usefulness! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. I just have the one small nitpick, which isn't important at all.
Thanks Danny! Well done. |
Closes #1920
This was fairly trivial to fix, but this seems like it was broken since py3 was rolled out. Do we know why more people didn't complain about this? Is it possible that this parser isn't being used at the moment?
I also added a very simple unit test, to capture simple breakages in the
BllipParser
for next time