New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: perfom linting for punkt.py #2830
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the look of this PR! I do have some comments though. Also, it seems that there was a small error in the PR somewhere causing the CI to fail. You can check the outputs for debugging. Perhaps you've renamed a variable to something that was already being used for an existing variable?
nltk/tokenize/punkt.py
Outdated
@@ -570,8 +570,8 @@ def _tokenize_words(self, plaintext): | |||
yield self._Token(tok, parastart=parastart, linestart=True) | |||
parastart = False | |||
|
|||
for t in line_toks: | |||
yield self._Token(t) | |||
for token in line_toks: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer to be consistent with abbreviations, e.g. for tok in line_toks
or for token in line_tokens
(FYI @jnothman) |
No issues on my side. Good to see the code receiving some TLC |
Thanks @12mohaned |
This pull request refactors the punks.py file that contain PunktTokenizer class by doing the following:-