Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: refactor prettyprinter to be more readable #2893

Merged
merged 5 commits into from Nov 28, 2021
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
6 changes: 3 additions & 3 deletions nltk/tree/prettyprinter.py
Expand Up @@ -75,7 +75,7 @@ def __init__(self, tree, sentence=None, highlight=()):
leaves = tree.leaves()
if (
leaves
and not any(len(a) == 0 for a in tree.subtrees())
and any(len(a) > 0 for a in tree.subtrees())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The replacement contains a logical error. NB:

>>> not any(len(a) == 0 for a in [[1], []])
False
>>> any(len(a) > 0 for a in [[1], []])
True

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah right, my bad.

I believe it should be replaced with all instead of any.

Considering the below test cases:-

>>> not any(len(a) == 0 for a in [[1], []])
False
>>> all(len(a) > 0 for a in [[1], []])
False

>>> not any(len(a) == 0 for a in [[], []])
False
>>> all(len(a) > 0 for a in [[], []])
False

>>> not any(len(a) == 0 for a in [[1], [1]])
True
>>> all(len(a) > 0 for a in [[1], [1]])
True

and all(isinstance(a, int) for a in leaves)
):
sentence = [str(a) for a in leaves]
Expand Down Expand Up @@ -207,7 +207,7 @@ def dumpmatrix():
raise ValueError("All leaves must be integer indices.")
if len(leaves) != len(set(leaves)):
raise ValueError("Indices must occur at most once.")
if not all(0 <= n < len(sentence) for n in leaves):
if not all(n in range(0,len(sentence)) for n in leaves):
raise ValueError(
"All leaves must be in the interval 0..n "
"with n=len(sentence)\ntokens: %d indices: "
Expand Down Expand Up @@ -291,7 +291,7 @@ def dumpmatrix():
matrix[rowidx][i] = ids[m]
nodes[ids[m]] = tree[m]
# add column to the set of children for its parent
if m != ():
if len(m) > 0:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if len(m) > 0:
if len(m):

I believe len(m) would be equivalent to len(m) > 0 in truthy context. Both only evaluate to False if m is an empty collection. (That said, I understand the argument that len(m) > 0 could be simpler to understand)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally agree, but like you said len(m) > 0 is simpler to understand, so I would personally go with it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understandable!

childcols[m[:-1]].add((rowidx, i))
assert len(positions) == 0

Expand Down