Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: refactor prettyprinter to be more readable #2893

Merged
merged 5 commits into from Nov 28, 2021

Conversation

12mohaned
Copy link
Contributor

This pull request refactors the TreePrettyPrinter class to be more readable and clean. For example, line 78 can be simply expressed as:

if (... and len(a) > 0 and ..)
instead of

if (... and not any(len(a) == 0 and ...)

I believe it is more confusing to read the latter one.

Also, there is a lot of magic numbers and variables(i.e, scale = 2) without any comments explaining what they do, which is considered bad practice since it does not explain what it does represents.

Regards,
Mohaned Mashaly.

@@ -207,7 +207,7 @@ def dumpmatrix():
raise ValueError("All leaves must be integer indices.")
if len(leaves) != len(set(leaves)):
raise ValueError("Indices must occur at most once.")
if not all(0 <= n < len(sentence) for n in leaves):
if not all(n in range(0, len(sentence)) for n in leaves):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey mohaned! this actually felt a bit better before. plus now it's constructing a list for each element in leaves which feels a bit inefficient. what do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right.

Thanks.

@@ -75,7 +75,7 @@ def __init__(self, tree, sentence=None, highlight=()):
leaves = tree.leaves()
if (
leaves
and not any(len(a) == 0 for a in tree.subtrees())
and any(len(a) > 0 for a in tree.subtrees())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The replacement contains a logical error. NB:

>>> not any(len(a) == 0 for a in [[1], []])
False
>>> any(len(a) > 0 for a in [[1], []])
True

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah right, my bad.

I believe it should be replaced with all instead of any.

Considering the below test cases:-

>>> not any(len(a) == 0 for a in [[1], []])
False
>>> all(len(a) > 0 for a in [[1], []])
False

>>> not any(len(a) == 0 for a in [[], []])
False
>>> all(len(a) > 0 for a in [[], []])
False

>>> not any(len(a) == 0 for a in [[1], [1]])
True
>>> all(len(a) > 0 for a in [[1], [1]])
True

@@ -291,7 +291,7 @@ def dumpmatrix():
matrix[rowidx][i] = ids[m]
nodes[ids[m]] = tree[m]
# add column to the set of children for its parent
if m != ():
if len(m) > 0:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if len(m) > 0:
if len(m):

I believe len(m) would be equivalent to len(m) > 0 in truthy context. Both only evaluate to False if m is an empty collection. (That said, I understand the argument that len(m) > 0 could be simpler to understand)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally agree, but like you said len(m) > 0 is simpler to understand, so I would personally go with it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understandable!

@stevenbird stevenbird merged commit de27a05 into nltk:develop Nov 28, 2021
@stevenbird
Copy link
Member

Thanks @12mohaned

@12mohaned 12mohaned deleted the prettyprinter_refactor branch December 1, 2021 16:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants