New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Tree.pformat_latex_forest #2956
base: develop
Are you sure you want to change the base?
Conversation
:param quotes: True to quote leaf nodes as Python strings. | ||
:type quotes: bool | ||
:param quotes: Two-element iterable to surround leaf nodes, | ||
or True to quote leaf nodes as Python strings. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To the best of my knowledge we've avoided this kind of polymorphism in NLTK. I think it needs some consideration if we're going to start doing this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To the best of my knowledge we've avoided this kind of polymorphism in NLTK. I think it needs some consideration if we're going to start doing this.
Any alternative suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes by @tyomitch make quotes
equivalent to parens
, which is a logical move. Then, beyond that, he also allows backwards compatibility so True and False can still be used. I don't necessarily see an issue with that, especially because using True means the use of repr
, which cleverly uses the right quotes depending on the string.
That said, at a glance the implementation is likely bugged. On line 841 - what if the child is a string, and quotes = False
. Then, the if-statement is true, and "\n" + indent_str + quotes[0] + str(child) + quotes[1]
is added to s
. However, quotes[0]
will throw an exception.
An alternative implementation that avoids the polymorphism simply reverts quotes
to what it was, and adds e.g. quote_strings=None
. The last else branch is where the quotes should be added, so then we can do e.g.:
elif isinstance(child, str) and not quotes:
s += "\n" + " " * (indent + 2) + "%s" % child
else:
if not quote_strings:
s += "\n" + " " * (indent + 2) + repr(child)
else:
s += "\n" + " " * (indent + 2) + quote_strings[0] + str(child) + quote_strings[1]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tomaarsen thank you for the detailed comment.
My expectation is that pre-existing callers either pass quotes=True
or don't pass quotes
at all; that's why the updated docstring doesn't list False
as an acceptable input.
With the code you suggest, when quotes
is False, quote_strings
get ignored, which may be confusing to the user. To apply custom quote characters, he'd need to pass both parameters, which is less user-friendly. I have no problem with that -- just flagging it up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that that would be confusing. I don't dislike your current implementation per se, although I would probably want to see support for quotes=False
(i.e. have it be equivalent to quotes=("", "")
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tomaarsen I've added a commit to support quotes=False
Includes some refactoring to Tree.pformat