Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Tree.pformat_latex_forest #2956

Open
wants to merge 4 commits into
base: develop
Choose a base branch
from
Open

Conversation

tyomitch
Copy link

@tyomitch tyomitch commented Mar 5, 2022

Includes some refactoring to Tree.pformat

:param quotes: True to quote leaf nodes as Python strings.
:type quotes: bool
:param quotes: Two-element iterable to surround leaf nodes,
or True to quote leaf nodes as Python strings.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To the best of my knowledge we've avoided this kind of polymorphism in NLTK. I think it needs some consideration if we're going to start doing this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To the best of my knowledge we've avoided this kind of polymorphism in NLTK. I think it needs some consideration if we're going to start doing this.

Any alternative suggestions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes by @tyomitch make quotes equivalent to parens, which is a logical move. Then, beyond that, he also allows backwards compatibility so True and False can still be used. I don't necessarily see an issue with that, especially because using True means the use of repr, which cleverly uses the right quotes depending on the string.

That said, at a glance the implementation is likely bugged. On line 841 - what if the child is a string, and quotes = False. Then, the if-statement is true, and "\n" + indent_str + quotes[0] + str(child) + quotes[1] is added to s. However, quotes[0] will throw an exception.

An alternative implementation that avoids the polymorphism simply reverts quotes to what it was, and adds e.g. quote_strings=None. The last else branch is where the quotes should be added, so then we can do e.g.:

            elif isinstance(child, str) and not quotes:
                s += "\n" + " " * (indent + 2) + "%s" % child
            else:
                if not quote_strings:
                    s += "\n" + " " * (indent + 2) + repr(child)
                else:
                    s += "\n" + " " * (indent + 2) + quote_strings[0] + str(child) + quote_strings[1]

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomaarsen thank you for the detailed comment.

My expectation is that pre-existing callers either pass quotes=True or don't pass quotes at all; that's why the updated docstring doesn't list False as an acceptable input.

With the code you suggest, when quotes is False, quote_strings get ignored, which may be confusing to the user. To apply custom quote characters, he'd need to pass both parameters, which is less user-friendly. I have no problem with that -- just flagging it up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that that would be confusing. I don't dislike your current implementation per se, although I would probably want to see support for quotes=False (i.e. have it be equivalent to quotes=("", "")).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomaarsen I've added a commit to support quotes=False

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants