Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate wordnet english #2977

Closed
LifeIsStrange opened this issue Apr 8, 2022 · 2 comments
Closed

Integrate wordnet english #2977

LifeIsStrange opened this issue Apr 8, 2022 · 2 comments

Comments

@LifeIsStrange
Copy link

LifeIsStrange commented Apr 8, 2022

English wordnet is a fork of princeton wordnet that is developped in opensource, as such it is much more actively maintained, hence it is both more correct and more complete.
NLTK integrating english wordnet corpus as an option, would make it more discoverable and accessible.

@tomaarsen friendly ping

The complexity seems low, as if wordnet english can be exported to the same format as princeton wordnet then nltk would support wordnet english out of the box?
If so what format does NLTK expect?

@tomaarsen
Copy link
Member

I believe that NLTK can use the english-wordnet data. For example see globalwordnet/english-wordnet#771 and #2860 by @ekaf.

Alongside that open-source english Wordnet, @goodmami is working on wn, which may also be worth checking out as an alternative to NLTK's Wordnet functionality. As far as I can tell, there's some good work being done there.

However, to put this all in a broader context, the NLTK team has been looking to deprecate the existing NLTK Wordnet corpus reader and opt for a separate one for quite some while now (e.g. #2423). That said, I don't believe there is consensus on how to move forward with this deprecation fully. Wordnet is one of the big use cases of NLTK, so we need to be careful here.
In short, this is a difficult discussion, but I imagine we would eventually move to deprecate the NLTK Wordnet in favor of wn, if the latter can be maintained and improved more effectively. Or perhaps we would just leave the choice with our users. There is no consensus on this yet, I believe.

I hope that helps clarify everything somewhat.

@LifeIsStrange
Copy link
Author

Thank you for this great answer :). While NLTK and wn supports english wordnet, english wordnet has 2 upstream issues (including one regarding sense ordering, as pointed by ekaf) that are not yet fixed so in the meantime I will stick with NLTK classic wordnet 3.0/3.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants