Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using the latest Wordnet 2021 version #2885

Closed
ekaf opened this issue Nov 14, 2021 · 4 comments
Closed

Using the latest Wordnet 2021 version #2885

ekaf opened this issue Nov 14, 2021 · 4 comments
Labels

Comments

@ekaf
Copy link
Contributor

ekaf commented Nov 14, 2021

Open English Wordnet 2021 (https://github.com/globalwordnet/english-wordnet) was recently released in a format compatible with NLTK:

https://john.mccr.ae/oewn2021/english-wordnet-2021.zip

It can work out-of-the-box with NLTK's wordnet.py module, by just replacing nltk_data/corpora/wordnet. However, a few thousand problems have already been reported in 3 issues: (globalwordnet/english-wordnet#773 (comment), globalwordnet/english-wordnet#774 (comment) and globalwordnet/english-wordnet#777 (comment)). In particular, no parser would be able to separate the examples from the definitions.

An alternative package with fewer problems is available from the X-englishwordnet project (https://github.com/x-englishwordnet/wndb):

https://x-englishwordnet.github.io/wndb/xewn_compat.zip

This package has quoted examples, and only half the sense number instability, compared with the official package. So the dilemma is which package to use: the official or the alternative?

Open English Wordnet has anounced that support for this legacy database format will be discontinued sometime in the future,
and recommends using the XML format instead. However, this appears to have the same sense stability problems as the WNDB format used by NLTK.

@ekaf
Copy link
Contributor Author

ekaf commented Nov 24, 2021

Since globalwordnet/english-wordnet#777 (comment) was recently solved, the only problem remaining is the sense ordering. So the difference between the candidate packages is not so big now, and the safest choice might be to use the official package at https://john.mccr.ae/oewn2021/english-wordnet-2021.zip

@ekaf
Copy link
Contributor Author

ekaf commented Dec 7, 2021

OEWN 2021 is now supported through #2860

@ekaf ekaf closed this as completed Dec 7, 2021
@LifeIsStrange
Copy link

@ekaf So are there remaining problems with OEWN 2021 support ?
Has this been solved ? " the only problem remaining is the sense ordering"

@ekaf
Copy link
Contributor Author

ekaf commented Apr 9, 2022

The present issue is closed because the problem is not in NLTK support, but in OEWN 2021 itself, as explained in the issues cited above (globalwordnet/english-wordnet#773 (comment), globalwordnet/english-wordnet#774 (comment)), which are still open.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants