Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support sense-to-synset relations #957

Open
jmccrae opened this issue Jul 4, 2023 · 2 comments
Open

Support sense-to-synset relations #957

jmccrae opened this issue Jul 4, 2023 · 2 comments
Labels
release format This issue refers to the WNDB or RDF export, so no changes will be made to this repository
Milestone

Comments

@jmccrae
Copy link
Member

jmccrae commented Jul 4, 2023

Sense to synset relations are useful and would help to solve issues such as in #732

This issue is to verify that we can support such relations and that all relevant tooling (validation, EWE, en-word.net) and release formats (esp. WNDB) work with such relations.

Note that there is no chance that WNDB will support this as it is legacy format, so in this case, we will simply export as sense relations referring to the first member of the target synset

@jmccrae jmccrae added the release format This issue refers to the WNDB or RDF export, so no changes will be made to this repository label Jul 4, 2023
@jmccrae jmccrae added this to the 2023 Release milestone Jul 4, 2023
@rhdunn
Copy link

rhdunn commented Jul 6, 2023

The source/target word numbers in a WNDB pointer are 1-based. Technically, this allows for nn00 for sense-to-synset and 00nn for synset-to-sense pointer relationships. The wndb docs mention 0000 being used for semantic relations (synset-to-synset). When discussing lexical relations (sense-to-sense), it states that word numbers start at 1.

The question is then how WNDB tools will handle these.

It is interesting to note that WordNet Search shows lexical relationships as sense-to-synset relationships, not as sense-to-sense relationships (e.g. http://wordnetweb.princeton.edu/perl/webwn?o2=&o0=1&o8=1&o1=1&o7=&o5=&o9=&o6=1&o3=&o4=&s=peripherally&i=4&h=11000#c). This is displayed incorrectly according to the format definition. I don't know if there are any examples from WordNet 3.1 where sense-to-synset is intended.

@jmccrae
Copy link
Member Author

jmccrae commented Jul 7, 2023

Yes, that is interesting, we could try to use codes like 00nn for this, however my fear is that it would break a large number of tools that do not expect such a code.

There are certainly relations that come from Princeton WordNet, which could be modelled as sense-to-synset, e.g., 'scallion' is linked to 'United States' but could equally apply to other members of this synset (USA, America, etc.).

@jmccrae jmccrae modified the milestones: 2023 Release, 2024 Release Aug 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release format This issue refers to the WNDB or RDF export, so no changes will be made to this repository
Projects
None yet
Development

No branches or pull requests

2 participants