Invalid lemma for `had` contraction #5373

pszpetkowski · 2020-04-28T20:45:18Z

I'm not sure if this issue is in scope of this project, since as far as I know it's only possible to figure if the 'd contraction is actually had or would from the context of the sentence, but most of the time spaCy seems to work with contractions as expected and it would be nice to be able to rely on it.

How to reproduce the behaviour

import spacy
nlp = spacy.load("en_core_web_lg")
doc = nlp("I'd a dream")
print(doc[1].lemma_)
> would

The result I'd expect to print is have instead of would.

Your Environment

spaCy version: 2.2.4
Platform: Linux-5.6.7-arch1-1-x86_64-with-glibc2.2.5
Python version: 3.8.2

The text was updated successfully, but these errors were encountered:

adrianeboyd · 2020-04-29T08:14:48Z

Thanks for the report! This is coming from a rule (in the tokenizer exceptions) that assigns the lemma/tag would/MD to the contraction 'd. I think it would make sense to remove would/MD and let the tagger handle it instead. The tagger is still probably going to get this wrong a fair amount of the time (and the tagger will probably do better on 3rd person pronouns than 1st/2nd), but it doesn't make sense for a rule to say it's always would.

github-actions · 2021-11-05T00:02:10Z

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

svlandeg added feat / lemmatizer Feature: Rule-based and lookup lemmatization lang / en English language data and models perf / accuracy Performance: accuracy labels Apr 29, 2020

adrianeboyd mentioned this issue Apr 29, 2020

Improve exceptions for 'd (would/had) in English #5379

Merged

3 tasks

svlandeg closed this as completed in #5379 May 8, 2020

github-actions bot locked as resolved and limited conversation to collaborators Nov 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Invalid lemma for `had` contraction #5373

Invalid lemma for `had` contraction #5373

pszpetkowski commented Apr 28, 2020

adrianeboyd commented Apr 29, 2020

github-actions bot commented Nov 5, 2021

Invalid lemma for had contraction #5373

Invalid lemma for had contraction #5373

Comments

pszpetkowski commented Apr 28, 2020

How to reproduce the behaviour

Your Environment

adrianeboyd commented Apr 29, 2020

github-actions bot commented Nov 5, 2021

Invalid lemma for `had` contraction #5373

Invalid lemma for `had` contraction #5373