Inaccurate Clause determination leads to wrong choice of verb for the sentence. #1035

ryancasburn-KAI · 2023-09-06T12:48:18Z

My overall goal is to identify PresentTense sentences with a #Person or #Pronoun subject.

While troubleshooting, I ran into an issue with the following sentence.

nlp("She led a diverse advisory committee throughout the process that included membership from schools, city planning departments, state engineering departments, transit agencies, and AARP.").sentences().json()

Which returns

{
  subject: "she"
  verb: "state engineering"
  predicate: "departments"
  grammar: {tense: "PresentTense"}
}

Clearly the verb should be led, not “state engineering”. I think the trouble here is that both state and engineering can be verbs, but in this sentence should be adjectives of “departments.”

This does work though:

nlp("She led a diverse advisory committee throughout the process that included membership from city planning departments and state engineering departments.").sentences().json()

{
  subject: "she"
  verb: "led"
  predicate: "a diverse advisory committee throughout the proces…ing departments and state engineering departments"
  grammar: { tense: "PastTense"}
}

In both cases, the subject is properly identified (as “she”), but when there is a comma separated list of items which include potential verbs (“state” and “engineering”) it seems to skip over the sentence’s actual verb (“led”) and chooses one of the potential verbs from the list (“state”).

Is there something I can do on my side to resolve this, or is there a fix that can be done within this package?

The text was updated successfully, but these errors were encountered:

spencermountain · 2023-09-06T13:47:22Z

hey Ryan - yep, your intuitions are correct. The tagger gets borked on the ambiguous words and the long list.
I think the error is in .clauses() - it may be confusing those commas with sentence fragments, then interpreting '^state engineering ..' like '^keep engineering..'.

You can poke-around at the tagger logs if you're interested, by adding nlp.verbose('tagger'). Fraid I don't see a quick solution on this sentence, other than adding nlp(txt, {'state engineering':'Noun'}) as a quick one.

In all, I'd really like to improve the logic for parsing the SVO of a sentence. You can see the logic right now is really poor.
Let me know if you find anything else
cheers

ryancasburn-KAI · 2023-09-06T16:49:51Z

Thank you for the suggestion for the quick fix. That obviously fixes just that sentence. I'll keep poking around to see if I can solve it more generally. Thanks for your quick response!

ryancasburn-KAI · 2023-09-06T18:42:57Z

I'm an engineer, not a linguist, but I do agree that .clauses() is the origin and I think there may be other issues too:

Found this here:

My friend who lives in London looks like Homer Simpson.

This should have two clauses: "My friend looks like Homer Simpson" and "who lives in London"

Compromise only finds a single clause and identifies the sentence's verb as "lives" instead of "looks".

I'll keep working through considerations on clause and main clause identification that handles comma separated lists and dependent clauses within independent clauses.

spencermountain added the bug label Sep 6, 2023

ryancasburn-KAI changed the title ~~Issue with subject and verb determination in .sentences() with comma separated list of items which include potential verbs~~ Inaccurate Clause determination leads to wrong choice of verb for the sentence. Sep 6, 2023

ryancasburn-KAI mentioned this issue Jan 2, 2024

"Throughout" within a main clause currently doesn't work. #1071

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inaccurate Clause determination leads to wrong choice of verb for the sentence. #1035

Inaccurate Clause determination leads to wrong choice of verb for the sentence. #1035

ryancasburn-KAI commented Sep 6, 2023

spencermountain commented Sep 6, 2023

ryancasburn-KAI commented Sep 6, 2023

ryancasburn-KAI commented Sep 6, 2023

Inaccurate Clause determination leads to wrong choice of verb for the sentence. #1035

Inaccurate Clause determination leads to wrong choice of verb for the sentence. #1035

Comments

ryancasburn-KAI commented Sep 6, 2023

spencermountain commented Sep 6, 2023

ryancasburn-KAI commented Sep 6, 2023

ryancasburn-KAI commented Sep 6, 2023