Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Research] /legal/search endpoint: Respondents need capability for exact string match #5830

Open
4 tasks
Tracked by #5832
patphongs opened this issue May 14, 2024 · 3 comments
Open
4 tasks
Tracked by #5832
Milestone

Comments

@patphongs
Copy link
Member

patphongs commented May 14, 2024

What we’re after

Currently the respondents[ ] field searches any word that matches. We want to add the ability to add exact string match in quotes for respondent as well. This is so results can be reduced, especially for common respondent names.

Ex: The search term "Salazar for Congress" currently returns many results because there are common respondent names with for congress in it. Limiting it to exactly that string would help narrow down the results.

Action items:

  • MURs
  • ADRs
  • Create future work ticket to consider how this can be applied for admin fines

https://api.open.fec.gov/v1/legal/search?case_respondents=%22Salazar+for+Congress%22&type=murs&hits_returned=20&from_hit=0&q=&api_key=NICAR16

Completion criteria

  • respondents field has the capability for exact string match
@patphongs patphongs changed the title Legal endpoint: Respondents need capability for exact string match /legal/search endpoint: Respondents need capability for exact string match May 14, 2024
@patphongs patphongs added this to the 25.1 milestone May 14, 2024
@fec-jli
Copy link
Contributor

fec-jli commented May 20, 2024

After researching and testing locally, I have included some notes here:

  1. To perform an exact match, we need to set the mapping type to "keyword". The type "text" will not perform an exact match.
  2. "Keyword" search is always case-sensitive in ES 7.4.0.
  3. Our existing respondents data are title-cased in ES.
  4. To fix this, we have two options on the backend side:

a) Set keyword search to case_insensitive, which was added in ES 7.10.0. Currently, we are using ES 7.4.0.
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-term-query.html

b) When uploading respondent data to ES, convert it to upper case.

@patphongs patphongs modified the milestones: 25.1, 25.2 May 21, 2024
@patphongs
Copy link
Member Author

patphongs commented May 21, 2024

Proposed solution

In order to make the respondents name search more precise, we discussed doing the following:

  1. Create new elasticsearch indexed fields for: respondent first name, respondent last name, respondent committee name. New field configurations:
    • Make these new fields using the "keyword" method so that we can get exact match for each field.
    • Use and logic between these filters so that we can match exact name. Question: What happens when there exists the first name and last name in the same case, but it's not part of the same respondent detail?
    • Results syntax:
"respondent_details": [
   {
      "first_name": null,
      "last_name": null,
      "committee_name": Cohn for Congress 2020
   },
   {
      "first_name": "Thomas Charles",
      "last_name": "Datwyler",
      "committee_name": null
   },
   {
      "first_name": "Timothy",
      "last_name": "Scott",
      "committee_name": null,
   },
],
  1. Make these 3 additional fields all UPPER CASE so that case sensitivity will not be an issue because the front end can send the API the query with everything upper case, but leave whatever the user typed alone in the call.

Considerations:

When using the keyword method, it will search based on exact match. We may come across complications when there is a middle initial or middle name inserted into the first name column. For example, respondent first name = Lee Ann and respondent last name = Elliott, may not come up with a result if the user only typed in respondent first name = Lee and respondent last name = Elliott.

@patphongs patphongs changed the title /legal/search endpoint: Respondents need capability for exact string match [Research] /legal/search endpoint: Respondents need capability for exact string match May 22, 2024
@cnlucas
Copy link
Member

cnlucas commented May 22, 2024

We are getting more precise examples for this issue.
We should look into adding to our stop words, implementing exact string match (like in keyword search,) re-indexing adding the f_name and l_name, and using match_phrase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: 🔜 Sprint backlog
Development

No branches or pull requests

3 participants