Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to ignore diacritics #723

Open
mgrabovsky opened this issue Jun 13, 2023 · 4 comments
Open

Add option to ignore diacritics #723

mgrabovsky opened this issue Jun 13, 2023 · 4 comments
Labels

Comments

@mgrabovsky
Copy link

Description

I'd find it extremely useful if Fuse had the ability to ignore diacritics/accents both in the search string and in the indexed content. However, I don't want to lose the diacritics completely, i.e. I want the diacritics to appear in the search results for the matched items.

Based on the comments in a previous issue (#415), I have implemented a quick prototype (see below), but it falls short of being usable: the getter strips the diacritics and it's impossible to recover them from the results. (See below for an illustration.)

Describe the solution you'd like

Have a configuration option like ignoreAccents: boolean, disabled by default, that would toggle this functionality.

Describe alternatives you've considered

I have read through #415 and implemented a prototype in our app, but it has the problem of losing the diacritics. The implementation basically boils down to stripping diacritics from the search string and a custom getter which does the same to the content:

// Code by Mathieu TUDISCO via GitHub:
// https://github.com/krisk/Fuse/issues/415#issuecomment-634348136
const stripAccents = String.prototype.normalize
    ? ((str) => str.normalize('NFD').replace(/[\u0300-\u036F]/g, ''))
    : ((str) => str);

const options = {
    // ...
    getFn: (obj, path) => stripAccents(Fuse.config.getFn(obj, path)),
    includeMatches: true,
    // ...
};

const fuse = new Fuse(data, options);
const searchResults = fuse.search(stripAccents(this.searchString));

To illustrate the problem, see the screenshot below. The title of the first two items should be “Lesnatost v přírodních lesních oblastech” and “Proč dnes příroda tak rychle přichází o svou rozmanitost?”, respectively, but the diacritics are stripped by the getter and they cannot be recovered.

@mgrabovsky
Copy link
Author

Is this a feature that would be welcome by the maintainers but there's not enough capacity to implement it? If so, I would be happy to take a look and try to submit a patch myself.

@github-actions
Copy link

This issue is stale because it has been open 120 days with no activity. Remove stale label or comment or this will be closed in 30 days

@github-actions github-actions bot added the Stale label Oct 12, 2023
@mgrabovsky
Copy link
Author

This still appears to be a problem and I'm still open to collaboration.

@github-actions github-actions bot removed the Stale label Oct 20, 2023
@boydkelly
Copy link

Would love to see this option...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants