Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to remove all transliteration #358

Open
lloydjatkinson opened this issue Oct 10, 2022 · 5 comments
Open

Add option to remove all transliteration #358

lloydjatkinson opened this issue Oct 10, 2022 · 5 comments

Comments

@lloydjatkinson
Copy link

Hi!

Please could you add an option to disable transliteration? It's not always appropriate or user friendly for certain languages to have all their URL's be converted "into English".

Thank you

@Trott
Copy link
Owner

Trott commented Oct 10, 2022

I'm working on the next major release which will be ASCII-chars only by default and then you add in other language sets as appropriate. Stay tuned....

@Trott
Copy link
Owner

Trott commented Oct 10, 2022

In the meantime, if you have specific characters you want to preserve, you can do that by adding them to the charmap. For example, if I am using Spanish and don't want ñ to be transliterated to n, I can override it:

const slug = require('slug');

// Default: 'manana'
console.log(slug('mañana')); //

// Preserve the character 'ñ': 'mañana'
slug.charmap['ñ'] = 'ñ';
console.log(slug('mañana'));

// Omit the character 'ñ': 'maana'
delete slug.charmap['ñ']
console.log(slug('mañana'));

@regexj
Copy link

regexj commented Jun 1, 2023

Hi, is there an option for this yet? Reading the docs and playing around I can't see if it is.

We have a path pattern for our site that looks like this:
https://www.domain.com/profile/[firstname].[lastname]/

I need to ensure the name provided by the user is URL safe, but anglicising the name is just unnecessary and not user friendly. For example with a random Arabic name سعيد الكلباني passing through this turns into saayd.alklbany.

If it isn't forming the domain non-latin characters are fine, according to Google: https://www.youtube.com/watch?v=74FiBesPkI4

Wikipedia doesn't bother with this outside the domain either: https://ar.wikipedia.org/wiki/مكة

So yeah, +1 to request here, if it would be possible to disable the entire transliteration, just make safe those illegal URL special characters :)

@Trott
Copy link
Owner

Trott commented Jun 3, 2023

@regexj I don't believe there is an easy way to do this within slug's current design. The way it works now, if it finds the character in the charmap, it uses that entry to indicate what the character should be changed (or not changed) to. Otherwise, it drops the character. This is sensible from a safe-design perspective. Having an explicit list of "these characters are OK" is going to be safer than "here are all the dangerous characters to remove, and you can safely leave everything else".

The work-in-progress branch probably won't fix this. Unknown characters will still be dropped. I don't intend to change that because doing so creates a footgun for developers. What characters are safe depends on your use case. (For example, a . is safe in a query string, but unsafe in a path component.) So this module is guaranteed to return a value with a very narrow range of allowable characters.

@Trott
Copy link
Owner

Trott commented Jun 3, 2023

What you'll need to do in the future version (and in the current version) is explicitly add any characters you want to keep to the charmap or the multicharmap. The difference in the future version is that the default will be to remove the non-ASCII characters rather than transliterate them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants