Suggestion lists are useful in addressing common mistakes as noted by Wikipedia:Lists of common misspellings - Wikipedia
The idea is to make it easier for companies / projects to define a list of forbidden terms with a list of suggested replacements.
Below is a proposal on two ways to define suggestions.
The intention is to implement both. Since flagWords
is easier to do, it might get done first.
The idea is to enhance the definition of flagWords
to allow for suggestions.
Replace the definition of flagWords
with the following:
type FlagWordNoSuggestions = string;
type FlagWordWithSuggestions = [forbidWord: string, suggestion: string, ...otherSuggestions: string[]];
type FlagWord = FlagWordNoSuggestions | FlagWordWithSuggestions;
type FlagWords = FlagWord[];
interface BaseSettings {
// ... other fields
flagWords?: FlagWords;
}
flagWords:
- crap
- [hte, the]
- [acadmic, academic]
- [accension, accession, ascension]
"flagWords": [
"crap",
["hte", "the"],
["acadmic", "academic"],
["accension", "accession", "ascension"]
]
Be able to leverage lists like:
Using a suggestions dictionary provides several useful features:
- The word list is in a separate file
- Multiple formats can be supported
- Named dictionaries can be turned on, off, or even redefined
The file format is generally inferred based upon the file extension. All files can be gzip
d and will have a .gz
final extension.
One suggestion set per line.
Example:
againnst->against
agains->against
agaisnt -> against
aganist-> against
aggaravates->aggravates
alusion->allusion, illusion
alwasy->always
alwyas->always
amalgomated->amalgamated
amatuer->amateur
amature->armature, amateur
boaut->boat, bout, about
Validation:
/^((?:\p{L}\p{M}*)+)\s*->\s*((?:\p{L}\p{M}*)+)(?:,\s*((?:\p{L}\p{M}*)+))*$/gmu
dictionaryDefinitions:
- name: en-us-suggestions
path: ./en-us-suggestions.txt.gz
type: suggestions