Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: TypeError: Cannot read properties of undefined (reading '0') in cspell-trie-lib #5222

Open
1 task done
1j01 opened this issue Feb 3, 2024 · 3 comments · Fixed by #5233
Open
1 task done

Comments

@1j01
Copy link

1j01 commented Feb 3, 2024

Kind of Issue

Crash / Error

Tool or Library

cspell-trie

Version

8.3.2

Supporting Library

cspell-trie-lib

OS

All of them

OS Version

No response

Description

With a multi-lingual word list, the CSpell CLI throws an error while constructing a prefix tree from the dictionary.
See this repro repo for more info.

Isaiah@Cardboard MINGW64 ~/Projects/cspell-bug-repro (main)
$ npx cspell-cli lint .
TypeError: Cannot read properties of undefined (reading '0')
    at new FastTrieBlobINode (file:///C:/Users/Isaiah/Projects/cspell-bug-repro/node_modules/cspell-trie-lib/dist/lib/TrieBlob/FastTrieBlobIRoot.js:17:27)
    at FastTrieBlobINode.child (file:///C:/Users/Isaiah/Projects/cspell-bug-repro/node_modules/cspell-trie-lib/dist/lib/TrieBlob/FastTrieBlobIRoot.js:77:16)
    at nodeWalker (file:///C:/Users/Isaiah/Projects/cspell-bug-repro/node_modules/cspell-trie-lib/dist/lib/ITrieNode/walker/walker.js:58:30)
    at nodeWalker.next (<anonymous>)
    at get size [as size] (file:///C:/Users/Isaiah/Projects/cspell-bug-repro/node_modules/cspell-dictionary/dist/SpellingDictionary/SpellingDictionaryFromTrie.js:43:51)      
    at file:///C:/Users/Isaiah/Projects/cspell-bug-repro/node_modules/cspell-dictionary/dist/SpellingDictionary/SpellingDictionaryCollection.js:21:64
    at Array.sort (<anonymous>)
    at new SpellingDictionaryCollectionImpl (file:///C:/Users/Isaiah/Projects/cspell-bug-repro/node_modules/cspell-dictionary/dist/SpellingDictionary/SpellingDictionaryCollection.js:21:47)
    at createCollection (file:///C:/Users/Isaiah/Projects/cspell-bug-repro/node_modules/cspell-dictionary/dist/SpellingDictionary/SpellingDictionaryCollection.js:98:12)      
    at _getDictionaryInternal (file:///C:/Users/Isaiah/Projects/cspell-bug-repro/node_modules/cspell-lib/dist/esm/SpellingDictionary/Dictionaries.js:50:12)

Steps to Reproduce

  • Run cspell-cli lint . with the given configuration file, and it throws an error.
  • Also, open the cspell.json file in VS Code, and it reports a misspelling for one of the words in the accepted words list. Before trimming the word list, it reported even more words within the word list as misspelled.

Expected Behavior

  • cspell-cli lint . should not error.
  • No words in the words array in cspell.json should be underlined in VS Code.

Additional Information

There is likely a much smaller reproduction possible, but in the given configuration, removing any one word will make it fail to reproduce the bug.
I have not tried simplifying the reproduction by modifying the words themselves, although this may be elucidatory.

cspell.json

{
	"ignorePaths": [
		".history", // VS Code "Local History" extension
		"node_modules"
	],
	"words": [
		"Æвзаг",
		"ajeļ",
		"allowfullscreen",
		"apng",
		"APNGs",
		"appinstalled",
		"Aragonés",
		"Asụsụ",
		"Avañe'ẽ",
		"Azərbaycan",
		"bepis",
		"bgcolor",
		"Bokmål",
		"Český",
		"Čeština",
		"classid",
		"cmaps",
		"ctype",
		"Cueŋƅ",
		"d'Òc",
		"desaturated",
		"DIALOGEX",
		"Divehi",
		"draggable",
		"ellipticals",
		"endonym",
		"eqeqeq",
		"equivalize",
		"ertical",
		"esque",
		"Eʋegbe",
		"eyedrop",
		"focusring",
		"Føroyskt",
		"fudgedness",
		"fullscreen",
		"Gàidhlig",
		"gazemouse",
		"GIFs",
		"Gikuyu",
		"grayscale",
		"headmouse",
		"hilight",
		"Hrvatski",
		"icns",
		"IFDs",
		"Íslenska",
		"Język",
		"jnordberg",
		"jspaint",
		"Kreyòl",
		"Kurdî",
		"Latviešu",
		"Lëtzebuergesch",
		"libtess",
		"Lietuvių",
		"Lingála",
		"llpaper",
		"localdomain",
		"localforage",
		"localizable",
		"lookpath",
		"lors",
		"ltres",
		"Macromedia",
		"nomine",
		"nostri",
		"nowrap",
		"occluder",
		"octree",
		"Oʻzbek",
		"oleobject",
		"orizontal",
		"ovaloids",
		"oviforms",
		"pako",
		"palettized",
		"paypal",
		"pointermove",
		"pointerup",
		"Português",
		"proxied",
		"pseudorandomly",
		"psppalette",
		"rbaycan",
		"redoable",
		"reenable",
		"repurposable",
		"rerender",
		"retargeted",
		"Română",
		"rotologo",
		"roundrects",
		"royskt",
		"rrect",
		"sandboxed",
		"scrollable",
		"scrollbars",
		"sketchpalette",
		"slenska",
		"Slovenčina",
		"Slovenščina",
		"Slovenský",
		"sorthweast",
		"soundcloud",
		"subrepo",
		"tbody",
		"themeable",
		"themepack",
		"Tiếng",
		"tileable",
		"timespan",
		"tina",
		"titlebar",
		"Toçikī",
		"togglable",
		"Tshivenḓa",
		"ufeff",
		"undock",
		"unfocusing",
		"uniquify",
		"unmaximize",
		"upiatun",
		"ustom",
		"UTIF",
		"vaporwave",
		"verts",
		"Việt",
		"viewports",
		"Volapük",
		"webglcontextlost",
		"webglcontextrestored",
		"Wikang",
		"WINTRAP",
		"Yângâ",
		"Zhōngwén",
		"zoomable",
		"zoomer",
		"zyk",
		"Ελληνικά",
		"Аҧсшәа",
		"Башҡорт",
		"Беларуская",
		"Език",
		"Ирон",
		"Језик",
		"Коми",
		"Қазақ",
		"Македонски",
		"Нохчийн",
		"Русский",
		"Словѣньскъ",
		"Српски",
		"Тоҷикӣ",
		"Түркмен",
		"Ўзбек",
		"Українська",
		"Чӑваш",
		"Чӗлхи",
		"Ѩзыкъ",
		"Ӏарул",
		"ქართული",
		"Հայերեն",
		"עברית",
		"أۇزبېك",
		"ئۇيغۇرچە",
		"اردو",
		"العربية",
		"بهاس",
		"پنجابی",
		"تاجیکی",
		"سندھی",
		"سنڌي",
		"فارسی",
		"كشميري",
		"ትግርኛ",
		"አማርኛ",
		"ພາສາລາວ",
		"ꦧꦱꦗꦮ",
		"ᐃᓄᒃᑎᑐᑦ",
		"ᐊᓂᔑᓈᐯᒧᐎᓐ",
		"ᓀᐦᐃᔭᐍᐏᐣ"
	]
}

cspell.config.yaml

No response

Example Repository

https://github.com/1j01/cspell-bug-repro

Code of Conduct

  • I agree to follow this project's Code of Conduct
@Jason3S
Copy link
Collaborator

Jason3S commented Feb 6, 2024

@1j01,

Thank you! I'll look into it. It should not error like that.

@Jason3S
Copy link
Collaborator

Jason3S commented Feb 6, 2024

@1j01,

The error given by cspell is not very informative. The spell checker failed to create an internal dictionary based upon the words found in the config. There is a limit on the number of unique characters in a dictionary. I'll look into a fix to make the limit much larger, but it might take a while.

The workaround is to have multiple dictionaries:

cspell.json

{
    "dictionaryDefinitions": [
        {
            "name": "words-latin",
            "path": "words-latin.txt"
        },
        {
            "name": "words-greek",
            "path": "words-greek.txt"
        },
        {
            "name": "words-cyrillic",
            "path": "words-cyrillic.txt"
        },
        {
            "name": "words-arabic",
            "path": "words-arabic.txt"
        },
        {
            "name": "words-inline",
            "words": [
                "DIALOGEX",
                "GIFs",
                "WINTRAP"
            ]
        }
    ],
    "dictionaries": [
        "words-latin",
        "words-greek",
        "words-cyrillic",
        "words-arabic",
        "words-inline"
    ]
}

Jason3S added a commit that referenced this issue Feb 8, 2024
fixes: #5222

The binary dictionary builder (TrieBlob) only supported word lists with 250 unique characters.

This was not an issue with the object based trie dictionaries used with the compiled dictionaries.
Jason3S added a commit that referenced this issue Feb 10, 2024
fixes: #5222

The binary dictionary builder (TrieBlob) only supported word lists with 250 unique characters.

This was not an issue with the object based trie dictionaries used with the compiled dictionaries.
Jason3S added a commit that referenced this issue Feb 12, 2024
fixes: #5222

The binary dictionary builder (TrieBlob) only supported word lists with 250 unique characters.

This was not an issue with the object based trie dictionaries used with the compiled dictionaries.
Jason3S added a commit that referenced this issue Feb 16, 2024
fixes: #5222

The binary dictionary builder (TrieBlob) only supported word lists with 250 unique characters.

This was not an issue with the object based trie dictionaries used with the compiled dictionaries.
Jason3S added a commit that referenced this issue Feb 18, 2024
fixes: #5222

The binary dictionary builder (TrieBlob) only supported word lists with 250 unique characters.

This was not an issue with the object based trie dictionaries used with the compiled dictionaries.
@Jason3S
Copy link
Collaborator

Jason3S commented Feb 20, 2024

I'm re-opening this issue since I had to revert the changes in #5233 with #5281.

@Jason3S Jason3S reopened this Feb 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants