Question: cannot search anything by i18n zh.yaml #465

hitzhangjie · 2022-07-19T07:44:33Z

Could you help sovling the searching problem?
I write a ebook here: https://hitzhangjie.pro/go-internals/, it uses hugo-book theme. I don't know why the searching not working.
I use Chrome to debug the problem, and I see the document is indexed.

Maybe the problem is relevant with i18n zh booksearchConfig, I change the tokenize function:
from str.replace(/[\x00-\x7F]/g, '').split('');
to str.split(/\W+/).concat(str.replace(/[\x00-\x7F]/g, '').split('')).filter(e => !!e);

And it worked.

The text was updated successfully, but these errors were encountered:

alex-shpak · 2022-07-19T15:26:18Z

Hi!
Nice to see theme used :) that's a lot of content.
It is possible that search config tokenization needs update, as I don't have any idea how to search in chineese properly, and relied on google-help.

Although, when I naively put chineese 'Lorem ipsum' to page and search fractions of it, it works for me in zh locale.
Can you send what content you have and what are you trying to search as example?

hitzhangjie · 2022-07-20T14:31:56Z

For example, I want to search both Chinese and English words, like '码农' or 'AST' which appears in markdown file.

Let me show an example here to reproduce the problem.

Case: tokenize function uses 'str.replace(/[\x00-\x7F]/g, '').split('');'

Let's search '码农', then we see the search result seems OK, it returns a document:

Then I search 'AST', then we see the search result is empty, it should returns 2 documents:

Case: tokenize function uses 'str.split(/\W+/).concat(str.replace(/[\x00-\x7F]/g, '').split('')).filter(e => !!e);'

Let's search both words again, it works.

search '码农':

search 'ast':

ps: Actually I don't know the internals about tokenize function, I came across this problem before and wrote down the right tokenizer settings from Google. Hope this could help 'hugo-book' theme.

Here is the ebook address, you may want to test it here: https://hitzhangjie.pro/go-internals/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: cannot search anything by i18n zh.yaml #465

Question: cannot search anything by i18n zh.yaml #465

hitzhangjie commented Jul 19, 2022

alex-shpak commented Jul 19, 2022

hitzhangjie commented Jul 20, 2022 •

edited

Question: cannot search anything by i18n zh.yaml #465

Question: cannot search anything by i18n zh.yaml #465

Comments

hitzhangjie commented Jul 19, 2022

alex-shpak commented Jul 19, 2022

hitzhangjie commented Jul 20, 2022 • edited

Case: tokenize function uses 'str.replace(/[\x00-\x7F]/g, '').split('');'

Case: tokenize function uses 'str.split(/\W+/).concat(str.replace(/[\x00-\x7F]/g, '').split('')).filter(e => !!e);'

hitzhangjie commented Jul 20, 2022 •

edited