Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chinese searching highlighting does not show up in some scenario #3990

Closed
5 tasks done
blueswen opened this issue Jun 5, 2022 · 3 comments
Closed
5 tasks done

Chinese searching highlighting does not show up in some scenario #3990

blueswen opened this issue Jun 5, 2022 · 3 comments
Labels
bug Issue reports a bug resolved Issue is resolved, yet unreleased if open

Comments

@blueswen
Copy link
Sponsor Contributor

blueswen commented Jun 5, 2022

Contribution guidelines

I've found a bug and checked that ...

  • ... the problem doesn't occur with the mkdocs or readthedocs themes
  • ... the problem persists when all overrides are removed, i.e. custom_dir, extra_javascript and extra_css
  • ... the documentation does not mention anything about my problem
  • ... there are no open or closed issues that are related to my problem

Description

There is a Chinese searching highlighting bug. If target query string is behind other Chinese text, searching highlighting mark does not show up. I think it's because match expression in highlighter/index.ts can not deal with this scenario.

I am not good at regex, but I found it could solved by adding another | in first group of expression (replace (^|${config.separator}|\\b) with (^|${config.separator}|\\b|)). Though it seems to work, I am not sure that could cause any side effect or not.

Expected behaviour

With query parameter h=公司 then modify match expression first group to (^|${config.separator}|\b|), and searching highlighting is working fine.

image

Actual behaviour

With query parameter h=公司, but searching highlighting is not working.

image

Steps to reproduce

  1. Markdown content about TSMC from wiki:

    ## TSMC
    
    台灣積體電路製造股份有限公司(英語:Taiwan Semiconductor Manufacturing Co., Ltd.),簡稱台積電、台積、台積公司或TSMC[4],與旗下公司合稱時則稱作台積電集團[5][6],是臺灣一家從事晶圓代工的公司,為全球規模最大的半導體製造廠,總部位於新竹科學園區,主要廠房則分布於新竹市、臺中市、臺南市。
    
  2. Search with 公司 (means company)

Package versions

  • Python: Python 3.9.10
  • MkDocs: version 1.3.0
  • Material: 8.3.2+insiders.4.17.2

Configuration

site_name: My Docs

theme:
  name: material

System information

  • Operating system: MacOS 12.3.1
  • Browser: Chrome Version 102
@squidfunk squidfunk added the needs investigation Issue must be investigated by the maintainers label Jun 5, 2022
@squidfunk
Copy link
Owner

Fixed in a60ac14. I've tested your proposed solution and it seems that we can just remove the word boundary character, which solves the problem at hand. I've checked several words with and without the word boundary and came to the conclusion that it should work reasonably well. As already noted on Gitter, I'll have to revisit highlighting in the future again, but since the change fixes the issue now without seemingly introducing new problems, we'll go with that for now.

@squidfunk squidfunk added bug Issue reports a bug resolved Issue is resolved, yet unreleased if open and removed needs investigation Issue must be investigated by the maintainers labels Jun 11, 2022
@blueswen
Copy link
Sponsor Contributor Author

It looks great! Thanks for helping.

@squidfunk
Copy link
Owner

Released as part of 8.3.5+insiders-4.18.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue reports a bug resolved Issue is resolved, yet unreleased if open
Projects
None yet
Development

No branches or pull requests

2 participants