Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot search headers containing colon (:) #4884

Closed
4 tasks done
TWiStErRob opened this issue Jan 18, 2023 · 6 comments
Closed
4 tasks done

Cannot search headers containing colon (:) #4884

TWiStErRob opened this issue Jan 18, 2023 · 6 comments
Labels
bug Issue reports a bug resolved Issue is resolved, yet unreleased if open

Comments

@TWiStErRob
Copy link

TWiStErRob commented Jan 18, 2023

Context

No response

Bug description

When a symbol is defined as a separator, we expect that symbol to searchable, that is the search string being tokenized exactly the same as the docs contents. Based on observations it looks like this is not the case with the built-in plugin.

Related links

Reproduction

Created by following

  1. https://squidfunk.github.io/mkdocs-material/bug-report/reproduction/
  2. https://squidfunk.github.io/mkdocs-material/setup/setting-up-site-search/#built-in-search-plugin
  3. Configure separator to include spaces, dash and colon

example.zip (browse same sources on GitHub)

Steps to reproduce

  1. remove - info from plugins:
  2. mkdocs serve
  3. Search for dash-separated and observe match:
    image
  4. Search for colon:separated and be baffled:
    image
  5. For control, search for separated and observe tokenization works:
    image
  6. Try "colon:separated" 🤞 it works because that's the usual for "exact match":
    image

Browser

No response

Before submitting

@squidfunk
Copy link
Owner

Thanks for reporting. This is a feature and not a bug 😉 : is considered a control character that allows to specify which field to search on. For example title:foo will only search the title of documents for foo. What we might be able to do is to allow for : as long as the word before the colon is not the name of a field, but I'm not sure this is a good idea, because then behaviour for title, text and tags differs from the rest of cases.

Normally, this isn't a problem because people don't search for :. I'll check if we can somehow limit to active fields, but if that doesn't prove to be viable, you might still fork the theme and adjust the query lexing stage:

/* Split query terms with tokenizer */
return transform(query, part => {
const terms: string[] = []
/* Initialize lexer and analyze part */
const lexer = new lunr.QueryLexer(part)
lexer.run()
/* Extract and tokenize term from lexeme */
for (const { type, str: term, start, end } of lexer.lexemes)
if (type === "TERM")
split(term, lunr.tokenizer.separator, (...range) => {
terms.push([
part.slice(0, start),
term.slice(...range),
part.slice(end)
].join(""))
})
/* Return terms */
return terms
})

@squidfunk squidfunk added the needs investigation Issue must be investigated by the maintainers label Jan 18, 2023
@squidfunk squidfunk added bug Issue reports a bug and removed needs investigation Issue must be investigated by the maintainers labels Jan 28, 2023
@squidfunk
Copy link
Owner

Fixed in a6436bd. When the query lexer returns an unknown field, we transform the query to replace the : with whitespace, so the search pipeline doesn't consider this as a field name. Note that the query lexing approach is very new and needs to be better incorporated into the search worker. I'll tackle this when I work on search the next time, which will be very soon given how many open change requests concerning search there currently are.

For the time being, this fix should make the search in your docs functional again 😊

@squidfunk squidfunk added the resolved Issue is resolved, yet unreleased if open label Jan 28, 2023
@squidfunk
Copy link
Owner

I closed this too early. We need to keep this open until the fix is released.

@squidfunk squidfunk reopened this Jan 28, 2023
@viceice
Copy link

viceice commented Jan 28, 2023

thanks for working on this. ❤️

@squidfunk
Copy link
Owner

Released as part of 9.0.7

@HonkingGoose
Copy link
Contributor

I can confirm that 9.0.7 fixed this problem for us in production as well.

Thank you very much for working on this, and getting it fixed so quickly! ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue reports a bug resolved Issue is resolved, yet unreleased if open
Projects
None yet
Development

No branches or pull requests

4 participants