Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search not only in .md's #795

Closed
Zuijdam opened this issue Mar 19, 2022 · 19 comments
Closed

Search not only in .md's #795

Zuijdam opened this issue Mar 19, 2022 · 19 comments
Labels
jekyll-version Issues relating to Jekyll versions

Comments

@Zuijdam
Copy link

Zuijdam commented Mar 19, 2022

Is your feature request related to a problem? Please describe.
We have a site with markdown pages but also some json files we show with data. The search in not searching in the json's

Describe the solution you'd like
A way to alter the search scope so it will find everything we write/make.

Describe alternatives you've considered
no clue how to fix this

@dcchambers
Copy link

dcchambers commented Mar 24, 2022

The search is powered by lunr.js, which is a client-side search tool that works by building a search index based on the provided content.

By default, lunr only searches html pages that include the lunr javascript. When you enable search in the config, just-the-docs includes this js in all of your markdown pages (rendered to html) unless you opt the specific page out of search.

also some json files we show with data

How are you actually displaying the json to the user? Is it rendered in an html page? Are you letting them download the json directly?

Regardless, what you want to do may be possible by writing your own javascript that uses the lunr api to add content from the json files to the search index.

@reithose
Copy link

reithose commented Mar 30, 2022

It looks like lunr.js crawls the pages before the json files are rendered in a html page. See for example:
https://www.dedigitaletuin.nl/docs/lijsten/lijsten-weetjes.html
The list is from a json file. The intro ("wist je dat we") of the page can be found by lunr. The list itself cannot.

search-data.json looks like this for this page:
"content": "# Weetjes Wist je dat we al {{site.data.weetjes | size}} weetjes hebben! --- {% assign i = 1 %} {% for weetje in site.data.weetjes %} {% increment i %}. {{ weetje | newline_to_br }} {% endfor %} "

No rendered json in the content variable.

@Zuijdam
Copy link
Author

Zuijdam commented Mar 30, 2022

Yes that seems correct. (Reithose and I are working both for same site). Can we somehow prerender or give the crawler a different timing?

@mattxwang
Copy link
Member

Going to mark this as needs investigation. A PR is certainly wecome.

@pdmosses
Copy link
Contributor

pdmosses commented Jul 2, 2022

This issue is due to a Jekyll regression in v4.2.0.

To check, run jekyll build -V; the file assets/js/zzzz-search-data.json should be processed after all html files have been written.

@MilesFarber
Copy link

MilesFarber commented Aug 15, 2022

I can confirm it's not just json files, but any type of content that has been retrieved from site.data or _data. Even CSV files. This is on the latest version with Github Pages.
image

@pdmosses
Copy link
Contributor

@FlarosOverfield Thanks for following up on this issue. Looking at @Zuijdam and @reithose's site, https://www.dedigitaletuin.nl/assets/js/search-data.json contains entirely unrendered content; as @dcchambers explained, this is not the expected behaviour!

To investigate this further, we'll need the url of the source file repository or (better) a minimal working example.

@pdmosses
Copy link
Contributor

pdmosses commented Aug 18, 2022

@FlarosOverfield thanks, that helps:

Jekyll converts Markdown+Liquid files to HTML if and only if they start with front matter.

Unless you want to use a special layout, you should start all1 *.md files that are to produce HTML pages with this:

---
layout: default
---

Moreover, links to pages appear in the navigation panel only if they have also have a title setting in the front matter.

Footnotes

  1. A Markdown file that is included in another file should not start with front matter.

@MilesFarber
Copy link

image
By adding:

---
layout: default
title: PMD Items
permalink: PMDItems
---

Search seems to work, albeit quite slow.

@pdmosses
Copy link
Contributor

pdmosses commented Aug 20, 2022

@Zuijdam @reithose The issue you reported is due to a regression in Jekyll v4.2.0 that causes generation of /assets/js/search-data.json before all the html pages have been rendered.

In your Gemfile you've specified:

gem "jekyll", "~> 4.2.0"

Your checked-in Gemfile.lock also specifies jekyll (4.2.0). AFAIK, GitHub Pages runs bundle install, which respects the settings in Gemfile.lock. Running bundle update ignores Gemfile.lock, and should (currently) update Jekyll to v4.2.2.

Running bundle update works for me, testing locally on a download of your repository. You should commit the changes to Gemfile.lock after running bundle update. If that doesn't fix the search data on your published site, I'll be happy to take another look at your issue.

@Zuijdam
Copy link
Author

Zuijdam commented Aug 20, 2022 via email

@reithose
Copy link

@Zuijdam @reithose The issue you reported is due to a regression in Jekyll v4.2.0 that causes generation of /assets/js/search-data.json before all the html pages have been rendered.

In your Gemfile you've specified:

gem "jekyll", "~> 4.2.0"

Your checked-in Gemfile.lock also specifies jekyll (4.2.0). AFAIK, GitHub Pages runs bundle install, which respects the settings in Gemfile.lock. Running bundle update ignores Gemfile.lock, and should (currently) update Jekyll to v4.2.2.

Running bundle update works for me, testing locally on a download of your repository. You should commit the changes to Gemfile.lock after running bundle update. If that doesn't fix the search data on your published site, I'll be happy to take another look at your issue.

Your solution fixed the problem! geensnor/DigitaleTuin@a725a7d

Thanks for your help!

@pdmosses
Copy link
Contributor

Good to know that using bundle update fixed the problem. I'll now remove the bug label and close this issue.

However, I wrote:

AFAIK, GitHub Pages runs bundle install, which respects the settings in Gemfile.lock.

That's inaccurate: GitHub Pages uses Jekyll 3 when it builds a website directly from a branch, regardless of the jekyll setting in Gemfile.lock. But GitHub Pages can also build websites using Actions, taking account of the settings in the Gemfile.

@pdmosses pdmosses added jekyll-version Issues relating to Jekyll versions and removed bug labels Aug 21, 2022
@pdmosses
Copy link
Contributor

BTW, v0.4.0.rc1 of the theme supports a (not yet documented) include: search_placeholder_custom.html.

See #925 for how to start using v0.4.0.rc1. Run bundle update to overwrite Gemfile.lock. Then create _includes/search_placeholder_custom.html with contents (for example):

Zoek {{site.title}}

I've tested it locally – I hope it works for you too.

@melat0nin
Copy link

A related issue: is it possible to index YAML frontmatter for lunr.js? Because of complicated layouts and the use of Netlify CMS, I store a lot of page content in the frontmatter. I'd love for this to be indexed.

@pdmosses
Copy link
Contributor

A related issue: is it possible to index YAML frontmatter for lunr.js?

The Liquid code that constructs the search index is at /assets/js/zzzz-search-data.json. It refers to specific items of front matter, such as page.title and page.content (where page is a variable ranging over all built HTML pages).

It should be possible to extend the code to construct lunr.js index entries from all other items of front matter. How easy it would be depends on what kind of strings you use – and on how familiar you are with Liquid…

I suggest to submit this issue as an enhancement proposal, indicating whether you're planning to implement it yourself.

@melat0nin
Copy link

I actually managed this soon after I posted my comment -- it was fairly straightforward to add yaml values to page_content, and it works great :)

@diablodale
Copy link
Contributor

Frontmatter content related #1067

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jekyll-version Issues relating to Jekyll versions
Projects
None yet
Development

No branches or pull requests

8 participants