Skip to content

[metascraper-helpers] jsonld caching #225

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 24, 2019
Merged

[metascraper-helpers] jsonld caching #225

merged 1 commit into from
Oct 24, 2019

Conversation

wyattjoh
Copy link
Contributor

The current implementation of the $jsonld uses jsonld which is wrapped by mem.

Unfortunately, because the use of mem does not actually specify any cache limitations, all cache entries are just persisted forever. This means that if at any point you scrape the same URL twice, updates to the script[type="application/ld+json"] on the page will never be parsed, forcing you to restart the server.

I don't think this was intentional behaviour, so this PR replaces use of mem with memoize-one which is used elsewhere in the codebase already. This will only cache the result for the last invoked call, which for the purposes of what caching the parse results of jsonld achieves, is the same, without the same drawbacks for subsequent calls with the same URL, but different content.

Unverified

This commit is not signed, but one or more authors requires that any commit attributed to them is signed.
@wyattjoh
Copy link
Contributor Author

As a note as well, because this is also affecting a helper package, if you bumped the version of @metascraper/helpers, you'd also need to update the version of the helper package across all the existing rules that use $jsonld.

wyattjoh added a commit to coralproject/talk that referenced this pull request Oct 24, 2019

Unverified

This commit is not signed, but one or more authors requires that any commit attributed to them is signed.
- upgraded scraper packages to use new jsonld parser
- waiting on microlinkhq/metascraper#225
  for merging
- fixes #2670
@coveralls
Copy link
Collaborator

Coverage Status

Coverage remained the same at 75.904% when pulling 3135564 on wyattjoh:master into 81f9921 on microlinkhq:master.

1 similar comment
@coveralls
Copy link
Collaborator

Coverage Status

Coverage remained the same at 75.904% when pulling 3135564 on wyattjoh:master into 81f9921 on microlinkhq:master.

@Kikobeats Kikobeats merged commit 862db51 into microlinkhq:master Oct 24, 2019
wyattjoh added a commit to coralproject/talk that referenced this pull request Oct 24, 2019

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
* feat: add support for application/json+ld parsing

- upgraded scraper packages to use new jsonld parser
- waiting on microlinkhq/metascraper#225
  for merging
- fixes #2670

* chore: metascraper upgrades
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants