Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#8132] Update actions and pages which set "noindex", "nofollow" crawler directives #8223

Open
wants to merge 7 commits into
base: develop
Choose a base branch
from

Conversation

gbp
Copy link
Member

@gbp gbp commented Apr 30, 2024

Relevant issue(s)

Fixes #8132

What does this do?

Update actions and pages which set "noindex", "nofollow" crawler directives

Why was this needed?

Snippets of request content often appear on list pages, and create a whack-a-mole situation when unhappy users find that external search engines have indexed a list page (e.g. /body/foo?page=12) that contains a cached snippet of PII that we've removed from the request page itself.

Implementation notes

@garethrees are you happy with changing the number of paginated pages which are indexed? I'm concerned this might impact search ranking due to newer request pages not being indexed at all.

gbp added 7 commits April 30, 2024 09:05
Extract out into a common concern included in `ApplicationController` so
this hook/helper is available to all controllers.
Don't allow indexing of:
- New citation page
- New request page
- Similar requests page
- Request details page
- User profile wall
These actions require a user to be logged in or link to actions which we
don't allow to be indexed and as such there is no reason for search
indexers to follow them.
Even though we're setting the response header ensure we set a consistent
value in the meta tag.
Pages after the first shouldn't be crawled, this will help with site
performance.
@gbp gbp added this to the Reduce Admin Burden milestone Apr 30, 2024
@gbp gbp requested a review from garethrees April 30, 2024 08:22
@gbp gbp added the on-staging label Apr 30, 2024
Copy link
Member

@garethrees garethrees left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth making the dependency on RobotsHeaders explicit in the concerns that use it:

  module ProminenceHeaders
    extend ActiveSupport::Concern
+    include RobotsHeaders

  module PublicTokenable
    extend ActiveSupport::Concern
+    include RobotsHeaders

https://api.rubyonrails.org/classes/ActiveSupport/Concern.html says you can do this (in the intro section of the documentation)

@garethrees
Copy link
Member

Reassigning as discussed to make a few tweaks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reduce external search indexing of request list pages
2 participants