Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

People detect Lighthouse to cheat its performance score #15829

Open
krzksz opened this issue Feb 22, 2024 · 5 comments
Open

People detect Lighthouse to cheat its performance score #15829

krzksz opened this issue Feb 22, 2024 · 5 comments
Assignees
Labels

Comments

@krzksz
Copy link

krzksz commented Feb 22, 2024

Summary

Hey! I work as a performance technical architect at Shopify. Lighthouse (PSI) has always been one of the go-to tools for our merchants when it comes to checking the performance of their stores. Unfortunately this ended up as an opportunity for some bad 3rd party actors to offer "speed optimisation services" that heavily strip down the page content when the Lighthouse test is detected. This generates a false image of the actual experience real users are getting, preventing store owners from even learning it's worth to improve.

Most popular techniques
Here are the most popular techniques that I found during the investigation:

  1. Checking for the Lighthouse string in the user agent.
  2. Checking for the Linux x86_64 value of navigator.platform.
  3. Checking for the moto g power string in the user agent.

After Lighthouse is detected, there are usually scripts in place that do a mix of the following:

  • Append an LCP hack.
  • Prevent all/3rd party JavaScript from loading.
  • Prevent all/lazyloaded images from loading.
  • Prevent all/lazyloaded iframes from loading.
  • Stop parsing of the HTML using document.close().

Proposed solution
While we're working internally to address this issue, we probably won't be able to find and fix all of the affected pages. The malicious JavaScript is usually heavily obfuscated so it's not always as simple as looking for one of the strings mentioned above. That's why I thought it would be good to fix the problem at its root instead.

When it comes to the UA, I can see it was already removed by @paulirish in this PR. That said, it seems like PSI still appends the Chrome-Lighthouse string. The information still leaks through the Sec-Ch-Ua header as well through:

{brand: 'Lighthouse', version: lighthouseVersion},

My first proposal would be to remain consistent and remove those completely.

I think that the navigator.platform issue can be addressed by adding the following logic somewhere inside the emulate function:

await session.sendCommand('Page.addScriptToEvaluateOnNewDocument', {
      source: `Object.defineProperty(
        navigator,
        'platform',
        { value: '${formFactor === 'mobile' ? 'Android' : 'MacIntel'}' }
      );`,
    });

This should keep its value consistent with the rest of the user agent.

For the moto g power I don't think there are any elegant ways to address it. If someone wants to risk breaking the page for the actual users of this device then so be it.

Let me know what you think. I'm happy to prepare a PR with the changes mentioned above so we can make sure it's not so easy to cheat people with those shady practices in the future.

@connorjclark
Copy link
Collaborator

I will take care of making our client hints not signal Lighthouse, and changing the PSI UA header.

@connorjclark
Copy link
Collaborator

connorjclark commented Mar 6, 2024

Next Lighthouse release will address the brand client hint, and next PSI release (~a day) will address its UA.

Checking for the Linux x86_64 value of navigator.platform.

I'm not convinced we should do this for purposes of evading detection. This is the same value that an actual Linux user would have. Are you seeing this used to detect Lighthouse? It'd be a pretty big false signal..

but, maybe worth doing to match the fake UA we set for mobile/desktop. @paulirish wdyt

@benschwarz
Copy link
Contributor

benschwarz commented Mar 7, 2024

I am generally not a fan of removing the ability to identify LH traffic. It is important to able to identify Lighthouse traffic in monitoring and firewall contexts.

As we know, cheating Lighthouse doesn't improve Core web vitals for visitors.

Yes, there are bad actors selling webperf plugins that intentionally mislead site owners.

My take: Abuse reports should be delivered to the platforms who host problematic plugins. Services like Wordpress and Shopify must maintain quality within their own marketplace.

Identification of traffic is important to software teams, agencies, consultants and all the folks who are focused on making positive webperf outcomes.

@krzksz
Copy link
Author

krzksz commented Mar 12, 2024

Checking for the Linux x86_64 value of navigator.platform.

I'm not convinced we should do this for purposes of evading detection. This is the same value that an actual Linux user would have. Are you seeing this used to detect Lighthouse? It'd be a pretty big false signal..

but, maybe worth doing to match the fake UA we set for mobile/desktop. @paulirish wdyt

I figured it may be more appropriate for DevTools to handle it as a part of the device emulation. I created a bug against Chromium: https://issues.chromium.org/issues/326791407

@benschwarz In principle I agree with your statement. This is yet another case of "this is why we can't have nice things". We spent months to uncover as many cases as possible and I believe there are still more that we didn't identify. In reality, most platforms don't have enough resources to address this at all.

At the same time, there are real non-technical people paying $ to "performance experts" taking advantage of this.

@benschwarz
Copy link
Contributor

@krzksz I'm not sure I agree this is a case of "this is why we can't have nice things".

From my perspective it's a case of "Shopifys job is difficult because they have a large addressable market." or "Shopify must protect customers who use their marketplace".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants