Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Request Initiator Chain #12331

Open
ojaswa1942 opened this issue Apr 25, 2024 · 8 comments
Open

[Feature]: Request Initiator Chain #12331

ojaswa1942 opened this issue Apr 25, 2024 · 8 comments
Labels

Comments

@ojaswa1942
Copy link

ojaswa1942 commented Apr 25, 2024

Feature description

Hi,

This is to request information such as "Initiator Chain" for each request that is currently present in the Initiator section in Chrome Network tab:
Network Tab

Currently, it is possible to get the immediate initiator for an initial request via request events but not the complete initiator chain. It is further limited by type of initiator and only information for "Parser" and "Script" type requests are present, making it difficult to reliably generate this tree either. (Related discussions here)

While this is discussed and implemented, I would be happy to try out some alternates (maybe with CDP which exposes a little more information in some cases)?

@OrKoN
Copy link
Collaborator

OrKoN commented Apr 25, 2024

The DevTools feature is built on the same CDP data so it should be possible to re-construct the chain based on CDP responses. Could you describe your use case for needing this data in Puppeteer?

@ojaswa1942
Copy link
Author

@OrKoN Good to know. Any leads to derive this will help too!

Currently, we do have request events in Puppeteer. However, the complete network initiator chains for them is missing. As for our use-case, given some network calls, we intend to track them to their source on the page.

Let's consider a simple case such as: ScriptA -> ScriptB -> ScriptC -> Target Network Call, here the target call can be another script, XHR, or a static resource. An initiator chain will help identify:

  • The "Original Source" of a particular network call, in this case ScriptA which was an original asset present on the page
  • Looking at all initiators, we can identify all scripts loaded or calls invoked as a result of ScriptA being loaded on page

@OrKoN
Copy link
Collaborator

OrKoN commented Apr 25, 2024

I think DevTools just finds requests for the initiator script in the same frame by URL: https://source.chromium.org/chromium/chromium/src/+/main:third_party/devtools-frontend/src/front_end/models/logs/NetworkLog.ts;l=182;drc=ac291c8142f2481d814f3e13ad6138f70d4666b8 See usages of that function in that file.

@OrKoN
Copy link
Collaborator

OrKoN commented Apr 25, 2024

So basically the algorithm for a request is:

  1. While request.initiator.url if available (it should be for scripts but it might require some extra CDP domains to be enabled)
  2. Set request to a request that happened before the current request that has the url === request.initiator.url

DevTools also has some caching to avoid computing parts of the chain multiple times.

@ojaswa1942
Copy link
Author

Hi @OrKoN,
Thanks for this lead!

  1. Is it possible to share the extra CDP domains that we may need to enable?
  2. We do have something like this currently implemented at our end. Here, we see a common case where requests with same URL are sent from different initiators (example: different script files). Given the initiator for a request, we only have its request.initiator.url, for these cases, simply relying on the url for generating the chain generates incorrect data when the request from a different initiator is picked instead - not sure if we have an identifier here to differentiate. I'm yet to understand how chromium is handling this.
    Perhaps if there was a way to get request.initiator.request object or request.initiator.requestId instead of just depending on url comparison here?

@OrKoN
Copy link
Collaborator

OrKoN commented Apr 26, 2024

@ojaswa1942

  1. I do not know off the top of my head, you would need to read Chromium codebase

  2. Perhaps if there was a way to get request.initiator.request object or request.initiator.requestId instead of just depending on url comparison here?

That does not exist in Chromium AFAIK, I do not think it keeps track of the request IDs that loaded a given script/resource. As you see DevTools is doing the same thing.

@ojaswa1942
Copy link
Author

There must be some way it handles this though as on recreating this case, the chains available in the Initiator section in Chromium are indeed as expected. Let me dig in and make some more sense of that code first.

Thanks for the help again @OrKoN. Does it make sense to have this as a feature addition in Puppeteer? I think it calls for a decent use-case considering it is something readily available in the browser along with other request metadata.

@OrKoN
Copy link
Collaborator

OrKoN commented Apr 26, 2024

We can keep the feature request to see how many users have the same use case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants