Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Handle multiple/duplicate Fetch.requestPaused events #7802

Merged
merged 11 commits into from Dec 3, 2021

Conversation

benallfree
Copy link
Contributor

@benallfree benallfree commented Nov 26, 2021

What kind of change does this PR introduce?

bugfix

Did you add tests for your changes?

TBD

If relevant, did you update the documentation?

N/A

Summary

CDP will sometimes fire multiple Fetch.requestPaused events for a single Network.requestWillBeSent event. Puppeteer incorrectly assumes a 1:1 relationship between these events which causes request interception to hang because the second Fetch.requestPaused event is never continue()'d.

Does this PR introduce a breaking change?

No, but it's really hard to test this.

Other information

Related issues:
https://github.com/puppeteer/puppeteer/pull/7060/files#diff-3280fbae5b6189c9b87a76ea21a5da3e10140467e29532e0b7b72608036f5070R290
#7475 (comment)
#7475
#7288
#6696
#6735
#7225
https://bugs.chromium.org/p/chromium/issues/detail?id=1196004

Here is a test case:

test.js

const puppeteer = require('.');

(async () => {
  const browser = await puppeteer.launch({ headless: false });
  const page = await browser.newPage();
  await page.setCacheEnabled(false);

  await page.setRequestInterception(true);
  page.on('request', (interceptedRequest) => {
    interceptedRequest.continue();
  });
  await page.goto('https://www.google.com');
  await browser.close();
})();

To run:

npm install
npm run build
node ./test.js

@google-cla google-cla bot added the cla: yes label Nov 26, 2021
@benallfree benallfree changed the title [bug] CDP sometimes fires Fetch.requestPaused twice fix: Handle multiple/duplicate Fetch.requestPaused events Nov 26, 2021
@benallfree benallfree marked this pull request as draft November 26, 2021 19:36
@benallfree
Copy link
Contributor Author

Requesting reviews from: @jschfflr @Androbin @starrify @marvinhagemeister @dekelev @hemanthreddyk

@benallfree benallfree marked this pull request as ready for review November 26, 2021 20:22
@hemanthreddyk
Copy link

@benallfree Looks like your fix resolved the request stalling issue in my case. With caching enabled I no longer see the timeout issue, but I see a lot of unhandled rejections

Error 1 of 2: [UNHANDLED_REJECTION] Unhandled rejection and potential memory leak!
    at process.<anonymous> (/srv/package/node_modules/pdf-generator/lib/index.js:195:5)
    at process.emit (events.js:400:28)
    at process.emit (domain.js:475:12)
    at processPromiseRejections (internal/process/promises.js:245:33)
    at processTicksAndRejections (internal/process/task_queues.js:96:32)
    with {
           "message": "Unhandled rejection and potential memory leak!",
           "code": "UNHANDLED_REJECTION",
           "source": "pdf-generator"
         }

Error 2 of 2: Unexpected extraInfo events for request 51.126
    at NetworkManager._emitResponseEvent (/srv/package/node_modules/puppeteer/lib/cjs/puppeteer/common/NetworkManager.js:354:19)
    at NetworkManager._onResponseReceived (/srv/package/node_modules/puppeteer/lib/cjs/puppeteer/common/NetworkManager.js:377:14)
    at /srv/package/node_modules/puppeteer/lib/cjs/vendor/mitt/src/index.js:51:62
    at Array.map (<anonymous>)
    at Object.emit (/srv/package/node_modules/puppeteer/lib/cjs/vendor/mitt/src/index.js:51:43)
    at CDPSession.emit (/srv/package/node_modules/puppeteer/lib/cjs/puppeteer/common/EventEmitter.js:72:22)
    at CDPSession._onMessage (/srv/package/node_modules/puppeteer/lib/cjs/puppeteer/common/Connection.js:245:18)
    at Connection._onMessage (/srv/package/node_modules/puppeteer/lib/cjs/puppeteer/common/Connection.js:117:25)
    at WebSocket.<anonymous> (/srv/package/node_modules/puppeteer/lib/cjs/puppeteer/node/NodeWebSocketTransport.js:13:32)
    at WebSocket.onMessage (/srv/package/node_modules/ws/lib/event-target.js:199:18)
    at WebSocket.emit (events.js:400:28)
    at WebSocket.emit (domain.js:475:12)
    at Receiver.receiverOnMessage (/srv/package/node_modules/ws/lib/websocket.js:1022:20)
    at Receiver.emit (events.js:400:28)
    at Receiver.emit (domain.js:475:12)
    at Receiver.dataMessage (/srv/package/node_modules/ws/lib/receiver.js:522:14)
    at Receiver.getData (/srv/package/node_modules/ws/lib/receiver.js:440:17)
    at Receiver.startLoop (/srv/package/node_modules/ws/lib/receiver.js:148:22)
    at Receiver._write (/srv/package/node_modules/ws/lib/receiver.js:83:10)
    at writeOrBuffer (internal/streams/writable.js:358:12)
    at Receiver.Writable.write (internal/streams/writable.js:303:10)
    at Socket.socketOnData (/srv/package/node_modules/ws/lib/websocket.js:1116:35)
    at Socket.emit (events.js:400:28)
    at Socket.emit (domain.js:475:12)
    at addChunk (internal/streams/readable.js:293:12)
    at readableAddChunk (internal/streams/readable.js:267:9)
    at Socket.Readable.push (internal/streams/readable.js:206:10)
    at TCP.onStreamRead (internal/stream_base_commons.js:188:23)
    at TCP.callbackTrampoline (internal/async_hooks.js:130:17)
    with {
           "message": "Unexpected extraInfo events for request 51.126"
         }

if (extraInfos.length) {
throw new Error(
'Unexpected extraInfo events for request ' + responseReceived.requestId
);
}

@benallfree
Copy link
Contributor Author

@hemanthreddyk Thank you for testing, that helps a lot!

UNHANDLED_REJECTION] Unhandled rejection and potential memory leak! is coming from /srv/package/node_modules/pdf-generator/lib/index.js:195:5, the pdf-generator package. That is unlikely to be related to anything in Puppeteer. https://www.npmjs.com/package/pdf-generator appears to be abandoned and has no repository. I installed the package and there is no index.js:195. Sorry I can't be of more help on this one, my best suggestion is to move to a different package.

For the Unexpected extraInfo events error, is there a specific URL I can test that produces this error? Does it happen on the main branch or just this PR?

@hemanthreddyk
Copy link

hemanthreddyk commented Nov 27, 2021

@benallfree Sry for the confusion here, I wasn't actually using any such extra packages like pdf-generator, I just replaced the name of our internal package with some random name (my bad I used pdf-generator). it is just listening to 'unhandledRejection' event and logging the error.

Actual error is being originated from here
https://github.com/puppeteer/puppeteer/blob/main/src/common/NetworkManager.ts#L525-L529

I think this has nothing to do with your PR, it is happening even in main branch.

I could see a same error even when I load google.

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch({
    headless: true,
    // executablePath: '/Applications/Google Chrome.app/Contents/MacOS/Google Chrome',
    args: ['--window-size=2400,1239']
  })
  const page = await browser.newPage()
  await page.goto('https://www.google.com', {
    waitUntil: 'networkidle0',
    timeout: 60000
  })
  await browser.close()
})()

@benallfree
Copy link
Contributor Author

@hemanthreddyk It seems to be the waitUntil: 'networkidle0'. If I remove it, the error stops.

@benallfree
Copy link
Contributor Author

benallfree commented Nov 28, 2021

@josepharhar This touches your work in #7764 and #7760 and appears to expose an incorrect extraInfo exception. Can you PTAL?

@OrKoN
Copy link
Collaborator

OrKoN commented Nov 29, 2021

@benallfree I think the extraInfo exception is this issue https://bugs.chromium.org/p/chromium/issues/detail?id=1274813 It might still turn out that Puppeteer is doing smth wrong but it's also like that it's a Chromium bug.

@benallfree
Copy link
Contributor Author

I think the extraInfo exception is this issue https://bugs.chromium.org/p/chromium/issues/detail?id=1274813 It might still turn out that Puppeteer is doing smth wrong but it's also like that it's a Chromium bug.

@OrKoN Def a CDP bug. Network.responseReceived says about the hasExtraInfo field:

Indicates whether requestWillBeSentExtraInfo and responseReceivedExtraInfo events will be or were emitted for this request.

I think the issue is that CDP only flags hasExtraInfo: true if Network.responseReceived is sent first.

But in all reality that field can be ignored if we already have an extraInfo event. I think the bug in Puppeteer is that it doesn't expect Network.responseReceivedExtraInfo to ever come first.

@josepharhar
Copy link
Collaborator

I think the issue is that CDP only flags hasExtraInfo: true if Network.responseReceived is sent first.

I don't believe that chromium has a race condition, but I also haven't spent the time to look into the chromium bug I filed yet.

@benallfree
Copy link
Contributor Author

I don't believe that chromium has a race condition, but I also haven't spent the time to look into the chromium bug I filed yet.

Yep I'm just guessing too. Now I'm curious though :)

@benallfree benallfree requested a review from OrKoN December 3, 2021 16:26
@benallfree
Copy link
Contributor Author

@OrKoN @Androbin requesting another review to see if we can get this one merged. Any further thoughts?

@OrKoN
Copy link
Collaborator

OrKoN commented Dec 3, 2021

Looks good to me. Which of the related issues will be resolved by this PR?

@OrKoN OrKoN merged commit 636b086 into puppeteer:main Dec 3, 2021
@OrKoN
Copy link
Collaborator

OrKoN commented Dec 3, 2021

Thanks for the details and the contribution! I will try to find time for the other PR next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants