Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Cooperative request intercepts #6733

Closed

Conversation

benallfree
Copy link
Contributor

@benallfree benallfree commented Jan 6, 2021

Hello, I wanted to open a discussion via this PR about a solution for cooperative request intercepts. I needed this for a recent project and thought it might help to share my fork.

If this PR seems too aggressive to be a core change, my other thought is #6734, which is just the modification to allow Puppeteer to accept pluggable NetworkManager and HTTPResponse classes. Then I can maintain a separate package with these more extensive changes.

The Problems

Puppeteer's core expects only one page.on('request') handler

Puppeteer anticipates only one request intercept handler (page.on('request', ...)) to perform continue(), abort(), or respond(). As the community has pointed out in multiple issues, there are often cases where multiple request handlers make sense. For example, puppeteer-extra attempts a plugin architecture. The current core design makes it impossible for multiple request handlers to cooperatively determine what to do with a request.

Here is a sample of issues for background:

berstend/puppeteer-extra#364
berstend/puppeteer-extra#156
argos-ci/jest-puppeteer#308
#5334
#3853 (comment)
#2687
#4524

Puppeteer's request intercept handler does not wait for async handlers to finish

Request intercept handling currently does not support async intercept handlers. The underlying mitt event system does not await async events, and NetworkManager just fires the events and moves on:

    this.emit(NetworkManagerEmittedEvents.Request, request);

The solutions

The solution this PR proposes a cooperative interception strategy while also making NetworkManager wait for async handlers to be fulfilled before finalizing the request interception.

One caveat to a cooperative interception strategy is deciding which directive should 'win'. I decided upon abort > respond > continue

// This handler will 'win' because it asks for an abort(). The request will be aborted.
page.on('request', req=> {
  req.abort()
});

// Had there not been an abort(), this handler would 'win' because a respond() is better than a continue()
page.on('request', req=> {
  req.respond({...})
});

// This is the lowest priority. The request will be continued only if no abort() or respond()
page.on('request', req=> {
  req.continue() // Not even necessary, NetworkManger will fall through to continue() by default
});

// Async example. NetworkManager will not fulfill the request until this deferred operation has been completed
page.on('request', req=> {
  req.defer( async ()=> {
      // do something async like a database lookup
  })
});

Possible breaking changes

HTTPRequest.abort(), HTTPRequest.continue() and HTTPRequest.respond() are no longer return promises. Their behavior has changed because they no longer await resolution of the underlying the CDP Fetch events.

I don't think they should anymore, because they are merely cooperative 'recommendations' about what to do. We could make these still return a promise, but then we'd have to decide what to do if the request interception ultimately resolved differently. For example, if you did await continue() but the request actually aborted, what should happen? Probably an exception? I think it is cleaner to make these return nothing, and if needed, introduce a separate HTTPRequest.fulfilled()=>Promise<ResolutionType> signal.

@google-cla
Copy link

google-cla bot commented Jan 6, 2021

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

@google-cla google-cla bot added the cla: no label Jan 6, 2021
@benallfree
Copy link
Contributor Author

@googlebot I signed it!

@benallfree
Copy link
Contributor Author

Note: moved to #6735

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant