Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] fix: add missing await to catch errors thrown in parsingDidEnd() #6559

Merged
merged 1 commit into from Jun 13, 2022
Merged

Conversation

fubhy
Copy link
Contributor

@fubhy fubhy commented Jun 10, 2022

Currently, errors throwing during the post-parsing phase hook fly by this try/catch block because of the missing await. They then bubble up into the outer scope and are treated as INTERNAL_SERVER_ERRORS. This took me quite a while to figure out (hard to spot that missing await there).

There are also other incidents in this code base where errors are incorrectly thrown as INTERNAL_SERVER_ERROR where they should instead be BAD_USER_INPUT (e.g. missing query). Maybe something you'd want to fix too?

Thanks!

@netlify
Copy link

netlify bot commented Jun 10, 2022

Deploy Preview for apollo-server-docs canceled.

Built without sensitive environment variables

Name Link
🔨 Latest commit 06c258f
🔍 Latest deploy log https://app.netlify.com/sites/apollo-server-docs/deploys/62a33983fc42f1000861d687

@fubhy fubhy changed the title fix: add missing await to catch errors thrown in parsingDidEnd() [BUG] fix: add missing await to catch errors thrown in parsingDidEnd() Jun 10, 2022
@codesandbox-ci
Copy link

This pull request is automatically built and testable in CodeSandbox.

To see build info of the built libraries, click here or the icon next to each commit SHA.

Latest deployment of this branch, based on commit 06c258f:

Sandbox Source
Apollo Server Typescript Configuration
Apollo Server Configuration

@glasser
Copy link
Member

glasser commented Jun 13, 2022

Ooh, great catch.

The larger issue is hopefully already improved on the version-4 branch, where we've done a lot of error handling improvements, including in #6514 explicitly adding an "unexpected error processing request" for random throws (though I suppose not for random unhandled rejections!).

@glasser glasser merged commit 54416e2 into apollographql:main Jun 13, 2022
@trevor-scheer
Copy link
Member

I don't think this change is harmful in any way (so I won't push to revert), but I also don't think it solves the issue. If that hook throws, the catch block should still hit whether the function call is awaited or not (somebody please correct me if I'm wrong, this would be very surprising to me).

What I think is actually happening is this:
#6567

I suspect the parsingDidEnd hook is being called again, and throwing again (without being caught this time from inside the catch block).

@glasser
Copy link
Member

glasser commented Jun 13, 2022

I think you're both right:

If the non-error call to parsingDidEnd rejects, then you get an unfulfilled promise rejection error (which in some configurations crashes a server). Your PR fixes this. Trevor seems to be correct that this might not be the INTERNAL_SERVER_ERROR issue in question.

#6567 would fix the odd behavior of "hook can be called twice", and version-4 already handles unexpected errors being thrown from plugins better.

That said, it sounds like you are intentionally throwing from parsingDidEnd, and expecting that to render as a particular kind of error. I'm not sure that our code attempts to make that be a supported thing, and version-4 will make that worse by changing the error to be an "unexpected error got thrown" error.

ie, it seems like you're using this as some sort of validation stage? Perhaps you want to specify custom validationRules and hook directly into the validation stage of the pipeline?

@fubhy
Copy link
Contributor Author

fubhy commented Jun 15, 2022

@glasser There's a particular category of query errors (invalid queries) where a operation name (operationName) is passed with the request that is not actually present as a named operation in the query.

I am sure you know this but adding it here for context regardless: When sending multiple named operations in a single document , you have to also pass a operationName alongside the query to choose which of the passed operations in the document to actually execute.

A few days ago our backend api was under attack with someone trying to penetrate the graphql api with arbitrary (invalid) requests. We normally "ignore" invalid user input (4xx errors) in our alerting stack (these types of attack attempts happen regularly) but because this one actually made it past the query validation into the plugin stage and some of the plugins expected a valid operation to already be inferred from the request, we suddenly saw a spike of INTERNAL_SERVER_ERROR in our monitoring.

The logic in requestPipeline.ts already accounts for the multiple operation scenario (and so does graphql-js in buildExecutionContext) but it does so AFTER the apollo plugin hooks are called:

// TODO: If we want to guarantee an operation has been set when invoking
// `willExecuteOperation` and executionDidStart`, we need to throw an
// error here and not leave this to `buildExecutionContext` in
// `graphql-js`.
const operation = getOperationAST(
requestContext.document,
request.operationName,
);
requestContext.operation = operation || undefined;
// We'll set `operationName` to `null` for anonymous operations.
requestContext.operationName = operation?.name?.value || null;

We have a plugin (custom query complexity) that requires additional context from the requests and therefore can't be used with the bare validationRule (only validates the gql query syntax) config.

Hence why we chose to do this in the didResolveOperation stage which works well for us.

However, during that (while debugging requestPipeline.ts, we noticed this missing await). Additionally, we noticed that these kind of query errors (invalid user input) are only thrown in buildExecutionContext in graphql-js and then bubble up as INTERNAL_SERVER_ERROR due to how sendErrorResponse (re-)wraps and formats these errors.

In order to prevent that, we also added another Apollo plugin to capture these errors (invalid operation) and throw them as SyntaxErrors at the didResolveOperation stage already.

That resolves it for us, but the better approach (imho this is currently a bug) would be to actually throw a SyntaxError in requestPipeline.ts already (as the TODO comment there also suggests) before:

// TODO: If we want to guarantee an operation has been set when invoking
// `willExecuteOperation` and executionDidStart`, we need to throw an
// error here and not leave this to `buildExecutionContext` in
// `graphql-js`.

@fubhy
Copy link
Contributor Author

fubhy commented Jun 15, 2022

Note that the TypeScript types for didResolveOperation are consequently also wrong / not in sync with reality: The operation might very well be undefined in didResolveOperation (we check that with our custom plugin and throw a Syntax error in this case).

So yeah... It all comes down to resolving the TODO and actually throwing a SyntaxError early there!

@fubhy
Copy link
Contributor Author

fubhy commented Jun 15, 2022

Related to that: The GraphQL operations must contain a non-empty queryor apersistedQuery extension. should be thrown as a BAD_USER_INPUT error too.

@glasser
Copy link
Member

glasser commented Jun 15, 2022

Thanks for all the details! Here are some thoughts. Note that "AS4" means "Apollo Server 4, in active development on the version-4 branch". We've done a fair amount of work towards improving error handling there.

  • For better or for worse, much of the plugin interface is designed around enabling observability features rather than enabling changes to the flow of the logic. It probably would be better to focus more on the latter and some improvements in that direction can probably be made in a backwards-compatible way! But we're not there yet. In AS4 we've at least made some things more consistent by ensuring that any thrown error in requestPipeline.ts that isn't explicitly handled turns into an (observable via plugin!) "unexpected error processing request". Making hooks like didResolveOperation have really nice outcomes when they throw is an improvement definitely worth considering — it's just not a huge surprise that it doesn't now.
  • You're right that issues involving the operation name can't be caught at the validation phase: validation is a function of operation and schema (and thus is cacheable when these don't change), not explicitly-provided-operation-name. (I've actually been hoping that graphql-js can factor out operation name resolution from the rest of execution, which might happen soon.) So yeah, didResolveOperation is a good place to do this check... except that, as described above, the pipeline doesn't really support dealing with errors from it in a particularly nice way.
  • We did actually already fix the TS types on version-4 so that GraphQLRequestContextDidResolveOperation no longer asserts that operation is required.
  • Explicitly throwing our own "unknown operation" error (probably not "syntax" though) rather than relying on execute() to do it for us is probably a good idea! I'll file that in the v4 milestone as Error handling: make sure unknown operation is a 4xx error #6577. (In that case we may revert the change in the previous bullet.)
  • In AS4 we've already made the "must contain a non-empty" error return a BAD_REQUEST error code (which is new in AS4 and used for a lot of similar sorts of errors), with a 400 (as it does in AS3).

@fubhy
Copy link
Contributor Author

fubhy commented Jun 15, 2022

Well that all sounds great :-). So I consider this "resolved" for us. I added a TODO in our codebase to remove our "hotfix" once AS4 lands. Thanks for the detailed response.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 20, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants