Support for multiple wasm query planner objects? #210

dkapadia · 2020-10-06T18:07:51Z

In our application, we instantiate multiple different Apollo Gateway objects, each with slightly different schemas. We do this because we're currently migrating code from one service to another and we want to do side-by-side testing of the old and new implementations. Having two gateway objects allows us to compare the query plans that each produces and to make a side-by-side test whenever we notice a difference between those plans.

After the wasm query planner was introduced @apollo/gateway in v0.20.1, we found that we could no longer do this - it seemed that we always got query plans for whichever gateway was initialized last (see reproduction code below).

I am not a rust expert, but I suspect the issue is here in this file where there currently is just a global SCHEMA and DATA objects that are always reused whenever a gateway object updates its schema.

Given that these variables are setup right now as vectors rather than just global pointers, I'm hoping that y'all are planning to support multiple query planner objects in the future. Is that the case, or is that something you'd be open to accepting a PR for? We'd like to update to the latest query planner since it seems to fix some bugs we're experiencing, but this issue will block us from doing so.

To reproduce the issue here:

const { ApolloGateway } = require("@apollo/gateway");
const { parse } = require("graphql");

const gatewayA = new ApolloGateway({
  localServiceList: [
    { name: "accounts", url: "http://localhost:4001/graphql", typeDefs: parse("type Query { foo: String, bar: String}")},
    { name: "reviews", url: "http://localhost:4002/graphql", typeDefs: parse("type Query { baz: String }")},
  ],
});

const gatewayB = new ApolloGateway({
  localServiceList: [
    { name: "accounts", url: "http://localhost:4001/graphql", typeDefs: parse("type Query { foo: String }")},
    { name: "reviews", url: "http://localhost:4002/graphql", typeDefs: parse("type Query { baz: String, bar: String }")},
  ],
});

// The two schemas are different
console.log(JSON.stringify(gatewayA.schema) == JSON.stringify(gatewayB.schema)) // expected: 'false', actual: 'false'

// However the two gateway objects share a pointer to the same query
// planner wasm object, which means they will always return the same
// query plans (those of gatewayB)
console.log(gatewayA.queryPlannerPointer == gatewayB.queryPlannerPointer) // expected: 'false', actual: 'true'

The text was updated successfully, but these errors were encountered:

KoltesDigital · 2020-10-08T14:14:07Z

After upgrading dependencies I can't run my dev env anymore, in which I start multiple services and gateways in the same process. I think the problem may be the same: my system design is not compatible with a global shared schema. I'll try to run them in different processes to prevent this state sharing.

FYI here's the error:

{
  "error": {
    "errors": [
      {
        "message": "unreachable",
        "extensions": {
          "code": "INTERNAL_SERVER_ERROR",
          "exception": {
            "stacktrace": [
              "RuntimeError: unreachable",
              "    at <anonymous>:wasm-function[535]:0xa3fec",
              "    at <anonymous>:wasm-function[703]:0xa94fe",
              "    at <anonymous>:wasm-function[723]:0xa9961",
              "    at <anonymous>:wasm-function[657]:0xa8692",
              "    at <anonymous>:wasm-function[264]:0x8c14b",
              "    at <anonymous>:wasm-function[28]:0x464db",
              "    at <anonymous>:wasm-function[335]:0x94b7d",
              "    at <anonymous>:wasm-function[33]:0x4ab21",
              "    at <anonymous>:wasm-function[575]:0xa5e02",
              "    at getQueryPlan (<anonymous>:wasm-function[391]:0x9a112)",
              "    at Object.module.exports.getQueryPlan (...\\node_modules\\@apollo\\query-planner-wasm\\dist\\index.js:107:24)",
              "    at Object.buildQueryPlan (...\\node_modules\\@apollo\\gateway\\dist\\buildQueryPlan.js:7:33)",
              "    at Object.ApolloGateway.executor (...\\node_modules\\@apollo\\gateway\\dist\\index.js:90:46)",
              "    at processTicksAndRejections (internal/process/task_queues.js:93:5)"
            ]
          }
        }
      }
    ]
  }
}

There are several unreachable!() invocations in the source code, I can't tell which one panics. Although the root cause is the shared state, consider adding a message to each of the invocations in order to have meaningful error messages.

I've modified the federation demo code here to reproduce an issue we're seeing in our monolith. Specifically we have the `accounts` service own an interface (`Task`) and an implementing type (`TestSectionTask`) and we have the `reviews` service extend both the interface and the implemementing type. The issue we're seeing shows up when we make a query like [1] below: [1]``` { me { tasksFromReviews { reviewsField accountsField } } } ``` This returns: ``` { "data": { "me": { "tasksFromReviews": [ null ] } } } ``` But we expect it to return: ``` { "data": { "me": { "tasksFromReviews": [ { "reviewsField": "reviews", "accountsField": "accounts" } ] } } } ``` which _is_ the data we get when we make a query like [2], which fetches first from `accounts` and then gets data from `reviews` [2] ``` { me { tasks { reviewsField accountsField } } } ``` When we run query [1], I expect: 1. The gateway to fetch a list of Tasks from `reviews`, including the reviewsField 2. The gateway to send a list of representations to the `_entities` query to accounts, in order to fetch the `accountsField` for those entities 3. The gateway to merge the data from the accounts with reviews and then return to the end user. But, what we actually see is that step 2 fails - the gateway sends an empty list of representations to the `accounts` service. As best as I can tell, the issue is in the query planner (or in the interaction between query planner and executor). For query [1], step 1, it outputs a plan like: ``` Flatten(path: "me") { Fetch(service: "reviews") { { ... on User { __typename id } } => { ... on User { tasksFromReviews { __typename ... on TestSectionTask { reviewsField id } } } } }, }, Flatten(path: "me.tasksFromReviews.@") { Fetch(service: "accounts") { { ... on Task { __typename } ... on TestSectionTask { id } } => { ... on Task { accountsField } } }, }, ``` The issue (i think) is confusingly the lack of `__typename` selected from the TestSectionTask type - this [3] bit of execute query plan will only merge in data if the typename matches. More confusingly, the issue seems to go away when we upgrade to the latest version of apollo/gateway, but since that version rewrote the query planner in rust it's not immediately obvious to me what changed between the JS and rust implementations. We also cannot upgrade to that version until the issue described in [4] is fixed. [3] https://github.com/apollographql/federation/blob/88cb9164341363d908c8a414f879ecb143017688/gateway-js/src/executeQueryPlan.ts#L423-L434 [4] apollographql/federation#210

Fixes apollographql#210 I considered using a HashMap, but it's a little more complicated, and I'm assuming we won't have people creating many hundreds of query plans 🙃 if that's a use case we want to support, we could move to HashMaps.

Enrico2 · 2020-10-13T16:41:33Z

@KoltesDigital If you're running multiple gateways in the same process, then yes, it's the same root cause. The issue brought up here is an interesting use case, and we're thinking about the right way to solve this. Stay tuned.

This PR makes it so that getting a query planner creates a new planner every time and never overrides it. We've added a new function named `updatePlannerSchema` that overrides a specific planner. This hopefully fixes #210 This implementation is suggested instead of #226 to keep things simple.

Enrico2 · 2020-11-02T17:43:25Z

@dkapadia Can you see if #261 solves your issue? There are instructions in the PR for how to test it.

abernix · 2021-04-16T10:54:49Z

This should be fixed via #261 (comment), copy-pasted here for completeness:

This should no longer be necessary with the landing of #622 (specifically, the changing of the query planner back to TypeScript). Put another way, the code this changes is no longer present and the query planners are now scoped within the Gateways that generate and execute them.

jaredly mentioned this issue Oct 12, 2020

Add support for multiple wasm QueryPlanner objects #226

Closed

Enrico2 added the 🏓 waiting-response label Oct 13, 2020

Enrico2 mentioned this issue Oct 28, 2020

Support multiple gateways in the same process #261

Closed

glasser mentioned this issue Mar 12, 2021

feat(gateway): Update managed federation to fetch "cloud config" #458

Merged

3 tasks

abernix closed this as completed Apr 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for multiple wasm query planner objects? #210

Support for multiple wasm query planner objects? #210

dkapadia commented Oct 6, 2020

KoltesDigital commented Oct 8, 2020 •

edited

Enrico2 commented Oct 13, 2020

Enrico2 commented Nov 2, 2020

abernix commented Apr 16, 2021

Support for multiple wasm query planner objects? #210

Support for multiple wasm query planner objects? #210

Comments

dkapadia commented Oct 6, 2020

KoltesDigital commented Oct 8, 2020 • edited

Enrico2 commented Oct 13, 2020

Enrico2 commented Nov 2, 2020

abernix commented Apr 16, 2021

KoltesDigital commented Oct 8, 2020 •

edited