Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Variable deduplication in query planner #872

Merged
merged 17 commits into from Jun 7, 2022
Merged

Variable deduplication in query planner #872

merged 17 commits into from Jun 7, 2022

Conversation

bnjjj
Copy link
Contributor

@bnjjj bnjjj commented Apr 19, 2022

close #87

Before this change we had this __entities:

entities.len() 8
entities [
    Object({
        "name": String(
            "Table",
        ),
    }),
    Object({
        "name": String(
            "Couch",
        ),
    }),
    Object({
        "name": String(
            "Chair",
        ),
    }),
    Object({
        "name": String(
            "Table",
        ),
    }),
    Object({
        "name": String(
            "Table",
        ),
    }),
    Object({
        "name": String(
            "Couch",
        ),
    }),
    Object({
        "name": String(
            "Chair",
        ),
    }),
    Object({
        "name": String(
            "Table",
        ),
    }),
]

Now we have:

entities len 3
entities [
    Object({
        "name": String(
            "Table",
        ),
    }),
    Object({
        "name": String(
            "Couch",
        ),
    }),
    Object({
        "name": String(
            "Chair",
        ),
    }),
]

in responses, but I also deduplicate in the requests so it's less traffic either on the request and the response.

Signed-off-by: Benjamin Coenen <5719034+bnjjj@users.noreply.github.com>
Signed-off-by: Benjamin Coenen <5719034+bnjjj@users.noreply.github.com>
@bnjjj bnjjj self-assigned this Apr 19, 2022
@netlify
Copy link

netlify bot commented Apr 19, 2022

Deploy Preview for apollo-router-docs canceled.

Name Link
🔨 Latest commit 54b11d2
🔍 Latest deploy log https://app.netlify.com/sites/apollo-router-docs/deploys/629f1bab4f715b000aabc08c

@github-actions

This comment has been minimized.

@BrynCooke
Copy link
Contributor

Let's not merge this before GA.

Copy link
Member

@abernix abernix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love this, with a timing catch. 😉

As long as we're doing deep comparisons on the representations, this is safe and desirable. This is also something that we've wanted to do for a very long time in the Gateway and something users have wanted!

With that in mind, my suggestion is that we either:

  1. Punting this until after GA; or
  2. Making this a configuration option that is off by default

I suspect my preference could be option 2.

My primary motivation for this suggestion is that I believe this could make it more challenging for adopters (e.g., SREs analyzing such a deployment) to accurately analyze the differences in behavior between the Gateway and the Router, which we have tried to keep largely the same. It would be easier if the shapes of the subgraph requests were as similar as possible — this might tie into whatever tool/diffs that come out of #765 if someone was comparing Router vs Gateway.

As a secondary motivation, because this is an optimization while we're in a push to GA (which I'd claim should primarily on stabilizing what we have, rather than new features that differentiate things from the Gateway — even if desirable!).

There are, I think, a number of users who are interested in this that have probably opened issues on the federation repository that we might want to have try it out, so a configuration option could enable that. It's also just a great fast-follow to GA.

My one technical note on this PR is that I think it should introduce some tests, and administratively, looking for a CHANGELOG.md.

Thoughts?

@Geal
Copy link
Contributor

Geal commented Apr 22, 2022

Deep comparisons should not be a huge issue, as the Hash implementation does it. If there's a case where a representation is similar but does not hash the same way (example: fields in different order), it should be safe, because it would only result in more data requested.
I agree though, that needs more tests. But still, great work 😃

Signed-off-by: Benjamin Coenen <5719034+bnjjj@users.noreply.github.com>
Signed-off-by: Benjamin Coenen <5719034+bnjjj@users.noreply.github.com>
Signed-off-by: Benjamin Coenen <5719034+bnjjj@users.noreply.github.com>
@bnjjj bnjjj added the performance Performance or scalability issues label Apr 25, 2022
@bnjjj
Copy link
Contributor Author

bnjjj commented Apr 25, 2022

Discussed with @BrynCooke, let's merge it POST-GA

Copy link
Contributor

@BrynCooke BrynCooke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still believe that this can be activated via the traffic shaping plugin.

QueryPlannerRequest should have QueryPlanOptions as a field and the traffic shaping plugin should mutate the options to enable dedup.

@bnjjj
Copy link
Contributor Author

bnjjj commented May 15, 2022

Ok if we're ok to expose the queryplan options then indeed I think it could work. But all logics about dedup will stay in core and not in the plugin itself, we're just talking about moving the configuration to the plugin.

@bnjjj bnjjj requested a review from StephenBarlow as a code owner May 24, 2022 14:08
@bnjjj bnjjj requested a review from BrynCooke May 24, 2022 14:10
bnjjj added 2 commits May 30, 2022 11:57
Signed-off-by: Benjamin Coenen <5719034+bnjjj@users.noreply.github.com>
Signed-off-by: Benjamin Coenen <5719034+bnjjj@users.noreply.github.com>
@bnjjj bnjjj enabled auto-merge (squash) June 3, 2022 10:04
bnjjj added 2 commits June 7, 2022 11:21
Signed-off-by: Benjamin Coenen <5719034+bnjjj@users.noreply.github.com>
Signed-off-by: Benjamin Coenen <5719034+bnjjj@users.noreply.github.com>
@bnjjj bnjjj merged commit 31ff873 into main Jun 7, 2022
@bnjjj bnjjj deleted the bnjjj/feat_87 branch June 7, 2022 09:44
@bnjjj bnjjj added this to the v0.9.4 milestone Jun 14, 2022
This was referenced Jun 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance or scalability issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

variable deduplication
4 participants