New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
variable deduplication #87
Comments
This sounds related to one of the concerns raised in apollographql/federation#180 (just noting for the sake of posterity). I think right now this would be potentially solved in that repo's query planner, right? |
(Well, I guess it's execution of the plan, but I guess this exists in both implementations right now, yea?) |
I am not sure this could be solved in the query planner, this is very dependent on runtime data. If you query: query {
topProducts {
reviews {
author { name }
}
} on this data: [
{
"id": 1,
"name": "chair",
"reviews": [
{
"author": {
"id": 1234,
"name": "Alice"
}
},
{
"author": {
"id": 5678,
"name": "Bob"
}
}
]
},
{
"id": 2,
"name": "couch",
"reviews": [
{
"author": {
"id": 1234,
"name": "Alice"
}
}
]
}
] we could get a query to the users subgraph with the following representations: On the other hand, it is easily solvable with #250: in the phase generating the variables for a subgraph query, it creates for each representation a "path" into the end response, to know where to store it. |
Signed-off-by: Benjamin Coenen <5719034+bnjjj@users.noreply.github.com>
When doing a query like this:
we will do a serie of requests, first to producs to get the topproducts, then to reviews, then to products. The second request to products aggregates the queries but does not deduplicate them, so we get variables like this:
The router should be able to recognize that we are asking multiple times for the same data, and only ask each item once, so there should be a mapping between the reviews and the products query: (with indexes, reviews should reference
[0, 1, 2, 0, 0, 1, 2, 0]
from the variables array[ { "__typename": "Product", "upc": "1" }, { "__typename": "Product", "upc": "2" }, { "__typename": "Product", "upc": "3" } ]
)To go further, the first request to products returned this:
so for each of these products, we already knew the name, so we could have avoided entirely the second call to products.
Implementation details
Variables in representations are generated here:
router/crates/apollo-router-core/src/federated.rs
Lines 314 to 315 in 46b0191
select
inrouter/crates/apollo-router-core/src/response.rs
Lines 59 to 85 in 46b0191
The annoying part is that since we are working with
serde_json::Value
elements, that do not implementHash
, they are hard to deduplicate (for now, there's work happening on that serde-rs/json#747 serde-rs/json#720 serde-rs/json#814 )What could be done here? Should we use another structure than
serde_json::Value
? Should the response contain a kind of repository of values to look up and avoid some queries?The text was updated successfully, but these errors were encountered: