Pre-parsed queries and/or Query caching #227

matthewmueller · 2018-06-18T09:20:22Z

Right now when you execute against the schema, it parses the queries each time. We could speed up these executions by either parsing some queries beforehand or caching previously seen queries.

The use-case I have in mind is when you'd like to respond to a webhook or expose a REST API that use your GraphQL resolvers internally, having pre-parsed server-side queries would speed up the operation.

Something like this:

func (s *Schema) ExecDocument(
  ctx context.Context,
  query *graphql.Document, 
  operationName string, 
  variables map[string]interface{},
) *Response {
    // ...
}

An alternative to this is lazily caching the parsed query results, when we encounter the same query twice, use the cached result. This would probably need to be bounded by an LRU cache, but I think it'd be a nice addition to schema.Exec(...)

The text was updated successfully, but these errors were encountered:

pavelnikolov · 2018-06-20T01:06:47Z

If you are using GraphQL over HTTP (which I guess is the case for most of the users of this library), then the ideal solution would be to just use old-school Cache-Control headers.
The guys from Apollo have done a great job on caching. They have developed a cache control extension. I do believe that Cache-Control header is the right approach to do it (i.e. CDN/infrastructure agnostic) and extensions is the way to go. This library would definitely benefit from extensions. Unfortunately, extensions are not supported yet. 😞
To workaround this, in my team we have implemented imperative caching instead of the much better declarative approach. To do this we had to do two things:

Allow GET requests (I'll send a PR for that very soon)
Allow each resolver to declare its caching preferences. (maxAge - TTL in seconds and scope - PUBLIC or PRIVATE). Then at the end of each request we set the appropriate CacheControl header for its response. This is not an ideal solution, but a good workaround until this library supports extensions.

Regardless of the caching implementation (imperative/declarative), there is another thing that (usually) goes hand-in-hand with caching - persisted queries. With persisted queries it would e nice to validate queries only once and then every time a persisted query is executed skip the validation step (unless the code has changed).

matthewmueller · 2018-06-20T03:36:42Z

Hi @pavelnikolov, thanks for the background on apollo's caching system! It's interesting to read what they did to address the stale data problem.

This issue is not exactly about stale data. It's about caching the static queries, so if we see a query come in, we already have the parsed query document ready to go.

In code, something like this:

cache := map[string]*query.Document{}

op := `
  query {
    hero {
      name
      appearsIn
    }
  }
`

cache[op] = query.Parse(op)

So each time we see those static queries again, we have this cache ready with the parsed query.

pavelnikolov · 2018-06-20T07:14:27Z

My point is that with cache hints/imperative caching you'll not have to see the same query again. If you set the cache-control headers, the response will be cached by your infrastructure (Nginx, Varnish, a CDN etc) or in the user's browser.

What you are suggesting would definitely benefit persisted queries, though. With persisted queries, the client application will not have to send the entire query, instead it can send a pre-defined query_id. Also, since the request doesn't need the body, it can be send over GET method. For example:

GET https://your-domain.com/graphql?query_id=5&variales={"key":"value"}

Then the handler can load the query from some cache store (e.g. Redis, Memcached or even embedded datastore). In this case we don't need to validate/parse the query again - we can do it only once when the query is inserted into the datastore.
This is one of the things I would like to work on.

It is possible to use this idea, even without persisted queries or caching. The application would probably store the query+valriables+operation in a key/value store by hash. This way each unique combination of query+variables+operationName will e parsed/validated at most once during the lifetime of the GraphQL application. Is this what you mean?

matthewmueller · 2018-06-20T08:01:48Z

@pavelnikolov Ahh yep, I see your point now. I guess that's less of an issue with this library and more about how you setup your graphql infrastructure.

I think you're right that cache-control would probably be a better way to architect your app, but I think this is low-hanging fruit to improve the performance of graphql servers out of the box.

This is off-topic, but I've also run into the use case for persisted queries, though in my opinion the other literature online doesn't really do a good job of selling them. From what I've read, their main benefit is to avoid sending full queries each time. I think an even more important role for them is interacting with existing services. Using your example, rather than:

GET https://your-domain.com/graphql?query_id=5&variables={"key":"value"}

You can expose:

GET https://your-domain.com/user?variables={"key":"value"}

Which would call a predefined server-side "user" query. That's what I was getting at with wanting the ability to use pre-parsed queries. Then you can migrate your existing REST API to using GraphQL under the hood without changing your existing clients.

pavelnikolov · 2018-06-20T08:17:55Z

I guess that's less of an issue with this library and more about how you setup your graphql infrastructure

Exactly!

But do you really need this optimisation? Is this actually a bottleneck for your API? I haven't seen the metrics/benchmarks, to be honest, but I think that the time spent there would be relatively insignificant. Maybe there are apps that need this optimisations, though. Just not my team's use-case. I'll think a little bit more about it.

matthewmueller · 2018-06-20T10:38:45Z

But do you really need this optimisation?

Maybe not, my use case is more about supporting static server-side queries. Seems wasteful to have to parse these kinds of queries since I have access to the schema and query right on the server anyway. Then it got me thinking about lazily caching since that would also solve the problem.

For example, exposing this would work great:

users := query.MustParse(`...`)
response := schema.ExecQuery(users, variables)

pavelnikolov · 2018-06-20T10:50:22Z

This (together with persisted queries) has been on my TODO list for more than a year now. I’d much rather cache the responses in a CDN, than optimize the library. I agree with you that this is a good optimization and ideally we need both. It’s just a matter of priorities...

…

Sent from my phone

On 20 Jun 2018, at 20:38, Matthew Mueller ***@***.***> wrote: But do you really need this optimisation? Maybe not, this issue is more about supporting static server-side queries. Seems wasteful to have to parse these kinds of queries since I have access to the schema and query on the server anyway. Then it got me thinking about lazily caching since that would also solve the problem. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

matthewmueller · 2018-06-20T10:53:20Z

Totally understand! And thanks for the discussion earlier – I learned a lot from this issue :-)

matthewmueller changed the title ~~Pre-parsed queries and Query caching~~ Pre-parsed queries and/or Query caching Jun 18, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pre-parsed queries and/or Query caching #227

Pre-parsed queries and/or Query caching #227

matthewmueller commented Jun 18, 2018 •

edited

pavelnikolov commented Jun 20, 2018 •

edited

matthewmueller commented Jun 20, 2018 •

edited

pavelnikolov commented Jun 20, 2018

matthewmueller commented Jun 20, 2018 •

edited

pavelnikolov commented Jun 20, 2018

matthewmueller commented Jun 20, 2018 •

edited

pavelnikolov commented Jun 20, 2018 via email

matthewmueller commented Jun 20, 2018

Pre-parsed queries and/or Query caching #227

Pre-parsed queries and/or Query caching #227

Comments

matthewmueller commented Jun 18, 2018 • edited

pavelnikolov commented Jun 20, 2018 • edited

matthewmueller commented Jun 20, 2018 • edited

pavelnikolov commented Jun 20, 2018

matthewmueller commented Jun 20, 2018 • edited

pavelnikolov commented Jun 20, 2018

matthewmueller commented Jun 20, 2018 • edited

pavelnikolov commented Jun 20, 2018 via email

matthewmueller commented Jun 20, 2018

matthewmueller commented Jun 18, 2018 •

edited

pavelnikolov commented Jun 20, 2018 •

edited

matthewmueller commented Jun 20, 2018 •

edited

matthewmueller commented Jun 20, 2018 •

edited

matthewmueller commented Jun 20, 2018 •

edited