Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memoize type.getFields in ObjectTypeComposer.js? #319

Closed
KyleAMathews opened this issue Feb 26, 2021 · 17 comments
Closed

Memoize type.getFields in ObjectTypeComposer.js? #319

KyleAMathews opened this issue Feb 26, 2021 · 17 comments

Comments

@KyleAMathews
Copy link

KyleAMathews commented Feb 26, 2021

I'm doing some profiling on Gatsby query running for a site & the hottest function is getting the fields for a type. I assume this could be memoized? It gets called 10s of thousands of times for a small-ish site.

Screen Shot 2021-02-26 at 10 44 06 AM

@nodkz
Copy link
Member

nodkz commented Feb 26, 2021

Wow, it sounds very bad.

Let's try to investigate what's wrong. I start to add my thought in the free form here. Maybe we can found the root of the problem together.

Step 1

defineFieldMap() is important method - it constructs the proper astNode property inside graphql type (so we cannot get rid of this method). This method is not expensive - it just traverses two times fields with all args for graphql type construction; does not have any recursions.

So the problem somewhere higher - something calls this method many times.

Step 2

If we take a look at simplified ObjectTypeComposer.getType() method implementation:

  getType(): GraphQLObjectType { // <------------------------------------------------ POINT_0
    this._gqType.astNode = getObjectTypeDefinitionNode(this);
    this._gqType._fields = () => {
      return defineFieldMap( // <---------------------------------------------------- POINT_1
        this._gqType,
        mapEachKey(this._gqcFields, (fc, name) => this.getFieldConfig(name)),  // <-- POINT_2
        this._gqType.astNode
      );
    };
    this._gqType._interfaces = () => this.getInterfacesTypes();
    return this._gqType;
  }
  • Somwhere we call ObjectTypeComposer.getType() (POINT_0) and it return wrapped in array function defineFieldMap call (POINT_1). And it will be resolved by graphql-js on schema construction.
  • We can notice that before calling defineFieldMap (POINT_1) we call getFieldConfig() for every field (POINT_2).
  • Under the hood getFieldConfig() (POINT_2) calls ObjectTypeComposer.getType() method (POINT_0). 💣💣💣 So here is recursion and defineFieldMap may be called several times for the one type (unnecessary excessive computations).

Definitely, we need to add memorization for ObjectTypeComposer.getType() method.

@nodkz
Copy link
Member

nodkz commented Feb 26, 2021

Now is the good question: How we can add memorization to ObjectTypeComposer.getType() method? What properties can be used for tracking changes inside ObjectTypeComposer?

cc @vladar maybe you have any thoughts?

@KyleAMathews
Copy link
Author

KyleAMathews commented Feb 26, 2021

EDIT: reading #319 (comment) again I get it now — CPU sampling is only capturing where time is being spent not that actual stack trace so yeah, memoizing getType could be enough.

I think getFields is the main problem as it gets called over and over again by graphql-js & gatsby as you can see here:

Screen Shot 2021-02-26 at 11 51 25 AM

getFields has ~5.3s of CPU time in my sample & getType has 0.4s — though that's also pretty expensive so probably both need memoized.

@KyleAMathews
Copy link
Author

How do you change the type? Are they immutable? Or have easily trackable ways of mutation? If so, we could just always return the cached version right?

@KyleAMathews
Copy link
Author

Or perhaps if it's not easy to tell when a type is mutated, perhaps we could manually "freeze" the types since we're controlling changing them.

@vladar
Copy link
Contributor

vladar commented Feb 26, 2021

For me this code path looks problematic:

getType(): GraphQLObjectType {
this._gqType.astNode = getObjectTypeDefinitionNode(this);
if (graphqlVersion >= 14) {
this._gqType._fields = () =>
defineFieldMap(
this._gqType,
mapEachKey(this._gqcFields, (fc, name) => this.getFieldConfig(name)),
this._gqType.astNode
);
this._gqType._interfaces = () => this.getInterfacesTypes();

Anytime we call composer.getType() we basically reset _fields to the new thunk. Even if it was already defined (thunk was already called and replaced with the result by graphql-js).

So graphql-js always re-runs it here:

  getFields(): GraphQLFieldMap<any, any> {
    if (typeof this._fields === 'function') {
      this._fields = this._fields();
    }
    return this._fields;
  }

So maybe if we skip this code path when this._fields is object (so was previously defined) - it should be enough? But I am not sure if it will interfere with something else (maybe there are cases when we want to override previously defined fields).

@vladar
Copy link
Contributor

vladar commented Feb 26, 2021

Probably a bad idea - 53 tests are failing 😅

EDIT: actually only 2 tests are failing with this change:

  getType(): GraphQLObjectType {
    this._gqType.astNode = getObjectTypeDefinitionNode(this);
    if (graphqlVersion >= 14) {
      if (typeof this._gqType._fields !== `object`) {
        this._gqType._fields = () =>
          defineFieldMap(
            this._gqType,
            mapEachKey(this._gqcFields, (fc, name) => this.getFieldConfig(name)),
            this._gqType.astNode
          );
      }

@vladar
Copy link
Contributor

vladar commented Feb 27, 2021

I am actually not sure what would be the correct fix for it. With my suggestion, you won't be able to override type fields (and args) after the schema was built. I think the good fix requires a way to "freeze" types explicitly (and maybe freeze them automatically once the schema is built).

When frozen - it will simply return the this._gqType without mutations. But this sounds like a rather big change. We were able to do exactly this in a hacky way in this workaround in Gatsby: gatsbyjs/gatsby#29822

But the fix for general case is probably more involved 😒

@nodkz
Copy link
Member

nodkz commented Feb 27, 2021

@vladar made exactly what I suppose to do to fix this problem 👍👍👍

@KyleAMathews can you make profiling again from the gatsby master branch and provide new timings?
And if this change improves timings dramatically then I'll start to implement memoization for getType() method. Thanks!

@nodkz
Copy link
Member

nodkz commented Feb 27, 2021

About asked questions and suggestions:

How do you change the type? Are they immutable? Or have easily trackable ways of mutation? If so, we could just always return the cached version right?

Nope, types are not immutable. And it's a good feature request but a solution will be very complicated - need to track fileds, args in fields, extensions & deprecationReason & astNode in args and fields and in type itself, defaultValue in args, interfaces in type. So it can be a nightmare in the code which provides a small worth.

Or have easily trackable ways of mutation? If so, we could just always return the cached version right?

For now, there are no trackable ways of mutation. Your issue is the first request in this field. So I'm very glad that we found this problem. But I'm sorry that this blind point brought inconvenience.

Or perhaps if it's not easy to tell when a type is mutated, perhaps we could manually "freeze" the types since we're controlling changing them.

Yep, we definitely should freeze types when calling schemaComposer.buildSchema(). BUT I want to keep type changing in runtime which unlocks the ability to extend & modify & swap graphql schema online without restarting the server.

This feature will be highly demanded in graphql-compose-modules in the future. BTW take some time to look at this new way of schema construction. I have been using it for more than a year and very happy with it. It uses directory structure for schema definition (like next.js uses files in the pages/ folder). Moreover, this structure helps apply different middlewares on schema entrypoints.

So maybe if we skip field generation if this._fields is object. EDIT: actually only 2 tests are failing with this change.

No. It's a bad idea for the following reasons:

  • Hoisting problems. You definitely may get conditions when some types have references for each other and placed in different js modules and you will get undefined types for some fields. For solving this problem graphql-js uses code which you provided above - on the top level when you create a schema instance for the first run they deferred initialize all fieldConfigs from thunks (all js modules already initialized) and after that uses them without hoisting problem.
  • ObjectTypeComposer contains it's own fieldConfig map. So if you call getType() and after that modify some field then this field will be changed in ObjectTypeComposer._gqcFields but not in generated GraphQLObjectType. The reason for having own fieldConfig
    • It keeps TypeComposers as type property and it saves time when you need to modify schemas via astVisitors.
    • Solves hoisting problem better that it allows graphql-js - you may get a list of fields while some types inside are not yet initialized. With graphql-js all fieldConfig map is thunked, but in graphql-compose just type property thunked for every field in map.

@KyleAMathews
Copy link
Author

Profiled the same site again with @vladar's fix and time spent in graphql-compose dropped from 1578ms to 5ms 😆 so that's good change.

QPS went from 349 to 1527 so 4.4x speedup!

Screen Shot 2021-02-27 at 10 15 32 AM

@nodkz
Copy link
Member

nodkz commented Feb 27, 2021

Great news! 🎉

I'm curious. Do you create graphql schema for every request?

@nodkz
Copy link
Member

nodkz commented Feb 27, 2021

I considered different scenarios how to fix this problem and chose the following plan:

  • Convert codebase on TypeScript. I want to get rid of Flowtype because it extremely slow and introducing new changes to the codebase brings a lot of pain. Also, Flowtype blocks to gather new contributors.
  • Bump 8.0.0 version. API should not be changed.
  • Construct astNode immediately on type change. Make changes focused, e.g., if arg updated, then change just its astNode. This will update GraphQL type immediately on any change and make getType() method cheap. Now every TypeComposer has directives property like extensions. AstNode is still assembling on getType() call but it's recalculated only if changed new flag TC._gqcIsModified. _gqcIsModified becomes true if you change fields, args and other properties of GraphQL type.
  • Remove directives from extensions. Directives will be stored in astNode only. This is my old mistake which was added when astNode didn't exist.
  • Bump 9.0.0 version. API should not be changed.

Sorry, but these changes may take several months. I am happy that @vladar's fix works fine for you. This gives me time to fix this problem properly and remove old hacky things.

@vladar, if you have any thoughts or additions, please tell me ;)

@vladar
Copy link
Contributor

vladar commented Feb 28, 2021

Hey @nodkz Thank you so much for your prompt replies 💜

The plan sounds cool to me! I am not sure about extensions though - we use them interchangeably with directives. But I think we will figure something out. In general, separating the two makes sense to do (as for me).

nodkz added a commit that referenced this issue May 27, 2021
…ctives are rewritten from scratch.

relates #319
closes #306

BREAKING CHANGE: `directives` are no more available under `extensions.directives` property.  Now `directives` are stored under it's own property in TypeComposers.
nodkz added a commit that referenced this issue May 30, 2021
…ctives are rewritten from scratch.

relates #319
closes #306

BREAKING CHANGE: `directives` are no more available under `extensions.directives` property.  Now `directives` are stored under it's own property in TypeComposers.
@nodkz
Copy link
Member

nodkz commented May 30, 2021

🎉 This issue has been resolved in version 9.0.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

@nodkz nodkz closed this as completed May 30, 2021
@nodkz
Copy link
Member

nodkz commented May 30, 2021

@vladar now all TC.getType() methods have memorization.

You may upgrade to v9.0.0 and remove the following hotfix gatsbyjs/gatsby#29822

@vladar
Copy link
Contributor

vladar commented May 31, 2021

Great news, thanks! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants