Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce duplicated identifier strings in C parser and Ruby parser #4899

Merged
merged 10 commits into from
Apr 15, 2024

Conversation

rmosolgo
Copy link
Owner

@rmosolgo rmosolgo commented Apr 4, 2024

This PR dedups identifiers during parsing when parsing schemas.

A previous investigation (#4897) tried de-duping these in Schema members, but because AST nodes are retained along with the things that they define, these strings should be de-duplicated sooner, or else the originally-parsed strings will still be retained.

This greatly reduces retained strings in the benchmark:

  retained objects by class
  -----------------------------------
-     9897  String
      4978  Array
      4965  GraphQL::Language::Nodes::TypeName
      2745  Hash
      2340  GraphQL::Language::Nodes::FieldDefinition
      2340  GraphQL::Schema::Field
      2025  GraphQL::Language::Nodes::InputValueDefinition
      2025  GraphQL::Language::Nodes::NonNullType
      2025  GraphQL::Schema::Argument
       600  GraphQL::Schema::TypeMembership
       234  Class
+      164  String
       111  GraphQL::Schema::LateBoundType
       101  GraphQL::Language::Nodes::ObjectTypeDefinition
        10  GraphQL::Language::Nodes::UnionTypeDefinition
         7  Symbol
  allocated memory by gem
  -----------------------------------
-   22885432  graphql-ruby/lib
+   22879592  graphql-ruby/lib
-    2384584  graphql-c_parser/lib
+    2012304  graphql-c_parser/lib
      2080  set

I also updated the Ruby parser. It still allocates duplicate strings, but it doesn't retain them anymore:

- Total retained:  4756648 bytes (34386 objects)
+ Total retained:  4377104 bytes (24939 objects)

TODO:

  • Address GC question with interned strings
  • Make it so interning is on for schema parsing but off for query parsing
  • Consider a similar patch for the Ruby parser?
  • Using rb_enc_interned_str breaks Ruby 2.7 compat. Is this OK? (2.7 is EOL)

@rmosolgo rmosolgo marked this pull request as ready for review April 12, 2024 17:33
@rmosolgo rmosolgo added this to the 2.3.1 milestone Apr 12, 2024
@rmosolgo rmosolgo added the c-ext label Apr 12, 2024
@rmosolgo rmosolgo merged commit 5afb3b5 into master Apr 15, 2024
12 checks passed
@rmosolgo rmosolgo deleted the ast-strings branch April 15, 2024 18:43
@rmosolgo rmosolgo changed the title Reduce duplicated identifier strings in C parser Reduce duplicated identifier strings in C parser and Ruby parser Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant