✨ NEW: Initial graphql implementation #17

chrisjsewell · 2021-06-03T07:25:43Z

FYI @ltalirz @NinadBhat this is an initial implementation of a /graphql endpoint.

You should give it a go; this is what the endpoint looks like in the browser, you get autocompletion and hover over tooltip etc, it is super intuitive I think:

ltalirz

thanks @chrisjsewell , looking nice!

The first question that comes to mind is whether we can easily reuse the models between the graphql/restapi approach (I see you've reimplemented them here).

In terms of giving this project a scope, how about you try to make it work for the information you need for your react app to see how difficult that would be?

Another (related) question I would have is how easy it would be to use for mutating/creating data.
Should one try to support this in graphql or does this become tricky?
I can imagine that it can be handy to be able to create multiple entities in one request; on the other hand it could be complex to use in practice and we may prefer the simpler REST approach for it.

aiida_restapi/routers/graphql.py

ltalirz · 2021-06-03T10:39:52Z

aiida_restapi/routers/graphql.py

+    entities = (
+        orm.QueryBuilder()
+        .append(entity, tag="result", filters={"id": pk}, project=project)
+        .dict()


we also stumbled upon this during the coding session with @NinadBhat

it seems to me that we would like to have a "dict" version of the qb.one() function that makes sure exactly one result is returned.
as a quick fix we could simply subclass the querybuilder in aiida-restapi here and add it; it could then later be added to aiida-core

yep I can see the use case 👍

aiida_restapi/routers/graphql.py

chrisjsewell · 2021-06-03T11:44:11Z

oops sorry @ltalirz I've just made some updates that might be worth looking at; gradually refining the models, then absolutely it would be good to thing how we can reduce dupliction with pydantic and the aiida orm models

chrisjsewell · 2021-06-03T11:47:06Z

So now, this is quite nice, I've started to expose filtering and nested querying, e.g.

{
  User(id: 1) {
    id
    email
    first_name
    last_name
    nodes(before: "2021-06-02T18:01:41.129976+02:00") {
      count
      rows(limit: 2, offset: 1) {
        id
        uuid
        label
        mtime
        attributes(filter: ["a", "b", "c"])
      }
    }
  }
}

so you filter on the "nodes" object, then on the "rows" you do the pagination, and finally select what fields you want. I think again this makes it very intuitive, and puts the logic "where it should be"
(you want to filter before you count, but then paginate the returned nodes)

on my dummy database this gives:

{
  "data": {
    "User": {
      "id": 1,
      "email": "a@b.com",
      "first_name": "a",
      "last_name": "d",
      "nodes": {
        "count": 7,
        "rows": [
          {
            "id": 2,
            "uuid": "2bee3941-d851-42f6-9109-efa5d60be5e9",
            "label": "",
            "mtime": "2021-06-02T18:01:41.093479+02:00",
            "attributes": {
              "a": 1,
              "b": null,
              "c": null
            }
          },
          {
            "id": 3,
            "uuid": "42587d68-098b-467e-88cc-7af6c76caaf0",
            "label": "",
            "mtime": "2021-06-02T18:01:41.099625+02:00",
            "attributes": {
              "a": 1,
              "b": null,
              "c": null
            }
          }
        ]
      }
    }
  }
}

chrisjsewell · 2021-06-03T11:50:26Z

The first question that comes to mind is whether we can easily reuse the models between the graphql/restapi approach (I see you've reimplemented them here).

In terms of giving this project a scope, how about you try to make it work for the information you need for your react app to see how difficult that would be?

Another (related) question I would have is how easy it would be to use for mutating/creating data.
Should one try to support this in graphql or does this become tricky?

yep I will gradually start to explore these 👍

ltalirz · 2021-06-03T11:54:40Z

Since you mentioned performance, one question that comes to mind here is how many queries it is doing under the hood.

In your example, is it doing one query per node or is it more clever?

chrisjsewell · 2021-06-03T11:57:53Z

In your example, is it doing one query per node or is it more clever?

Oh no its more clever than that 😉 its bulk querying, e.g. the nodes entity

class NodesEntity(gr.ObjectType):
    count = gr.Int(description="Total number of nodes")
    rows = gr.List(
        NodeEntity,
        limit=gr.Int(default_value=100, description="Maximum number of rows to return"),
        offset=gr.Int(default_value=0, description="Skip the first n rows"),
    )

    @with_dbenv()
    @staticmethod
    def resolve_count(parent: Any, info: gr.ResolveInfo) -> int:
        try:
            filters = parent.get("filters")
        except AttributeError:
            filters = None
        query = orm.QueryBuilder().append(orm.Node, filters=filters)
        return query.count()

    @with_dbenv()
    @staticmethod
    def resolve_rows(
        parent: Any,
        info: gr.ResolveInfo,
        limit: int,
        offset: int,
    ) -> List[dict]:
        project = get_projection(info)
        try:
            filters = parent.get("filters")
        except AttributeError:
            filters = None

        query = orm.QueryBuilder().append(
            orm.Node, tag="fields", filters=filters, project=project
        )
        query.offset(offset)
        query.limit(limit)
        return [d["fields"] for d in query.dict()]

so essentially you pass "down" the filters from the parent, and pass "up" the fields to project from the "child"

which then means you are only querying (a) for a filtered set of nodes, (b) for a certain amount (which will be limited) and (c) for only the fields you need

ltalirz · 2021-06-03T12:00:08Z

I see, so if I were to count the number of queries for your request it would be

one for the user
one for counting the nodes
one for the filtered nodes

is that correct?
That's very reasonable.

chrisjsewell · 2021-06-03T12:01:31Z

is that correct?

yep 👍

chrisjsewell · 2021-06-03T12:02:43Z

and yeh for example, I want to put in rate limiting like github does, e.g. if you use commitContributionsByRepository(maxRepositories: 200) it returns:

{
  "data": {
    "user": null
  },
  "errors": [
    {
      "type": "ARGUMENT_LIMIT",
      "path": [
        "user",
        "contributionsCollection",
        "commitContributionsByRepository"
      ],
      "locations": [
        {
          "line": 8,
          "column": 7
        }
      ],
      "message": "Only up to 100 repositories is supported."
    }
  ]
}

ltalirz · 2021-06-03T12:09:08Z

This brings up another point which is the top-level schema for the responses.
It seems your graphql implementation follows json api, which is widely used elsewhere as well and I think it would be sensible to adopt here (e.g. it specifies that data should be communicated back in the data field, errors in the errors field, etc.).

I'll have a stab at seeing how to do this on the rest api side.

chrisjsewell · 2021-06-03T12:11:14Z

I'll have a stab at seeing how to do this on the rest api side.

sounds good 👍

chrisjsewell · 2021-06-03T12:31:51Z

and yeh for example, I want to put in rate limiting like github does

Added for nodes query in 41f42c9

chrisjsewell · 2021-06-03T22:57:55Z

Ok so this is quite nice @ltalirz; with a9130c5 I have introduced make_entities_cls which basically "auto-generates" entity specific classes with fields for querying multiple entities.
so you have:

{
  Users {
    count
    rows(limit: 2, offset: 1) {
      id
    }
  }
  Groups {
    count
  }
  Computers {
    count
  }
  Nodes(after: "2021-06-02T18:01:41.117683+02:00", before: "2021-06-02T18:01:41.147405+02:00") {
    count
  }
  Comments {
    count
  }
}

and they all work the same; with count and rows fields, an upper limit of returning 100 rows (could change this), and optionally take (entity specific) filters, which are parsed down to the count and row queries.

chrisjsewell · 2021-06-03T23:42:51Z

Note there is also https://github.com/graphql-python/graphene-django, but I have chosen not to use it in the first instance, since (a) we are technically not "meant" to use the django backend directly, and instead go through the QueryBuilder, and (b) I anyhow wanted to learn the "basics" first and understand what is possible, before relying on something more high-level.
Even if we don't use it directly, though, there may well be things we can learn from it.

ltalirz · 2021-06-04T00:16:14Z

looks great! have you already tried adapting the react app to it?

By the way, it looks to me that allows to treat the id and uuid fields on equal footing (?)
That is something I quite like - i.e. one doesn't have to e.g. overload the /nodes/<id> parameter like in the REST API to be able to deal with both ids and uuids.

P.S. On second check, the graphql response isn't compliant with JSON API after all. The top-level fields seem to be, and I got a bit confused by the attributes field which is part of JSON API but in this case comes from the node I guess.

E.g. for JSON API instead of

          {
            "id": 2,
            "uuid": "2bee3941-d851-42f6-9109-efa5d60be5e9",
            "label": "",
            "mtime": "2021-06-02T18:01:41.093479+02:00",
            "attributes": {
              "a": 1,
              "b": null,
              "c": null
            }

a resource would need to be provided as

          {
            "id": 2,
            "type": "node",
            "attributes": {
                 "uuid": "2bee3941-d851-42f6-9109-efa5d60be5e9",
                "label": "",
                "mtime": "2021-06-02T18:01:41.093479+02:00",
                "attributes": {
                      "a": 1,
                     "b": null,
                     "c": null
                }
            }

However, there is no reason for the GraphQL endpoint to return JSON API compliant responses; this is just to correct my initial thought.

.pre-commit-config.yaml

chrisjsewell · 2021-06-04T00:20:01Z

looks great! have you already tried adapting the react app to it?

No not yet, wanted to get in all the necessary features first, but as explained here it should be very easy: https://react-query.tanstack.com/graphql

ltalirz · 2021-06-04T00:28:11Z

aiida_restapi/routers/graphql/utils.py

+) -> Type[gr.ObjectType]:
+    """Return a class with standard fields/resolvers for querying multiple rows of the same entity."""
+
+    # TODO is this the best way to achieve this?


I'm not 100% sure whether that is what you're looking for here - just checking whether you're aware of the option of subclassing from Generic and thereby introducing a template variable into the class itself (not just the type checks) as done e.g. here

e.g. something like

OrmT = TypeVar("OrmT", bind=...) class Entities(gr.ObjectType, Generic[OrmT]): ... orm.QueryBuilder().append(OrmT, filters=filters) ... # later instantiate template with Entities[orm.User]

oh yeh thanks, I'm semi-uptodate with all this templating business. Haven't gone through yet and properly addressed all the typing and linting, etc

Actually this might be the canonical way to deal with such subclassing: https://stackoverflow.com/a/56570594/5033292, using class Meta: as all graphene-pydantic/django/sqlalchemy do

chrisjsewell · 2021-06-04T00:28:39Z

By the way, it looks to me that allows to treat the id and uuid fields on equal footing (?)
That is something I quite like - i.e. one doesn't have to e.g. overload the /nodes/ parameter like in the REST API to be able to deal with both ids and uuids.

Yep indeed, currently you can only use id (for simplicity), but it was in my plans to also allow uuid

codecov-commenter · 2021-06-04T01:08:36Z

Codecov Report

Merging #17 (5700827) into master (e5b377d) will decrease coverage by 1.71%.
The diff coverage is 91.37%.

@@            Coverage Diff             @@
##           master      #17      +/-   ##
==========================================
- Coverage   93.80%   92.09%   -1.72%     
==========================================
  Files           8       25      +17     
  Lines         226      746     +520     
==========================================
+ Hits          212      687     +475     
- Misses         14       59      +45

Flag	Coverage Δ
pytests	`92.09% <91.37%> (-1.72%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
aiida_restapi/graphql/sphinx_ext.py	`0.00% <0.00%> (ø)`
aiida_restapi/graphql/utils.py	`80.00% <80.00%> (ø)`
aiida_restapi/graphql/users.py	`86.95% <86.95%> (ø)`
aiida_restapi/filter_syntax.py	`88.88% <88.88%> (ø)`
aiida_restapi/graphql/plugins.py	`91.30% <91.30%> (ø)`
aiida_restapi/graphql/orm_factories.py	`92.52% <92.52%> (ø)`
aiida_restapi/aiida_db_mappings.py	`96.34% <96.34%> (ø)`
aiida_restapi/graphql/nodes.py	`96.92% <96.92%> (ø)`
aiida_restapi/graphql/basic.py	`100.00% <100.00%> (ø)`
aiida_restapi/graphql/comments.py	`100.00% <100.00%> (ø)`
... and 25 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e5b377d...5700827. Read the comment docs.

chrisjsewell · 2021-06-04T16:27:32Z

Just to record an additional note here:
I was looking into adding middleware (for handling authentification and rate limiting etc), and found here encode/starlette#740, that actually Starlette (and hence fastapi) is moving graphql support to third-party packages: encode/starlette#619

This should hopefully not be a problem: basically when these versions of starlette/fastapi are released, you would change:

from starlette.graphql import GraphQLApp
app = GraphQLApp(schema=gr.Schema(query=Query))

to e.g. using https://github.com/ciscorn/starlette-graphene3:

from starlette_graphene3 import GraphQLApp
app = GraphQLApp(gr.Schema(query=Query), on_get=make_graphiql_handler())

Note also, this would actually remove one of my reasons for choosing graphene over alternatives like strawberry (#17 (comment)), although (a) this would still not be possible until the new version of Starlette, (b) graphene still looks to be the most popular graphql implementation (6.6K stars), (c) I haven't personally found any issues with it yet, and (d) anyhow this should not (in theory) change the conceptual design of my graphql schema and resolvers (should be able to swap later).

ltalirz · 2021-06-10T14:13:59Z

I've replied to all comments with questions

Co-authored-by: Leopold Talirz <leopold.talirz@gmail.com>

chrisjsewell · 2021-06-10T15:08:54Z

I've replied to all comments with questions

Thanks! I should probably complete my responses/changes tomorrow or over the weekend

ltalirz · 2021-06-14T21:01:18Z

You let me know when I should have another look at this, right?

chrisjsewell · 2021-06-14T21:59:12Z

You let me know when I should have another look at this, right?

ready

ltalirz

thanks @chrisjsewell

There are still two open comments that should not get lost
#17 (comment)
#17 (comment)

but for me this is good to squash & merge (I'll leave it up to you to prepare the commit message)

ltalirz · 2021-06-14T22:24:53Z

conftest.py

@@ -14,7 +14,7 @@
 def clear_database_auto(clear_database_before_test):  # pylint: disable=unused-argument
    """Automatically clear database in between tests."""

-    # TODO: Somehow this does not reset the user id counter, which causes the /users/id test to fail  # pylint: disable=fixme
+    # TODO: Somehow this does not reset the user id counter, which causes the /users/id test to fail


we can probably remove this

chrisjsewell · 2021-06-14T23:40:14Z

cheers for the review @ltalirz 👍

add initial graphql implementation

12cc5a7

ltalirz reviewed Jun 3, 2021

View reviewed changes

update

849f4a3

ltalirz mentioned this pull request Jun 3, 2021

bubble up python exceptions to response? #11

Open

add limit to number of nodes that can be returned from a query

41f42c9

Add plural fields

a9130c5

Make node filter functions methods of NodesEntity

0b63fdd

ltalirz reviewed Jun 4, 2021

View reviewed changes

.pre-commit-config.yaml Show resolved Hide resolved

ltalirz reviewed Jun 4, 2021

View reviewed changes

fix requirements

6e72a08

chrisjsewell added 2 commits June 4, 2021 04:46

Add entry point querying and entity row ordering

0f17a99

Add logs entity

6464e8d

chrisjsewell and others added 2 commits June 10, 2021 16:18

Apply suggestions from code review

6288024

Co-authored-by: Leopold Talirz <leopold.talirz@gmail.com>

Add introduction to graphql (what/why)

0d99673

chrisjsewell added 6 commits June 14, 2021 08:44

add filters to logs and comments

8664280

change to lowercase

49bbd43

Merge branch 'master' into graphql

388ffbe

add full test

55de97c

move grammar to static file

1c76030

allow multiple filters of same key

989d549

chrisjsewell added 3 commits June 14, 2021 23:04

improve comments

ba6775d

updates

d683254

minor update

7779ac3

chrisjsewell mentioned this pull request Jun 14, 2021

Address TODOs in code #22

Open

chrisjsewell requested a review from ltalirz June 14, 2021 21:58

chrisjsewell added 2 commits June 15, 2021 00:19

Add initial example to docs

84f9ef9

Update graphql.md

5e20e9b

ltalirz approved these changes Jun 14, 2021

View reviewed changes

chrisjsewell added 3 commits June 15, 2021 00:36

remove todo

924f612

improve filter documentation

ee81233

Add plugins documentation

5700827

chrisjsewell changed the title ~~add initial graphql implementation~~ ✨ NEW: Initial graphql implementation Jun 14, 2021

chrisjsewell merged commit d115033 into master Jun 14, 2021

chrisjsewell deleted the graphql branch June 14, 2021 23:31

chrisjsewell mentioned this pull request Jun 14, 2021

Change package name #23

Open

chrisjsewell mentioned this pull request Jun 15, 2021

Incorrect input field specification for POST methods #26

Closed

chrisjsewell mentioned this pull request Dec 23, 2022

How does this compare to GraphQL #13

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

✨ NEW: Initial graphql implementation #17

✨ NEW: Initial graphql implementation #17

chrisjsewell commented Jun 3, 2021

ltalirz left a comment •

edited

ltalirz Jun 3, 2021

chrisjsewell Jun 3, 2021 •

edited

chrisjsewell commented Jun 3, 2021

chrisjsewell commented Jun 3, 2021 •

edited

chrisjsewell commented Jun 3, 2021

ltalirz commented Jun 3, 2021

chrisjsewell commented Jun 3, 2021 •

edited

ltalirz commented Jun 3, 2021

chrisjsewell commented Jun 3, 2021

chrisjsewell commented Jun 3, 2021 •

edited

ltalirz commented Jun 3, 2021 •

edited

chrisjsewell commented Jun 3, 2021

chrisjsewell commented Jun 3, 2021 •

edited

chrisjsewell commented Jun 3, 2021 •

edited

chrisjsewell commented Jun 3, 2021

ltalirz commented Jun 4, 2021 •

edited

chrisjsewell commented Jun 4, 2021

ltalirz Jun 4, 2021 •

edited

chrisjsewell Jun 4, 2021

chrisjsewell Jun 4, 2021

chrisjsewell commented Jun 4, 2021

codecov-commenter commented Jun 4, 2021 •

edited

chrisjsewell commented Jun 4, 2021 •

edited

ltalirz commented Jun 10, 2021 •

edited

chrisjsewell commented Jun 10, 2021

ltalirz commented Jun 14, 2021

chrisjsewell commented Jun 14, 2021

ltalirz left a comment •

edited

ltalirz Jun 14, 2021

chrisjsewell Jun 14, 2021

chrisjsewell commented Jun 14, 2021

✨ NEW: Initial graphql implementation #17

✨ NEW: Initial graphql implementation #17

Conversation

chrisjsewell commented Jun 3, 2021

ltalirz left a comment • edited

Choose a reason for hiding this comment

ltalirz Jun 3, 2021

Choose a reason for hiding this comment

chrisjsewell Jun 3, 2021 • edited

Choose a reason for hiding this comment

chrisjsewell commented Jun 3, 2021

chrisjsewell commented Jun 3, 2021 • edited

chrisjsewell commented Jun 3, 2021

ltalirz commented Jun 3, 2021

chrisjsewell commented Jun 3, 2021 • edited

ltalirz commented Jun 3, 2021

chrisjsewell commented Jun 3, 2021

chrisjsewell commented Jun 3, 2021 • edited

ltalirz commented Jun 3, 2021 • edited

chrisjsewell commented Jun 3, 2021

chrisjsewell commented Jun 3, 2021 • edited

chrisjsewell commented Jun 3, 2021 • edited

chrisjsewell commented Jun 3, 2021

ltalirz commented Jun 4, 2021 • edited

chrisjsewell commented Jun 4, 2021

ltalirz Jun 4, 2021 • edited

Choose a reason for hiding this comment

chrisjsewell Jun 4, 2021

Choose a reason for hiding this comment

chrisjsewell Jun 4, 2021

Choose a reason for hiding this comment

chrisjsewell commented Jun 4, 2021

codecov-commenter commented Jun 4, 2021 • edited

Codecov Report

chrisjsewell commented Jun 4, 2021 • edited

ltalirz commented Jun 10, 2021 • edited

chrisjsewell commented Jun 10, 2021

ltalirz commented Jun 14, 2021

chrisjsewell commented Jun 14, 2021

ltalirz left a comment • edited

Choose a reason for hiding this comment

ltalirz Jun 14, 2021

Choose a reason for hiding this comment

chrisjsewell Jun 14, 2021

Choose a reason for hiding this comment

chrisjsewell commented Jun 14, 2021

ltalirz left a comment •

edited

chrisjsewell Jun 3, 2021 •

edited

chrisjsewell commented Jun 3, 2021 •

edited

chrisjsewell commented Jun 3, 2021 •

edited

chrisjsewell commented Jun 3, 2021 •

edited

ltalirz commented Jun 3, 2021 •

edited

chrisjsewell commented Jun 3, 2021 •

edited

chrisjsewell commented Jun 3, 2021 •

edited

ltalirz commented Jun 4, 2021 •

edited

ltalirz Jun 4, 2021 •

edited

codecov-commenter commented Jun 4, 2021 •

edited

chrisjsewell commented Jun 4, 2021 •

edited

ltalirz commented Jun 10, 2021 •

edited

ltalirz left a comment •

edited