Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ NEW: Initial graphql implementation #17

Merged
merged 51 commits into from Jun 14, 2021
Merged

✨ NEW: Initial graphql implementation #17

merged 51 commits into from Jun 14, 2021

Conversation

chrisjsewell
Copy link
Member

FYI @ltalirz @NinadBhat this is an initial implementation of a /graphql endpoint.

You should give it a go; this is what the endpoint looks like in the browser, you get autocompletion and hover over tooltip etc, it is super intuitive I think:

image

Copy link
Member

@ltalirz ltalirz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @chrisjsewell , looking nice!

The first question that comes to mind is whether we can easily reuse the models between the graphql/restapi approach (I see you've reimplemented them here).

In terms of giving this project a scope, how about you try to make it work for the information you need for your react app to see how difficult that would be?

Another (related) question I would have is how easy it would be to use for mutating/creating data.
Should one try to support this in graphql or does this become tricky?
I can imagine that it can be handy to be able to create multiple entities in one request; on the other hand it could be complex to use in practice and we may prefer the simpler REST approach for it.

aiida_restapi/routers/graphql.py Outdated Show resolved Hide resolved
entities = (
orm.QueryBuilder()
.append(entity, tag="result", filters={"id": pk}, project=project)
.dict()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we also stumbled upon this during the coding session with @NinadBhat

it seems to me that we would like to have a "dict" version of the qb.one() function that makes sure exactly one result is returned.
as a quick fix we could simply subclass the querybuilder in aiida-restapi here and add it; it could then later be added to aiida-core

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep I can see the use case 👍

aiida_restapi/routers/graphql.py Outdated Show resolved Hide resolved
@chrisjsewell
Copy link
Member Author

oops sorry @ltalirz I've just made some updates that might be worth looking at; gradually refining the models, then absolutely it would be good to thing how we can reduce dupliction with pydantic and the aiida orm models

@chrisjsewell
Copy link
Member Author

chrisjsewell commented Jun 3, 2021

So now, this is quite nice, I've started to expose filtering and nested querying, e.g.

{
  User(id: 1) {
    id
    email
    first_name
    last_name
    nodes(before: "2021-06-02T18:01:41.129976+02:00") {
      count
      rows(limit: 2, offset: 1) {
        id
        uuid
        label
        mtime
        attributes(filter: ["a", "b", "c"])
      }
    }
  }
}

so you filter on the "nodes" object, then on the "rows" you do the pagination, and finally select what fields you want. I think again this makes it very intuitive, and puts the logic "where it should be"
(you want to filter before you count, but then paginate the returned nodes)

on my dummy database this gives:

{
  "data": {
    "User": {
      "id": 1,
      "email": "a@b.com",
      "first_name": "a",
      "last_name": "d",
      "nodes": {
        "count": 7,
        "rows": [
          {
            "id": 2,
            "uuid": "2bee3941-d851-42f6-9109-efa5d60be5e9",
            "label": "",
            "mtime": "2021-06-02T18:01:41.093479+02:00",
            "attributes": {
              "a": 1,
              "b": null,
              "c": null
            }
          },
          {
            "id": 3,
            "uuid": "42587d68-098b-467e-88cc-7af6c76caaf0",
            "label": "",
            "mtime": "2021-06-02T18:01:41.099625+02:00",
            "attributes": {
              "a": 1,
              "b": null,
              "c": null
            }
          }
        ]
      }
    }
  }
}

@chrisjsewell
Copy link
Member Author

The first question that comes to mind is whether we can easily reuse the models between the graphql/restapi approach (I see you've reimplemented them here).

In terms of giving this project a scope, how about you try to make it work for the information you need for your react app to see how difficult that would be?

Another (related) question I would have is how easy it would be to use for mutating/creating data.
Should one try to support this in graphql or does this become tricky?

yep I will gradually start to explore these 👍

@ltalirz
Copy link
Member

ltalirz commented Jun 3, 2021

Since you mentioned performance, one question that comes to mind here is how many queries it is doing under the hood.

In your example, is it doing one query per node or is it more clever?

@chrisjsewell
Copy link
Member Author

chrisjsewell commented Jun 3, 2021

In your example, is it doing one query per node or is it more clever?

Oh no its more clever than that 😉 its bulk querying, e.g. the nodes entity

class NodesEntity(gr.ObjectType):
    count = gr.Int(description="Total number of nodes")
    rows = gr.List(
        NodeEntity,
        limit=gr.Int(default_value=100, description="Maximum number of rows to return"),
        offset=gr.Int(default_value=0, description="Skip the first n rows"),
    )

    @with_dbenv()
    @staticmethod
    def resolve_count(parent: Any, info: gr.ResolveInfo) -> int:
        try:
            filters = parent.get("filters")
        except AttributeError:
            filters = None
        query = orm.QueryBuilder().append(orm.Node, filters=filters)
        return query.count()

    @with_dbenv()
    @staticmethod
    def resolve_rows(
        parent: Any,
        info: gr.ResolveInfo,
        limit: int,
        offset: int,
    ) -> List[dict]:
        project = get_projection(info)
        try:
            filters = parent.get("filters")
        except AttributeError:
            filters = None

        query = orm.QueryBuilder().append(
            orm.Node, tag="fields", filters=filters, project=project
        )
        query.offset(offset)
        query.limit(limit)
        return [d["fields"] for d in query.dict()]

so essentially you pass "down" the filters from the parent, and pass "up" the fields to project from the "child"

which then means you are only querying (a) for a filtered set of nodes, (b) for a certain amount (which will be limited) and (c) for only the fields you need

@ltalirz
Copy link
Member

ltalirz commented Jun 3, 2021

I see, so if I were to count the number of queries for your request it would be

  • one for the user
  • one for counting the nodes
  • one for the filtered nodes

is that correct?
That's very reasonable.

@chrisjsewell
Copy link
Member Author

is that correct?

yep 👍

@chrisjsewell
Copy link
Member Author

chrisjsewell commented Jun 3, 2021

and yeh for example, I want to put in rate limiting like github does, e.g. if you use commitContributionsByRepository(maxRepositories: 200) it returns:

{
  "data": {
    "user": null
  },
  "errors": [
    {
      "type": "ARGUMENT_LIMIT",
      "path": [
        "user",
        "contributionsCollection",
        "commitContributionsByRepository"
      ],
      "locations": [
        {
          "line": 8,
          "column": 7
        }
      ],
      "message": "Only up to 100 repositories is supported."
    }
  ]
}

@ltalirz
Copy link
Member

ltalirz commented Jun 3, 2021

This brings up another point which is the top-level schema for the responses.
It seems your graphql implementation follows json api, which is widely used elsewhere as well and I think it would be sensible to adopt here (e.g. it specifies that data should be communicated back in the data field, errors in the errors field, etc.).

I'll have a stab at seeing how to do this on the rest api side.

@chrisjsewell
Copy link
Member Author

I'll have a stab at seeing how to do this on the rest api side.

sounds good 👍

@chrisjsewell
Copy link
Member Author

chrisjsewell commented Jun 3, 2021

and yeh for example, I want to put in rate limiting like github does

Added for nodes query in 41f42c9

@chrisjsewell
Copy link
Member Author

chrisjsewell commented Jun 3, 2021

Ok so this is quite nice @ltalirz; with a9130c5 I have introduced make_entities_cls which basically "auto-generates" entity specific classes with fields for querying multiple entities.
so you have:

{
  Users {
    count
    rows(limit: 2, offset: 1) {
      id
    }
  }
  Groups {
    count
  }
  Computers {
    count
  }
  Nodes(after: "2021-06-02T18:01:41.117683+02:00", before: "2021-06-02T18:01:41.147405+02:00") {
    count
  }
  Comments {
    count
  }
}

and they all work the same; with count and rows fields, an upper limit of returning 100 rows (could change this), and optionally take (entity specific) filters, which are parsed down to the count and row queries.

@chrisjsewell
Copy link
Member Author

Note there is also https://github.com/graphql-python/graphene-django, but I have chosen not to use it in the first instance, since (a) we are technically not "meant" to use the django backend directly, and instead go through the QueryBuilder, and (b) I anyhow wanted to learn the "basics" first and understand what is possible, before relying on something more high-level.
Even if we don't use it directly, though, there may well be things we can learn from it.

@ltalirz
Copy link
Member

ltalirz commented Jun 4, 2021

looks great! have you already tried adapting the react app to it?

By the way, it looks to me that allows to treat the id and uuid fields on equal footing (?)
That is something I quite like - i.e. one doesn't have to e.g. overload the /nodes/<id> parameter like in the REST API to be able to deal with both ids and uuids.

P.S. On second check, the graphql response isn't compliant with JSON API after all. The top-level fields seem to be, and I got a bit confused by the attributes field which is part of JSON API but in this case comes from the node I guess.

E.g. for JSON API instead of

          {
            "id": 2,
            "uuid": "2bee3941-d851-42f6-9109-efa5d60be5e9",
            "label": "",
            "mtime": "2021-06-02T18:01:41.093479+02:00",
            "attributes": {
              "a": 1,
              "b": null,
              "c": null
            }

a resource would need to be provided as

          {
            "id": 2,
            "type": "node",
            "attributes": {
                 "uuid": "2bee3941-d851-42f6-9109-efa5d60be5e9",
                "label": "",
                "mtime": "2021-06-02T18:01:41.093479+02:00",
                "attributes": {
                      "a": 1,
                     "b": null,
                     "c": null
                }
            }

However, there is no reason for the GraphQL endpoint to return JSON API compliant responses; this is just to correct my initial thought.

@chrisjsewell
Copy link
Member Author

looks great! have you already tried adapting the react app to it?

No not yet, wanted to get in all the necessary features first, but as explained here it should be very easy: https://react-query.tanstack.com/graphql

) -> Type[gr.ObjectType]:
"""Return a class with standard fields/resolvers for querying multiple rows of the same entity."""

# TODO is this the best way to achieve this?
Copy link
Member

@ltalirz ltalirz Jun 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure whether that is what you're looking for here - just checking whether you're aware of the option of subclassing from Generic and thereby introducing a template variable into the class itself (not just the type checks) as done e.g. here

e.g. something like

OrmT = TypeVar("OrmT", bind=...)

class Entities(gr.ObjectType, Generic[OrmT]):
   ...
   orm.QueryBuilder().append(OrmT, filters=filters) ...

# later instantiate template with
Entities[orm.User]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yeh thanks, I'm semi-uptodate with all this templating business. Haven't gone through yet and properly addressed all the typing and linting, etc

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this might be the canonical way to deal with such subclassing: https://stackoverflow.com/a/56570594/5033292, using class Meta: as all graphene-pydantic/django/sqlalchemy do

@chrisjsewell
Copy link
Member Author

By the way, it looks to me that allows to treat the id and uuid fields on equal footing (?)
That is something I quite like - i.e. one doesn't have to e.g. overload the /nodes/ parameter like in the REST API to be able to deal with both ids and uuids.

Yep indeed, currently you can only use id (for simplicity), but it was in my plans to also allow uuid

@codecov-commenter
Copy link

codecov-commenter commented Jun 4, 2021

Codecov Report

Merging #17 (5700827) into master (e5b377d) will decrease coverage by 1.71%.
The diff coverage is 91.37%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #17      +/-   ##
==========================================
- Coverage   93.80%   92.09%   -1.72%     
==========================================
  Files           8       25      +17     
  Lines         226      746     +520     
==========================================
+ Hits          212      687     +475     
- Misses         14       59      +45     
Flag Coverage Δ
pytests 92.09% <91.37%> (-1.72%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
aiida_restapi/graphql/sphinx_ext.py 0.00% <0.00%> (ø)
aiida_restapi/graphql/utils.py 80.00% <80.00%> (ø)
aiida_restapi/graphql/users.py 86.95% <86.95%> (ø)
aiida_restapi/filter_syntax.py 88.88% <88.88%> (ø)
aiida_restapi/graphql/plugins.py 91.30% <91.30%> (ø)
aiida_restapi/graphql/orm_factories.py 92.52% <92.52%> (ø)
aiida_restapi/aiida_db_mappings.py 96.34% <96.34%> (ø)
aiida_restapi/graphql/nodes.py 96.92% <96.92%> (ø)
aiida_restapi/graphql/basic.py 100.00% <100.00%> (ø)
aiida_restapi/graphql/comments.py 100.00% <100.00%> (ø)
... and 25 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e5b377d...5700827. Read the comment docs.

@chrisjsewell
Copy link
Member Author

chrisjsewell commented Jun 4, 2021

Just to record an additional note here:
I was looking into adding middleware (for handling authentification and rate limiting etc), and found here encode/starlette#740, that actually Starlette (and hence fastapi) is moving graphql support to third-party packages: encode/starlette#619

This should hopefully not be a problem: basically when these versions of starlette/fastapi are released, you would change:

from starlette.graphql import GraphQLApp
app = GraphQLApp(schema=gr.Schema(query=Query))

to e.g. using https://github.com/ciscorn/starlette-graphene3:

from starlette_graphene3 import GraphQLApp
app = GraphQLApp(gr.Schema(query=Query), on_get=make_graphiql_handler())

Note also, this would actually remove one of my reasons for choosing graphene over alternatives like strawberry (#17 (comment)), although (a) this would still not be possible until the new version of Starlette, (b) graphene still looks to be the most popular graphql implementation (6.6K stars), (c) I haven't personally found any issues with it yet, and (d) anyhow this should not (in theory) change the conceptual design of my graphql schema and resolvers (should be able to swap later).

@ltalirz
Copy link
Member

ltalirz commented Jun 10, 2021

I've replied to all comments with questions

chrisjsewell and others added 2 commits June 10, 2021 16:18
Co-authored-by: Leopold Talirz <leopold.talirz@gmail.com>
@chrisjsewell
Copy link
Member Author

I've replied to all comments with questions

Thanks! I should probably complete my responses/changes tomorrow or over the weekend

@ltalirz
Copy link
Member

ltalirz commented Jun 14, 2021

You let me know when I should have another look at this, right?

@chrisjsewell
Copy link
Member Author

You let me know when I should have another look at this, right?

ready

Copy link
Member

@ltalirz ltalirz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @chrisjsewell

There are still two open comments that should not get lost
#17 (comment)
#17 (comment)

but for me this is good to squash & merge (I'll leave it up to you to prepare the commit message)

conftest.py Outdated
@@ -14,7 +14,7 @@
def clear_database_auto(clear_database_before_test): # pylint: disable=unused-argument
"""Automatically clear database in between tests."""

# TODO: Somehow this does not reset the user id counter, which causes the /users/id test to fail # pylint: disable=fixme
# TODO: Somehow this does not reset the user id counter, which causes the /users/id test to fail
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can probably remove this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@chrisjsewell chrisjsewell changed the title add initial graphql implementation ✨ NEW: Initial graphql implementation Jun 14, 2021
@chrisjsewell chrisjsewell merged commit d115033 into master Jun 14, 2021
@chrisjsewell chrisjsewell deleted the graphql branch June 14, 2021 23:31
@chrisjsewell
Copy link
Member Author

cheers for the review @ltalirz 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants