Protection against malicious queries #907

Arfey · 2019-02-21T11:07:31Z

Hey. I was looking for a lot of information on how to protect against malicious requests, and as a result I found several common approaches:

query cost or resource limitations (https://developer.github.com/v4/guides/resource-limitations/)
limiting query depth (https://sangria-graphql.org/learn/#limiting-query-depth)
query whitelisting
hide introspection for production mode (this is not common solution, but sounds good)

All this approaches we can implement by meddleware and custom backend. But it will be cool if this solution is out of the box.

Also, graphene don't have information about security into the documentation.

ps: I can try to help, but if u have no reason why this is a bad idea.

jmichalicek · 2019-03-07T13:55:41Z

I have recently been looking into exactly these same things. It would be great to have something built in available, and if not, at least some examples of how someone might best implement these with graphene.

jkimbo · 2019-03-16T11:48:22Z

@Arfey this is an area where there aren't any official answers in Graphene at the moment but I'll give you my opinion on the approaches that you've listed:

1 and 2: query cost or resource limitations + limiting query depth

In another issue I've written up some sample code (not thoroughly tested) that will implement a basic max query depth check: #772 (comment) Something similar could be used for query cost calculations but it's unclear to mean how you would determine the cost of particular fields.

I think there is opportunity for experimentation in user space for this.

3. query whitelisting

The backend functionality is also the place to implement any kind of persisted/query whitelisting. I actually don't think this feature is within the scope of the Graphene project though because it's going to be tightly coupled to the rest of your application. You would need to define where the persisted queries are stored, how the front end part of your app adds new queries, how your server interprets static queries etc. Again this is a place where experimentation can happen in userland and Graphene already exposes the right hooks through backends to implement it.

4. hide introspection for production mode

This sounds like a reasonable suggestion and I think it could be implemented. Any thoughts @graphql-python/core @graphql-python/governors ?

mvanlonden · 2019-03-19T16:10:20Z

@jkimbo @Arfey Solution 4 sounds good. Easy to implement and will address this vulnerability in a timely manner. I don't see a reason why we would need introspection in production. @Arfey mind submitting a PR with this approach?

Arfey · 2019-03-19T16:12:44Z

yes, i will make pull request some time later

etandel · 2019-03-22T20:09:59Z

Just my completely unasked for $.02:

About 3, I think it would help if the types that are automatically generated (from SQLAlchemy or Django ORM models) had their fields explicitly included instead of explicitly excluded. Much like how Django's own ModeField works.

Instead of

class MyType(SQLAlchemyObjectType):
    class Meta:
        model = MyModel
        exclude_fields = ('scary_field',)

this:

class MyType(SQLAlchemyObjectType):
    class Meta:
        model = MyModel
        include_fields = ('safe_field',)

while making the absence of both include_fields or exclude_fields an error.

ktosiek · 2019-04-17T05:34:12Z

Hiding introspection won't help much if the attacker has access to a copy of a client application, as she would be able to guess enough of the schema to write custom queries - just one cycle in the graph is enough for an attack.
But it would be useful for hiding upcoming features from competitors :-)

Arfey · 2019-04-17T09:44:15Z

I use middleware for that right now

class HideIntrospectMiddleware:
    """
    This middleware should use for production mode. This class hide the
    introspection.
    """
    def resolve(self, next, root, info, **args):
        if info.field_name == '__schema':
            return None
        return next(root, info, **args)

sandwichsudo · 2019-06-04T13:49:51Z

@Arfey this is an area where there aren't any official answers in Graphene at the moment but I'll give you my opinion on the approaches that you've listed:

1 and 2: query cost or resource limitations + limiting query depth

In another issue I've written up some sample code (not thoroughly tested) that will implement a basic max query depth check: #772 (comment) Something similar could be used for query cost calculations but it's unclear to mean how you would determine the cost of particular fields.

I think there is opportunity for experimentation in user space for this.

3. query whitelisting

The backend functionality is also the place to implement any kind of persisted/query whitelisting. I actually don't think this feature is within the scope of the Graphene project though because it's going to be tightly coupled to the rest of your application. You would need to define where the persisted queries are stored, how the front end part of your app adds new queries, how your server interprets static queries etc. Again this is a place where experimentation can happen in userland and Graphene already exposes the right hooks through backends to implement it.

4. hide introspection for production mode

This sounds like a reasonable suggestion and I think it could be implemented. Any thoughts @graphql-python/core @graphql-python/governors ?

Hiya, thanks for the sample code but how can I plug this into graphene please? Is it a setting I can add, something like:

GRAPHENE = {
    "SCHEMA": "path.to.my.schema",
    "BACKENDS": ["path.to.my.backends.DepthAnalysisBackend"],
}

?

I've also tried adding it to my GraphQL view, which gives me an error when I run a query through graphiql (using the core backend)

graphql_views.GraphQLView.as_view(graphiql=True, backend=GraphQLCoreBackend)

the error is:

document_from_string() missing 1 required positional argument: 'document_string'

Update - sorted it out, needed to provide an instance not the class. I'll leave this here in case it helps someone.

so to add the backend, I've done:

from .graphql import backends
....
urlpatterns = [
    path(
          "v1/graphql",
            graphql_views.GraphQLView.as_view(
                graphiql=True, backend=backends.DepthAnalysisBackend()
            )
     )
]
```

thejcannon · 2019-06-05T18:34:46Z

My $.02 for how I handle this:

I have different types in the Schema with different related-node-information based on where the type is being used. This artificially sets a query depth which is dependent on the query.

E.g. If authors have one more books:

Author has field books of type AuthorsBook
AuthorsBook doesn't have field author, as that would create a cycle

This works for small to medium sized applications and doesn't take much thought/code to get working.

sandwichsudo · 2019-06-06T15:17:51Z

Thanks @thejcannon - agreed, it makes a lot of sense just design the api not to have cycles in it. Still nice to know no one is able to run queries that join many distinct tables on my server.

@jkimbo - can I check something on the code sample you provided please:

...
# We are only interested in queries
            if definition.operation != 'query':
                continue
...

This suggests we're only interested in queries, but since a mutation could return an object (like an authentication mutation might return a user object) does it not make sense to run this on both? Wondering if there is some subtlety I'm missing.
Thanks

jkimbo · 2019-06-06T16:56:02Z

@sandwichsudo that is a good point, I can't remember why I added that line but I think you could probably just omit it and it would still work.

stale · 2019-08-05T17:17:36Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

sboisson · 2019-08-16T11:31:47Z

I am using this validation rule to disable schema introspection:

class NoIntrospection(ValidationRule):
    def enter_Field(self, node, key, parent, path, ancestors):
        field_name = node.name.value
        if field_name == "__schema" or field_name == "__type":
            self.context.report_error(
                GraphQLError(u"GraphQL introspection is not allowed", [node])
            )

flowirtz · 2019-09-06T14:37:20Z

@sboisson I found this snippet:

export const introspectionTypes = Object.freeze([
  __Schema,
  __Directive,
  __DirectiveLocation,
  __Type,
  __Field,
  __InputValue,
  __EnumValue,
  __TypeKind,
]);

Seems like there is more introspection keywords - I think you should be checking for those too.
I think it might be easier to just check for startsWith("__").

On a more general note: More security tools for this would be nice.
Either included or as attachable middleware.

Adding to the initial list:

Input Validation (e.g. is the string I'm getting actually a string? Also, sanitise it.).

sboisson · 2019-09-10T08:36:34Z

@fwirtz Nice find!
But checking just for startsWith("__") might be too much.
It seems that__typename is used a lot, by Apollo client for example.

flowirtz · 2019-09-13T13:57:28Z

@sboisson yeah I found that catch afterwards as well haha.
I think it would be best to rather check the query name (i.e. root node) whether it starts with __ instead of the fields. Afaik __typename is always just a nested field. Not entirely sure how to do this though.

cglacet · 2019-12-03T11:42:52Z

Any update on this matter? I found this repo of someone trying to work on this, for now it implements the least interesting part of it (limiting query depth). I'm really new to this so I don't know if this could help.

By the way, is it possible to add a middleware directly to a schema instead of having it plugged to schema.execute, I ask this because in some cases (for example when using Graphene with Starlette), we don't have access to schema.execute as it is performed internally. I opened an issue on that question on Starlette side, not sure what's the best solution here.

melvinkcx · 2019-12-23T04:57:22Z

Can anyone enlighten me why schema introspection is a flaw?

I use graphene-django, and I only expose non-private fields using include_fields. For us, our React frontend dynamically generates Forms using schema introspection.

ktosiek · 2019-12-23T08:08:11Z

@melvinkcx it's not strictly a flaw, it's a risk:

for security, by exposing fields not used by your client applications, fields that might be less tested or even might only be exposed by accident;
for business, by exposing new fields (often with some documentation) before the feature that uses them is released (or even finished).

stale · 2020-03-22T09:02:52Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

iamareebjamal · 2020-03-24T16:29:22Z

Not stale

devkral · 2020-10-28T00:42:42Z

there are two projects which aim to solve this problem:

secure-graphene
graphene-protector (untested as it is late, but with better django integration)

aryaniyaps · 2021-08-04T06:52:48Z

@Arfey this is an area where there aren't any official answers in Graphene at the moment but I'll give you my opinion on the approaches that you've listed:

1 and 2: query cost or resource limitations + limiting query depth

In another issue I've written up some sample code (not thoroughly tested) that will implement a basic max query depth check: #772 (comment) Something similar could be used for query cost calculations but it's unclear to mean how you would determine the cost of particular fields.

I think there is opportunity for experimentation in user space for this.

3. query whitelisting

The backend functionality is also the place to implement any kind of persisted/query whitelisting. I actually don't think this feature is within the scope of the Graphene project though because it's going to be tightly coupled to the rest of your application. You would need to define where the persisted queries are stored, how the front end part of your app adds new queries, how your server interprets static queries etc. Again this is a place where experimentation can happen in userland and Graphene already exposes the right hooks through backends to implement it.

4. hide introspection for production mode

This sounds like a reasonable suggestion and I think it could be implemented. Any thoughts @graphql-python/core @graphql-python/governors ?

@jkimbo backends are being removed in Graphene v3.
What could we do instead to make our code future proof?
https://github.com/graphql-python/graphene/wiki/v3-release-notes#backends-support-removed

devkral · 2021-08-08T06:41:57Z

We could either implement a middleware or, I think better it is a better way, a custom validation rule:
https://github.com/graphql-python/graphql-core/tree/main/src/graphql/validation/rules
Currently I have no time, so help wanted (I am the author of graphene-protector)

jkimbo · 2021-08-08T07:56:45Z

@codebyaryan : @devkral is right, using validation rules is the best way of doing this. I've add support for that in Strawberry here: strawberry-graphql/strawberry#1021

Feel free to reuse the code for a Graphene project. It should work with any graphql-core based library.

aryaniyaps · 2021-08-09T01:52:36Z

thanks for the replies @jkimbo @devkral
It turns out that no integration provided by the graphql-server repository supports validation rules.
I think that this might be a problem with the repo and all of the provided packages.

I've raised an issue here.
graphql-python/graphql-server#82

edit: I've also submitted a PR. Please review it!
graphql-python/graphql-server#83

edit 2: guys, jkimbo merged the PR!
We can finally implement custom validation rules!

ProjectCheshire added the question label Mar 11, 2019

jkimbo changed the title ~~The native security api from malicious queries~~ Protection against malicious queries Mar 16, 2019

jkimbo mentioned this issue Mar 16, 2019

Offer a way to calculate and limit the total cost of a query #772

Closed

mvanlonden added the 🙋 help wanted label Mar 19, 2019

stale bot added the wontfix label Aug 5, 2019

stale bot removed the wontfix label Aug 16, 2019

iamareebjamal mentioned this issue Dec 5, 2019

Add query cost analysis and limiting yezyilomo/django-restql#101

Closed

stale bot added the wontfix label Mar 22, 2020

stale bot removed the wontfix label Mar 24, 2020

aryaniyaps mentioned this issue Aug 13, 2021

add support for query validation #1357

Merged

2 tasks

syrusakbary closed this as completed in #1357 Aug 21, 2021

Cito mentioned this issue Mar 8, 2024

I use middleware for that right now graphql-python/graphql-core#215

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Protection against malicious queries #907

Protection against malicious queries #907

Arfey commented Feb 21, 2019 •

edited

jmichalicek commented Mar 7, 2019

jkimbo commented Mar 16, 2019

mvanlonden commented Mar 19, 2019

Arfey commented Mar 19, 2019

etandel commented Mar 22, 2019 •

edited

ktosiek commented Apr 17, 2019

Arfey commented Apr 17, 2019

sandwichsudo commented Jun 4, 2019 •

edited

1 and 2: query cost or resource limitations + limiting query depth

3. query whitelisting

4. hide introspection for production mode

thejcannon commented Jun 5, 2019 •

edited

sandwichsudo commented Jun 6, 2019

jkimbo commented Jun 6, 2019

stale bot commented Aug 5, 2019

sboisson commented Aug 16, 2019

flowirtz commented Sep 6, 2019

sboisson commented Sep 10, 2019

flowirtz commented Sep 13, 2019

cglacet commented Dec 3, 2019

melvinkcx commented Dec 23, 2019

ktosiek commented Dec 23, 2019

stale bot commented Mar 22, 2020

iamareebjamal commented Mar 24, 2020

devkral commented Oct 28, 2020

aryaniyaps commented Aug 4, 2021 •

edited

1 and 2: query cost or resource limitations + limiting query depth

3. query whitelisting

4. hide introspection for production mode

devkral commented Aug 8, 2021 •

edited

jkimbo commented Aug 8, 2021

aryaniyaps commented Aug 9, 2021 •

edited

Protection against malicious queries #907

Protection against malicious queries #907

Comments

Arfey commented Feb 21, 2019 • edited

jmichalicek commented Mar 7, 2019

jkimbo commented Mar 16, 2019

1 and 2: query cost or resource limitations + limiting query depth

3. query whitelisting

4. hide introspection for production mode

mvanlonden commented Mar 19, 2019

Arfey commented Mar 19, 2019

etandel commented Mar 22, 2019 • edited

ktosiek commented Apr 17, 2019

Arfey commented Apr 17, 2019

sandwichsudo commented Jun 4, 2019 • edited

1 and 2: query cost or resource limitations + limiting query depth

3. query whitelisting

4. hide introspection for production mode

thejcannon commented Jun 5, 2019 • edited

sandwichsudo commented Jun 6, 2019

jkimbo commented Jun 6, 2019

stale bot commented Aug 5, 2019

sboisson commented Aug 16, 2019

flowirtz commented Sep 6, 2019

sboisson commented Sep 10, 2019

flowirtz commented Sep 13, 2019

cglacet commented Dec 3, 2019

melvinkcx commented Dec 23, 2019

ktosiek commented Dec 23, 2019

stale bot commented Mar 22, 2020

iamareebjamal commented Mar 24, 2020

devkral commented Oct 28, 2020

aryaniyaps commented Aug 4, 2021 • edited

1 and 2: query cost or resource limitations + limiting query depth

3. query whitelisting

4. hide introspection for production mode

devkral commented Aug 8, 2021 • edited

jkimbo commented Aug 8, 2021

aryaniyaps commented Aug 9, 2021 • edited

Arfey commented Feb 21, 2019 •

edited

etandel commented Mar 22, 2019 •

edited

sandwichsudo commented Jun 4, 2019 •

edited

thejcannon commented Jun 5, 2019 •

edited

aryaniyaps commented Aug 4, 2021 •

edited

devkral commented Aug 8, 2021 •

edited

aryaniyaps commented Aug 9, 2021 •

edited