Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protection against malicious queries #907

Closed
Arfey opened this issue Feb 21, 2019 · 26 comments · Fixed by #1357
Closed

Protection against malicious queries #907

Arfey opened this issue Feb 21, 2019 · 26 comments · Fixed by #1357

Comments

@Arfey
Copy link

Arfey commented Feb 21, 2019

Hey. I was looking for a lot of information on how to protect against malicious requests, and as a result I found several common approaches:

All this approaches we can implement by meddleware and custom backend. But it will be cool if this solution is out of the box.

Also, graphene don't have information about security into the documentation.

ps: I can try to help, but if u have no reason why this is a bad idea.

@jmichalicek
Copy link

I have recently been looking into exactly these same things. It would be great to have something built in available, and if not, at least some examples of how someone might best implement these with graphene.

@jkimbo
Copy link
Member

jkimbo commented Mar 16, 2019

@Arfey this is an area where there aren't any official answers in Graphene at the moment but I'll give you my opinion on the approaches that you've listed:

1 and 2: query cost or resource limitations + limiting query depth

In another issue I've written up some sample code (not thoroughly tested) that will implement a basic max query depth check: #772 (comment) Something similar could be used for query cost calculations but it's unclear to mean how you would determine the cost of particular fields.

I think there is opportunity for experimentation in user space for this.

3. query whitelisting

The backend functionality is also the place to implement any kind of persisted/query whitelisting. I actually don't think this feature is within the scope of the Graphene project though because it's going to be tightly coupled to the rest of your application. You would need to define where the persisted queries are stored, how the front end part of your app adds new queries, how your server interprets static queries etc. Again this is a place where experimentation can happen in userland and Graphene already exposes the right hooks through backends to implement it.

4. hide introspection for production mode

This sounds like a reasonable suggestion and I think it could be implemented. Any thoughts @graphql-python/core @graphql-python/governors ?

@jkimbo jkimbo changed the title The native security api from malicious queries Protection against malicious queries Mar 16, 2019
@mvanlonden
Copy link
Member

@jkimbo @Arfey Solution 4 sounds good. Easy to implement and will address this vulnerability in a timely manner. I don't see a reason why we would need introspection in production. @Arfey mind submitting a PR with this approach?

@Arfey
Copy link
Author

Arfey commented Mar 19, 2019

yes, i will make pull request some time later

@etandel
Copy link
Contributor

etandel commented Mar 22, 2019

Just my completely unasked for $.02:

About 3, I think it would help if the types that are automatically generated (from SQLAlchemy or Django ORM models) had their fields explicitly included instead of explicitly excluded. Much like how Django's own ModeField works.

Instead of

class MyType(SQLAlchemyObjectType):
    class Meta:
        model = MyModel
        exclude_fields = ('scary_field',)

this:

class MyType(SQLAlchemyObjectType):
    class Meta:
        model = MyModel
        include_fields = ('safe_field',)

while making the absence of both include_fields or exclude_fields an error.

@ktosiek
Copy link

ktosiek commented Apr 17, 2019

Hiding introspection won't help much if the attacker has access to a copy of a client application, as she would be able to guess enough of the schema to write custom queries - just one cycle in the graph is enough for an attack.
But it would be useful for hiding upcoming features from competitors :-)

@Arfey
Copy link
Author

Arfey commented Apr 17, 2019

I use middleware for that right now

class HideIntrospectMiddleware:
    """
    This middleware should use for production mode. This class hide the
    introspection.
    """
    def resolve(self, next, root, info, **args):
        if info.field_name == '__schema':
            return None
        return next(root, info, **args)

@sandwichsudo
Copy link

sandwichsudo commented Jun 4, 2019

@Arfey this is an area where there aren't any official answers in Graphene at the moment but I'll give you my opinion on the approaches that you've listed:

1 and 2: query cost or resource limitations + limiting query depth

In another issue I've written up some sample code (not thoroughly tested) that will implement a basic max query depth check: #772 (comment) Something similar could be used for query cost calculations but it's unclear to mean how you would determine the cost of particular fields.

I think there is opportunity for experimentation in user space for this.

3. query whitelisting

The backend functionality is also the place to implement any kind of persisted/query whitelisting. I actually don't think this feature is within the scope of the Graphene project though because it's going to be tightly coupled to the rest of your application. You would need to define where the persisted queries are stored, how the front end part of your app adds new queries, how your server interprets static queries etc. Again this is a place where experimentation can happen in userland and Graphene already exposes the right hooks through backends to implement it.

4. hide introspection for production mode

This sounds like a reasonable suggestion and I think it could be implemented. Any thoughts @graphql-python/core @graphql-python/governors ?

Hiya, thanks for the sample code but how can I plug this into graphene please? Is it a setting I can add, something like:

GRAPHENE = {
    "SCHEMA": "path.to.my.schema",
    "BACKENDS": ["path.to.my.backends.DepthAnalysisBackend"],
}

?

I've also tried adding it to my GraphQL view, which gives me an error when I run a query through graphiql (using the core backend)

graphql_views.GraphQLView.as_view(graphiql=True, backend=GraphQLCoreBackend)

the error is:

document_from_string() missing 1 required positional argument: 'document_string'

Update - sorted it out, needed to provide an instance not the class. I'll leave this here in case it helps someone.

so to add the backend, I've done:

from .graphql import backends
....
urlpatterns = [
    path(
          "v1/graphql",
            graphql_views.GraphQLView.as_view(
                graphiql=True, backend=backends.DepthAnalysisBackend()
            )
     )
]
```

@thejcannon
Copy link

thejcannon commented Jun 5, 2019

My $.02 for how I handle this:

I have different types in the Schema with different related-node-information based on where the type is being used. This artificially sets a query depth which is dependent on the query.

E.g. If authors have one more books:

  • Author has field books of type AuthorsBook
  • AuthorsBook doesn't have field author, as that would create a cycle

This works for small to medium sized applications and doesn't take much thought/code to get working.

@sandwichsudo
Copy link

Thanks @thejcannon - agreed, it makes a lot of sense just design the api not to have cycles in it. Still nice to know no one is able to run queries that join many distinct tables on my server.

@jkimbo - can I check something on the code sample you provided please:

...
# We are only interested in queries
            if definition.operation != 'query':
                continue
...

This suggests we're only interested in queries, but since a mutation could return an object (like an authentication mutation might return a user object) does it not make sense to run this on both? Wondering if there is some subtlety I'm missing.
Thanks

@jkimbo
Copy link
Member

jkimbo commented Jun 6, 2019

@sandwichsudo that is a good point, I can't remember why I added that line but I think you could probably just omit it and it would still work.

@stale
Copy link

stale bot commented Aug 5, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Aug 5, 2019
@sboisson
Copy link

I am using this validation rule to disable schema introspection:

class NoIntrospection(ValidationRule):
    def enter_Field(self, node, key, parent, path, ancestors):
        field_name = node.name.value
        if field_name == "__schema" or field_name == "__type":
            self.context.report_error(
                GraphQLError(u"GraphQL introspection is not allowed", [node])
            )

@stale stale bot removed the wontfix label Aug 16, 2019
@flowirtz
Copy link

flowirtz commented Sep 6, 2019

@sboisson I found this snippet:

export const introspectionTypes = Object.freeze([
  __Schema,
  __Directive,
  __DirectiveLocation,
  __Type,
  __Field,
  __InputValue,
  __EnumValue,
  __TypeKind,
]);

Seems like there is more introspection keywords - I think you should be checking for those too.
I think it might be easier to just check for startsWith("__").


On a more general note: More security tools for this would be nice.
Either included or as attachable middleware.

Adding to the initial list:

  • Input Validation (e.g. is the string I'm getting actually a string? Also, sanitise it.).

@sboisson
Copy link

@fwirtz Nice find!
But checking just for startsWith("__") might be too much.
It seems that__typename is used a lot, by Apollo client for example.

@flowirtz
Copy link

@sboisson yeah I found that catch afterwards as well haha.
I think it would be best to rather check the query name (i.e. root node) whether it starts with __ instead of the fields. Afaik __typename is always just a nested field. Not entirely sure how to do this though.

@cglacet
Copy link

cglacet commented Dec 3, 2019

Any update on this matter? I found this repo of someone trying to work on this, for now it implements the least interesting part of it (limiting query depth). I'm really new to this so I don't know if this could help.

By the way, is it possible to add a middleware directly to a schema instead of having it plugged to schema.execute, I ask this because in some cases (for example when using Graphene with Starlette), we don't have access to schema.execute as it is performed internally. I opened an issue on that question on Starlette side, not sure what's the best solution here.

@melvinkcx
Copy link

Can anyone enlighten me why schema introspection is a flaw?

I use graphene-django, and I only expose non-private fields using include_fields. For us, our React frontend dynamically generates Forms using schema introspection.

@ktosiek
Copy link

ktosiek commented Dec 23, 2019

@melvinkcx it's not strictly a flaw, it's a risk:

  1. for security, by exposing fields not used by your client applications, fields that might be less tested or even might only be exposed by accident;
  2. for business, by exposing new fields (often with some documentation) before the feature that uses them is released (or even finished).

@stale
Copy link

stale bot commented Mar 22, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Mar 22, 2020
@iamareebjamal
Copy link

Not stale

@stale stale bot removed the wontfix label Mar 24, 2020
@devkral
Copy link

devkral commented Oct 28, 2020

there are two projects which aim to solve this problem:

  • secure-graphene
  • graphene-protector (untested as it is late, but with better django integration)

@aryaniyaps
Copy link
Contributor

aryaniyaps commented Aug 4, 2021

@Arfey this is an area where there aren't any official answers in Graphene at the moment but I'll give you my opinion on the approaches that you've listed:

1 and 2: query cost or resource limitations + limiting query depth

In another issue I've written up some sample code (not thoroughly tested) that will implement a basic max query depth check: #772 (comment) Something similar could be used for query cost calculations but it's unclear to mean how you would determine the cost of particular fields.

I think there is opportunity for experimentation in user space for this.

3. query whitelisting

The backend functionality is also the place to implement any kind of persisted/query whitelisting. I actually don't think this feature is within the scope of the Graphene project though because it's going to be tightly coupled to the rest of your application. You would need to define where the persisted queries are stored, how the front end part of your app adds new queries, how your server interprets static queries etc. Again this is a place where experimentation can happen in userland and Graphene already exposes the right hooks through backends to implement it.

4. hide introspection for production mode

This sounds like a reasonable suggestion and I think it could be implemented. Any thoughts @graphql-python/core @graphql-python/governors ?

@jkimbo backends are being removed in Graphene v3.
What could we do instead to make our code future proof?
https://github.com/graphql-python/graphene/wiki/v3-release-notes#backends-support-removed

@devkral
Copy link

devkral commented Aug 8, 2021

We could either implement a middleware or, I think better it is a better way, a custom validation rule:
https://github.com/graphql-python/graphql-core/tree/main/src/graphql/validation/rules
Currently I have no time, so help wanted (I am the author of graphene-protector)

@jkimbo
Copy link
Member

jkimbo commented Aug 8, 2021

@codebyaryan : @devkral is right, using validation rules is the best way of doing this. I've add support for that in Strawberry here: strawberry-graphql/strawberry#1021

Feel free to reuse the code for a Graphene project. It should work with any graphql-core based library.

@aryaniyaps
Copy link
Contributor

aryaniyaps commented Aug 9, 2021

thanks for the replies @jkimbo @devkral
It turns out that no integration provided by the graphql-server repository supports validation rules.
I think that this might be a problem with the repo and all of the provided packages.

I've raised an issue here.
graphql-python/graphql-server#82

edit: I've also submitted a PR. Please review it!
graphql-python/graphql-server#83

edit 2: guys, jkimbo merged the PR!
We can finally implement custom validation rules!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

17 participants