Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

working with webargs meta-location #548

Closed
rocambolesque opened this issue Apr 1, 2020 · 15 comments
Closed

working with webargs meta-location #548

rocambolesque opened this issue Apr 1, 2020 · 15 comments

Comments

@rocambolesque
Copy link

rocambolesque commented Apr 1, 2020

In my api, in order to isolate validation code, I created a meta-location (following this example from the documentation https://webargs.readthedocs.io/en/latest/advanced.html#meta-locations):

@parser.location_loader("view_args_and_json")
def parse_view_args_and_json(request, schema):
    newdata = deepcopy(request.view_args)
    newdata.update(request.json)
    return MultiDictProxy(newdata, schema)

And I'm using it as followed in my api:

@bp.route("/<attr_id:int>")
class MyView(MethodView):
    @bp.arguments(PostMySchema, location="view_args_and_json")
    @bp.response(MySchema, code=201)
    def post(self, payload, attr_id):
        my_model = MyModel(**payload)
        db_session.add(my_model)
        db_session.commit()
        return my_model, 201

Here's the schema:

class MySchema(Schema):
    id = fields.Int(dump_only=True)
    attr_id = fields.Int(load_only=True)
    type = fields.Int()

class PostMySchema(RealtorFilterSchema):
    @validates_schema
    def validate_already_exists(self, data, **kwargs):  # here's the validation bit
        my_model = MyModel.query.filter(
            MyModel.attr_id == data["attr_id"],
            MyModel.type == data["type"],
        ).first()
        if realtor_filter:
            raise ValidationError("Already exists")

And here's the generated openapi.json:

{
        "parameters": [
          {
            "in": "view_args_and_json",
            "name": "attr_id",
            "required": false,
            "schema": {
              "type": "integer",
              "format": "int32"
            }
          },
          {
            "in": "view_args_and_json",
            "name": "type",
            "required": false,
            "schema": {
              "type": "integer",
              "format": "int32"
            }
          }
        ]
}

The swagger interface is unable to post data using this json.
My goal is to make openapi understand that view_args_and_json is supposed to be a requestBody and that attr_id is not supposed to be part of the body because it is a load-only field.

I've tried to add my meta-location to __location_map__ (https://github.com/marshmallow-code/apispec/blob/dev/src/apispec/ext/marshmallow/openapi.py#L30) without success.

@lafrech
Copy link
Member

lafrech commented Apr 2, 2020

I don't know your DB but generally it is better to have a unique constraint in the DB and catch the duplicated ID exception when writing. Unless you're in a transaction, it is dangerous to check first then write because another item could be created in between the check and the write (race condition).

Did you really mean load_only? In my apps, the ID is dump_only. In your case, I suppose you want to dump it but not allow the user to pass it in the payload.

I don't see an obvious solution to your issue. You'd need to do that check somewhere else, I guess. Or not do it at all (see above).

@rocambolesque
Copy link
Author

rocambolesque commented Apr 2, 2020

I agree with you on the race condition and I do have a unique constraint in the DB.
The id is dump_only (see schema).

My question is not really about my use case though, it's about webargs meta-locations and their compatibility with openapi and thus the swagger interface. Endpoints using meta-locations simply aren't callable in the swagger interface.
There does not seem to be a way to describe meta-locations to openapi. The idea would be to tell openapi where to find parameters.

@lafrech
Copy link
Member

lafrech commented Apr 2, 2020

I agree exposing meta-locations is bad.

Meta-locations mean data can be found in several locations. I don't see how this fits in standard OpenAPI.

In fact, the rework in webargs 6 makes it closer to OpenAPI and makes meta-location an exception.

@rocambolesque
Copy link
Author

Alright.

Is there any other way to pass view_args parameters to the schema in order to have these parameters available in the validation process?

@lafrech
Copy link
Member

lafrech commented Apr 2, 2020

AFAIU, the problem here is that the schema contains a field (the ID field) that you don't want to expose in the json payload in the spec.

You can change the mapping to make view_args_and_json point to json but then you'll have the ID field exposed in the spec as body parameter (as well as path parameter).

Unless you make that field dump_only but then your validation won't work.

@rocambolesque
Copy link
Author

You can change the mapping to make view_args_and_json point to json but then you'll have the ID field exposed in the spec as body parameter (as well as path parameter).
Yes that is exactly what I tried to do but I ran into further problems

At this point, I'm going to change the way my endpoint works in order to have all the data I need in the body the request.
So instead of /<attr_id:int> I'll have a simple / and attr_id in the body.

Thanks for your help!

I suggest mentioning meta-locations don't play well with openapi in the docs: https://webargs.readthedocs.io/en/latest/advanced.html#meta-locations.

@lafrech
Copy link
Member

lafrech commented Apr 2, 2020

Not sure this fits in webargs docs as webargs is not "apispec-aware".

I'd keep attr_id as path param and I wouldn't do that verification at schema validation. Well, in fact I wouldn't do that verification at all, as said above. Perhaps you could do it in a custom decorator to get all arguments.

Anyway, I think it should work if you complete the mapping as said above. However, I'm surprised your schema load and validation work while the id field is dump_only.

@rocambolesque
Copy link
Author

I understand your point about not doing the verification and leaving it to the database.
My goal is to provide the API consumer with a standard validation error response 99% of the time and consider the other 1% as exceptional. For example, if caught at validation-level you get a standard 400 else a 500.
The problem with not doing the verification during the validation is that if I have a dozen constraints like these it will be very hard to consume the API because of obscure error messages or it will be very hard to maintain my code because I formatted them correctly but I now have database concerns leaking into my API code.

@lafrech
Copy link
Member

lafrech commented Apr 2, 2020

I understand.

What I do is catch DB error and use abort to return a consistent 422 error.

I understand the leakage issue. Depending on the size of your app, you may want to just live with it (easy and fast) or add an abstraction layer to encapsulate DB operations.

@rocambolesque
Copy link
Author

When catching database errors do you then try to mimic a standard 422 error with a message linking that error back to a field or do you just use abort(422)?
Also, aren't you concerned about the noise generated at the database-level for things you could have caught (in most cases) at the validation-level e.g unique constraints and foreign keys?

@lafrech
Copy link
Member

lafrech commented Apr 2, 2020

Also, aren't you concerned about the noise generated at the database-level for things you could have caught (in most cases) at the validation-level e.g unique constraints and foreign keys?

No. Maybe I should.

I don't feel the need to optimize this. It shouldn't happen in the first place. I mean an app must manage client errors. But optimize client error processing?

When catching database errors do you then try to mimic a standard 422 error with a message linking that error back to a field or do you just use abort(422)?

Up to you to use a simple string message or link to a field (or several fields if the error spans on several fields, like unique constraint on two fields).

See marshmallow-code/flask-smorest#132 (comment).

It can depend on how your client actually parses the error response. Error structuring is nice when debugging, on to help UI providing errors to client. But some errors might not fit in this structure.

Catching database connection errors might not be necessary: ultimately, return a 500 might be enough from client perspective client. I include traceback in the logs.

@rocambolesque
Copy link
Author

I don't feel the need to optimize this. It shouldn't happen in the first place. I mean an app must manage client errors. But optimize client error processing?
Catching database connection errors might not be necessary: ultimately, return a 500 might be enough from client perspective client. I include traceback in the logs.

You must have really nice clients :)

See marshmallow-code/flask-smorest#132 (comment).

I will definitely use this, thanks

@lafrech
Copy link
Member

lafrech commented Apr 2, 2020

Yeah, or not client at all... My point in this example is that for the client, it does not matter that much if the error is a database server error, a network error, whatever. That's internal details. The server can't answer, it is a server error.

@rocambolesque
Copy link
Author

I agree that the client should not be exposed to the underlying complexity of the database model, that is one of the reasons to use APIs.
But it is much better for the client to get a message saying "hey, here is your problem" than "not working". Making the API consumer life easy is part of the job in my opinion.

@lafrech
Copy link
Member

lafrech commented Apr 2, 2020

I think we agree.

Ideally, a 500 is logged and perhaps triggers a notification to the maintainer somehow. And it may contain a message like "please tell the admin". Just saying the user shouldn't need to know it's a DB error or a network error since in any case he can't do anything about it. But yes, it may be slightly better in terms of communication to provide unneeded details so that the backend code doesn't appears as just a broken blob.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants