Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experiment a high-level HTTPMiddleware #1691

Closed
wants to merge 13 commits into from
Closed

Conversation

florimondmanca
Copy link
Member

@florimondmanca florimondmanca commented Jun 14, 2022

Refs #1678, cc @adriangb

I experimented with what a helper middleware to do some simple request/response peeks and edits might look like, and yet be pure ASGI -- no Starlette dependencies at all, accept any ASGI app as a parent.

I was inspired by what we did for HTTPX auth flows. Turns out we can use the same generator-based API to provide a somewhat clean interface that goes like this:

class MyMiddleware(HTTPMiddleware):
    async def dispatch(self, request):
        # Access or tweak the `request`...

        try:
            response = yield  # Call into the underlying app.
        except Exception:
            # Handle any exception that occurred, and optionally
            # respond on behalf of the failed app.
            yield Response(status_code=500)

        # Tweak the response
        response.headers["Custom"] = "Example"

The structure of this API does NOT allow:

  • Accessing or modifying the request body. The request is built with scope only (request method, URL, headers, etc), so we haven't got any access to receive().

I copy-pasted the BaseHTTPMiddleware tests, switching to HTTPMiddleware, and got everything passing.

I also looked up some example usage on https://grep.app and found this fairly complex usage from Netflix's Dispatch project. Here's the diff showing how they might upgrade:

- class MetricsMiddleware(BaseHTTPMiddleware):
-     async def dispatch(self, request: Request, call_next: RequestResponseEndpoint) -> Response:
+ class MetricsMiddleware(HTTPMiddleware):
+     async def dispatch(self, request: Request) -> AsyncGenerator[None, Response]:
          path_template = get_path_template(request)

          method = request.method
          tags = {"method": method, "endpoint": path_template}

          try:
              start = time.perf_counter()
-             response = await call_next(request)
+             response = yield
              elapsed_time = time.perf_counter() - start
          except Exception as e:
              metric_provider.counter("server.call.exception.counter", tags=tags)
              raise e from None
          else:
              tags.update({"status_code": response.status_code})
              metric_provider.timer("server.call.elapsed", value=elapsed_time, tags=tags)
              metric_provider.counter("server.call.counter", tags=tags)

-         return response

Some missing pieces might be:

  • Refine the 'response patching' API -- I've reused the Response class, but we might want to use a specific 'stub' that clearly limits what's allowed within the "on-response-headers" part of the middleware. E.g. it should be clear that it's not possible to patch response.body.

@adriangb
Copy link
Member

Just curious, did you look at my branch (linked in #1678 (comment))? I ask because if you did not it seems we arrived at the exact same solution independently, which is a good sign.

@florimondmanca
Copy link
Member Author

@adriangb Woah, I did not look at that branch — a very good sign indeed!

@adriangb
Copy link
Member

Woah, I did not look at that branch — a very good sign indeed!

Agreed. It seems like we even did the same thing with early_response.

async support

I think we should just make it an async generator and always an async generator.

it should be clear that it's not possible to patch response.body

There is one alternative: if response.body is set / changed we can return that instead. But maybe we want to allow replacing the entire response?

response = yield
yield Response()

We can support this by checking if the generator stops or is still available.

Another issue to think about: do we want to allow (safely) reading the request body?

  • If the user calls request.body() we store the entire body in memory and replace the receive stream with that body
  • If the user calls request.stream() we replace receive with a single message with an empty body

starlette/middleware/http.py Outdated Show resolved Hide resolved
starlette/middleware/http.py Outdated Show resolved Hide resolved
tests/middleware/test_http.py Outdated Show resolved Hide resolved
tests/middleware/test_http.py Outdated Show resolved Hide resolved
@florimondmanca florimondmanca force-pushed the fm/http-middleware branch 3 times, most recently from a0aa64a to 91fbe42 Compare June 14, 2022 21:33
Copy link
Member

@adriangb adriangb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aclosing is only a couple LOC, we can just vendor it

setup.py Outdated Show resolved Hide resolved
starlette/_compat.py Outdated Show resolved Hide resolved
@florimondmanca
Copy link
Member Author

@adriangb As for bodies -- I'm tempted we explicitly don't allow a helper HTTPMiddleware to read or alter them. Buffering like that might get into issues with improper streaming support again. The rationale would be, "if you do need that, then go pure ASGI".

@adriangb
Copy link
Member

Yeah I think that's a fair position to have. I think it can be done without issues (ref adriangb@0bf1f3f). But we can always add that feature later, it wouldn't be breaking this API.

@florimondmanca florimondmanca requested a review from a team June 14, 2022 22:19
@florimondmanca
Copy link
Member Author

I switched things so that .dispatch() accepts an unbound request. Saves users from manually building the components they need from scope, eg Headers(raw=scope["headers"]). Instead they can do request.headers, and .body() or .stream() will raise errors because receive wasn't passed to the Request.

@florimondmanca florimondmanca force-pushed the fm/http-middleware branch 2 times, most recently from b70f13d to 9090aa7 Compare June 14, 2022 22:33
@erewok
Copy link
Contributor

erewok commented Jun 14, 2022

This looks great! FWIW, I support not allowing accessing or modifying request or response body. It's a fair limitation, in my opinion.

starlette/_compat.py Outdated Show resolved Hide resolved
starlette/_compat.py Outdated Show resolved Hide resolved
starlette/_compat.py Show resolved Hide resolved
starlette/_compat.py Outdated Show resolved Hide resolved
starlette/middleware/http.py Outdated Show resolved Hide resolved
starlette/middleware/http.py Outdated Show resolved Hide resolved
starlette/middleware/http.py Outdated Show resolved Hide resolved
starlette/middleware/http.py Outdated Show resolved Hide resolved
starlette/middleware/http.py Outdated Show resolved Hide resolved
@adriangb
Copy link
Member

This is looking great. I think the last issue we have to carry over from #1694 is what to do if a user tries to set Response.media_type or Response.body? I think the safest path for now is to subclass Response and raise exceptions if they try to set either of those attributes.

@florimondmanca
Copy link
Member Author

florimondmanca commented Jun 15, 2022

@adriangb I made an attempt at controlling what attributes can be read/written in the context of the middleware. It works, but I'm far from satisfied by it. Overall that seems overly complex and defensive...

The core problem is, Starlette's Response wasn't designed to be built-and-swapped like this. On the other hand that's a central principle in httpcore for example.

Maybe, for now (? how 'temporary' would that actually be?), we should do status_code, headers = yield? That would be structurally more appropriate: you only get a read-only access to status_code, and you can access or tweak the headers, is all. No access given to a full-fledged Response instance that wasn't designed for setter usage. The signature would become: AsyncGenerator[None, Tuple[int, (Mutable)Headers]].

@adriangb
Copy link
Member

adriangb commented Jun 15, 2022

I think your attempt is quite reasonable. I share your same concerns, but as a counter point this API (a Response object that can't really be modified in some ways) has existed in BaseHTTPMiddleware for a long time and I do not recall anyone complaining about that part of BaseHTTPMiddleware, so maybe what you've got is fine.

I also don't particularly like the tuple response, especially as something temporary (changing it would be a breaking change).

@Kludex
Copy link
Sponsor Member

Kludex commented Oct 3, 2022

What should we do with this PR?

@adriangb
Copy link
Member

adriangb commented Oct 3, 2022

The conclusion from discussion in the meeting was that it's probably best to just push users towards writing pure ASGI middleware or solutions provided by higher-level packages like FastAPI. The logic behind this decision is that users generally are not writing one-off middleware and any sort of reusable middleware component can take the time to be implemented as a pure ASGI middleware.

So I think we should close this PR? Florimond, what do you think? Maybe this can be a 3rd party package?

@Kludex
Copy link
Sponsor Member

Kludex commented Oct 5, 2022

Closing it just to organize a bit the PRs.

Feel free to reopen if you think it's still relevant. 🙏

@Kludex Kludex closed this Oct 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants