Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include routing information for middleware #685

Open
cypai opened this issue Oct 22, 2019 · 22 comments · Fixed by #1649
Open

Include routing information for middleware #685

cypai opened this issue Oct 22, 2019 · 22 comments · Fixed by #1649
Labels
feature New feature or request

Comments

@cypai
Copy link

cypai commented Oct 22, 2019

Is there a way to obtain which route the request is taking in middleware? Right now, the only available thing is request.url.path, which doesn't give me a nice name for metrics.

I see in https://github.com/steinnes/timing-asgi/blob/master/timing_asgi/integrations/starlette.py that it basically just loops through all the routes to obtain the matching one. Is there a better way to do it?

Looping in that manner also runs into an issue where if there are multiple matching routes, like
/resource/default/1
/resource/1/1
it can't find the right one since they both match.

Important

  • We're using Polar.sh so you can upvote and help fund this issue.
  • We receive the funding once the issue is completed & confirmed by you.
  • Thank you in advance for helping prioritize & fund our backlog.
Fund with Polar
@bourdeau
Copy link

bourdeau commented Nov 6, 2019

I got the same problem as @cypai with AuthenticationMiddleware. There is only request.scope["path"] and request.scope["raw_path"] available and I've to make quite some regex to handle the route properly.

Is it possible to build the Route() before *Middleware is called so one can get the route_name and other useful informations in the middleware? Maybe @cypai and I are both missing something?

@tomchristie
Copy link
Member

So, the routing only occurs after the middleware handling.

I think options here are:

  • We could potentially add some public API on the application so that it's easier to premptively resolve the request routing? Ideally we ensure that can be cached, so that the lookup only occurs once.
  • We could add the ability to have some kinds of middleware run after the request routing. (Probably not super keen on that.)

@bourdeau
Copy link

bourdeau commented Nov 6, 2019

Thanks for your answer @tomchristie. I don't know Starlette enough to know which solution would be best. I know that in some other frameworks, there is usually some Events mechanisms with some Listeners attached to it.

If I can be of any help, let me know, we use Starlette in production in my company, so it's critical for us.

@allan-simon
Copy link

@tomchristie why the second option not something you're keen on ? (honest question, I have no opinion)
but yes I've run into this issue because I was trying to put in a ContextVars the route name, so that my logger can get this information

@allan-simon
Copy link

right now a workaround I've found (as I'm using fastapi and not startlette directly, so not sure how much it applies on startette) is subclassing like here https://fastapi.tiangolo.com/tutorial/custom-request-and-route/#custom-apiroute-class-in-a-router and there I can access the routing information I need before/after request

@shizidushu
Copy link

@allan-simon Can you share the code of your workaround?

@allan-simon
Copy link

sure thing

"""The goal is to have the route name of the current request                                                                                                                                 
accessible anywhere by our loggers to help us contextualizing                                                                                                                                
when debugging though kibana or other centralized log visualizer                                                                                                                             
"""                                                                                                                                                                                          
from typing import Callable                                                                                                                                                                  
from typing import Optional                                                                                                                                                                  
from contextvars import ContextVar                                                                                                                                                           
                                                                                                                                                                                             
from fastapi import APIRouter, FastAPI                                                                                                                                                       
from fastapi.routing import APIRoute                                                                                                                                                         
from starlette.requests import Request                                                                                                                                                       
from starlette.responses import Response                                                                                                                                                     
                                                                                                                                                                                             
                                                                                                                                                                                             
_route_name_ctx_var: ContextVar[Optional[str]] = ContextVar("route_name", default=None)                                                                                                      
                                                                                                                                                                                             
                                                                                                                                                                                             
def get_route_name() -> Optional[str]:                                                                                                                                                       
    return _route_name_ctx_var.get()                                                                                                                                                         
                                                                                                                                                                                             
                                                                                                                                                                                             
class LoggedRoute(APIRoute):                                                                                                                                                                 
    def get_route_handler(self) -> Callable:                                                                                                                                                 
        original_route_handler = super().get_route_handler()                                                                                                                                 
        route_name = self.name                                                                                                                                                               
                                                                                                                                                                                             
        # we import here to avoid the circular dependencies                                                                                                                                  
        from app.logging import get_logger                                                                                                                                                   
                                                                                                                                                                                             
        async def custom_route_handler(request: Request) -> Response:                                                                                                                        
                                                                                                                                                                                             
            _route_name_ctx_var.set(route_name)                                                                                                                                              
                                                                                                                                                                                             
            app_logger = get_logger()                                                                                                                                                        
            app_logger.info({"step": "start"})                                                                                                                                               
                                                                                                                                                                                             
            response: Response = await original_route_handler(request)                                                                                                                       
                                                                                                                                                                                             
            app_logger.info({"step": "end", "status_code": response.status_code})                                                                                                            
                                                                                                                                                                                             
            return response                                                                                                                                                                  
                                                                                                                                                                                             
        return custom_route_handler              

and then I use it like this:

from fastapi import APIRouter, Depends, HTTPException
from sqlalchemy.engine import Connection

from app.database import get_connection
from app.logged_route import LoggedRoute


router = APIRouter(route_class=LoggedRoute)

@shizidushu
Copy link

@allan-simon Thanks. What is get_route_name used for?

@allan-simon
Copy link

@shizidushu here I will not be able to simply copy paste you code, but here basically the usage:

I've configured all the loggers to have a JSON Formatter in order for my logs to be then integrated into a centralized log system (ELK stack)
in addition to this all the loggers have also a logfilter which add in the json , the field "route_name" (and other field , like user_id , request_id etc. ) so that if anywhere in the code I do "logger.info("hey") " I will be able to know it was called in the context of a given route

the goal is that this way it's easy to make statistics by api endpoint name

@shizidushu
Copy link

@allan-simon Thanks for your explanation. So, in your case, it is used to expose the route context ("route_name") to logger. It is an inspiring idea!

@bejoinka
Copy link

bejoinka commented May 29, 2020

Question for the group here... tagging onto this thread because I think it may be related:

Am I correct in thinking route discovery is currently the best way to handle a pipeline of middleware for a given route or group of routes?

A little backdrop... I am thinking about the equivalent of a policies.js file in nodejs, where you may build a pipeline of checks for a given request:

admin_policies = [https_redirect, is_logged_in, is_admin]
user_policies = [https_redirect, is_logged_in]

My understanding is that policies (which are I think identical in implementation to middleware, though serve a subtly different purpose) are not directly supported in starlette, which is fine. That said, it might be nice to be able to assign functions (at the router level) assign those functions/routes to those policies so anyone who hits that route goes through that same pipeline.

I've found this kind of structure quite nice, and while starlette is meant to be lightweight and may expect something else to plug into it (which is fine), I'm curious if anyone has approached something similar... and if so, how?

EDIT: Please let me know if you think this question is more appropriate as its own issue.

@frjonsen
Copy link

Just wanted to add my voice to this issue. We want to log response times per endpoint. As this is done at the end of the request, it is perfectly fine for us to not know the endpoint name until the request has been resolved.

The workaround posted by @allan-simon might work, but I would rather handle response times in a middleware than in the routes themselves.

@adriangb
Copy link
Member

As a FastAPI user, I feel that one of the advantages of their dependency injection system is being able to configure it on a per route, per router or per app basis. Maybe Starlette could support something similar by having middleware for routes & mounts? That way there is no need to have an option in the middleware that says match/exclude these routes. But this would still require having some sort of "middleware after routing".

@adriangb
Copy link
Member

We could add the ability to have some kinds of middleware run after the request routing. (Probably not super keen on that.)

@tomchristie could you elaborate on why you think this isn't a good option? I think the implementation is relatively straightforward.

@billcrook
Copy link

Just an FYI, this feature would enable a ticket in the newrelic python agent project. The newrelic agent would be able to name transactions using the route path pattern and group transactions properly.

@sk-
Copy link
Contributor

sk- commented Sep 10, 2022

Commenting here, as I also want to do something similar but no in a middleware. My use case is to automatically report eh target path to Opentelemetry. Attributes in metrics should have a lw cardinality, that's why we need a way to get the original path.

FastAPI adds the matching route to the scope, so the following function will get the original path.

def _collect_target_attribute(
    scope: typing.Dict[str, typing.Any]
) -> typing.Optional[str]:
    # FastAPI
    root_path = scope.get("root_path", "")

    route = scope.get("route")
    if not route:
        return None
    path_format = getattr(route, "path_format", None)
    if path_format:
        return f"{root_path}{path_format}"

    return None

sk- added a commit to sk-/opentelemetry-python-contrib that referenced this issue Sep 10, 2022
This PR adds the target information for metrics reported by instrumentation/asgi.

Unfortunately, there's no ASGI standard to reliably get this information, and I was only able to get it for FastAPI.

I also tried to get the info with Sanic and Starlette (encode/starlette#685), but there's nothing in the scope allowing to recreate the route.

Besides the included unit tests, the logic was tested using the following app:

```python
import io
import fastapi

app = fastapi.FastAPI()

def dump_scope(scope):
    b = io.StringIO()
    print(scope, file=b)
    return b.getvalue()

@app.get("/test/{id}")
def test(id: str, req: fastapi.Request):
    print(req.scope)
    return {"target": _collect_target_attribute(req.scope), "scope": dump_scope(req.scope)}

sub_app = fastapi.FastAPI()

@sub_app.get("/test/{id}")
def sub_test(id: str, req: fastapi.Request):
    print(req.scope)
    return {"target": _collect_target_attribute(req.scope), "scope": dump_scope(req.scope)}

app.mount("/sub", sub_app)
```

Partially fixes open-telemetry#1116

Note to reviewers: I tried to touch as less as possible, so that we don;t require a refactor before this change. However, we could consider changing `collect_request_attributes` so that it returns both a trace attributes and a metrics attributes.
Wihout that change we cannot add the `HTTP_TARGET` attribute to the list of metric atttributes, because it will be present but with high cardinality.
sk- added a commit to sk-/opentelemetry-python-contrib that referenced this issue Sep 10, 2022
This PR adds the target information for metrics reported by instrumentation/asgi.

Unfortunately, there's no ASGI standard to reliably get this information, and I was only able to get it for FastAPI.

I also tried to get the info with Sanic and Starlette (encode/starlette#685), but there's nothing in the scope allowing to recreate the route.

Besides the included unit tests, the logic was tested using the following app:

```python
import io
import fastapi

app = fastapi.FastAPI()

def dump_scope(scope):
    b = io.StringIO()
    print(scope, file=b)
    return b.getvalue()

@app.get("/test/{id}")
def test(id: str, req: fastapi.Request):
    print(req.scope)
    return {"target": _collect_target_attribute(req.scope), "scope": dump_scope(req.scope)}

sub_app = fastapi.FastAPI()

@sub_app.get("/test/{id}")
def sub_test(id: str, req: fastapi.Request):
    print(req.scope)
    return {"target": _collect_target_attribute(req.scope), "scope": dump_scope(req.scope)}

app.mount("/sub", sub_app)
```

Partially fixes open-telemetry#1116

Note to reviewers: I tried to touch as less as possible, so that we don;t require a refactor before this change. However, we could consider changing `collect_request_attributes` so that it returns both a trace attributes and a metrics attributes.
Wihout that change we cannot add the `HTTP_TARGET` attribute to the list of metric atttributes, because it will be present but with high cardinality.
@Kludex
Copy link
Sponsor Member

Kludex commented Sep 21, 2022

This is closed by #1649, and it will be available in Starlette 0.21.0.

🏃‍♂️ 💨

@basepi
Copy link

basepi commented Sep 21, 2022

Unfortunately, this issue ended up with some scope creep/tangents.

The original issue wanted to get the matching route name for a request in order to correctly identify metrics for a given route. This is also the solution I was interested in, as we're currently iterating over the routes to try to find the correct one.

However, it appears others in this thread were interested in middleware which operates only on certain routes/mounts. #1649 solves their problem, but doesn't appear to solve my issue (or the original issue by @cypai).

Am I missing something? Does #1649 make it easier to access the route name from inside this new class of "mount" middlewares? I'm pretty certain it doesn't do anything to solve the issue of finding the route name for a global middleware added via app.add_middleware().

Should I open a new issue, or should we re-open and re-claim the original description of this issue? Or am I missing something obvious? 😅

@adriangb
Copy link
Member

Sorry if we closed this prematurely @basepi.

My thought would be that if you want metrics per route you should wrap each route with a middleware. If you want to send metrics from a centralized middleware (e.g. for batching) you could have the middleware that wraps each route just set values in scope and then the central middleware is the one that actually makes requests/IO. Does that make sense?

@adriangb adriangb reopened this Sep 21, 2022
@basepi
Copy link

basepi commented Sep 21, 2022

@adriangb It does make sense, but doesn't really work well for APM agents (Which is the use case for @billcrook and I).

We're trying to name our APM traces using the route name instead of the URL, since the latter has cardinality problems. (And routes are a better way to group traces than URL anyway)

We want minimal work on the part of the user -- add our middleware, and we do all the work for you, creating traces for each transaction that comes through your app.

Currently this require us to iterate over the routes to get the route name, as I mentioned before. It has minor performance implications, besides being a bit hacky.

I'll understand if this is a situation where our use case isn't one you want to support, but unfortunately requiring the user to add a middleware to each route is probably a worse solution than the iterating we're doing.

I can't speak to whether your solution would work for @cypai.

@adriangb
Copy link
Member

adriangb commented Sep 21, 2022

requiring the user to add a middleware to each route is probably a worse solution than the iterating we're doing

I see how this could put a lot of the burden on users, we just prefer explicitness to implicitness. You could probably provide some small shim function to make it a bit nicer for users, but yeah ultimately they would have to do quite a bit of manual boilerplate.

One alternative I can think of is to iterate over all of the routes once and wrap them with a middleware during app startup.

Or just monkeypatch something (I know instrumentation libraries tend to do this for the sake of ease of use, so there is some precedent). Maybe we could provide some global ROUTE_MIDDLEWARE = [] object that libraries can inject "default" middleware into? Just thinking out loud...

FWIW I've also wanted routing information (in particular the path template that was matched) in middleware, I wouldn't be opposed to that feature. I also think it would be interesting for Starlette to provide some utilities like def visit_routes(app: Starlette) -> Iterable[SomeRouteInfo?] or something like that since I think this is a pretty common pattern (also used in FastAPI in include_router IIRC).

@basepi
Copy link

basepi commented Sep 21, 2022

Yeah we have some long-term plans to use monkeypatching to even further reduce code changes for onboarding. In which case we could add a middleware to all the routes that way. Anyway, thanks for re-opening the issue and letting me clarify our use-case. :)

@Kludex Kludex added the feature New feature or request label Dec 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.