Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SessionMiddleware sends a new set-cookie for every request, with unintended results #2019

Open
Kludex opened this issue Jan 27, 2023 Discussed in #2018 · 11 comments
Open
Labels
need confirmation This issue needs confirmation.

Comments

@Kludex
Copy link
Sponsor Member

Kludex commented Jan 27, 2023

Discussed in #2018

Originally posted by sm-Fifteen January 27, 2023
The way SessionMiddleware currently works is that it encodes session data in a signed base64 object with timestamp, in a format not entirely dissimilar to (but incompatible with) JWT. Because of the way it works, being run right before the headers for an HTTP response are sent, and because it doesn't perform many checks besides whether or not there is session data to encode, it will try to update the session cookie whenever a route is called. This is somewhat wasteful, as session data is not usually updated on every request, but it leads to interesting issues as well.

I have a FastAPI application where login is handeled via a session cookie with a separate login process. Several routes are unlocked based on whether or not the cookie is present and based on the user authorizations recorded in said session cookie. When the frontend fetches API data, if the user is logged in, every response contains a new Set-Cookie header, which Firefox ignores, but Chrome acknowledges. None of the API routes alter session data, so the cookie value is always the same, save for the timestamp and signature at the end. This actually runs into an interesting Chrome issue where responses to fetch requests that arrive after the page has been navigated away from still alter the session cookie, even if if the new page changed the session data on load. This means I have this strange, Chrome-exclusive, race-condition-y bug that matches lepture/authlib#334, where the session state data setup by the Oauth flow before redirecting to the third party login form may get clobbered if some /api/slow_route query was pending when the user clicked the login button.

It's unclear if it's actually a chrome bug or unintended behavior from everything working as intended. If Starlette wasn't updating cookies for every request, and the Set-Cookie was actually important here, it's difficult to argue whether or not Chrome should have ignored it.

See here the Starlette test case from the Chromium bug report, which doesn't use SessionMiddleware, but shows the Chrome behavior in action.

from starlette.applications import Starlette
from starlette.responses import HTMLResponse, RedirectResponse, JSONResponse, PlainTextResponse
from starlette.routing import Route
import uvicorn
import datetime
from asyncio import sleep

html_content = """
    <!DOCTYPE html>
    <html>
        <head>
            <meta http-equiv="refresh" content="2;url=/set_cookie_and_redirect" />
            <script>fetch("/set_cookie_on_fetch")</script>
        </head>
        <body>
            Calling slow route via fetch() before changing page.
        </body>
    </html>
"""


async def homepage(request):
    return HTMLResponse(html_content)

async def set_cookie_on_fetch(request):
    res = JSONResponse({'hello':'world'})
    await sleep(5)
    res.set_cookie("my_cookie", "a__" + datetime.datetime.now().isoformat())
    return res

async def set_cookie_and_redirect(request):
    res = RedirectResponse("/final")
    res.set_cookie("my_cookie", "b__" + datetime.datetime.now().isoformat())
    return res

async def final(request):
    cookie_val = request.cookies.get('my_cookie')
    cookie_is_b = cookie_val.partition("__")[0] == 'b'

    if (cookie_is_b):
        res = PlainTextResponse("Cookie value is \"" + cookie_val + "\", all is well. Try refreshing the page.", 200)
    else:
        res = PlainTextResponse("Cookie value is \"" + cookie_val + "\", it shouldn't be.", 400)
    return res

routes = [
    Route('/', homepage),
    Route('/set_cookie_on_fetch', set_cookie_on_fetch),
    Route('/set_cookie_and_redirect', set_cookie_and_redirect),
    Route('/final', final),
]

app = Starlette(debug=True, routes=routes)
uvicorn.run(app, host="0.0.0.0", port=8000)
```</div>

<!-- POLAR PLEDGE BADGE START -->
> [!IMPORTANT]
> - We're using [Polar.sh](https://polar.sh/encode) so you can upvote and help fund this issue.
> - We receive the funding once the issue is completed & confirmed by you.
> - Thank you in advance for helping prioritize & fund our backlog.

<a href="https://polar.sh/encode/starlette/issues/2019">
<picture>
  <source media="(prefers-color-scheme: dark)" srcset="https://polar.sh/api/github/encode/starlette/issues/2019/pledge.svg?darkmode=1">
  <img alt="Fund with Polar" src="https://polar.sh/api/github/encode/starlette/issues/2019/pledge.svg">
</picture>
</a>
<!-- POLAR PLEDGE BADGE END -->
@Kludex Kludex added the need confirmation This issue needs confirmation. label Jan 27, 2023
@sm-Fifteen
Copy link

sm-Fifteen commented Jan 27, 2023

As explained in the Chromium bug report, regarding the code snippet:

The attached file is a minimal Starlette (Python) application where the root is an HTML page that initiates a slow (5 seconds) fetch request to /set_cookie_on_fetch but navigates to /set_cookie_and_redirect after 2 seconds, before the fetch operation has had time to complete. Both operation have a Set-Cookie header with a timestamp and a specific prefix. /set_cookie_and_redirectredirects to a page, /final, that prints the current value for the cookie in question. Immediately after the redirect, the value begins with "b__", as expected, but waiting for the server to return the result of /set_cookie_on_fetch and then refreshing again shows that the cookie's value has now changed and begins with "a__"

This issue does not occur with Firefox or Microsoft Edge, which are the two other browsers I have at hand to test this on.

@maartenbreddels
Copy link

We hit exactly the same issue, and via lepture/authlib#334 found that this was already reported here.

We are developing an OAuth feature for Solara and if we navigate away from the page (to our login endpoint), we also send out a window.navigator.sendBeacon(close_url); to make sure we clean up the application in case the websocket close does not get detected soon. These two requests happen at approximately the same time, and with the dev console open, the sendBeacon seems to win, effectively resetting the cookie that our login redirect just set.

We made a modification to the session middleware, that assigns to the session object a dict like object that keeps track if any key is modified. We now only send the cookie if the session dict was modified, and that solved our issue.
Maybe this implementation makes sense for Starlette as well.

@maartenbreddels
Copy link

My guess is flask does this as well: https://flask.palletsprojects.com/en/2.2.x/api/#flask.session

@gchait
Copy link

gchait commented Jun 2, 2023

Hey, any update? Even doing this isn't a suitable approach since now the response will contain 2 set-cookie headers:

@app.middleware("http")
async def some_middleware(request: Request, call_next):
    response = await call_next(request)
    session = request.cookies.get("session")
    if session:
        response.set_cookie(
            key="session", value=request.cookies.get("session"), httponly=True
        )
    return response
$ curl -IX GET --cookie "session=<XXX>" localhost:8080
HTTP/1.1 307 Temporary Redirect
date: Fri, 02 Jun 2023 08:25:23 GMT
server: uvicorn
content-length: 0
location: /login
set-cookie: session=<YYY>; path=/; Max-Age=1209600; httponly; samesite=lax
set-cookie: session=<XXX>; HttpOnly; Path=/; SameSite=lax

Edit:
I ended up switching to starlette-authlib which apparently works correctly, at least in this scenario.

@hasansezertasan
Copy link

Is there any update on it?

@Kludex
Copy link
Sponsor Member Author

Kludex commented Jan 2, 2024

No... 👀

@ToasterChicken
Copy link

The issue is because the session middleware is always handling every request as needing session data to persist, even if it was already present.

If the scope has data the cookie is set. Then the elif regardless if initial_session_was_empty set to false when the cookie is validated this will erase the cookie.

However fixing this I feel will cause a larger problem and more changes which is why I haven't just submitted a pull request for the change. Adding a couple more cases to the if statement in the send_wraper will temporarily keeps the same session cookie for the lifetime of the session, however that will cause the session to become invalidated after the max_age is reached and the users session will be destroyed.

I feel a complete fix would also need an option for when we want the sessions to refresh beyond the max_age like a refresh window. Like if the current time is with in the max_age minus refresh window, set initial_session_was_empty true (and rename that variable to something more descriptive like needs_persistence or some such) and write a new cookie as long as the current cookie is still valid.

https://github.com/encode/starlette/blob/5f9da0b6564e1b7def4d002f222766db5090735c/starlette/middleware/sessions.py#L57C28-L57C28

Testing the starlete_authlib.middleware it looks like it is persisting the cookie but taking a quick look at the library I suspect it will have the expiration issue. It's storing the expiration in the session but doesn't appear to be doing anything with it after that.

This was referenced Jan 8, 2024
@sm-Fifteen
Copy link

IMO the main problem with the sesion-cookie storage is that it doesn't have a stable encoding, because itsdangerous (being a kind of proto-JWT with no header) includes the timestamp where the data was signed in its payload, so you can't even go the naive route of comparing the (string) cookie value in the request and the cookie value that the middleware is about to inject in the response to make sure it's not being needlessly re-sent. I mentionned this on the FastAPI side (tiangolo/fastapi#754 (comment)), but I personally believe it would be better to switch to JWT for this and use fixed fields like nbf and exp instead of a timestamp+max_age combo, so that the encoded value remains constant if nothing changes.

Obviously the preferable way of going about this would be to skip serializing and signing session data altogether if nothing has changed, but the experiments I've tried make it seem tricky to efficiently and reliably do at the middleware and/or Starlette level.

@ToasterChicken
Copy link

Using the session middleware with a JWS token provided by an external service was my intention with what I was considering it for. The method which the session is taking to verify if it's tampered with wasn't going to affect me other than this extra layer being provided by the sessions middleware is inefficient.

The persistent work flow in the pull request I logged doesn't use the comparison at all to persist the data. That comparison is to check if the token needs to be updated if the data has changed. The persistence either persists until the token is no longer valid, or if there's a refresh window defined and it needs to re-write the token with a new signature. To be fair all of what I committed for my purposes could be moved out of the session middleware and into it's own separate cookie and handled as a function in each call. The middleware makes that a bit more convenient. All the existing middleware is doing is handling the cookie data and making sure it's not tampered with and doing it with basically the same lifecycle as a JWS token, a signed string.

I'm separating the concept of authentication from session. The session middleware in my opinion appears to be intended to use be used for maintaining session state, not necessarily authentication. However as you alluded to here and in the FastAPI comment, the session middleware cookie is effectively no different than a JWT token. JWS tokens borrowed from the concept that the sessions middleware also borrowed from and expanded on it. But comparing these two in my opinion would be confusing two concepts, session state and authentication since we can have session states with out being authenticated. JWT/JWS tokens are intended to provide an assertion by a 3rd party and verify its authenticity, while sessions are not provided by a 3rd party.

Regarding the cryptography aspects, I had a much longer response written but decided to change direction and simplify it how I believe I understand your concern. A JWT (JWS) is the same concept it's a signed string containing data (sessions cookies as they stand now). The only conceptual difference in respects to a session is the JWT token includes a header. Beyond the header the difference between a JWS token and a itsdangerous timestampsigner session cookie can be the keys, but can also be signed with a secret key and identical.

JWS tokens can either be signed with a private secret (like sessions.py now) or with public/private keys, but the RFC suggests they should be asymmetric. The sessions cookie is signed using HMAC SHA-1 with a secret key, omitting the rotating keys and digest_method options in itsdangerous. The itsdangerous response generated in the sessions library at least is including the timestamp adding some randomness to the string before signing so it's not quite so terrible. According to the standards JWS can be signed any way we like, including using a private secret with HMAC SHA-1. The strength of the secret key in either case is as important as the strength of a password or the private key used, key rotation would improve things but then a key store would be needed. Leaking the secret(s) or a private key is the same risk. Signing and verifying on the same service means that it has access to the private secret, or both public and private keys. If the cookie data was signed using the same method as a JWS token, then the two wouldn't be any different.

So long story short, using a secret key is another concern and should be a separate issue. The difficulty in changing the session data to a stronger signing method is less about implementing it in the library, but more with usage, documentation, and tests. So I feel like this separate problem statement should be: The current implementation is using a Secret key, and no method to change the digest_method from the default HMAC SHA-1. A JWT token wouldn't really be a solution to session security since it has the same problem, and risks adding overhead and complexity that isn't necessarily needed by a first party solution.

@maartenbreddels
Copy link

Note that our implementation is at https://github.com/widgetti/solara/blob/master/packages/solara-enterprise/solara_enterprise/auth/middleware.py

This is part of a non-open source licensed part of our mono repo, but consider it open source (the implementation is trival anyway). If someone has the time to turn this into a PR, that would be awesome!

@ToasterChicken
Copy link

ToasterChicken commented Jan 9, 2024

Too add, regarding the concern of getting the timestamp, it's in the cookie value string which is base64 encoded. The string has 3 parts, the Payload, Timestamp and Signature separated by a dot '.' which is the default value for itsdangerous. The timestamp if the string is split on the dots would be element 1 (second) so if we just wanted resolve the timestamp it just needs to be decoded, no different than a JWT/JWS token where you need to decode the payload and get for the 'iat' key. So instead of converting the json, we just split the line on dots and grab element 1. If you were entirely uninterested in the validity of the token, decode element 0 to a string to obtain the json string.

    def sign(self, value: _t_str_bytes) -> bytes:
        """Signs the given string and also attaches time information."""
        value = want_bytes(value)
        timestamp = base64_encode(int_to_bytes(self.get_timestamp()))
        sep = want_bytes(self.sep)
        value = value + sep + timestamp
        return value + sep + self.get_signature(value)

I also don't think relying on the timestamp for anything other than verifying the validity of the message is a good idea.

The pull request I created and linked to this discussion has working persistence and refresh. I just haven't created the test cases for the additional features and wanted more discussion about the subject.

@maartenbreddels I intend to take a look at your sample, it looks lot cleaner than what I submitted. So maybe I will make it into a PR :)

ToasterChicken added a commit to ToasterChicken/starlette that referenced this issue Feb 22, 2024
- Issues encode#2019
Added same_site for CHIPS cookie support
- Discussion encode#2441

Updated Documentation
Updated Unit Test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need confirmation This issue needs confirmation.
Projects
None yet
Development

No branches or pull requests

6 participants