Skip to content
This repository has been archived by the owner on Apr 14, 2022. It is now read-only.

Early data support #179

Open
njsmith opened this issue Dec 1, 2019 · 4 comments
Open

Early data support #179

njsmith opened this issue Dec 1, 2019 · 4 comments

Comments

@njsmith
Copy link
Member

njsmith commented Dec 1, 2019

I just had a long discussion with @sethmlarson as we tried to wrap our head around RFC 8470, the RFC for supporting TLS 1.3 "early data" in HTTP. It's super complicated, so I'm going to try to write down what (I think) we figured out, before I forget again.

First, usually we're lazy and just talk about "clients" and "proxies" and "servers", but for this we need much more precise terminology. So in this post I'm going to be super anal about this. A HTTP request may traverse multiple hops:

user agent → intermediary → intermediary → origin

The user agent is where the request starts, then it bounces through zero or more intermediaries, until it finally reaches the origin, which actually handles the request and sends back the response.

At each hop, there's a client and a server: on the first hop, the user agent is the client and the first intermediary is the server, but then on the next hop that intermediary is the client and the next intermediary is the server, etc.

hip is an HTTP client library. So this means sometimes we're running inside the user agent (most of the time), and sometimes we're running inside an intermediary (for example, if someone implements an HTTP reverse proxy by writing a web server that turns around and calls hip to forward requests to another server).

OK, now, early data. The way this works is, TLS servers hand out "session tickets" to TLS clients, and if the client hangs onto that ticket, then the next time they make a connection they can use the ticket to skip some steps and make the connection faster.

When a server issues a ticket, there's a flag it can set that tells the client whether it's allowed to use "early data". If this flag is set, then when the client uses the ticket to make a new connection, then it can start sending encrypted application data right away, in the first volley of packets. This is nice because it reduces latency – you get to skip a round trip. BUT the trade-off is, since the client + server haven't had a chance to negotiate any per-connection entropy yet, the early data can't be "customized" to this connection. Therefore, if someone evil is snooping on the network and observes this early data, they can save a copy, and then later make a new connection and play the data back – a "replay attack".

This is somewhat limited: the data is encrypted, so the attacker doesn't necessarily know what it says – but they may be able to make a good guess based on context, e.g. if I convince you to send me $5 on paypal, and then I see you make an HTTPS connection to paypal and capture the data, then I can guess that the data I captured is likely to be an encrypted version of "send $5 to @njsmith". If I replay that a few hundred times then it might have some nice effects. Well, nice for me, anyway; not so nice for you.

And, after the early data, the client/server still have to complete the handshake before they can continue, and an attacker can't do this, because this requires having the actual session ticket, not just a copy of the early data. So this means that an attacker can't actually get a response. And if a server waits for the handshake to be complete, then they can retroactively tell that the early data actually was real, not a replay attack. But the whole point of early data is that you want to get started processing the request early, without waiting for the handshake. And when I'm attacking paypal, I don't really care if I can't see the responses to my replayed messages, just so long as the money gets transferred.

So, servers can't just blindly treat early data like regular data. Like, for Paypal, if the request is GET / then it's fine to process that as early data so the homepage loads faster, but if it's POST /transfer-money?to=njsmith&... then processing that as early data is not OK. In general, origin servers need some awkward application-specific logic to decide what's OK and what's not; there's nothing the IETF can do to help with that. Instead, RFC 8470 is about answering: What do user-agents and intermediaries need to do, in order to make it possible for origins to apply their awkward application-specific logic?

For a user-agent, it's not too complicated: if they have a request that they think might be a good candidate for early-data (e.g. a GET with no body), and they have a valid session ticket with the early data flag set, then they can go ahead and try to use TLS early data. Otherwise, they make the request like normal.

If they do use TLS early data, then the server can say eh this is like GET /, I don't care about replay attacks, I'm going to accept the early data and respond immediately. That's easy. But if the server thinks the request is one where replay attacks would be bad, then it has three options:

  • It can use some TLS-level messages to reject the early data entirely. In this case it's like it was never sent, and the client's TLS library will finish the handshake and then sends it again as regular data.

  • It can stash the early data in an internal buffer somewhere, and wait to process it until after the handshake completes. Once the handshake is complete, we know that there's no replay attack, so this is equivalent to the case above – it's just a question of whether you're more worried about server memory (for the internal buffer) versus network bandwidth (for the re-sending).

  • It can send back a 425 Too Early response, in which case the client has to make a whole new request, and this time it shouldn't even try to use early data.

If you're a user-agent, the first two cases are pretty trivial: your TLS library probably handles them internally. And for the last case, a 425 response, you can transparently retry without worrying. So when hip is acting as a user-agent, then it can try using early data opportunistically, without any user intervention, and automatically handle any issues that arise without hassling the user.

But sometimes hip might be acting as an intermediary, and this gets trickier. As an intermediary, you might receive incoming early data, and want to immediately forward it on to the origin server without waiting for the handshake to complete. But, this is risky: you know that this is early data, and might be a replay attack. But if you send it on to the origin server as normal, then this information might get lost – the origin server has no way to tell that this is a potential replay attack, and it might fail to reject POST /transfer-money?to=njsmith&.... So you need to tell the origin server.

You might think well, you should just send the next request as early data yourself, so that the server will know! But, two problems: (1) one of the server's options for being paranoid is to accept the early data and silently hold onto it until the handshake completes, and then treat it as regular data. And your TLS library probably doesn't have an option to be like "hey send this early data, but then pause and don't finish the handshake until I tell you to". So the server might decide that actually this data is safe against replays, even if you send it as early data, and have no way to tell whether this has happened. (2) you might not have a usable TLS session ticket, so you can't necessarily send early data anyway. Heck, your connection to the upstream server might not even be using TLS at all.

(A common configuration where this would happen is if the the intermediary is a reverse proxy inside the same data center as the origin server: the user-agent is really far away, so you want to minimize round trips between the user-agent and the proxy, but the intermediary and origin are really close, so it's fine if you have to do a full handshake before you can pass on the request.)

So, you can't rely on sending the next request as early data. And for hip that's very convenient, because it means we don't need to expose a force_early_data=True config option.

Instead, if you're in this situation, what you have to do is (a) make sure that the upstream server supports RFC 8470, via some kind of out-of-band configuration (again, this is OK for the reverse-proxy-inside-a-data-center case), and (b) add an Early-Data: 1 header to the forwarded request.

And then on the server side, if it sees Early-Data: 1 and decides that this request is unsafe, then it has to use the 425 response method; the other two options aren't allowed.

And then when the intermediary sees the 425 response, it can't just blindly retry it immediately, like a user-agent would: it has to either forward the 425 back to the user-agent, or else it has to at least wait for the user-agent to complete the handshake, so that the intermediary knows that there isn't a replay attack happening, and then it can immediately re-send the request to the origin.

So this is what makes things awkward: when hip is a user-agent, it wants to opportunistically use early-data, and to transparently handle 425 responses. But when hip is an intermediary, it can't handle 425 responses transparently.

However, I think there is a solution:

If someone is building intermediary and wants to handle incoming early-data requests, then they have to do something to explicitly opt-in to that. No HTTP server is going to blindly deliver early-data to applications that aren't expecting it. So we can assume that whoever is calling hip already knows that they're dealing with early data.

And, the folks building this intermediary have to know whether the upstream server supports RFC 8470, and if it does, they have to add the Early-Data: 1 header. So from hip's perspective, the presence of an Early-Data: 1 header in the request will happen if and only if it's being used as part of an intermediary, and the request it's being asked to send needs the special intermediary handling.

So I think the bottom line is: hip can opportunistically attempt to use early data, and it can automatically retry on 425 responses, except that on requests where Early-Data: 1 is set, then it should return 425 responses back to the application. (And then they can handle that however they want.)

@sethmlarson
Copy link
Contributor

Interesting points I just read at the very end of the TLS/SSL Python docs:

  • Session tickets are no longer sent as part of the initial handshake and are handled differently. SSLSocket.session and SSLSession are not compatible with TLS 1.3.

  • TLS 1.3 features like early data, deferred TLS client cert request, signature algorithm configuration, and rekeying are not supported yet.

So it looks like Early-Data isn't a worry except for aioquic which does support Early-Data?

@njsmith
Copy link
Member Author

njsmith commented Dec 1, 2019

That sounds plausible, yeah. Python's TLS ecosystem will probably catch up with this stuff eventually, but... TLS APIs are messy.

The part I was worried about immediately was figuring out if early data support was going to be incompatible with architectural choices we're making now. But it sounds like it won't be too bad.

The one consequence I can think of is that you shouldn't even attempt to use early data unless you know your request body is rewindable. So this might mean we will need to expose that as part of our internal request body abstraction. But since that's an internal abstraction, we can wait and add it later.

@njsmith
Copy link
Member Author

njsmith commented Dec 2, 2019

Something I don't currently understand at all: how does early data interact with ALPN negotiation?

The issue is: the client sends early data before it actually knows which protocol the server is going to select. RFC 8470 even makes some vague references to this, talking about how the early data chooses a protocol speculatively, and if the server rejects the early data then the client might also have to switch representations before resending it.

But if you have a server that supports http/1.1 + http/2, and it accepts the early data, what does it do with it? Is it supposed to sniff the data to guess which protocol it's using? Is this written down anywhere? It seems like a pretty critical piece to making any of this work..

@njsmith
Copy link
Member Author

njsmith commented Dec 2, 2019

It looks like nginx and haproxy are both shipping early data support for http already, so I guess to answer these questions we need to study what they actually do.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants