Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix HttpLexer to allow HTTP/2.0 as preface #1521

Merged
merged 2 commits into from Aug 24, 2020
Merged

Conversation

rugk
Copy link
Contributor

@rugk rugk commented Aug 23, 2020

According to RFC7540 (section 3.5), if I interpret it correctly, it actually even has to start with 2.0 and not 2.

As HTTP/3 is also defined there, I have not looked into how that has to be used/started. (i.e. that may need to be fixed as well)

Fixes #1520

According to [RFC7540 (section 3.5)](https://tools.ietf.org/html/rfc7540#section-3.5), if I interpret it correctly, it actually even _has to_ start with `2.0` and not `2`.

As HTTP/3 is also defined there, I have not looked into how that has to be used/started.
@ocket8888
Copy link
Contributor

That part of the spec is talking about TCP frames to be sent to establish an HTTP/2 connection, not the request line of an HTTP/2 request. I think e.g. GET / HTTP/2.0 might be acceptable, but it's certainly doesn't have to end with .0, and curl for example definitely omits that part:

$ curl --http2-prior-knowledge -skv github.com
* Rebuilt URL to: github.com/
*   Trying 140.82.113.4...
* TCP_NODELAY set
* Connected to github.com (140.82.113.4) port 80 (#0)
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x7fffd48b8860)
> GET / HTTP/2
> Host: github.com
> User-Agent: curl/7.58.0
> Accept: */*
>
...

@birkenfeld
Copy link
Member

So I've looked at the RFC, and there are several things to note here.

Most important is that HTTP 2 doesn't even have a request header line or status line anymore (see section 8.1.2.1). What is sent after the HTTP/2 negotiation (the PRI * line from section 3.5) is mostly binary anyway. What curl shows here is not what is going on the wire, but a representation/"translation" of this HTTP/2 conversation to a HTTP1-like text-only version.

Still it makes sense to accept this in the lexer, since something like this is the most likely representation of a HTTP/2 session used as its input...

@rugk
Copy link
Contributor Author

rugk commented Aug 24, 2020

What is sent after the HTTP/2 negotiation (the PRI * line from section 3.5) is mostly binary anyway

Ah, yeah, so actually that does include the string HTTP/2.0. But well yeah, I guess for interpretation/display purposes, we just should not be so strict and accept requests that look like a proper HTTP request.

@birkenfeld birkenfeld merged commit bc67241 into pygments:master Aug 24, 2020
@rugk rugk deleted the patch-1 branch August 24, 2020 16:53
@rugk
Copy link
Contributor Author

rugk commented Aug 24, 2020

Note: Remember to check whether HTTP/3 may need the same addition.

@Anteru Anteru added the changelog-update Items which need to get mentioned in the changelog label Sep 3, 2020
@Anteru Anteru added this to the 2.7 milestone Sep 3, 2020
@Anteru Anteru removed the changelog-update Items which need to get mentioned in the changelog label Sep 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HTTP lexer fails on HTTP/2.0 (HTTP/2 works)
4 participants