Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce rack.protocol header for handling version-agnostic protocol upgrades. #1954

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Expand Up @@ -23,6 +23,7 @@ All notable changes to this project will be documented in this file. For info on
- `rack.hijack?` (partial hijack) and `rack.hijack` (full hijack) are now independently optional.
- `rack.hijack_io` has been removed completely.
- `rack.response_finished` is an optional environment key which contains an array of callable objects that must accept `#call(env, status, headers, error)` and are invoked after the response is finished (either successfully or unsucessfully).
- `rack.protocol` is an optional environment key and response header for handling connection upgrades.

### Removed

Expand Down
18 changes: 16 additions & 2 deletions SPEC.rdoc
Expand Up @@ -82,6 +82,10 @@ Rack-specific variables:
<tt>rack.hijack</tt>:: See below, if present, an object responding
to +call+ that is used to perform a full
hijack.
<tt>rack.protocol</tt>:: If the request is an HTTP/1 upgrade or
HTTP/2 CONNECT with +:protocol+ pseudo
header, this is set to an +Array+
containing the value(s) of that header.
Additional environment specifications have approved to
standardized middleware APIs. None of these are required to
be implemented by the server.
Expand Down Expand Up @@ -248,16 +252,26 @@ Header values must be either a String instance,
or an Array of String instances,
such that each String instance must not contain characters below 037.

=== The content-type
==== The +content-type+ Header

There must not be a <tt>content-type</tt> header key when the +Status+ is 1xx,
204, or 304.

=== The content-length
==== The +content-length+ Header

There must not be a <tt>content-length</tt> header key when the
+Status+ is 1xx, 204, or 304.

==== The +rack.protocol+ Header

If the +rack.protocol+ header is present, it must be a String, and
must be one of the values from the +rack.protocol+ array from the
environment.

Setting this value informs the server that it should perform a
connection upgrade. In HTTP/1, this is done using the +upgrade+
header. In HTTP/2, this is done by accepting the request.

=== The Body

The Body is typically an +Array+ of +String+ instances, an enumerable
Expand Down
40 changes: 34 additions & 6 deletions lib/rack/lint.rb
Expand Up @@ -78,8 +78,9 @@ def response
end

## and the *body*.
check_content_type(@status, @headers)
check_content_length(@status, @headers)
check_content_type_header(@status, @headers)
check_content_length_header(@status, @headers)
check_rack_protocol_header(@status, @headers)
@head_request = @env[REQUEST_METHOD] == HEAD

@lint = (@env['rack.lint'] ||= []) << self
Expand Down Expand Up @@ -179,6 +180,11 @@ def check_environment(env)
## to +call+ that is used to perform a full
## hijack.

## <tt>rack.protocol</tt>:: If the request is an HTTP/1 upgrade or
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not in favor of the rack.protocol naming. It seems like it would be which HTTP version is in use (SERVER_PROTOCOL) or http or https (rack.url_scheme).

I assume the HTTP/1 upgrade header is already available in HTTP_UPGRADE. Is the HTTP/2 protocol pseudo header exposed anywhere yet?

Absent a better idea, I recommend switching to rack.upgrade_protocol, since it looks this this is used for upgrading the protocol used for the connection.

Regardless of what we select, I think we want to explicitly specify that this in an array (even if it can only contain one value in certain HTTP versions), since it looks like we specify that the env key should be an array later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTTP/2 literally called this the :protocol pseudo header. I actually don't have a strong opinion about the name but I've already used this name for my current implementation of websocket simply because I had to choose something and it seemed like the logical approach.

I also checked https://datatracker.ietf.org/doc/html/rfc3875 and could find no reference about how to handle connection: upgrade requests. It doesn't seem like the CGI specification which we follow for our env keys has any guidance on the matter.

rack.upgrade_protocol - I would suggest we avoid this, or the "upgrade" terminology in general, as HTTP/2 explicitly calls this out in a number of places as being unsupported and I think the word "upgrade" is deprecated in favour of "connect" and "protocol".

The specific introduction of the protocol pseudo header is here: https://www.rfc-editor.org/rfc/rfc8441.html#section-4

I'm fine with specifying that it should be an Array.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rack.connect_protocol maybe? rack.protocol is too confusing as a name for this, IMO.

I would still like to know how changing the connection protocol is currently handled for HTTP/2 in rack. Is it impossible with the current design, does it require server-specific code, or is it possible and just implemented a different way?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rack.protocol is the intersection of all the different semantics. connect_protocol is too specific to HTTP/2 that it makes the HTTP/1 use case clumsy. It doesn't make a lot of sense to have rack.connect_protocol in the response headers map to an HTTP/1 connection upgrade IMHO. I understand the value of being more specific, but this isn't all or nothing, the intersection between HTTP/1 and HTTP/2 semantics is very wide and doesn't invite more specific naming IMHO. In other words, being too specific in the naming is equally confusing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I probably should have been more specific. There is nothing that ties the :protocol pseudo header to the CONNECT method, so linking them as rack.connect_protocol would be a bit clumsy if later on another method was introduced that also used the :protocol pseudo header with a different method.

## HTTP/2 CONNECT with +:protocol+ pseudo
## header, this is set to an +Array+
## containing the value(s) of that header.

## Additional environment specifications have approved to
## standardized middleware APIs. None of these are required to
## be implemented by the server.
Expand Down Expand Up @@ -671,9 +677,9 @@ def check_header_value(key, value)
end

##
## === The content-type
## ==== The +content-type+ Header
##
def check_content_type(status, headers)
def check_content_type_header(status, headers)
headers.each { |key, value|
## There must not be a <tt>content-type</tt> header key when the +Status+ is 1xx,
## 204, or 304.
Expand All @@ -687,9 +693,9 @@ def check_content_type(status, headers)
end

##
## === The content-length
## ==== The +content-length+ Header
##
def check_content_length(status, headers)
def check_content_length_header(status, headers)
headers.each { |key, value|
if key == 'content-length'
## There must not be a <tt>content-length</tt> header key when the
Expand All @@ -714,6 +720,28 @@ def verify_content_length(size)
end
end

##
## ==== The +rack.protocol+ Header
##
def check_rack_protocol_header(status, headers)
## If the +rack.protocol+ header is present, it must be a String, and
## must be one of the values from the +rack.protocol+ array from the
## environment.
protocol = headers['rack.protocol']
if protocol
request_protocols = Array(@env['rack.protocol'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the env entry is required to be an array if present, we shouldn't need to use Array to turn it into an array.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay with that, however in HTTP/2, the semantics are that it should only contain a single value, i.e. a string, while in HTTP/1, it can have multiple values. I'm not sure if we want to just specify that it can be a string (single value HTTP/2 semantics) or Array (multiple values HTTP/1 semantics) or an Array with a single value (HTTP/2 wrapped in HTTP/1 semantics). It's probably easier to use if we always assume it's an Array, but 99.9999999999% of the time, this will always be a single value, websocket. I've never seen anyone with an upgrade header like upgrade: websocket, smtp, redis. I think that's why HTTP/2 standardised it as a single-value pseudo header - realistically no one is asking to upgrade to several distinct protocols, and/or protocol versions. I don't have a strong opinion, but I consider the HTTP/1 semantic to be largely "deprecated" based on the directions the standards appear to be going in. I don't have a strong opinion here either way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is going to 99.9999999999% be a string in HTTP/1, and 100% in HTTP/2, we probably should just specify it must be a string. If a client submits upgrade: websocket, smtp, and the app uses the response header upgrade: websocket, it wouldn't pass Lint since it wouldn't match, and I think that's OK. We shouldn't design for the 00.0000000001% use case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be okay with this, but async-http will internally follow the HTTP semantics and has it as an Array, which is how servers should be checking it. Therefore, I think we should encourage servers to implement the behaviour correctly, essentially:

if env['rack.protocol'].find?{|protocol| protocol.casecmp?('websocket')}


if request_protocols.empty?
raise LintError, "rack.protocol header is #{protocol.inspect}, but rack.protocol was not set in request!"
elsif !request_protocols.include?(protocol)
raise LintError, "rack.protocol header is #{protocol.inspect}, but should be one of #{request_protocols.inspect} from the request!"
end
end
end
##
## Setting this value informs the server that it should perform a
## connection upgrade. In HTTP/1, this is done using the +upgrade+
## header. In HTTP/2, this is done by accepting the request.
##
## === The Body
##
Expand Down
34 changes: 34 additions & 0 deletions test/spec_lint.rb
Expand Up @@ -832,4 +832,38 @@ def call(env, status, headers, error)
[200, {}, ["foo"]]
}).call(env({ "rack.response_finished" => [-> (env) {}, lambda { |env| }, callable_object], "content-length" => "3" })).first.must_equal 200
end

it "notices when the respones protocol is specified in the response but not in the request" do
app = Rack::Lint.new(lambda{|env|
[101, {'rack.protocol' => 'websocket'}, ["foo"]]
})

lambda do
app.call(env())
end
.must_raise(Rack::Lint::LintError)
.message.must_match(/rack.protocol header is "websocket", but rack.protocol was not set in request/)
end

it "notices when the respones protocol is specified in the response but not in the request" do
app = Rack::Lint.new(lambda{|env|
[101, {'rack.protocol' => 'websocket'}, ["foo"]]
})

lambda do
app.call(env('rack.protocol' => ['smtp']))
end
.must_raise(Rack::Lint::LintError)
.message.must_match(/rack.protocol header is "websocket", but should be one of \["smtp"\] from the request!/)
end

it "pass valid rack.protocol" do
app = Rack::Lint.new(lambda{|env|
[101, {'rack.protocol' => 'websocket'}, ["foo"]]
})

response = app.call(env({'rack.protocol' => 'websocket'}))

response.first.must_equal 101
end
end