Separate headers by CR LF #471

dallagi · 2023-08-28T07:04:14Z

While using HTTPretty together with the last version of aiohttp, aiohttp failed to parse the response generated by HTTPretty due to the way headers are separated.
Specifically, HTTPretty separates them via the LF (\n) control code while aiohttp expects CR LF (\r\n).

This behavior (on aiohttp's side) is also mentioned here aio-libs/aiohttp#7494 (comment) .

By searching a bit on the Internet (eg. here) it seems that http parsers are expected to be lenient wrt parsing newlines, so I'm not sure this is technically a fault on HTTPretty's side.

However I think it could be better to stay on the safe side and just use \r\n as separator for headers, which is the change I implemented with this PR.

Dreamsorcerer · 2023-09-11T14:32:10Z

By searching a bit on the Internet (eg. here) it seems that http parsers are expected to be lenient wrt parsing newlines, so I'm not sure this is technically a fault on HTTPretty's side.

It's literally not an HTTP message, which is clearly a mistake for a library with HTTP in its name. ;)

HTTP-message = start-line CRLF
*( field-line CRLF )
CRLF
[ message-body ]
https://www.rfc-editor.org/rfc/rfc9112.html#section-2.1-1

Allowing LF is completely optional, and can lead to security issues, particularly on the server side:

Although the line terminator for the start-line and fields is the sequence CRLF, a recipient MAY recognize a single LF as a line terminator and ignore any preceding CR.
https://www.rfc-editor.org/rfc/rfc9112.html#section-2.2-3

Dreamsorcerer · 2023-09-11T14:37:47Z

httpretty/core.py

@@ -1130,7 +1130,7 @@ def fill_filekind(self, fk):
            )

        for item in string_list:
-            fk.write(utf8(item) + b'\n')
+            fk.write(utf8(item) + b'\r\n')


Also, UTF-8 is not valid in headers:

Field values are usually constrained to the range of US-ASCII characters [USASCII]. Fields needing a greater range of characters can use an encoding, such as the one defined in [RFC8187].
https://www.rfc-editor.org/rfc/rfc9110.html#section-5.5-4

A recipient MUST parse an HTTP message as a sequence of octets in an encoding that is a superset of US-ASCII [USASCII]. Parsing an HTTP message as a stream of Unicode characters, without regard for the specific encoding, creates security vulnerabilities due to the varying ways that string processing libraries handle invalid multibyte character sequences that contain the octet LF (%x0A).
https://www.rfc-editor.org/rfc/rfc9112.html#section-2.2-2

Separate headers by CR LF

fb6fe3e

Dreamsorcerer reviewed Sep 11, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Separate headers by CR LF #471

Separate headers by CR LF #471

dallagi commented Aug 28, 2023

Dreamsorcerer commented Sep 11, 2023

Dreamsorcerer Sep 11, 2023

Separate headers by CR LF #471

Are you sure you want to change the base?

Separate headers by CR LF #471

Conversation

dallagi commented Aug 28, 2023

Dreamsorcerer commented Sep 11, 2023

Dreamsorcerer Sep 11, 2023

Choose a reason for hiding this comment