Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

httptools.parser.errors.HttpParserInvalidURLError on fragmented first line of the HTTP request. #1262

Closed
2 tasks done
div1001 opened this issue Nov 23, 2021 · 6 comments · Fixed by #1263
Closed
2 tasks done
Assignees

Comments

@div1001
Copy link

div1001 commented Nov 23, 2021

Checklist

  • The bug is reproducible against the latest release and/or master.
  • There are no similar issues or pull requests to fix it yet.

Describe the bug

I've ran into issue with uvicorn+httptools, HttpParserInvalidURLError is raised and the server closes connection when the first line of the http/1.* request is split into several tcp packets.
It also seems like everything is fine if h11 used instead of httptools.

To reproduce

import socket
import threading as th
from time import sleep

import uvicorn

########## HELPERS ##########
def receive_all(sock):
    chunks = []
    while True:
        chunk = sock.recv(1024)
        if not chunk:
            break

        chunks.append(chunk)

    return b''.join(chunks)


async def stub_app(scope, receive, send):
    event = await receive()
    d = b'whatever'
    await send({
        'type': 'http.response.start',
        'status': 200,
        'headers': [
            [b'Content-Length', str(len(d)).encode()],
            [b'Content-Type', b'text/plain'],
        ]
    })
    await send({'type': 'http.response.body', 'body': d, 'more_body': False})
#############################


def send_fragmented_req(path):

    sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    sock.connect(('127.0.0.1', 5000))
    d = (
        f'GET {path} HTTP/1.1\r\n'
        'Host: localhost\r\n'
        'Connection: close\r\n\r\n'
    ).encode()

    split = len(path) // 2
    # send first part
    sock.sendall(d[:split])
    sleep(0.01)  # important, ~100% error, without this the chances of reproducing the error are pretty low

    # send second part
    sock.sendall(d[split:])
    
    # read response
    resp = receive_all(sock)
    sock.shutdown(socket.SHUT_RDWR)
    sock.close()
    return resp


def test():
    t = th.Thread(target=lambda: uvicorn.run(stub_app, host='127.0.0.1', port=5000, http='httptools'))
    t.daemon = True
    t.start()
    sleep(1)    # wait for unicorn to start
    
    path = '/?param=' + 'q' * 10
    print(send_fragmented_req(path).decode())


if __name__ == '__main__':
    test()


# Console output:
# INFO:     Started server process [748603]
# INFO:     Waiting for application startup.
# INFO:     ASGI 'lifespan' protocol appears unsupported.
# INFO:     Application startup complete.
# INFO:     Uvicorn running on http://127.0.0.1:5000 (Press CTRL+C to quit)
# WARNING:  Invalid HTTP request received.
# Traceback (most recent call last):
#   File "httptools/parser/parser.pyx", line 258, in httptools.parser.parser.cb_on_url
#   File "/_/lib/python3.8/site-packages/uvicorn/protocols/http/httptools_impl.py", line 193, in on_url
#     parsed_url = httptools.parse_url(url)
#   File "httptools/parser/url_parser.pyx", line 105, in httptools.parser.url_parser.parse_url
# httptools.parser.errors.HttpParserInvalidURLError: invalid url b'am=qqqqqqqqqq'
#
# During handling of the above exception, another exception occurred:
#
# Traceback (most recent call last):
#   File "/_/lib/python3.8/site-packages/uvicorn/protocols/http/httptools_impl.py", line 131, in data_received
#     self.parser.feed_data(data)
#   File "httptools/parser/parser.pyx", line 212, in httptools.parser.parser.HttpParser.feed_data
# httptools.parser.errors.HttpParserCallbackError: `on_url` callback error

Expected behavior

HTTP response:

HTTP/1.1 200 OK
date: Tue, 23 Nov 2021 09:36:10 GMT
server: uvicorn
content-length: 8
content-type: text/plain

whatever

Actual behavior

Server closes connection without a response.

Environment

Running uvicorn 0.15.0 with CPython 3.8.2 on Linux
Dependencies:
uvicorn==0.15.0
h11==0.12.0
httptools==0.3.0

@euri10
Copy link
Member

euri10 commented Nov 23, 2021

thanks for the detailed report !

I'm not sure there's much we can do about it, httptools is using for its .parse_url https://github.com/nodejs/http-parser.

so unless upstream (http-parser) can deal with tcp fragmentation in url (which I doubt since it's been a while it's unmaintained) or lhttp that does the parsing httptools uses under the hood, implements it (long standing issue nodejs/llhttp#7) then we're stuck I'm afraid.

happy to reopen in case things change or if these thoughts are incorrect @div1001

@euri10 euri10 closed this as completed Nov 23, 2021
@euri10 euri10 reopened this Nov 23, 2021
@euri10
Copy link
Member

euri10 commented Nov 23, 2021

reopening I have a fix in mind

@euri10
Copy link
Member

euri10 commented Nov 23, 2021

this might be a good first approach from here: https://github.com/encode/uvicorn/pull/778/files, want to takle it @div1001 ?

there might be side effect from this PR but I think that might solve your fragmentation issue

@div1001
Copy link
Author

div1001 commented Nov 23, 2021

want to takle it @div1001 ?

I've understood your solution idea.
If it's ok I'd like to try. I'd like to take some time for it though, to look aroud codebase since i know little about it at the moment.

@euri10
Copy link
Member

euri10 commented Nov 23, 2021

I've understood your solution idea.
If it's ok I'd like to try. I'd like to take some time for it though, to look aroud codebase since i know little about it at the moment.

sure ! take #1263 if you want as a baseline, I did that quickly so I'm not sure there are no edge cases, would be good to add tests for that : in the protocols module that should be the best, not sure how to handle the fragmentation in a test though but in that module we deal with several bare http requests so that would be a good place to add your snippet adapted for our tests format.

@malthejorgensen
Copy link

Yay – the issue from #344 was reproduced! Good job @div1001!

@euri10 euri10 assigned euri10 and unassigned div1001 Feb 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants