Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Customize max_line_size and max_field_size in HttpParser? #2988

Open
alexandru opened this issue May 10, 2018 · 10 comments
Open

Customize max_line_size and max_field_size in HttpParser? #2988

alexandru opened this issue May 10, 2018 · 10 comments

Comments

@alexandru
Copy link

alexandru commented May 10, 2018

We're using aiohttp 2.x with Python 3.5.2 and trying to upgrade to the latest aiohttp 3.2.0 with Python 3.6.5.

We're getting errors such as these:

[2018-05-10 08:31:55 +0000] [12471] [ERROR] Error handling request
Traceback (most recent call last):
  File "/project/lib/python3.6/site-packages/aiohttp/web_protocol.py", line 235, in data_received
    messages, upgraded, tail = self._request_parser.feed_data(data)
  File "aiohttp/_http_parser.pyx", line 297, in aiohttp._http_parser.HttpParser.feed_data
  File "aiohttp/_http_parser.pyx", line 425, in aiohttp._http_parser.cb_on_header_value
aiohttp.http_exceptions.LineTooLong: 400, message='Got more than 8190 bytes (9383) when reading Header value is too long.'

This was happening in aiohttp 2.x as well, because I believe the standard specifies that length as the maximum, however the real world doesn't always agree with standards 😒

To fix this, with aiohttp 2.x we were monkey-patching HttpParser in order to override max_line_size and max_field_size. Here's the horrible things we've been doing: gist — this file being imported in our gunicorn_worker.py.

Now that we've upgraded to 3.x, this monkey patching no longer works. Don't know why.

Any way to customize max_line_size and max_field_size?
Possibly without monkey patching, but not necessarily, atm I just want the thing to work?

Environment:

  • Python 3.6.5
  • aiohttp 3.2.0
  • Gunicorn 19.8.1 (via aiohttp.worker.GunicornWebWorker)

See the full requirements.txt if interested.

Cheers,

@asvetlov
Copy link
Member

Nobody implemented it yet.
We have the very similar issue in backlog: #2304

@pposca
Copy link
Contributor

pposca commented May 26, 2018

Where would be the best place to expose max_line_size and max_field_size ? Using an Application init param, like with client_max_size?

@asvetlov
Copy link
Member

@pposca let's push params into BaseRequest first.
After that, we can think how to configure it the best way.
The library has several configuration parameters, would be nice to have a generalized solution for this.
The objection to Application.__init__ is that the class is used for sub-applications too, but parameters like max_line_size make sense only for root application.
Maybe we can live with it, or perhaps we will end up with config parameter which should be None for any sub-app. I don't know, let's discuss it later.

@pposca
Copy link
Contributor

pposca commented May 31, 2018

@asvetlov OK. There is also a max_headers param that seems related with max_line_size and max_field_size. Shouldn't it be added as well? It has sense for me and it would be free.

@asvetlov
Copy link
Member

Agree, please support max_headers as well

@EvgenyKhaliper
Copy link

Guys, any progress with it ?

@asvetlov
Copy link
Member

The issue is waiting for a champion

@alexpantyukhin
Copy link

I looked at it a little and found there is such chain Application._make_handler -> Server.call -> RequestHandler -> HttpRequestParser . _make_handler already takes some parameters with kwargs. Maybe it possible to merge kwargs with some config dictionary which are passed into __init__.

@bmbouter
Copy link
Contributor

bmbouter commented Apr 8, 2020

Our project is affected by the issue, and I'd like to open a PR to resolve it. Since 2018 I believe the need is no longer on Application._make_handler and instead to be configured on the ApplicationRunner created here instead. Do you agree?

Rather than adding options one-by-one, is there a way we can have many options specified by an application running with GunicornWebWorker? Here are some ideas:

Idea 1: A subclass pass extra kwargs to ApplicationRunner

To do this we could add an internal method named, e.g. _get_app_runner_instance(self) which return a configured instance of ApplicationRunner.

diff --git a/aiohttp/worker.py b/aiohttp/worker.py
index 73ba6e38..21fb671d 100644
--- a/aiohttp/worker.py
+++ b/aiohttp/worker.py
@@ -64,6 +64,17 @@ class GunicornWebWorker(base.Worker):
 
         sys.exit(self.exit_code)
 
+    def _get_application_runner_instance(self, app, **kwargs):
+        access_log = self.log.access_log if self.cfg.accesslog else None
+        runner = web.AppRunner(app,
+                               logger=self.log,
+                               keepalive_timeout=self.cfg.keepalive,
+                               access_log=access_log,
+                               access_log_format=self._get_valid_log_format(
+                                   self.cfg.access_log_format),
+                               **kwargs)
+        return runner
+
     async def _run(self) -> None:
         if isinstance(self.wsgi, Application):
             app = self.wsgi
@@ -73,13 +84,8 @@ class GunicornWebWorker(base.Worker):
             raise RuntimeError("wsgi app should be either Application or "
                                "async function returning Application, got {}"
                                .format(self.wsgi))
-        access_log = self.log.access_log if self.cfg.accesslog else None
-        runner = web.AppRunner(app,
-                               logger=self.log,
-                               keepalive_timeout=self.cfg.keepalive,
-                               access_log=access_log,
-                               access_log_format=self._get_valid_log_format(
-                                   self.cfg.access_log_format))
+
+        runner = self._get_application_runner_instance(app)
         await runner.setup()
 
         ctx = self._create_ssl_context(self.cfg) if self.cfg.is_ssl else None```

To allow GunicornWebWorker add additional "default" config options in the future without those config options also being added to every subclass there ever was, subclasses should use super() to inject the kwargs. Here would be an example of a subclass of CustomGunicornWebWorker that overrides max_line_size and max_field_size:

class CustomGunicornWebWorker(GunicornWebWorker):

    def _get_application_runner_instance(self, app):
        extra_kwargs = {"max_line_size" = 16000, "max_field_size": 16000}
        return super()._get_application_runner_instance(app, **extra_kwargs)

Idea 2: Have GunicornWebWorker accept a new application_runner_kwargs somehow

The application GunicornWebWorker patch to apply the patch (below) would be straightforward, but where does application_runner_kwargs come from?

diff --git a/aiohttp/worker.py b/aiohttp/worker.py
index 73ba6e38..b2c7194f 100644
--- a/aiohttp/worker.py
+++ b/aiohttp/worker.py
@@ -79,7 +79,8 @@ class GunicornWebWorker(base.Worker):
                                keepalive_timeout=self.cfg.keepalive,
                                access_log=access_log,
                                access_log_format=self._get_valid_log_format(
-                                   self.cfg.access_log_format))
+                                   self.cfg.access_log_format),
+                               **application_runner_kwargs)
         await runner.setup()
 
         ctx = self._create_ssl_context(self.cfg) if self.cfg.is_ssl else None

loven-doo added a commit to loven-doo/aiohttp that referenced this issue Apr 12, 2021
These changes allow the configuration of web.AppRunner object inside application code. This makes possible to use custom configs of Aiohttp server (like 'max_line_size' parameters setup) in pair with Gunicorn.
Associated issue: aio-libs#2988 (reference)
@kylemacfarlane
Copy link

I just ran into this when using aiohttp as a client to read from a server which is returning a gigantic CSP header.

The source of the problem seems to be here where a hardcoded non-customisable parser is used:

self._parser = HttpResponseParser(
self,
self._loop,
read_bufsize,
timer=timer,
payload_exception=ClientPayloadError,
response_with_body=not skip_payload,
read_until_eof=read_until_eof,
auto_decompress=auto_decompress,
)

I can't figure out how to get a override this method properly. It seems like you need a custom connector class providing a custom connection class providing a custom protocol which finally provides a custom parser, but I can't figure it out.

Monkey patching HttpResponseParser directly doesn't work, perhaps because of the C extensions.

In the end this monkey patch works but I hate it:

from typing import Optional

from aiohttp.client_exceptions import ClientPayloadError
from aiohttp.client_proto import ResponseHandler
from aiohttp.helpers import BaseTimerContext
from aiohttp.http import HttpResponseParser


def set_response_params(
    self,
    *,
    timer: Optional[BaseTimerContext] = None,
    skip_payload: bool = False,
    read_until_eof: bool = False,
    auto_decompress: bool = True,
    read_timeout: Optional[float] = None,
    read_bufsize: int = 2 ** 16
) -> None:
    self._skip_payload = skip_payload

    self._read_timeout = read_timeout
    self._reschedule_timeout()

    self._parser = HttpResponseParser(
        self,
        self._loop,
        read_bufsize,
        timer=timer,
        payload_exception=ClientPayloadError,
        response_with_body=not skip_payload,
        read_until_eof=read_until_eof,
        auto_decompress=auto_decompress,
        # These three lines are the important change
        max_line_size=8190 * 4,
        max_headers=32768 * 4,
        max_field_size=8190 * 4
    )

    if self._tail:
        data, self._tail = self._tail, b""
        self.data_received(data)


ResponseHandler.set_response_params = set_response_params

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants