Pass max_length parameter to ZLibDecompressor #8248

ikrivosheev · 2024-03-25T14:18:01Z

Is your feature request related to a problem?

Steps to reproduce:

Create file: dd if=/dev/zero bs=1M count=1024 | gzip > 1G.gz
Start nginx on localhost with location:

location /1G.gz {
        try_files $uri /tmp/1G.gz;
        add_header Content-Encoding gzip;
    }

Try to download file:

async with aiohttp.ClientSession(headers=headers) as session:
    async with session.get('http://localhost/1G.gz') as response:
         async for content in response.content.iter_any():
             print(len(content)

See memory usage. It'll so big... I run my service in docker and have a memory limit for container. I get OOMKilled.

Describe the solution you'd like

Will be great an option to pass max_length to ZLibDecompressor (https://github.com/aio-libs/aiohttp/blob/master/aiohttp/compression_utils.py#L104)

Describe alternatives you've considered

Change default max_length from 0 to, 4Kb or greater.

Related component

Client

Additional context

No response

Code of Conduct

I agree to follow the aio-libs Code of Conduct

The text was updated successfully, but these errors were encountered:

Dreamsorcerer · 2024-03-26T17:33:57Z

At a glance, if I've understood that correctly, changing the parameter would just stop you from receiving the full response (as that method is only going to be called when new data is received). Isn't the problem more that it shouldn't be calling that method with too much data? As you are using the streaming API, I'd expect it to only receive a few KBs of data each time and call that decompress method with it each time. So, if you see high memory usage, I wonder if the issue is actually with the streaming API instead...

@mykola-mokhnach might have some ideas as they last worked on the compression code.

ikrivosheev · 2024-03-26T18:34:54Z

At a glance, if I've understood that correctly, changing the parameter would just stop you from receiving the full response (as that method is only going to be called when new data is received). Isn't the problem more that it shouldn't be calling that method with too much data? As you are using the streaming API, I'd expect it to only receive a few KBs of data each time and call that decompress method with it each time. So, if you see high memory usage, I wonder if the issue is actually with the streaming API instead...

@mykola-mokhnach might have some ideas as they last worked on the compression code.

For example 1Kb after ungzip will be about 1Mb or bigger. Because the buffer is for incoming from the socket, not after ungzip.

Dreamsorcerer · 2024-03-26T18:45:15Z

Right, good point. So, if it's data with a very high compression ratio it could become a problem. I suspect this would still need some refactoring in order to get it to return the full data.

i.e. Instead of something like:

for chunk in data_received:
    yield decompress(chunk)

we'd need something more along the lines of:

for chunk in data_received:
    while chunk:
        result = decompress(chunk)
        yield result.data
        chunk = result.unconsumed_tail

Dreamsorcerer · 2024-03-26T18:47:42Z

Feel free to try and create a PR. I'd assume this would be fine changing the default behaviour, as the purpose of the streaming API is to limit memory usage, so this looks like an oversight to me.

Ideally, this would need to be implemented in the brotli decompressor too.

ikrivosheev · 2024-03-27T06:55:39Z

@Dreamsorcerer thank you! I'll work on it. I created issue in brotli. it doesn't have the option.

ikrivosheev added the enhancement label Mar 25, 2024

ikrivosheev mentioned this issue Mar 27, 2024

Add parameter for set maximum length after decompress google/brotli#1149

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pass max_length parameter to ZLibDecompressor #8248

Pass max_length parameter to ZLibDecompressor #8248

ikrivosheev commented Mar 25, 2024 •

edited

Dreamsorcerer commented Mar 26, 2024

ikrivosheev commented Mar 26, 2024

Dreamsorcerer commented Mar 26, 2024 •

edited

Dreamsorcerer commented Mar 26, 2024

ikrivosheev commented Mar 27, 2024

Pass max_length parameter to ZLibDecompressor #8248

Pass max_length parameter to ZLibDecompressor #8248

Comments

ikrivosheev commented Mar 25, 2024 • edited

Is your feature request related to a problem?

Describe the solution you'd like

Describe alternatives you've considered

Related component

Additional context

Code of Conduct

Dreamsorcerer commented Mar 26, 2024

ikrivosheev commented Mar 26, 2024

Dreamsorcerer commented Mar 26, 2024 • edited

Dreamsorcerer commented Mar 26, 2024

ikrivosheev commented Mar 27, 2024

ikrivosheev commented Mar 25, 2024 •

edited

Dreamsorcerer commented Mar 26, 2024 •

edited