Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transparent (content-encoding) compression #456

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jimdigriz
Copy link
Contributor

@jimdigriz jimdigriz commented Nov 16, 2017

An initial stab at merging my wrapper into hackney its-self which addresses issue #155

To use this, just pass compress as an option to hackney:{request,get,...}.

Copy link

@seriyps seriyps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good. All notes are non-critical.

src/hackney_http.erl Outdated Show resolved Hide resolved
src/hackney_pool.erl Outdated Show resolved Hide resolved
{done, Rest} ->
Rest2 = iolist_to_binary(zlib:inflate(Z, Rest)),
ok = zlib:inflateEnd(Z),
ok = zlib:close(Z),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure this is the only condition when zlib handle might be closed? Is it possible to leave some zlib handles stale in case of errors or smth like that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, pretty sure we leak if something implodes (ie. server prematurely closes stream).

Really at this point more interested in:

  • does this work for others
  • anyone see any regressions
  • does compress as an option suit the users of hackney
  • is there a better layer for me to plumb this in or less intrusively

Fixing bugs/edge cases I suspect would not change drastically the shape of this patch but changing the plumbing would, as would maybe some fixes for regressions. Make sense?

I dislike wasting time if I am on the wrong footpath and no one tells me :)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"does compress as an option suit the users of hackney. " yes

@seriyps
Copy link

seriyps commented Nov 28, 2017

Tests a failing because of a dialyzer warning:

_build/default/lib/hackney/src/hackney_http.erl
 111: Record construction #hparser{type::'auto',max_line_length::4096,max_empty_lines::10,empty_lines::0,state::'on_first_line',buffer::<<>>,method::<<>>,partial_headers::[],te::<<>>,ce::<<>>,connection::<<>>,ctype::<<>>,location::<<>>,body_state::'waiting',encoding::'undefined'} violates the declared type of field encoding::{atom(),_}
 523: Function parse_options/2 will never be called

I would suggest you to change type of encoding field to {atom(), any()} | undefined.

@fenollp
Copy link

fenollp commented Nov 28, 2017

Shouldn't the new compress option be inferred from the accept header instead of needing to be explicitly specified? Was it not done this way for backwards compatibility reasons?

@jimdigriz
Copy link
Contributor Author

I used compress to avoid having to create a API call so then the application can tell what to put in its headers (today we support deflate/gzip simply as it brings in no dependencies). I suspect to a developer most of the time they think more "compress where possible" rather than "please only use deflate".

However, this is how I think of using hackney, if I need something else, I just hack the source and submit a PR, others I am willing to accept might prefer something else :)

@fenollp
Copy link

fenollp commented Nov 28, 2017

Works great for me. Thanks :)

@@ -325,7 +328,40 @@ parse_trailers(St, Acc) ->
_ -> error
end.

parse_body(St=#hparser{body_state=waiting, te=TE, clen=Length,
-define(MAX_WBITS, 15). % zconf.h
parse_body(St=#hparser{body_state=waiting, ce=CE, clen=Length, method=Method}) ->
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have some inconsistency here: body will be unzipped regardless of compress option. But maybe we should check that both Content-Encoding header and compress option are active to decompress the body?
Because, with your current implementation, there will be no way at all to get compressed body as is - without decompression. For example if I need to get compressed body for some reason (eg, send it further or store on disk without looking into it... proxy...) I may set Accept-Encoding manualy and will get compressed body in response. But if I need transparent compression, I'll use compress option.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I'll get back with a fix in a day or two.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jimdigriz did you fix it?

Copy link
Contributor Author

@jimdigriz jimdigriz Jan 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a stab at it but noticed that to push the 'do transparent compression' flag into #hparser{} is quite invasive. I do plan to return to this but Meat Space(tm) has caught up with me. :)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hrm which also let me think i should revisit that data structure, it's quite duplicating the #client{} one ...

Copy link

@cgorshing cgorshing Feb 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really appreciate the work you are doing, this is something that I am needing. A couple questions for you.
So the compress option means both to compress the request and to decompress the response? I see these as two different concerns. Maybe I've missed something reading through the comments.
I agree with seriyps above, I would only want to decompress the response if I pass in the option.
One case that I have, I don't really care about my request being compressed, but I care about the response being uncompressed. When I see the option compress I think there might be another option for the response. Thinking out loud.
Appreciate the work you have done, I'm wanting to get this into prod soon myself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree completely too with @seriyps thoughts too, it was something I had not thought of at the time. As @benoitc noticed too, to resolve this will probably involve a substantial change and I have not really looked much into it since.

@benoitc my concern is only wasting everyones time. If you can describe the changes you would like to see to replumb this, when I find the time to progress this, I would then be able to move in a direction that works for you.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benoitc Any idea how to help @jimdigriz ^?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jimdigriz I believe this was resolved with the last commit? Unfortunately, I can't see what changed in the last commit because of force-push..

@jimdigriz
Copy link
Contributor Author

jimdigriz commented May 29, 2020

Finally found the time to improve the compress flag support; I moved the functionality from hackney_http into hackney_response which seems more appropriate as Content-Encoding is application layer feature rather than part of the transport.

I chose to not add guards to skip initialising zlib for clen=0/bad_int and HEAD requests as the user chooses those values and could also choose to not use compress too. This was just my preference, I am not wedded to it and happy to change that behaviour.

I went to try adding unit tests, but looking around it seems to be non-trivial to add gzip/deflate support to gunicorn so got stopped in my tracks. Any suggestions on what to do here?

@benoitc
Copy link
Owner

benoitc commented Oct 28, 2021

@jimdigriz i added the tests using the new test platform. will land with the next RC.

@jadeallenx
Copy link

@benoitc Any update on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants