Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compression encoding preference #287

Open
daxpedda opened this issue Aug 5, 2022 · 16 comments
Open

Compression encoding preference #287

daxpedda opened this issue Aug 5, 2022 · 16 comments
Labels
C-feature-request Category: A feature request, i.e: not implemented / a PR. E-help-wanted Call for participation: Help is requested to fix this issue. E-medium Call for participation: Experience needed to fix: Medium / intermediate

Comments

@daxpedda
Copy link

daxpedda commented Aug 5, 2022

Feature Request

Motivation

Currently, when using Firefox for example, the encoding used by CompressionLayer is GZIP, which is unfortunate, considering that Brotli produces far smaller files and Firefox does support it.

Proposal

Currently the encoding used by CompressionLayer, is the first one listed in Accept-Encoding, which, as far as I can tell, is not mandated by the specification, see RFC 9110 Section 12.5.3 and 8.4.1.

Furthermore, most servers don't handle it this way and just pick the one from the list they prefer the most. When clients want to specify which one they prefer, they have to use quality values, which is already supported by CompressionLayer (as far as I can tell).

So my proposal is:

  • Specify a default server preference for each encoding instead of picking the first one.
  • Expose an interface to let the user specify a preference.

Happy to make a PR of course.

Alternatives

For example, if Brotli is the encoding preferred, users could always just only enable that crate feature, considering Brotli is widely supported. Obviously this isn't ideal.

Otherwise implementing CompressionLayer by hand always works.

@davidpdrsn
Copy link
Member

Adding a method for specifying the preferred compression (if the client doesn't have a preference) sounds good to me!

@davidpdrsn davidpdrsn added C-feature-request Category: A feature request, i.e: not implemented / a PR. E-help-wanted Call for participation: Help is requested to fix this issue. E-medium Call for participation: Experience needed to fix: Medium / intermediate labels Dec 2, 2022
@casey
Copy link
Contributor

casey commented Jan 26, 2023

This would be very nice! I enabled brotli and gzip, and although it wasn't specified in the docs, I assumed that if brotli and gzip were both acceptable to the client, brotli would be used.

@casey
Copy link
Contributor

casey commented Jan 26, 2023

With #325, tower-http will now prefer brotli > gzip > deflate, when more than one encoding is acceptable to the client. There still isn't a way for users of tower-axum to express a preference between compression schemes, but the new defaults might mean there isn't as much of a need to do so, so that might be enough to close this issue.

@dirkjonker
Copy link

dirkjonker commented Mar 8, 2023

Brotli decreases performance for large json files by an order of magnitude for me, roughly 10 times slower than gzip. The file size may be a bit smaller but the huge latency penalty isn't worth it.

I noticed it when updating Axum and tower-http 0.4, where brotli is now the default (the release notes mention a change in compression preferences, but not that it might have such an impact).

The (not so scientific) numbers:
Original file size: 101.76 kB
gzip: 7.02 kB
Brotli: 5.15 kB

Latency, release mode:
no compression: ~ 33 ms
gzip: ~ 35 ms
brotli: ~ 115 ms

Latency, debug mode:
no compression: ~ 70 ms
gzip: ~ 75 ms
brotli: ~ 800 ms

@daxpedda
Copy link
Author

daxpedda commented Mar 9, 2023

This is a known problem that keeps popping up when using async-compression.
See Nullus157/async-compression#174.

To summarize, tower-http uses the default compression level of async-compression (src), which uses the default compression level of brotli (src), which is by default the highest compression level (src).

So this can either be "fixed" by changing the default level in brotli, or not use the default level in tower-http.

@dirkjonker
Copy link

Thank you for the insights. There does not seem to be an obvious (documented) way to configure this, so for the time being I will just set no_br() when using the CompressionLayer.

@FSMaxB
Copy link

FSMaxB commented Mar 21, 2023

I ran into a case that was way way worse, it also took me a while to figure out that the CompressionLayer was the cause:

Brotli: 8.3s
gzip: 310ms
No compression: 35ms

It's a moderately large JSON: ~770KiB

@daxpedda
Copy link
Author

I asked at brotli if there is any interest on their side to change this: dropbox/rust-brotli#93.

@Ameobea
Copy link
Contributor

Ameobea commented Apr 13, 2023

Defaulting to the max compression level for dynamic content is definitely the wrong behavior for this. NGINX uses level 4 as a default for on-the-fly compression which might make sense for this crate to use as well.

Is it possible to select the compression level for whatever compression lib we're using? If it's not exposed as an option to even this crate, it might make sense to disable support for it. This degree of slowness could almost be considered a bug; it caused a whole crop of mysterious performance regressions in our app that took a bunch of effort to track down after just bumping some dependencies.

Ideally, I feel that the compression level should be configurable on the compression layer itself by the end user.

@davidpdrsn
Copy link
Member

Is someone up for submitting a PR?

@Ameobea
Copy link
Contributor

Ameobea commented Apr 14, 2023

It seems that support for specifying compression level on CompressionLayer was already implemented 3 weeks ago: 4ebe811

Seems to just not be released yet.

The only remaining piece would be to override the async-compression default with a lower level for brotli, and maybe other algorithms as well, since as I understand it their default is max level.

I think that could be accomplished as a special-case here: https://github.com/tower-rs/tower-http/blob/master/tower-http/src/compression/body.rs#L335

I can put up a PR to do that if the plan sounds good to you.

@Ameobea
Copy link
Contributor

Ameobea commented Apr 14, 2023

I created a PR to implement that: #356

Happy to discuss further there or handle any changes you'd like to see. I've also not tested this change and will def. want to do that before merging.

@privettoli
Copy link
Contributor

privettoli commented Jan 20, 2024

While the changes in PR #356 are beneficial for new projects, they don't address a specific need in our Rust rewrite scenarios. For example, we require the capability to utilize the first supported encoding, as per our project specifications. This functionality is not facilitated by the current implementation in PRs #356 and #325, and its inclusion would be greatly beneficial for our use cases.

@jplatte
Copy link
Collaborator

jplatte commented Feb 17, 2024

@privettoli Can you expand a little bit on that? Clients can also send a quality value, do you expect the ordering of the encoding list to act as an extra (fallback) preference if qualities of multiple encodings in the list are equal?

@privettoli
Copy link
Contributor

@jplatte sorry for the delay. Existing services have a certain implementation of ordering for equal-quality algorithms. For example, they might be picking up the first in the list or preferring gzip over all other. For certain teams it might be important to preserve the behavior when rewriting their service to Rust+Hyper with motivation to reduce customer impact and/or confusion when new version begins to accept all or part of the traffic. For many services it would not be critical to preserve identical behavior, for some it might be.

@jplatte
Copy link
Collaborator

jplatte commented Mar 5, 2024

I'm not that interested in what might be that case for somebody else but what your use case is. And I still don't understand it unfortunately. Do you want to be able to pick the first from the list regardless of q-values, or the first one between those with the highest q-value? The former I would consider a buggy implementation so I'd rather not put in much work to support that.. but maybe a callable that takes the header value, has to parse it internally and return a preferred encoding from it could work?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-feature-request Category: A feature request, i.e: not implemented / a PR. E-help-wanted Call for participation: Help is requested to fix this issue. E-medium Call for participation: Experience needed to fix: Medium / intermediate
Projects
None yet
Development

No branches or pull requests

8 participants