New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No support for encoding such as gzip or brotli? #1481
Comments
Isn't gzip or brotli something that's supposed to be done by a sysadmin (server administrator) instead of a web developer? Do you use apache, nginx or IIS? Contact your web host for advice on turning it on or use the Sitepoint forum. |
We host ourselves and have compression enabled on the web server (nginx). The HTTP client (like a web browser, wget/curl, or any application) that performs the HTTP request, normally announce what types of data compression they support. Based on that outcome, the web server will then return uncompressed or compressed responses. So what I see in our logs is that Selfoss makes a request but without any
So I looked in the code base, but can't find a reference to compression methods. I only saw 'accept-encoding' in a .htaccess file. Or in other words, it looks like Selfoss (or the client that does the HTTP requests), is not supporting any form of data compression. This indirectly means every single request the software makes is "wasting" additional bytes that have to be sent over the internet. Maybe also good to add, I don't use Selfoss myself, so can't test it from the "client" side. The reason for reaching out is to improve clients and saving a lot of internet traffic in the long haul. Hope that this clarifies the story behind the request a bit better. |
I can add gzip woithin 5 seconds, just like I did in 2010 when I added some lines to nginx should have something similar in mod_deflate for gzip
zlib compression
brotliBrotli is a technology made by Google so as it's relatively new, I think it has to be installed onto the server, as a module, given how there has already been other open source compression technology as a server extension module, that's already been around for over 20 years. |
Thanks for reporting. Looks like you are right. Running <?php error_log(var_export(getallheaders(), true), 0); reveals selfoss is only sending the following headers: array (
'Host' => '127.0.0.1:8000',
'User-Agent' => 'Selfoss/2.20-SNAPSHOT (+https://selfoss.aditu.de)',
'Referer' => 'http://127.0.0.1:8000/',
'Accept' => 'application/atom+xml, application/rss+xml, application/rdf+xml;q=0.9, application/xml;q=0.8, text/xml;q=0.8, text/html;q=0.7, unknown/unknown;q=0.1, application/unknown;q=0.1, */*;q=0.1',
) Compared to e.g. Firefox: array (
'Host' => '127.0.0.1:8000',
'User-Agent' => 'Mozilla/5.0 (X11; Linux x86_64; rv:124.0) Gecko/20100101 Firefox/124.0',
'Accept' => 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8',
'Accept-Language' => 'en-GB,en;q=0.8,cs;q=0.5,en-US;q=0.3',
'Accept-Encoding' => 'gzip, deflate, br',
'DNT' => '1',
'Connection' => 'keep-alive',
'Upgrade-Insecure-Requests' => '1',
'Sec-Fetch-Dest' => 'document',
'Sec-Fetch-Mode' => 'navigate',
'Sec-Fetch-Site' => 'none',
'Sec-Fetch-User' => '?1',
) We use Guzzle HTTP client library, which uses curl internally so I had assumed it sends the correct headers automatically. Especially, when decoding encoded values is enabled by default. But curl itself only sends array (
'Host' => '127.0.0.1:8000',
'User-Agent' => 'curl/8.6.0',
'Accept' => '*/*',
'Accept-Encoding' => 'deflate, gzip, br, zstd',
) Will look into it. |
Guzzle does not send `Accept-Encoding` header by default. That is equivalent to sending `Accept-Encoding: *`: https://www.rfc-editor.org/rfc/rfc9110#field.accept-encoding> Most servers will probably return an uncompressed body in response to that, which can be considered wasteful, and can trigger crawler detection systems: #1481 Others might even opt to use a compression method that is not supported by the system (e.g. when libcurl is not compiled with brotli support). Let’s force Guzzle to let curl send `Accept-Encoding` header reflecting which compression methods it supports: guzzle/guzzle#3215
Turns out Guzzle overrides curl headers to not send Thanks again for bringing it to our attention. |
Thanks for your quick response and actions. I noticed a few more issues with other RSS feed readers, so that gave me the idea to blog about it. Also keeping track of the actions taken and sharing in return. Hopefully it also inspires both developers, publishers, and users of RSS, to improve things together. |
I see in our log files that your software is being blocked as it does not provide any accept-encoding headers. Our rationale for doing this is to limit outdated or bad-behaving systems/crawlers while saving on resources (on our end, but especially on the internet in general). In this case, I was surprised to see a modern tool being blocked as well.
I guess this is a feature request: Is it possible to add compression support to the project (and save a lot of bytes on the internet)?
The text was updated successfully, but these errors were encountered: