Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UI - 304 'Not Modified' cached handling causes some requests to fail, can affect reverse-proxies #2053

Open
bt90 opened this issue Dec 21, 2023 · 36 comments
Assignees
Labels
upstream-bug issue with library - not directly with changedetection user-interface

Comments

@bt90
Copy link

bt90 commented Dec 21, 2023

** TLDR; ** Looks like an upstream bug that causes this, scroll to bottom

Describe the bug

I'm running changedetection with caddy as a reverse proxy. While the UI is perfectly usable, I am getting some errors in the browser's network tab and also in caddy's logs:

grafik

Dez 21 11:42:59 proliant caddy[48932]: {"level":"info","ts":1703155379.7331445,"msg":"Unsolicited response received on idle HTTP channel starting with \"0\\r\\n\\r\\n\"; err=<nil>"}
Dez 21 11:43:04 proliant caddy[48932]: {"level":"info","ts":1703155384.3559086,"msg":"Unsolicited response received on idle HTTP channel starting with \"0\\r\\n\\r\\n\"; err=<nil>"}
Dez 21 11:43:04 proliant caddy[48932]: {"level":"info","ts":1703155384.5456474,"msg":"Unsolicited response received on idle HTTP channel starting with \"0\\r\\n\\r\\n\"; err=<nil>"}
Dez 21 11:43:04 proliant caddy[48932]: {"level":"info","ts":1703155384.5492866,"msg":"Unsolicited response received on idle HTTP channel starting with \"0\\r\\n\\r\\n\"; err=<nil>"}

The issue seems to have been reported previously as #1977, but with little to no information on how to reproduce it or what exactly the problem is.

The error message itself comes from the go standard library used by caddy, and is an indication that the backend (in this case changedetection) is violating the HTTP spec as described in golang/go#19895 (comment)

I have only been able to reproduce this with the browser cache enabled. My guess is that the handling of 304 responses is the culprit here. The backend sets the content-length: 0 header. If any content is passed anyway, this would trigger the error.

grafik

Version

v0.45.9

To Reproduce

  1. run changedetection behind caddy or traefik
  2. switch between Groups and Settings with the browser cache enabled
  3. check the network tab and the logs of the reverse proxy

Expected behavior

No errors.

Desktop

  • OS: Windows 10
  • Browser: Firefox
  • Version: 121
@bt90
Copy link
Author

bt90 commented Dec 21, 2023

curl also complains when I mimic the 304 request directly without using a reverse proxy:

curl -vvv 'http://localhost:5000/static/images/gradient-border.png' -H 'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:121.0) Gecko/20100101 Firefox/121.0' -H 'Accept: image/avif,image/webp,*/*' -H 'Accept-Language: de,en-US;q=0.7,en;q=0.3' -H 'Accept-Encoding: gzip, deflate, br' -H 'Connection: keep-alive' -H 'If-Modified-Since: Wed, 20 Dec 2023 14:34:01 GMT' -H 'If-None-Match: "1703082841.0-21990-1682970004"' -H 'TE: trailers'
*   Trying 127.0.0.1:5000...
* Connected to localhost (127.0.0.1) port 5000 (#0)
> GET /static/images/gradient-border.png HTTP/1.1
> Host: localhost:5000
> User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:121.0) Gecko/20100101 Firefox/121.0
> Accept: image/avif,image/webp,*/*
> Accept-Language: de,en-US;q=0.7,en;q=0.3
> Accept-Encoding: gzip, deflate, br
> Connection: keep-alive
> If-Modified-Since: Wed, 20 Dec 2023 14:34:01 GMT
> If-None-Match: "1703082841.0-21990-1682970004"
> TE: trailers
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 304 NOT MODIFIED
< Content-Disposition: inline; filename=gradient-border.png
< Cache-Control: no-cache, max-age=0
< Expires: Thu, 21 Dec 2023 11:19:16 GMT
< Etag: "1703082841.0-21990-1682970004"
< Date: Thu, 21 Dec 2023 11:19:16 GMT
< Referrer-Policy: no-referrer
< Vary: Accept-Encoding
< Transfer-Encoding: chunked
< Connection: keep-alive
<
* Excess found: excess = 5 url = /static/images/gradient-border.png (zero-length body)
* Connection #0 to host localhost left intact

Edit: apparently the content-length: 0 header is set by caddy as it's missing in the curl output.

@dgtlmoon dgtlmoon changed the title Some requests fail if changedetection is running behind a reverse proxy [caddy] Some requests fail if changedetection is running behind a reverse proxy Dec 21, 2023
@dgtlmoon
Copy link
Owner

thanks for the nice report.. yeah interesting, it seems to be (atleast in your screenshot) only the static assets right? and only images..

These are served here

return send_from_directory("static/{}".format(group), path=filename)

what's interesting is that requesting static assets leaves a not amazing Expires

< Content-Type: text/javascript; charset=utf-8
< Content-Length: 369
< Last-Modified: Fri, 01 Dec 2023 17:09:07 GMT
< Cache-Control: no-cache, max-age=0
< Expires: Thu, 21 Dec 2023 14:21:33 GMT
< Etag: "1701450547.258573-856-4003010530:br"
< Date: Thu, 21 Dec 2023 14:21:33 GMT
< Vary: Accept-Encoding
< Content-Encoding: br
< Connection: keep-alive

even tho there's no real specific caching setup there

@dgtlmoon
Copy link
Owner

To be open, I dont know anything about caddy and I wont be installing caddy to figure it out (i dont have the time) although I'm happy to work people on this if they provide more debug info, one thing you can try is google something like 'flask caddy python zero content length' or something?

@bt90
Copy link
Author

bt90 commented Dec 21, 2023

The problem is that caddy and traefik are both using the go standard library which tends to follow the spec very closely:

The 304 response MUST NOT contain a message-body, and thus is always terminated by the first empty line after the header fields.

The presence of any kind of body triggers the error. curl seems to be a bit more lenient and only logs it as warning:

Excess found: excess = 5 url = /static/images/gradient-border.png (zero-length body)

My previous curl example should be enough to reproduce this warning without any kind of reverse proxy setup.

@dgtlmoon
Copy link
Owner

The problem is that caddy and traefik are both using the go standard library which tends to follow the spec very closely:

but running your curl command gives me a 200, not 304 when i'm not using caddy... so its some configuration problem with your end I believe

* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Content-Disposition: inline; filename=gradient-border.png
< Content-Type: image/png
< Content-Length: 21990
< Last-Modified: Sun, 19 Nov 2023 13:16:01 GMT
< Cache-Control: no-cache, max-age=0
< Expires: Thu, 21 Dec 2023 15:49:47 GMT
< Etag: "1700399761.0490255-21990-3255705226"
< Date: Thu, 21 Dec 2023 15:49:47 GMT
< Connection: keep-alive
< 

no 304, always 200

@dgtlmoon
Copy link
Owner

I dont understand what you're trying to say here - I can see that the application always returns 200, I never see 304.

@bt90
Copy link
Author

bt90 commented Dec 21, 2023

Most likely because the date values in the header fields of my example are too old now to trigger a 304 response in the backend.

I derived the example from the network tab in Firefox by copying the curl command for a request that yielded a 304 response.

All curl related tests were done without caddy. localhost:5000 is directly bound to the changedetection docker container.

@dgtlmoon
Copy link
Owner

dgtlmoon commented Dec 21, 2023

Most likely because the date values in the header fields of my example are too old now to trigger a 304 response in the backend.

If I make a request that yields the response

curl -vvv 'http://localhost:5000/static/images/gradient-border.png'

* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Content-Disposition: inline; filename=gradient-border.png
< Content-Type: image/png
< Content-Length: 21990
< Last-Modified: Sun, 19 Nov 2023 13:16:01 GMT
< Cache-Control: no-cache, max-age=0
< Expires: Thu, 21 Dec 2023 16:42:42 GMT
< Etag: "1700399761.0490255-21990-3255705226"
< Date: Thu, 21 Dec 2023 16:42:42 GMT
< Vary: Accept-Encoding

And I take the values from that response, and make another curl request which should set the right headers, I can trigger 304

curl -vvv 'http://localhost:5000/static/images/gradient-border.png' \
    -H 'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:121.0) Gecko/20100101 Firefox/121.0' \
    -H 'Accept: image/avif,image/webp,*/*' \
    -H 'Accept-Language: de,en-US;q=0.7,en;q=0.3' \
    -H 'Accept-Encoding: gzip, deflate, br' \
    -H 'Connection: keep-alive' \
    -H 'If-Modified-Since: Sun, 19 Nov 2023 13:16:01 GMT' \
    -H 'If-None-Match: "1700399761.0490255-21990-3255705226"' \
    -H 'TE: trailers'

and then monitor it with tcpdump/wireshark, I can see 0 there..

image

is that what this comes down to?

@bt90
Copy link
Author

bt90 commented Dec 21, 2023

Exactly.

You can compare it with e.g.

curl -vvv http://httpstat.us/304

@dgtlmoon
Copy link
Owner

hmm maybe .local/lib/python3.10/site-packages/eventlet/wsgi.py line 527...

            if use_chunked[0]:
                # Write the chunked encoding
                towrite.append(six.b("%x" % (len(data),)) + b"\r\n" + data + b"\r\n")

sorry for the screenshot from the breakpoint debugger..

image

@dgtlmoon
Copy link
Owner

dgtlmoon commented Dec 21, 2023

                if 'content-length' not in header_list:
                    if self.request_version == 'HTTP/1.1':
                        use_chunked[0] = True
                        towrite.append(b'Transfer-Encoding: chunked\r\n')

yeah ok... so it means something like https://stackoverflow.com/questions/3811595/flask-werkzeug-how-to-attach-http-content-length-header-to-file-download

hmmm but content-length is never passed, and if I add it manually to the headers its ignored..

@dgtlmoon
Copy link
Owner

So yeah, if I change the response to

            response = make_response( send_from_directory("static/{}".format(group), path=filename))
            response.headers['Content-Length'] =  "1000"
            response.headers['Some-Test'] = "ok"
            return response

it still doesnt show the content-length header and shows that weird use_chunk/ chunking / insert a "0" glitch... maybe some option is missing from send_file ??

@dgtlmoon
Copy link
Owner

googling something like "send_from_directory" python flask 304 "not modified" does not really give me much clues... i'm at an end to what todo

@bt90
Copy link
Author

bt90 commented Dec 21, 2023

I think that 0 is in fact the start of the chunked transfer encoding: https://en.wikipedia.org/wiki/Chunked_transfer_encoding#Format

Each chunk starts with the number of octets of the data it embeds expressed as a hexadecimal number in ASCII followed by optional parameters (chunk extension) and a terminating ␍␊ sequence, followed by the chunk data. The chunk is terminated by ␍␊.

So the output would describe an empty chunk:

0\r\n\r\n

This smells like an upstream issue.

@bt90
Copy link
Author

bt90 commented Dec 21, 2023

Could you try it with the conditional option set to True ? The documentation isn't very clear about its purpose but it seems to be related to etag/304 handling.

https://tedboy.github.io/flask/interface_api.useful_funcs.html#flask.send_file

@dgtlmoon
Copy link
Owner

            response = make_response(send_from_directory(os.path.join("static", group), filename,conditional=True))
            response.headers['Some-Test'] = "ok"
            return response

same output..

< HTTP/1.1 304 NOT MODIFIED
< Content-Disposition: inline; filename=gradient-border.png
< Cache-Control: no-cache, max-age=0
< Expires: Fri, 22 Dec 2023 06:24:49 GMT
< Etag: "1700399761.0490255-21990-3255705226"
< Date: Fri, 22 Dec 2023 06:24:49 GMT
< Some-Test: ok
< Vary: Accept-Encoding
< Transfer-Encoding: chunked
< Connection: keep-alive
< 
* Excess found: excess = 5 url = /static/images/gradient-border.png (zero-length body)
* Connection #0 to host localhost left intact

maybe when they added X-Sendfile it broke the support here somehow hmmm, this is upstream for sure, but weird that theres not much visibility on this issue from other projects

@dgtlmoon
Copy link
Owner

It is strongly recommended to activate either X-Sendfile support in your webserver or (if no authentication happens) to tell the webserver to serve files for the given path on its own without calling into the web application for improved performance.

yes, however things like screenshots should be security/access limited

@bt90
Copy link
Author

bt90 commented Dec 22, 2023

Can you derive a minimal reproducible example from this? That would be enough to raise an issue at https://github.com/pallets/flask/issues

@dgtlmoon
Copy link
Owner

@bt90 actually send_file is part of werkzeug not flask https://github.com/pallets/werkzeug

@dgtlmoon
Copy link
Owner

@bt90 pallets/werkzeug#2392 :-) is this the same ?

@bt90
Copy link
Author

bt90 commented Dec 22, 2023

Sounds related, but shouldn't that fix already be part of the version used by changedetection?

I just noticed that flask dropped the 3.0.0 release in September. Should we test with this version combined with the latest werkzeug release before creating an issue?

On the other hand, it's a new major release and I have no idea how much work that means for you...

Also: Sorry to waste your time on such minor issues 😅

@dgtlmoon
Copy link
Owner

@bt90 i'm unable to replicate this bug outside of changedetection.io , could you try this repo and tell me if you can tweak the headers and make the bug appear? https://github.com/dgtlmoon/flask-304-modified-extra-in-reply-test

@dgtlmoon
Copy link
Owner

Sounds related, but shouldn't that fix already be part of the version used by changedetection?

@bt90 yes, see the repo https://github.com/dgtlmoon/flask-304-modified-extra-in-reply-test , we need to find out why it happens in this repo, but not that test repo

@dgtlmoon
Copy link
Owner

@bt90 found it!!! when i added the eventlet.wsgi.server wrapper it came back

@bt90
Copy link
Author

bt90 commented Dec 22, 2023

This issue has certainly been a fun ride 😆

@bt90
Copy link
Author

bt90 commented Dec 22, 2023

eventlet/eventlet#746 ?

@dgtlmoon
Copy link
Owner

they are talking about 204 not 304 right?

@dgtlmoon
Copy link
Owner

@bt90 @Constantin1489 #2029 might be a workaround

@dgtlmoon
Copy link
Owner

yeah I think trying a gunicorn or gevent branch could be the way here (or it will add more issues :-) )

@bt90
Copy link
Author

bt90 commented Dec 22, 2023

@dgtlmoon thanks for your detective work. Filled eventlet/eventlet#862 and blatantly stole your Wireshark screenshot 😉

@dgtlmoon
Copy link
Owner

No probs, a wireshark screenshot is worth a thousand words :-) added you as a collab on that 304 test repo

@dgtlmoon dgtlmoon changed the title [caddy] Some requests fail if changedetection is running behind a reverse proxy 304 'Not Modified' cached handling causes some requests to fail, can affect reverse-proxies Dec 22, 2023
@bt90
Copy link
Author

bt90 commented Dec 22, 2023

Hmm i can't seem to get the example to switch to chunked transfer encoding.

*   Trying 127.0.0.1:5000...
* Connected to localhost (127.0.0.1) port 5000 (#0)
> GET /static HTTP/1.1
> Host: localhost:5000
> User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:121.0) Gecko/20100101 Firefox/121.0
> Accept: image/avif,image/webp,*/*
> Accept-Language: de,en-US;q=0.7,en;q=0.3
> Accept-Encoding: gzip, deflate, br
> Connection: keep-alive
> If-Modified-Since: Fri, 23 Dec 2023 10:14:33 GMT
> If-None-Match: "1703244189.8334846-21990-4220917683"
> TE: trailers
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 304 NOT MODIFIED
< Server: Werkzeug/3.0.1 Python/3.10.12
< Date: Fri, 22 Dec 2023 11:26:20 GMT
< Content-Disposition: inline; filename=image.png
< Cache-Control: no-cache
< ETag: "1703244189.8334846-21990-4220917683"
< Date: Fri, 22 Dec 2023 11:26:20 GMT
< Connection: close
<
* Closing connection 0

@dgtlmoon
Copy link
Owner

added the venv steps to the README.md on that test project to make sure we're all on the same page

@bt90
Copy link
Author

bt90 commented Dec 22, 2023

added the venv steps to the README.md on that test project to make sure we're all on the same page

That solved it. I can reproduce it now:

*   Trying 127.0.0.1:5000...
* Connected to localhost (127.0.0.1) port 5000 (#0)
> GET /static HTTP/1.1
> Host: localhost:5000
> User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:121.0) Gecko/20100101 Firefox/121.0
> Accept: image/avif,image/webp,*/*
> Accept-Language: de,en-US;q=0.7,en;q=0.3
> Accept-Encoding: gzip, deflate, br
> Connection: keep-alive
> If-Modified-Since: Fri, 22 Dec 2023 10:14:33 GMT
> If-None-Match: "1703245183.629759-13861-4220917683"
> TE: trailers
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 304 NOT MODIFIED
< Content-Disposition: inline; filename=image.png
< Cache-Control: no-cache
< Etag: "1703245183.629759-13861-4220917683"
< Date: Fri, 22 Dec 2023 11:42:00 GMT
< Transfer-Encoding: chunked
< Connection: keep-alive
<
* Excess found: excess = 5 url = /static (zero-length body)
* Connection #0 to host localhost left intact

@dgtlmoon dgtlmoon added upstream-bug issue with library - not directly with changedetection and removed triage labels Dec 22, 2023
@dgtlmoon dgtlmoon changed the title 304 'Not Modified' cached handling causes some requests to fail, can affect reverse-proxies UI - 304 'Not Modified' cached handling causes some requests to fail, can affect reverse-proxies Dec 26, 2023
@bt90
Copy link
Author

bt90 commented Dec 27, 2023

Potential fix: eventlet/eventlet#747

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
upstream-bug issue with library - not directly with changedetection user-interface
Projects
None yet
Development

No branches or pull requests

2 participants