New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docker logging blocks and never recovers #22502
Comments
ping @aaronlehmann @unclejack PTAL |
Can you test with |
@tonistiigi I don't have access to 1.11 at the moment, will have to test tomorrow (I'm on UTC+1). You should be able to repeat the steps above in a minute or two if you have a 1.11 to test against. |
@relistan I tested with your script and #21840 does seem to solve this. Note that point 6 behavior is somewhat by-design and is not fixed. If you have one reader that is very slow and one that is very fast we first buffer to memory but when this buffer fills up we are going to rely on the slower reader speed(0 in your case). If we didn't do that we would just run out of memory. Maybe we could provide something to tweak the buffer size. The bug in |
Thanks for testing @tonistiigi. I don't believe that is the right behavior in item 6 above, even if it's by design. Docker should drop log messages to the slow reader, not block the containerized app itself. Note that in item 6 it's not just the slow reader that is blocked, it's the whole stdout of the container. That's a big change in functionality vs 1.9 and also not very friendly behavior in a production environment. For example, someone watching the |
I don't think dropping data from the log stream would necessarily be the correct behavior. This could break the formatting of the logs and lead to unexpected gaps in the log data. Perhaps it would be reasonable as an option. Alternatively, the buffer size could be made configurable. |
@aaronlehmann so you are saying its acceptable behavior to block after a buffer is reached in a production environment? Dropping data or corrupting the formatting is definitely much better than blocking(thus making the app unusable) would it be feasible to at least have this as a config option to drop instead of block in such cases? |
Isn't there a separate buffer for each reader? That's what it looks like from the code. So if I have the daemon logging to a file, and two clients watching |
I don't think this has anything to do with logs or the logging speed, the problem appears because there is an attach stream that refuses to read data. (Of course if the logdriver was considerably slower than attach stream then this would be opposite). There can be 2 buffers per io stream. One if there is an attach stream and other if a log driver is attached. New |
@tonistiigi we're talking about logs here, right? Having worked on production systems for over 20 years I can't remember a time when I would have preferred to have complete logs over having the application up and running. The only exception being audit logging, something generally not entrusted to generalized logging anyway. This is the reason many people don't log to disk, and why traditional stream logging was done over UDP, for example. I know you said you don't think it's about logging speed, but it is. The tool I provided above is the fastest way to make it happen so it doesn't read anything. In the production environment where this hit us, Heka was consuming the stream, but it had gotten backed up and was slow. I know this because we replaced our log appender with an asynchronous one. We then attached metrics to the remaining buffer size for our asynchronous appender. You can see it go up and down as Heka consumes messages. Of course since we're on 1.10.3 it eventually sits at low water mark forever because the Docker engine stops consuming. Making the buffer bigger would make this happen less frequently perhaps, but it is not a solution. Also no one has addressed that this was a fundamental change in the Docker engine and that it wasn't in the changelogs. I know this is hard to get right all the time, but it should be retro-actively be added to the appropriate changelog so that anyone having not already upgraded can see it. If there are actually people who prefer this behavior, can we have a CLI flag to switch it? If not, I'm not sure what logging solution I'll recommend in production that leverages the Docker daemon, including in "Docker: Up and Running". The only other way around it is to use a shim like we used to with the Spotify logger or some other supervisory process in the container that does the right thing. |
logging is peripheral to the primary function of the application. This behaviour leads to a system that as soon as you flag an app in docker to enable debug level logging for a while then in the case where the debugging is volumous the system under debug becomes unusable - and impossible to debug. Operationally you'd much rather sacrifice logging integrity. Systems where logging is so important as to be a critical function of the app will/should not be doing that logging to STD{OUT,ERR}, so sacrificing integrity of STD{OUT,ERR} logs seems reasonable to me over sacrificing the entire app. |
I understand there are applications where it is absolutely critical that no log is lost. I would argue that is by far outweighed by applications that are more than happy to shed a few logs to survive a heavy load event. If it were configurable you could give people the option of 100% log capture but I would say the default should be to throw logs on the floor for a consumer with a full buffer. |
Someone able to cause a denial of service by using a slow log reader is even more dangerous, as "blocked" in this case is just another way to say DoS. In the field, one simply never assumes that syslogs are 100% complete because syslog, like ICMP, is one of the first things that is dropped to ensure the responsiveness of the service. Some systems note "## syslog events were dropped" so that purposeful gaps can be differentiated from gaps caused by communication interruption. That may be helpful here to some extent. |
@rnelson0 In this case, a slow reader could have caused unbounded buffer growth, eventually causing the entire daemon to crash. A slow |
Sorry if it was unclear, I was making analogies between a log reader dropping logs and how syslogs are treated, but not trying to directly relate them. |
The speed of a |
@cpuguy83 @tonistiigi Are you suggesting that the reproduction steps above do not cause the application to be blocked? It sounds like you're now saying that this bug is not valid. It seems that #21840 also proves that this can happen. I suggest that you try it on 10.1.3 with the reproduction steps above, it definitely blocks. And, we have production evidence that it does. I thought we were all agreed on the premise at least and simply arguing the correctness of the implementation. @tonistiigi the JSON logger never dropped logs before v1.10 but it did not block the container, either. It had a different issue: it just blew up in memory. That was very bad behavior but at least in production we could manage around it: you had a little time before things crapped out and you could monitor for it. What we see now is the container just blocks. If you believe the write speed should not be affected then it looks likely that there is instead a concurrency issue somewhere that causes that effect. The problem with all of the other drivers, including the syslog driver, is that you give up the ability to look at the log stream from a single container with |
@relistan The example is contrived. This would only happen with either a blocked/slow logging driver (note this is a driver not (Some) Logging drivers can be configured for async logging, this is your choice to risk dropping logs. Issues with recovery after unblocking (e.g. by closing down the client connection) are fixed in 1.11. I could see about implementing a way to make sure that attached clients can't block, but this would also be a significant change for some. |
And I take that back, I'm not seeing the recovery issue on master. |
@relistan |
@cpuguy83 @tonistiigi the behavior of |
@mjvaldez Attached clients get a 1MB buffer. This is a rather large buffer and clients would probably have to just not be reading from the attach stream at all to hit this buffer limit. In practice, in any environment, why is something attached to a container in such a way? |
@cpuguy83 please see comments above where relistan describes the use case and ramifications of the behavior we saw in production. |
@mjvaldez AFAICT this is centered around logging. Can you explain how you have logging setup? |
@relistan What version of mozilla-heka do you use? Seems old versions used the attach stream but it was fixed in mozilla-services/heka@bd5c4e8 |
The fundamental misunderstanding here is that it was not clear to me (and I think also at least the authors of Logspout, and Heka until recently) that
I was not aware of that and I guess no one at Docker (at least none of you guys) knew that at least two of the major third party logging systems were until not very long ago using the I am a major contributor on the Heka Docker plugin and we've been running it at Nitro just fine until we upgraded to Docker v1.10 in production and the change took out our site not long after. We'll pull in Rob's recent changes to the Heka Docker plugin that switches it to the Nonetheless, this functional change to the behavior of the I would suggest that a 1MB buffer (not quite, it's 1e6) is also not anywhere near big enough. With log line average length around 100-150 bytes , this is 6666 to 10k lines of buffer. A high throughput production app can blow that buffer very quickly, especially in debug logging mode. Finally, an explicit warning that |
Inadvertently closed rather than commented. |
Just fyi, for 1.14 we've added a new logging mode that you can pass through via |
Is there any way to recover the logs if you accidentally froze it? I am having this issue with version 1.13.1. When doing |
still hang on |
@Laisky IF you have a client attached to the container stdio and the client is not consuming it (or is consuming too slowly), you will get blocks on write. |
@cpuguy83 thank you for your answer. I can reproduce the problem by simply run one container with docker 18.03.0-ce, build 0520e24 (also on 18.01 9ee9f40):
the docker image of import datetime
while 1:
print(datetime.datetime.now()) and the after running a few minutes, the python script in this container will be blocked: # strace -tt -p 26500
Process 26500 attached
17:00:09.487341 write(1, "\n", 1 |
@Laisky I cannot reproduce
Until I stop the container. |
You can permanently block a production container by listening too slowly on a log socket with any client. Docker eventually blocks the container process, and it never recovers. The container will remain blocked on stdout/stderr unless it's handling output asynchronously and dealing with output overruns on its own. This is new behavior introduced in November, which utterly changes the behavior of logging in Docker. It is not even reflected in the changelogs.
Output of
docker version
:Output of
docker info
:Additional environment details (AWS, VirtualBox, physical, etc.):
Running on AWS, repeatable on VMWare
Steps to reproduce the issue:
docker run -d busybox sh -c "while true; do date; done"
. Note the container ID for later.blockingreader.go
from here . This connects up to a container, sets up some pipes, and then never reads from them, letting Docker buffer internally.go get github.com/fsouza/go-dockerclient
docker logs --tail=2 -f <your_container_id>
in another terminalgo run blockingreader.go <your_container_id>
busybox
but the app is not just not logging, it's actually totally blocked. You could hook GDB up to it if you really want to see. We took stack dumps of our app to prove it. This is bad because you've now blocked a production app with a client connection to the logs.blockingreader
process which will release the Docker client socket and you would expect to see Docker unblock the container. In extensive testing, this does not happen. Even when the blocked reader is gone, Docker continues to block the container.Describe the results you received:
Docker eventually blocks on high output logging if any client is slow and the internal logging buffer fills. This was not expected. It then also does not release the container when the socket is disconnected. This was really not expected.
Describe the results you expected:
Docker would continue to accept logging output from containerized applications without blocking (or leaking GBs of RAM like it used to). I would have expected a ring buffer implementation or some kind of sliding window without blocking at all.
Additional information you deem important (e.g. issue happens only occasionally):
100% repeatable. We tripped into this issue because we use Mozilla's Heka to log all of our container output. Heka connects to the Docker logs endpoint and sometimes becomes slow under heavy logging. In this way you can take out an entire server of Docker containers because of the permanent blocking behavior of the logs.
The text was updated successfully, but these errors were encountered: