Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

awslogs: Prevent close from being blocked on log #47748

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

cpuguy83
Copy link
Member

@cpuguy83 cpuguy83 commented Apr 23, 2024

Before this change a call to Close could be blocked if the the channel used to buffer logs is full.
When this happens the container state will end up wedged causing a deadlock on anything that needs to lock the container state.

This removes the use of a channel which has semantics which are difficult to manage to something more suitable for the situation.

Closes #39523
I can't say for sure if this resolves every report in #39523 but with the limited information provided I think this is the best we can do.
If others experience an issue then they'll need to open a new issue with the needed details to track the problem down.

Before this change a call to `Close` could be blocked if the the channel
used to buffer logs is full.
When this happens the container state will end up wedged causing a
deadlock on anything that needs to lock the container state.

This removes the use of a channel which has semantics which are
difficult to manage to something more suitable for the situation.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
@cpuguy83 cpuguy83 marked this pull request as ready for review May 2, 2024 21:47
Copy link
Contributor

@austinvazquez austinvazquez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cpuguy83 nice refactor here! Overall changes LGTM. Just had one or two questions but nothing blocking.

"sync"

"github.com/docker/docker/daemon/logger"
"github.com/pkg/errors"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor-nit: should we use built-in errors package here instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use pkg/errors everywhere in moby, mostly because it handles attaching stack traces.

// MessageQueue is a queue for log messages.
//
// [MessageQueue.Enqueue] will block unless/until there is a call to
// [MessageQueue.Dequeue].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be updated that dequeue is the act of reading from the channel returned by MessageQueue.Receiver?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes indeed! Forgot to update this after changing the implementation.

@@ -576,7 +578,7 @@ func (l *logStream) collectBatch(created chan bool) {
}
l.publishBatch(batch)
batch.reset()
case msg, more := <-l.messages:
case msg, more := <-chLogs:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The channel will be closed once the message queue is closed, so any buffered messages will not be handled by the current read implementation. This behavior existed before though so perhaps it should be a seperate issue.

Admittedly I also cannot think of a clean solution to abstract the complexity of reading from the underlying channel of the message queue structure here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should only return after the buffer is emptied (more is true only after the last message is drained).
https://go.dev/play/p/NR4WOn-XUCs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh nice! TIL, the wording in https://go.dev/tour/concurrency/4 threw me off, but your example proves the correct behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

docker stats hangs
2 participants