Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When disk I/O was very heavy, klog calling file sync with locking caused all log recording to be blocked for a long time. #403

Open
1978629634 opened this issue Apr 2, 2024 · 8 comments · May be fixed by #404
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@1978629634
Copy link

1978629634 commented Apr 2, 2024

/kind bug

What steps did you take and what happened:

We found an issue in our Kubernetes cluster. When disk I/O was very heavy, the node would occasionally become not ready.
We identified that this was because klog was calling fsync with locking, which caused kubelet's log writing to be blocked. This resulted in the container runtime status check time exceeding 10 seconds. As a result, kubelet considered the container runtime to be unhealthy and set the node to not ready status.

The kernel has an excellent and efficient buffered I/O mechanism. fsync can ensure that logs are persisted to disk, but we should not call fsync with locking as it would significantly impact performance and fail to take advantage of the kernel's buffered I/O mechanism, making it no different from using direct I/O (which is generally only used by databases).

A high-performance approach is to spawn a lock-free goroutine to periodically call fsync.(fsync does not block buffered I/O)

https://github.com/kubernetes/klog/blob/main/klog.go#L1224

// lockAndFlushAll is like flushAll but locks l.mu first.
func (l *loggingT) lockAndFlushAll() {
	l.mu.Lock()
	l.flushAll()
	l.mu.Unlock()
}

// flushAll flushes all the logs and attempts to "sync" their data to disk.
// l.mu is held.
func (l *loggingT) flushAll() {
	// Flush from fatal down, in case there's trouble flushing.
	for s := severity.FatalLog; s >= severity.InfoLog; s-- {
		file := l.file[s]
		if file != nil {
			_ = file.Flush() // ignore error
			_ = file.Sync()  // ignore error
		}
	}
	if logging.loggerOptions.flush != nil {
		logging.loggerOptions.flush()
	}
}

What did you expect to happen:

klog will not block normal log recording due to flushing dirty pages to disk(heavy IO).

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 2, 2024
@1978629634
Copy link
Author

@pohly I'm facing an issue and would appreciate if you could take a look when you have a chance.

@pohly
Copy link

pohly commented Apr 2, 2024

So your proposal is that lockAndFlushAll shouldn't hold the lock while flushing, right? That seems reasonable to me, one just has to be careful about still holding the lock while accessing the shared data structures. Basically flushAll needs to be split into "gather information about what to do" (while holding the lock) and "do it" (while not holding the lock).

Do you want to work on a PR?

@1978629634
Copy link
Author

/assign

@1978629634
Copy link
Author

So your proposal is that lockAndFlushAll shouldn't hold the lock while flushing, right? That seems reasonable to me, one just has to be careful about still holding the lock while accessing the shared data structures. Basically flushAll needs to be split into "gather information about what to do" (while holding the lock) and "do it" (while not holding the lock).

Do you want to work on a PR?

Do you want to work on a PR?

Working on the PR sounds great. I'm available and would be happy to take that on.

@1978629634
Copy link
Author

So your proposal is that lockAndFlushAll shouldn't hold the lock while flushing, right? That seems reasonable to me, one just has to be careful about still holding the lock while accessing the shared data structures. Basically flushAll needs to be split into "gather information about what to do" (while holding the lock) and "do it" (while not holding the lock).

Do you want to work on a PR?

I suggest acquiring a lock when calling file.Flush() (to write the klog buffer into the kernel buffer), as not doing so may lead to exceptional issues. However, for file.Sync() (the fsync system call that flushes the kernel buffer to disk), there is no need to acquire a lock; let the kernel handle it.

@1978629634 1978629634 linked a pull request Apr 2, 2024 that will close this issue
@pohly
Copy link

pohly commented Apr 3, 2024

I suggest acquiring a lock when calling file.Flush() (to write the klog buffer into the kernel buffer), as not doing so may lead to exceptional issues.

Indeed, syncBuffer (the implementation behind l.file) is not thread-safe.

@1978629634
Copy link
Author

glog also had this problem, and after I reported it to the glog team (golang#67), they have already fixed it in the release v1.2.1 (golang#68).

@logicalhan
Copy link
Member

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants