forked from golang/glog
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ktesting: tone down warning about leaked test goroutine #401
Open
pohly
wants to merge
1
commit into
kubernetes:main
Choose a base branch
from
pohly:ktesting-warning-delay
base: main
Could not load branches
Branch not found: {{ refName }}
Could not load tags
Nothing to show
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems will bury the real goroutine leakage if the duration is less than stopGracePeriod, do I understand right here?
Another point I can't get here is if the test completes, why will it wait for another 10s?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the "leaked" goroutine stops after less than
stopGracePeriod
, then it wasn't really a leak. It was "just" a test that didn't block and wait for goroutines at the end. This is very common.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally misunderstood the meaning here. I'm not quite understand how it works but I still have a silly question like: Will we wait and detect the goroutine leakage after test completes? Or why the duration from stop will be greater than
stopGracePeriod
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we don't wait. That is something that uber/goleak does, but not this code. This code is merely reporting on leaked goroutines if and when it detects them, which is when they keep logging.
The
stopGracePeriod
is the grace period that goroutines have before their log output triggers this warning. I can rename it if you have a better suggestion.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recalled goleak has maxRetries, then can't we just configure the goleak? It's configurable now. Both the retry times and sleep duration. I think somehow the same as
stopGracePeriod
. Will this lead to other problems?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is klog. We cannot depend on goleak here, nor do I want to re-implement it.
The reason why this warning exists is because the per-test logger has to do something. It cannot call
testing.T.Log
after the test thattesting.T
belongs to has terminated, it would crash - I know, because that has happened initially.When it detects that a test has stopped, it handles output differently and writes to stderr. Issuing this warning seemed usedful, but printing it immediately has confused Kubernetes developers because the goroutine hasn't really leaked. This PR is an attempt to keep the warning for goroutines that are more likely to be real leaks while not printing it for goroutines which terminate quickly after test completion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely agree.
Yes for klog... I mean for callers, like test framework. I suggested in this way is just because the goroutine leakage detection duration is mostly decided by goleak, the maxRetries(default to 20), maxSleep(default to 100ms) specifically, the
duration ~= maxRetries * (maxSleep + filterStacks)
, have no idea whether 10s fits here. However, the feature is reverted... uber-go/goleak#116, so no way to configure now.As a user of klog, happy with either way to avoid confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So LGTM?
Leak detection with goleak only works when tests run sequentially or in TestMain. I'm not sure how useful it is in an E2E test suite, because the actual code under test runs elsewhere. For integration and unit tests, individual test owners have to integrate goleak. Some apiserver-based tests have it.
So this is about how klog will interact with goleak. I don't have a strong opinion here. I picked 10 seconds because it should be long enough to no produce "false positives" for the "I am shutting down after test completion" goroutines and short enough to give feedback for others. Because it's not causing a test failure, it can be lower than the goleak timeouts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm