Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubelet: revert contextual logging support #110869

Closed
wants to merge 1 commit into from

Conversation

pohly
Copy link
Contributor

@pohly pohly commented Jun 29, 2022

What type of PR is this?

/kind bug
/kind failing-test

What this PR does / why we need it:

It turned out that using testing.T for logging had a race condition and
potential panic because the tests keep goroutines running and testing.T is not
supposed to be used anymore after test
completion (#110854).

Either the code must be fixed to terminate all goroutines before a test
ends (seems non-trivial because the usage of goroutines is fairly complex in
the shutdown manager code), or ktesting must handle this case. A solution for
this is pending in kubernetes/klog#337.

Either way, solving this will take a bit longer. In the meantime, we should
revert the change to get unit testing stable again.

Which issue(s) this PR fixes:

Related-to: #110854

Special notes for your reviewer:

In order to make this a local change, ktesting is kept as a dependency of the
test and thus Kubernetes.

Does this PR introduce a user-facing change?

NONE

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. kind/bug Categorizes issue or PR as related to a bug. kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jun 29, 2022
@k8s-ci-robot
Copy link
Contributor

@pohly: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jun 29, 2022
@pohly
Copy link
Contributor Author

pohly commented Jun 29, 2022

/cc @aojea @kerthcet

@k8s-ci-robot k8s-ci-robot added area/kubelet sig/node Categorizes an issue or PR as relevant to SIG Node. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jun 29, 2022
@aojea
Copy link
Member

aojea commented Jun 29, 2022

pkg/kubelet/kubelet.go:824:3: unknown field 'Logger' in struct literal of type nodeshutdown.Config

@pohly pohly force-pushed the kubelet-shutdown-test-revert branch from 08c53e0 to 063c25d Compare June 30, 2022 06:05
It turned out that using testing.T for logging had a race condition and
potential panic because the tests keep goroutines running and testing.T is not
supposed to be used anymore after test
completion (kubernetes#110854).

Either the code must be fixed to terminate all goroutines before a test
ends (seems non-trivial because the usage of goroutines is fairly complex in
the shutdown manager code), or ktesting must handle this case. A solution for
this is pending in kubernetes/klog#337.

Either way, solving this will take a bit longer. In the meantime, we should
revert the change to get unit testing stable again.

In order to make this a local change, ktesting is kept as a dependency of the
test and thus Kubernetes.
@pohly pohly force-pushed the kubelet-shutdown-test-revert branch from 063c25d to 8384e72 Compare June 30, 2022 07:56
name string
fields fields
wantErr bool
exceptOutputContains string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expect here? Or you're on purpose.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am reverting to the code before my changes. It was like that.

// hijack the klog output
tmpWriteBuffer := new(buffer)
klog.SetOutput(tmpWriteBuffer)
klog.LogToStderr(false)
Copy link
Member

@kerthcet kerthcet Jun 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we defer to revert this with defer LogToStderr(true)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here - I am just reverting.

@kerthcet
Copy link
Member

Totally looks good to me, but would like to leave to someone more familiar with this.

@pohly
Copy link
Contributor Author

pohly commented Jun 30, 2022

Here's the diff against the code before PR #110504:

$ git diff 0669ba386bde2e756bc9c6779ad4a4f036200f28 -- pkg/kubelet/kubelet.go pkg/kubelet/kubelet_test.go pkg/kubelet/nodeshutdown 
diff --git a/pkg/kubelet/nodeshutdown/nodeshutdown_manager_linux_test.go b/pkg/kubelet/nodeshutdown/nodeshutdown_manager_linux_test.go
index e682c0963f0..e58f2d4dea1 100644
--- a/pkg/kubelet/nodeshutdown/nodeshutdown_manager_linux_test.go
+++ b/pkg/kubelet/nodeshutdown/nodeshutdown_manager_linux_test.go
@@ -45,6 +45,14 @@ import (
        probetest "k8s.io/kubernetes/pkg/kubelet/prober/testing"
        "k8s.io/utils/clock"
        testingclock "k8s.io/utils/clock/testing"
+
+       // https://github.com/kubernetes/kubernetes/pull/110504 started using
+       // ktesting for the first time in Kubernetes and therefore added it to
+       // "vendor". We need to revert that PR for a short while (see
+       // https://github.com/kubernetes/kubernetes/issues/110854) but to
+       // make the revert local to this directory, we keep importing that
+       // package (no change to vendor).
+       _ "k8s.io/klog/v2/ktesting/init"
 )
 
 // lock is to prevent systemDbus from being modified in the case of concurrency.

Copy link
Member

@endocrimes endocrimes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simple revert that fixes the data race

/lgtm
/assign @mrunalp

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 7, 2022
@pohly
Copy link
Contributor Author

pohly commented Jul 7, 2022

/hold

Let's merge #111001 instead.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 7, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: endocrimes, pohly
To complete the pull request process, please assign dchen1107 after the PR has been reviewed.
You can assign the PR to them by writing /assign @dchen1107 in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@pohly
Copy link
Contributor Author

pohly commented Jul 8, 2022

/close

The klog update was merged, the revert should not be needed anymore.

@k8s-ci-robot
Copy link
Contributor

@pohly: Closed this PR.

In response to this:

/close

The klog update was merged, the revert should not be needed anymore.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubelet cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. kind/bug Categorizes issue or PR as related to a bug. kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. release-note-none Denotes a PR that doesn't merit a release note. sig/node Categorizes an issue or PR as relevant to SIG Node. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants