Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

contextual logging #108995

Merged
merged 5 commits into from Mar 30, 2022
Merged

contextual logging #108995

merged 5 commits into from Mar 30, 2022

Conversation

pohly
Copy link
Contributor

@pohly pohly commented Mar 24, 2022

What type of PR is this?

/kind feature

What this PR does / why we need it:

This completes the infrastructure support for the contextual logging feature:

  • support in the JSON logger for being called directly
  • the ContextualLogging feature gate

Which issue(s) this PR fixes:

Related-to: kubernetes/enhancements#3077

Special notes for your reviewer:

The current implementation of the feature gate uses the approach suggested by @liggitt in #105797 (comment): if the code is meant to use non-default state for the contextual logging feature, a FeatureGate instance must be passed in explicitly.

Because enabling that in cli.Run and logs.InitLogs would be a massive churn, only commands using ValidateAndApply will support contextual logging.

Does this PR introduce a user-facing change?

The infrastructure for contextual logging is complete (feature gate implemented, JSON backend ready).

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

- [KEP]: https://github.com/kubernetes/enhancements/issues/3077

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. area/apiserver area/cloudprovider area/kubelet area/test kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/cli Categorizes an issue or PR as relevant to SIG CLI. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. sig/network Categorizes an issue or PR as relevant to SIG Network. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/testing Categorizes an issue or PR as relevant to SIG Testing. wg/structured-logging Categorizes an issue or PR as relevant to WG Structured Logging. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Mar 24, 2022
InitLogs overrides the klog default and turns contextual logging off. This
ensures that it is only enabled in Kubernetes commands that explicitly enable
it via a feature gate. A feature gate for it gets defined in
k8s.io/component-base/logs and is then used by Options.ValidateAndApply.

The effect of disabling contextual logging is very limited according to
benchmarks with kube-scheduler. The feature gets added anyway to satisfy the
PRR recommendation that features should be controllable.

The following commands have support for contextual logging:
- kube-apiserver
- kube-controller-manager
- kubelet
- kube-scheduler
- component-base/logs example

Supporting a feature gate check in ValidateAndApply and not in InitLogs is a
simplification: changing InitLogs to accept a FeatureGate would have implied
changing also component-base/cli.Run. This didn't seem worthwhile because
ValidateAndApply already covers the relevant commands.
@enj enj added this to Needs Triage in SIG Auth Old Mar 29, 2022
@liggitt
Copy link
Member

liggitt commented Mar 29, 2022

cmd and feature gate changes lgtm

@liggitt
Copy link
Member

liggitt commented Mar 29, 2022

/approve
for cmd / feature gate changes

will defer lgtm to @serathius
hold to prevent accidental merge, unhold once this has lgtm

@liggitt liggitt added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 29, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: liggitt, pohly

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 29, 2022
@ehashman ehashman added this to Needs Reviewer in SIG Node PR Triage Mar 29, 2022
@ehashman
Copy link
Member

/retest-required
boskos outage related

Copy link
Member

@ehashman ehashman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/hold cancel
/lgtm

none of my comments are blocking, afaict all of @serathius's feedback was addressed. We can address any remaining issues in a follow-up.

@@ -38,11 +38,15 @@ var (
// NewJSONLogger creates a new json logr.Logger and its associated
// flush function. The separate error stream is optional and may be nil.
// The encoder config is also optional.
func NewJSONLogger(infoStream, errorStream zapcore.WriteSyncer, encoderConfig *zapcore.EncoderConfig) (logr.Logger, func()) {
func NewJSONLogger(v config.VerbosityLevel, infoStream, errorStream zapcore.WriteSyncer, encoderConfig *zapcore.EncoderConfig) (logr.Logger, func()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that we do have some k8s SIGs projects that vendor this... https://cs.k8s.io/?q=NewJSONLogger&i=nope&files=&excludeFiles=&repos=

We may want to add an action required release note saying that this interface has changed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we normally announce API changes in staging repos in the release notes, do we?

It's not relevant for end-users and developers who update their dependencies will notice because their code won't compile anymore, which then can be solved easily by looking at a diff.

@@ -53,13 +57,13 @@ func NewJSONLogger(infoStream, errorStream zapcore.WriteSyncer, encoderConfig *z
encoder := zapcore.NewJSONEncoder(*encoderConfig)
var core zapcore.Core
if errorStream == nil {
core = zapcore.NewCore(encoder, infoStream, zapcore.Level(-127))
core = zapcore.NewCore(encoder, infoStream, zapV)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding from the zapcore docs is that zapcore.Level(-127) is almost the lowest possible priority debug level (the lowest being -128).

LevelEnabler decides whether a given logging level is enabled when logging a message.

from https://pkg.go.dev/go.uber.org/zap/zapcore#LevelEnabler ... honestly the off-by-1 was probably a typo.

This change will allow us to shed any logs below the verbosity threshold... makes sense to me.

} else {
highPriority := zap.LevelEnablerFunc(func(lvl zapcore.Level) bool {
return lvl >= zapcore.ErrorLevel
return lvl >= zapcore.ErrorLevel && lvl >= zapV
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

n.b. zapcore.ErrorLevel = 2
https://pkg.go.dev/go.uber.org/zap/zapcore#Level

V=2 is our default log level so passing 0 results in the current k8s default behaviour, higher values = more logs.

})
lowPriority := zap.LevelEnablerFunc(func(lvl zapcore.Level) bool {
return lvl < zapcore.ErrorLevel
return lvl < zapcore.ErrorLevel && lvl >= zapV
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible that a consequence of this is that logs could be escalated to the Error stream when they previously were Info?

Answering my own question: no, not possible, because config.VerbosityLevel is an unsigned int:

@@ -239,7 +239,7 @@ func TestKlogIntegration(t *testing.T) {
t.Run(tc.name, func(t *testing.T) {
var buffer bytes.Buffer
writer := zapcore.AddSync(&buffer)
logger, _ := NewJSONLogger(writer, nil, nil)
logger, _ := NewJSONLogger(100, writer, nil, nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to include a comment on the magic constant 100. I deduced it means we log at V=100, flags above in the test set V=2 or lower.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> #109194


// This is the default in Kubernetes. Options.ValidateAndApply
// will override this with the result of a feature gate check.
klog.EnableContextualLogging(false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why set this to false and not contextualLoggingDefault?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we reach beta, contextualLoggingDefault will be true. At that point, we'll probably still want the feature to be off in those commands which don't have a feauture gate parameter because then a beta feature would be permanently enabled for them. That seems too soon, we probably only should do that when the feature is GA.

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 29, 2022
@ehashman ehashman moved this from Needs Reviewer to Done in SIG Node PR Triage Mar 29, 2022
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 29, 2022
@k8s-ci-robot k8s-ci-robot merged commit 5b8dbfb into kubernetes:master Mar 30, 2022
SIG Auth Old automation moved this from Needs Triage to Closed / Done Mar 30, 2022
pohly added a commit to pohly/kubernetes that referenced this pull request May 31, 2022
k8s-publishing-bot pushed a commit to kubernetes/component-base that referenced this pull request Jun 10, 2022
This addresses review feedback from
kubernetes/kubernetes#108995 (comment).

Kubernetes-commit: 7b8d711d0255be17de4e6e6918e8e03120af6ab9
muyangren2 pushed a commit to muyangren2/kubernetes that referenced this pull request Jul 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver area/cloudprovider area/kubectl area/kubelet area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/cli Categorizes an issue or PR as relevant to SIG CLI. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. sig/network Categorizes an issue or PR as relevant to SIG Network. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on. wg/structured-logging Categorizes an issue or PR as relevant to WG Structured Logging.
Projects
Archived in project
SIG Auth Old
Closed / Done
Development

Successfully merging this pull request may close these issues.

None yet

6 participants