Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

logcheck: contextual logging + enhanced checks #297

Merged
merged 11 commits into from Mar 21, 2022
3 changes: 2 additions & 1 deletion hack/tools/go.mod
Expand Up @@ -3,6 +3,7 @@ module k8s.io/klog/hack/tools
go 1.15

require (
github.com/golangci/golangci-lint v1.43.0
golang.org/x/exp v0.0.0-20210220032938-85be41e4509f
golang.org/x/tools v0.1.0
golang.org/x/tools v0.1.7
)
1,136 changes: 1,133 additions & 3 deletions hack/tools/go.sum

Large diffs are not rendered by default.

83 changes: 80 additions & 3 deletions hack/tools/logcheck/README.md
@@ -1,5 +1,82 @@
This directory contains tool for checking use of unstructured logs in a package. It is created to prevent regression after packages have been migrated to use structured logs.
This directory contains a linter for checking log calls. It was originally
created to detect when unstructured logging calls like `klog.Infof` get added
to files that should only use structured logging calls like `klog.InfoS`
and now also supports other checks.

**Installation:**`go install k8s.io/klog/hack/tools/logcheck`
**Usage:** `$logcheck.go <package_name>`
# Installation

`go install k8s.io/klog/hack/tools/logcheck`

# Usage

`$logcheck.go <package_name>`
`e.g $logcheck ./pkg/kubelet/lifecycle/`

# Configuration

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't find this configuration format better than hack/verify-structured-logging.sh. Have we first looked if it is possible to simplify the script before writing our own custom configuration format?

Copy link
Author

@pohly pohly Mar 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you look at the commit message? It cites several other advantages of integrating with golangci-lint:

  • faster checking because finding files and parsing is shared
    with other linters
  • support for // nolint:logcheck

Moving the logic of what to check in which file out of the script and into the checker is required for that integration.

This new configuration format also scales better: instead of maintaining different lists of files where unstructured calls are banned and another list where all klog logging calls are banned we only need to maintain one config file.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the script: others are welcome to try and simplify it, but I consider that a dead end and won't spend time on it. The issue about fixing it has been open for months without progress, too.

Copy link

@serathius serathius Mar 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if the advantages of golangci-lint you listed impact k8s enough to motivate this change. Personally I would really want to avoid introducing custom configuration as it generates much more work long term. For maintainability of kubernetes we need to ensure that we not only consider cost of implementation, but also how much work other contributors will spend time to understand, debug, fix and extend this format.

I would prefer for introduction of contextual logging to logcheck not to include this. However I don't have enough experience in maintaining validation scripts for K8s, so maybe I'm wrong. I think best would be to get some feedback from ./hack/ directory owners as they will be de final approvers on the configuration.
/cc bentheelder cblecker deads2k fejta lavalamp liggitt pwittrock SataQiu spiffxp sttts mikedanese

Copy link
Author

@pohly pohly Mar 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/cc @BenTheElder @cblecker @deads2k @fejta @lavalamp @liggitt @pwittrock @SataQiu @spiffxp @sttts @mikedanese

Guys, can anyone comment on this question: do you prefer to integrate log checking into the common golangci-lint invocation (drawback: a more complex configuration file is needed) or maintaining several different scripts under hack, one for structured logging and another for contextual logging (drawbacks: current script is complex and broken, slower overall, no support for nolint:logcheck)?

The support for integration with golangci-lint is in this PR. The prototype integration into Kubernetes is currently in kubernetes/kubernetes@master...pohly:logcheck

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me add that I had asked about this before going in this direction: https://kubernetes.slack.com/archives/C09QZ4DQB/p1643049767036400

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here's a PR with the Kubernetes changes, updated right now with the latest code from this PR: kubernetes/kubernetes#108159


Checks can be enabled or disabled globally via command line flags and env
variables. In addition, the global setting for a check can be modified per file
via a configuration file. That file contains lines in this format:

```
<checks> <regular expression>
```

`<checks>` is a comma-separated list of the names of checks that get enabled or
disabled when a file name matches the regular expression. A check gets disabled
when its name has `-` as prefix and enabled when there is no prefix or `+` as
prefix. Only checks that are mentioned explicitly are modified. All regular
expressions are checked in order, so later lines can override the previous
ones.

In this example, checking for klog calls is enabled for all files under
`pkg/scheduler` in the Kubernetes repo except for `scheduler.go`
itself. Parameter checking is disabled everywhere.

```
klog,-parameters k8s.io/kubernetes/pkg/scheduler/.*
-klog k8s.io/kubernetes/pkg/scheduler/scheduler.go
```
pohly marked this conversation as resolved.
Show resolved Hide resolved

The names of all supported checks are the ones used as sub-section titles in
the next section.

# Checks

## structured (enabled by default)

Unstructured klog logging calls are flagged as error.

## klog (disabled by default)
pohly marked this conversation as resolved.
Show resolved Hide resolved

None of the klog logging methods may be used. This is even stricter than
`unstructured`. Instead, code should retrieve a logr.Logger from klog and log
through that.

## parameters (enabled by default)

This ensures that if certain logging functions are allowed and are used, those
functions are passed correct parameters.

### all calls

Format strings are not allowed where plain strings are expected.

### structured logging calls

Key/value parameters for logging calls are checked:
- For each key there must be a value.
- Keys must be constant strings.

This also warns about code that is valid, for example code that collects
key/value pairs in an `[]interface` variable before passing that on to a log
call. Such valid code can use `nolint:logcheck` to disable the warning (when
invoking logcheck through golangci-lint) or the `parameters` check can be
disabled for the file.

## with-helpers (disabled by default)

`logr.Logger.WithName`, `logr.Logger.WithValues` and `logr.NewContext` must not
be used. The corresponding helper calls from `k8s.io/klogr` should be used
instead. This is relevant when support contextual logging is disabled at
runtime in klog.
194 changes: 3 additions & 191 deletions hack/tools/logcheck/main.go
Expand Up @@ -17,199 +17,11 @@ limitations under the License.
package main

import (
"flag"
"fmt"
"go/ast"
"go/token"
"strings"

"golang.org/x/exp/utf8string"
"golang.org/x/tools/go/analysis"
"golang.org/x/tools/go/analysis/singlechecker"
)

type config struct {
// When enabled, logcheck will ignore calls to unstructured klog methods (Info, Infof, Error, Errorf, Warningf, etc)
allowUnstructured bool
}
"k8s.io/klog/hack/tools/logcheck/pkg"
)

func main() {
singlechecker.Main(analyser())
}

func analyser() *analysis.Analyzer {
c := config{}
logcheckFlags := flag.NewFlagSet("", flag.ExitOnError)
logcheckFlags.BoolVar(&c.allowUnstructured, "allow-unstructured", c.allowUnstructured, `when enabled, logcheck will ignore calls to unstructured
klog methods (Info, Infof, Error, Errorf, Warningf, etc)`)

return &analysis.Analyzer{
Name: "logcheck",
Doc: "Tool to check use of unstructured logging patterns.",
Run: func(pass *analysis.Pass) (interface{}, error) {
return run(pass, &c)
},
Flags: *logcheckFlags,
}
}

func run(pass *analysis.Pass, c *config) (interface{}, error) {

for _, file := range pass.Files {

ast.Inspect(file, func(n ast.Node) bool {

// We are intrested in function calls, as we want to detect klog.* calls
// passing all function calls to checkForFunctionExpr
if fexpr, ok := n.(*ast.CallExpr); ok {
checkForFunctionExpr(fexpr, pass, c)
}

return true
})
}
return nil, nil
}

// checkForFunctionExpr checks for unstructured logging function, prints error if found any.
func checkForFunctionExpr(fexpr *ast.CallExpr, pass *analysis.Pass, c *config) {

fun := fexpr.Fun
args := fexpr.Args

/* we are extracting external package function calls e.g. klog.Infof fmt.Printf
and eliminating calls like setLocalHost()
basically function calls that has selector expression like .
*/
if selExpr, ok := fun.(*ast.SelectorExpr); ok {
// extracting function Name like Infof
fName := selExpr.Sel.Name

// for nested function cases klog.V(1).Infof scenerios
// if selExpr.X contains one more caller expression which is selector expression
// we are extracting klog and discarding V(1)
if n, ok := selExpr.X.(*ast.CallExpr); ok {
if _, ok = n.Fun.(*ast.SelectorExpr); ok {
selExpr = n.Fun.(*ast.SelectorExpr)
}
}

// extracting package name
pName, ok := selExpr.X.(*ast.Ident)

if ok && pName.Name == "klog" {
// Matching if any unstructured logging function is used.
if !isUnstructured((fName)) {
// if format specifier is used, check for arg length will most probably fail
// so check for format specifier first and skip if found
if checkForFormatSpecifier(fexpr, pass) {
return
}
if fName == "InfoS" {
isKeysValid(args[1:], fun, pass, fName)
} else if fName == "ErrorS" {
isKeysValid(args[2:], fun, pass, fName)
}
} else if !c.allowUnstructured {
msg := fmt.Sprintf("unstructured logging function %q should not be used", fName)
pass.Report(analysis.Diagnostic{
Pos: fun.Pos(),
Message: msg,
})
}
}
}
}

func isUnstructured(fName string) bool {

// List of klog functions we do not want to use after migration to structured logging.
unstrucured := []string{
"Infof", "Info", "Infoln", "InfoDepth",
"Warning", "Warningf", "Warningln", "WarningDepth",
"Error", "Errorf", "Errorln", "ErrorDepth",
"Fatal", "Fatalf", "Fatalln", "FatalDepth",
"Exit", "Exitf", "Exitln", "ExitDepth",
}

for _, name := range unstrucured {
if fName == name {
return true
}
}

return false
}

// isKeysValid check if all keys in keyAndValues is string type
func isKeysValid(keyValues []ast.Expr, fun ast.Expr, pass *analysis.Pass, funName string) {
if len(keyValues)%2 != 0 {
pass.Report(analysis.Diagnostic{
Pos: fun.Pos(),
Message: fmt.Sprintf("Additional arguments to %s should always be Key Value pairs. Please check if there is any key or value missing.", funName),
})
return
}

for index, arg := range keyValues {
if index%2 != 0 {
continue
}
lit, ok := arg.(*ast.BasicLit)
if !ok {
pass.Report(analysis.Diagnostic{
Pos: fun.Pos(),
Message: fmt.Sprintf("Key positional arguments are expected to be inlined constant strings. Please replace %v provided with string value", arg),
})
continue
}
if lit.Kind != token.STRING {
pass.Report(analysis.Diagnostic{
Pos: fun.Pos(),
Message: fmt.Sprintf("Key positional arguments are expected to be inlined constant strings. Please replace %v provided with string value", lit.Value),
})
continue
}
isASCII := utf8string.NewString(lit.Value).IsASCII()
if !isASCII {
pass.Report(analysis.Diagnostic{
Pos: fun.Pos(),
Message: fmt.Sprintf("Key positional arguments %s are expected to be lowerCamelCase alphanumeric strings. Please remove any non-Latin characters.", lit.Value),
})
}
}
}

func checkForFormatSpecifier(expr *ast.CallExpr, pass *analysis.Pass) bool {
if selExpr, ok := expr.Fun.(*ast.SelectorExpr); ok {
// extracting function Name like Infof
fName := selExpr.Sel.Name
if specifier, found := hasFormatSpecifier(expr.Args); found {
msg := fmt.Sprintf("structured logging function %q should not use format specifier %q", fName, specifier)
pass.Report(analysis.Diagnostic{
Pos: expr.Fun.Pos(),
Message: msg,
})
return true
}
}
return false
}

func hasFormatSpecifier(fArgs []ast.Expr) (string, bool) {
formatSpecifiers := []string{
"%v", "%+v", "%#v", "%T",
"%t", "%b", "%c", "%d", "%o", "%O", "%q", "%x", "%X", "%U",
"%e", "%E", "%f", "%F", "%g", "%G", "%s", "%q", "%p",
}
for _, fArg := range fArgs {
if arg, ok := fArg.(*ast.BasicLit); ok {
for _, specifier := range formatSpecifiers {
if strings.Contains(arg.Value, specifier) {
return specifier, true
}
}
}
}
return "", false
singlechecker.Main(pkg.Analyser())
}