Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: testinglogger: per-test, structured logging #240

Closed
wants to merge 1 commit into from

Conversation

pohly
Copy link

@pohly pohly commented May 27, 2021

What this PR does / why we need it:

This logr implementation can be used in tests to ensure that output
is associated with the currently running test. Compared to using the
global klog instance in a standard Go test that has some advantages:

  • Log messages are associated with the currently running test.
    Test can run in parallel without interleaving their output.
  • Log output is only printed when a test fails, unless "go test -v"
    is used.
  • Because of that, the logger can print all log messages. It is
    not necessary to add klog.InitFlags to tests and then remember
    to increase the log level when invoking the test.

Special notes for your reviewer:

The motivation for hosting this logger in klog is that it shares code with klogr and klog itself and that it may also be useful for others.

I originally wrote this for PMEM-CSI (copying some code from klog), but now would also like to use it in external-provisioner and thus need a way to share it properly.

I envision that this might also become relevant for Kubernetes, once we figure out how to pass a logr instance around instead of always using the global klog.

Release note:

new testinglogger: per-test, structured logging

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label May 27, 2021
@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label May 27, 2021
@pohly
Copy link
Author

pohly commented May 27, 2021

@pohly
Copy link
Author

pohly commented May 28, 2021

The motivation for hosting this logger in klog is that it shares code with klogr and klog itself and that it may also be useful for others.

@serathius suggested on Slack that k8s.io/component-base/logs might be another place to put this. I already pointed out the shared code which would have to be forked when doing this. Now some other arguments occurred to me why k8s.io/klog might be a better place:

  • klog has strict semantic versioning, anything in staging doesn't
  • updates to this code can be released as needed in klog, whereas in Kubernetes they are tied to the Kubernetes release cycle

@pohly
Copy link
Author

pohly commented May 31, 2021

/hold

Let's merge PR #242 first.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 31, 2021
@pohly
Copy link
Author

pohly commented May 31, 2021

Here is another example of using this: kubernetes-csi/external-provisioner#639

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 3, 2021
pohly referenced this pull request Aug 3, 2021
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 3, 2021
@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: pohly
To complete the pull request process, please assign lavalamp after the PR has been reviewed.
You can assign the PR to them by writing /assign @lavalamp in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Aug 3, 2021
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 3, 2021
@pohly
Copy link
Author

pohly commented Aug 3, 2021

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 3, 2021
@dims
Copy link
Member

dims commented Aug 3, 2021

more discussion here: https://kubernetes.slack.com/archives/CG3518SFJ/p1627997622000100?thread_ts=1611932732.008800&cid=CG3518SFJ

In principle, i've seen many folks struggle with testing their own code outside of k/k with klog and having curated ways to do this will help everyone.

So LGTM from me. I'd like to see examples and markdown etc to help with the cause.

@pohly
Copy link
Author

pohly commented Aug 4, 2021

/hold

I noticed that after rebasing and adapting to logr 1.0.0, call locations are not printed correctly anymore:

$ go test -run TestCallDepth -v .
=== RUN   TestCallDepth
    logr.go:208: INFO hello world
--- PASS: TestCallDepth (0.00s)

This is a new test that I wrote as quick way to check my suspicion. It only passes because I don't know how to capture and validate the output:

func TestCallDepth(t *testing.T) {
	logger := New(t)
	logger.Info("hello world")
}

This used to work when logr.Logger was an interface because the test code directly called the logger implementation and that code then could do the right thing, i.e. calling testing.T.Helper() before producing output with testingT.Log().

Now logr assumes that stack unwinding skips frames by counting them. But that assumption does not hold for the testing package, which expects intermediate calls to mark themselves as "to be skipped". https://github.com/go-logr/logr/blob/622c97b4eb90a99cc628f4773662eb0015ce0f2a/logr.go#L206-L210 doesn't do that and thus ends up being logged as call site.

One solution is to extend logr with a new optional sink interface that provides a Helper method and let logr.go call that if available.

Another is to extend the standard library with a testing.T.CallerIsHelper(int level) function that allows the testing logger implementation to mark the code in logr as to be skipped.

I prefer modifying logr because it can be completed in a reasonable time frame and will also work with current Go versions.

/cc @thockin

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 4, 2021
@pohly
Copy link
Author

pohly commented Aug 4, 2021

I prefer modifying logr because it can be completed in a reasonable time frame and will also work with current Go versions.

I was worried that modifying logr would make it harder to use because log helper functions would have to support two different optional APIs for skipping themselves. But I think that can be addressed by implementing logger.Helper such that it also does the right thing when the log sink implements CallDepthLogSink.

Let me write this down in a PR, otherwise it's all too abstract to follow...

@pohly
Copy link
Author

pohly commented Aug 4, 2021

Let me write this down in a PR, otherwise it's all too abstract to follow...

See go-logr/logr#60. Unfortunately that approach didn't work because implementing the proposed new log sink method runs into the same testing.T limitation: it's not possible to mark the caller as helper function.

That just leaves extending the Go standard library, which will take time and only work in future Go versions. Is it worth merging this PR in the meantime, even if call sites are not logged properly yet?

@pohly
Copy link
Author

pohly commented Sep 7, 2021

I noticed that after rebasing and adapting to logr 1.0.0, call locations are not printed correctly anymore

This got fixed in logr v1.1.0. I rebased on top of that.

/hold cancel

One of my goals was to have the same output in tests as in normal Kubernetes logs. If this isn't considered important enough to have a second logger that wraps testing.T, then we can also drop this PR.

I added an example that prints the same messages through both loggers:

$ go test -v ./testinglogger/example/
=== RUN   TestKlog
    example_test.go:54: INFO hello world
    example_test.go:55: ERROR failed err="failed: some error"
    example_test.go:56: INFO vebosity 1
    example_test.go:57: INFO main/helper: with prefix
    example_test.go:58: INFO key/value pairs int=1 float=2 pair="(1, 2)"
--- PASS: TestKlog (0.00s)
=== RUN   TestLogr
    example_test.go:54: "ts"="2021-09-07 14:45:39.458400" "level"=0 "msg"="hello world"
    example_test.go:55: "ts"="2021-09-07 14:45:39.458528" "msg"="failed" "error"="failed: some error"
    example_test.go:56: "ts"="2021-09-07 14:45:39.458558" "level"=1 "msg"="vebosity 1"
    example_test.go:57: main/helper: "ts"="2021-09-07 14:45:39.458576" "level"=0 "msg"="with prefix"
    example_test.go:58: "ts"="2021-09-07 14:45:39.458591" "level"=0 "msg"="key/value pairs" "int"=1 "float"=2 "pair"={}
--- PASS: TestLogr (0.00s)

The main difference of course is that the output from the logger in this PR is more like plain text. I personally find it more readable and would prefer to use that instead of logr/testing. Another option would be to copy the klog plain text formatting code into logr/funcr and use it as an alternative to the current format.

Another difference: funcr intentionally doesn't call String, which makes printing of the "pair" example fairly useless. However, in contrast to what the documentation says, Error is called. @thockin: wouldn't it be better to also call String?

Finally, time stamping can be enabled for logr/testing and it includes the message verbosity. Both could be added also to the klog logger.

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 7, 2021
This logr implementation can be used in tests to ensure that output
is associated with the currently running test. Compared to using the
global klog instance in a standard Go test that has some advantages:
- Log messages are associated with the currently running test.
  Tests can run in parallel without interleaving their output.
- Log output is only printed when a test fails, unless "go test -v"
  is used.
- Because of that, the logger can print all log messages. It is
  not necessary to add klog.InitFlags to tests and then remember
  to increase the log level when invoking the test.

The motivation for hosting this logger in klog is that it shares the formatting
code with klogr. Conceptually this is identical to go-logr/logr/testing, just
the output is different (klog text format vs. JSON).

  $ go test -v ./testinglogger/example/
  === RUN   TestKlog
      example_test.go:69: INFO hello world
      example_test.go:70: ERROR failed err="failed: some error"
      example_test.go:71: INFO verbosity 1
      example_test.go:72: INFO main/helper: with prefix
      example_test.go:73: INFO key/value pairs int=1 float=2 pair="(1, 2)" kobj="kube-system/sally"
  --- PASS: TestKlog (0.00s)
  === RUN   TestLogr
      example_test.go:69: "ts"="2021-09-07 16:44:54.307551" "level"=0 "msg"="hello world"
      example_test.go:70: "ts"="2021-09-07 16:44:54.307664" "msg"="failed" "error"="failed: some error"
      example_test.go:71: "ts"="2021-09-07 16:44:54.307686" "level"=1 "msg"="verbosity 1"
      example_test.go:72: main/helper: "ts"="2021-09-07 16:44:54.307703" "level"=0 "msg"="with prefix"
      example_test.go:73: "ts"="2021-09-07 16:44:54.307733" "level"=0 "msg"="key/value pairs" "int"=1 "float"=2 "pair"={} "kobj"={"name":"sally","namespace":"kube-system"}
  --- PASS: TestLogr (0.00s)
@pohly
Copy link
Author

pohly commented Sep 7, 2021

Not calling String is also bad for KObj:

    example_test.go:73: INFO key/value pairs int=1 float=2 pair="(1, 2)" kobj="kube-system/sally"
vs.
    example_test.go:73: "ts"="2021-09-07 16:44:54.307733" "level"=0 "msg"="key/value pairs" "int"=1 "float"=2 "pair"={} "kobj"={"name":"sally","namespace":"kube-system"}

@pohly
Copy link
Author

pohly commented Sep 8, 2021

On the other hand, perhaps that's the intention: in plain text, format as <namespace>/<name> for the sake of readability, in JSON, format as object?

That works for klog's current ObjectReference struct because it only contains those two fields. But when implementing something like the smarter KObj2 from #261 where creating that struct is avoided, dumping the entire wrapped object isn't intended. Could be solved by supporting MarshalJSON instead of String in funcr.

I checked what zap does. It uses (in this order):

  • String if defined
  • MarshalJSON if defined
  • compact representation of the set fields ({"KMetadata":{"metadata":{"name":"pod1","namespace":"kube-system","creationTimestamp":null},"spec":{"containers":null},"status":{}}} instead of {KMetadata:&Pod{ObjectMeta:{pod1 kube-system 0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[] map[] [] [] []},Spec:PodSpec{Volumes:[]Volume{},... (klog) or {"KMetadata":{"":{"kind":"","apiVersion":""},"metadata":{"name":"pod1","generateName":"","namespace":"kube-system","selfLink":"","uid":"","resourceVersion":"","generation":0,... (funcr))

@thockin
Copy link
Member

thockin commented Sep 9, 2021

Error() is only called on the explict error argument (not arbitrary K-V pairs) in the Error(...) path.

Calling String() and Error() and MarshalJSON() and MarshalYAML() and ... on arbitrary args has 2 problems (I did implement that but never merged it). 1) It has a cost - every arg needs to have N additional interface assertions; but more importantly 2) if we do that, then we eventually have to add a flag which says NOT to. CF spew, which added an IgnoreMethods option because sometimes you want the raw struct. As it stands, you can CHOOSE to call String() or Error() at the call site.

If we have lazy-built kobj (or anything) that means every callsite needs to call o.String().

I am skeptical of the optimization in #261, though. First copying strings in Go is fast. I bet all the cost is going into reflect. Why do we need to handle nil objects at all? Why not make KMetadata carry an IsNil() or IsValid() method (needing to push that all the way back up-stack is unpleasant)? Or make core types return "" if nil? I haven't investigated any of these ideas, obviously :) If you add the interface assertions to funcr, how much of your optimization goes away?

Second, premature optimization? Do we have any signal that kObj conversion accounts for any significant fraction of any real test? Lastly, The Go team handwaved at a possible optimization to peek at and inline Logger.Info and Logger.V and elide the pack/unpack of the args slice completely when the level would not fire a log. That doesn't exist yet, but it seems to erase your concern (assuming the optimization is valid)

There are alternatives I guess - we could force people who want "raw" output to say so. E.g. https://play.golang.org/p/0jteCzMMYuT but that's pretty ugly and requires ANOTHER optional LogSink interface.

I'm open to better answers.

@pohly
Copy link
Author

pohly commented Sep 9, 2021

Second, premature optimization? Do we have any signal that kObj conversion accounts for any significant fraction of any real test?

I'm not convinced either that KObj2 in that other PR is worthwhile. I tried out the idea because I hadn't seen anywhere that this had been considered, so I wanted to benchmark it. The benchmark shows some advantage, but not a whole lot.

The bigger question is the one which I raised here: what do we expect to be logged for KObj (or some replacement)?

Apparently it is "/" for text format and a JSON struct for json (as @serathius said in #261 (comment)), but that is not how it currently works in Kubernetes for the json format because zap doesn't do what was expected and instead calls String.

Ignoring the handling of KObj for a second, what do you think about a testing logger which outputs in human-readable format? Should that be here (this PR) or in go-logr (with an extended funcr formatter)?

@pohly
Copy link
Author

pohly commented Sep 9, 2021

Let's discuss the desired output for KObj in kubernetes/kubernetes#104877

@pohly
Copy link
Author

pohly commented Oct 6, 2021

Ignoring the handling of KObj for a second, what do you think about a testing logger which outputs in human-readable format? Should that be here (this PR) or in go-logr (with an extended funcr formatter)?

Let's come back to this. Originally, I wanted to avoid forking the output formatting code from klog (hence this PR). But considering the long-term goal of reducing dependencies of, say, client-go, it might make more sense to have a stand-alone logr.Logger implementation.

I find the klog output more readable than the JSON formatting from funcr. Besides that, Kubernetes will depend on support for the "text" format for quite a while longer - I am not sure whether the goal is to deprecate it and replace it with JSON. The KEP doesn't mention such a goal.

Considering these two points, I'm now leaning towards adding a plain-text output mode to funcr and closing this PR.

@pohly pohly changed the title testinglogger: per-test, structured logging RFC: testinglogger: per-test, structured logging Oct 6, 2021
@k8s-ci-robot
Copy link

@pohly: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 22, 2021
pohly added a commit to pohly/kubernetes that referenced this pull request Dec 8, 2021
This originally was proposed as enhancement for klog in
kubernetes/klog#240: the klog output formatting gets
turned into utility code which then can be used with a bit of boilerplate code
in a LogSink implementation.

The helper code from kubernetes/klog#240 was
de-duplicating keys because that was what the original klogr code did.

This de-duplication gets removed here because:
- Neither klog nor zap do that.
- It hides a problem in invalid log calls (keys should be unique).
- The code did massive allocations because each append() for the
  final slice created a new slice.
- It did not handle missing values as intended (inserted a nil value
  instead of MISSING).
@pohly
Copy link
Author

pohly commented Dec 17, 2021

I moved this code into the POC for the contextual logging KEP - see kubernetes/enhancements#3078

Support for logging during a unit test will be part of that KEP.

/close

@k8s-ci-robot
Copy link

@pohly: Closed this PR.

In response to this:

I moved this code into the POC for the contextual logging KEP - see kubernetes/enhancements#3078

Support for logging during a unit test will be part of that KEP.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

pohly added a commit to pohly/kubernetes that referenced this pull request Jan 28, 2022
This originally was proposed as enhancement for klog in
kubernetes/klog#240: the klog output formatting gets
turned into utility code which then can be used with a bit of boilerplate code
in a LogSink implementation.

The helper code from kubernetes/klog#240 was
de-duplicating keys because that was what the original klogr code did.

This de-duplication gets removed here because:
- Neither klog nor zap do that.
- It hides a problem in invalid log calls (keys should be unique).
- The code did massive allocations because each append() for the
  final slice created a new slice.
- It did not handle missing values as intended (inserted a nil value
  instead of MISSING).
pohly added a commit to pohly/kubernetes that referenced this pull request Feb 1, 2022
This originally was proposed as enhancement for klog in
kubernetes/klog#240: the klog output formatting gets
turned into utility code which then can be used with a bit of boilerplate code
in a LogSink implementation.

The helper code from kubernetes/klog#240 was
de-duplicating keys because that was what the original klogr code did.

This de-duplication gets removed here because:
- Neither klog nor zap do that.
- It hides a problem in invalid log calls (keys should be unique).
- The code did massive allocations because each append() for the
  final slice created a new slice.
- It did not handle missing values as intended (inserted a nil value
  instead of MISSING).
pohly added a commit to pohly/kubernetes that referenced this pull request Feb 8, 2022
This originally was proposed as enhancement for klog in
kubernetes/klog#240: the klog output formatting gets
turned into utility code which then can be used with a bit of boilerplate code
in a LogSink implementation.

The helper code from kubernetes/klog#240 was
de-duplicating keys because that was what the original klogr code did.

This de-duplication gets removed here because:
- Neither klog nor zap do that.
- It hides a problem in invalid log calls (keys should be unique).
- The code did massive allocations because each append() for the
  final slice created a new slice.
- It did not handle missing values as intended (inserted a nil value
  instead of MISSING).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants