Add SharedInformer and SharedInformerFactory setDebugItems #3121

codefromthecrypt · 2024-02-27T15:04:12Z

This centralizes logging into the existing ReflectorRunnable type, notably adding conditional logging of events.

This is toggled by setDebugItems(bool) on SharedInformer and SharedInformerFactory as ReflectorRunnable isn't directly accessible by end users.

Now, you can configure these independent of ApiType or if your needs are simple, use the logger.

boolean shouldDebugItems = LogFactory.getLog(ReflectorRunnable.class).isDebugEnabled();
sharedInformerFactory.setDebugItems(shouldDebugItems);
serviceInformer.setDebugItems(shouldDebugItems);
endpointsInformer.setDebugItems(shouldDebugItems);

Then, you can see streaming updates like this (in this case, I deleted a pod intentionally so it would be recreated):

2024-02-28T01:29:56.980Z DEBUG 1 --- [s.V1Endpoints-1] [                                                 ] i.k.c.informer.cache.ReflectorRunnable   : V1Endpoints#Receiving resourceVersion 154844
2024-02-28T01:30:06.582Z DEBUG 1 --- [s.V1Endpoints-1] [                                                 ] i.k.c.informer.cache.ReflectorRunnable   : V1Endpoints#Next item class V1Endpoints {
    apiVersion: v1
    kind: Endpoints
    metadata: class V1ObjectMeta {
        annotations: {endpoints.kubernetes.io/last-change-trigger-time=2024-02-28T01:30:06Z}
        creationTimestamp: 2024-02-27T05:54:17Z
        deletionGracePeriodSeconds: null
        deletionTimestamp: null
        finalizers: null
        generateName: null
        generation: null
        labels: {app.kubernetes.io/instance=zipkin, app.kubernetes.io/managed-by=Helm, app.kubernetes.io/name=zipkin, app.kubernetes.io/version=3.0.6, helm.sh/chart=zipkin-0.2.0}
        managedFields: [class V1ManagedFieldsEntry {
            apiVersion: v1
            fieldsType: FieldsV1
            fieldsV1: {f:metadata={f:annotations={.={}, f:endpoints.kubernetes.io/last-change-trigger-time={}}, f:labels={.={}, f:app.kubernetes.io/instance={}, f:app.kubernetes.io/managed-by={}, f:app.kubernetes.io/name={}, f:app.kubernetes.io/version={}, f:helm.sh/chart={}}}, f:subsets={}}
            manager: k3s
            operation: Update
            subresource: null
            time: 2024-02-28T01:30:06Z
        }]
        name: zipkin
        namespace: default
        ownerReferences: null
        resourceVersion: 154855
        selfLink: null
        uid: 462ae678-e912-4491-bc93-3851580cae10
    }
    subsets: [class V1EndpointSubset {
        addresses: [class V1EndpointAddress {
            hostname: null
            ip: 10.42.0.225
            nodeName: colima
            targetRef: class V1ObjectReference {
                apiVersion: null
                fieldPath: null
                kind: Pod
                name: zipkin-57667879f-kc49k
                namespace: default
                resourceVersion: null
                uid: 8670d47d-7582-4d0c-93b3-cff41d8a5a50
            }
        }]
        notReadyAddresses: null
        ports: [class CoreV1EndpointPort {
            appProtocol: null
            name: http-query
            port: 9411
            protocol: TCP
        }]
    }]
}
2024-02-28T01:30:06.586Z DEBUG 1 --- [s.V1Endpoints-1] [                                                 ] i.k.c.informer.cache.ReflectorRunnable   : V1Endpoints#Receiving resourceVersion 154855

This also reduces the verbosity of log lines, by making the log tag match the simple name of the type instead of the type's toString(). As a side effect, this also matches the item toString values who also use the simple name.

For example, without this change, the log message prefix is extremely long and includes "class io.kubernetes.client.openapi.models." which isn't needed to disambiguate types watched.

2024-02-28T01:07:55.114Z DEBUG 1 --- [els.V1Service-1] [                                                 ] i.k.c.informer.cache.ReflectorRunnable   : class io.kubernetes.client.openapi.models.V1Service#Extract resourceVersion 154265 list meta
2024-02-28T01:07:55.114Z DEBUG 1 --- [els.V1Service-1] [                                                 ] i.k.c.informer.cache.ReflectorRunnable   : class io.kubernetes.client.openapi.models.V1Service#Initial items [class V1Service {

This adds debug logging for the logger: io.kubernetes.client.util.Watch Notably, this logs when a watch is created, receives a new response body line, is exhausted, or closed. The below is an example log from spring-cloud-kubernetes discoveryserver which uses watches. As you'll notice, its log format already includes the thread name. However, the path watched is in the log message so that formats that don't include the thread can disambiguate. Feedback welcome! See kubernetes-client#275 (comment) ``` 2024-02-27T14:56:19.253Z DEBUG 1 --- [s.V1Endpoints-1] [ ] io.kubernetes.client.util.Watch : creating watch /api/v1/namespaces/default/endpoints 2024-02-27T14:56:19.256Z DEBUG 1 --- [els.V1Service-1] [ ] io.kubernetes.client.util.Watch : creating watch /api/v1/namespaces/default/services --snip-- 2024-02-27T14:57:46.137Z DEBUG 1 --- [s.V1Endpoints-1] [ ] io.kubernetes.client.util.Watch : /api/v1/namespaces/default/endpoints response line: {"type":"MODIFIED","object":{"kind":"Endpoints","apiVersion":"v1","metadata":{"name":"demo2","namespace":"default","uid":"7f00905a-2f1f-4f19-ab0f-13e8131fc6d5","resourceVersion":"151340","creationTimestamp":"2024-02-23T07:43:53Z","labels":{"app.kubernetes.io/instance":"demo2","app.kubernetes.io/managed-by":"Helm","app.kubernetes.io/name":"demo2","app.kubernetes.io/version":"1.16.0","helm.sh/chart":"demo2-0.1.0"},"managedFields":[{"manager":"k3s","operation":"Update","apiVersion":"v1","time":"2024-02-23T07:43:54Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:labels":{".":{},"f:app.kubernetes.io/instance":{},"f:app.kubernetes.io/managed-by":{},"f:app.kubernetes.io/name":{},"f:app.kubernetes.io/version":{},"f:helm.sh/chart":{}}}}}]}}} ``` Signed-off-by: Adrian Cole <adrian@tetrate.io>

codefromthecrypt · 2024-02-27T15:06:24Z

ps happy to add integration tests if this feels helpful. cc @spencergibb @wind57 @ryanjbaxter as I feel with this log detail, it is a lot easier to reason with the result of API responses in spring-cloud-kubernetes-discoveryserver. Totally ack that in a realistic size cluster, this is way too much logging.

brendandburns · 2024-02-27T17:41:08Z

util/src/main/java/io/kubernetes/client/util/Watch.java

@@ -92,6 +93,8 @@ public static <T> Watch<T> createWatch(ApiClient client, Call call, Type watchTy
      throw new ApiException("Watch is incompatible with debugging mode active.");
    }
    try {
+      String watchName = call.request().url().encodedPath();


We may want to include the query parameters here since they can also modify the watch.

brendandburns · 2024-02-27T17:42:37Z

Generally this looks ok to me. I do think that it is going to introduce a bunch of log spam in large clusters.

Perhaps we want the ability to turn it on/off for specific watches?

other than that lgtm.

codefromthecrypt · 2024-02-27T22:53:25Z

Thanks for the review @brendandburns. In analyzing how to propagate a "trace logging" flag, it seems the most sensible place to instrument is ReflectorRunnable where all the other logging is happening, and where the tagging (to manage the interleaving issue) is going on. I'll post a different version in the next commit.

This centralizes logging into the existing `ReflectorRunnable` type, notably adding conditional logging of events. This is toggled by `setDebugItems(bool)` on `SharedInformer` and `SharedInformerFactory` as `ReflectorRunnable` isn't directly accessible by end users. Now, you can configure these independent of `ApiType` or if your needs are simple, use the logger. ```java boolean shouldDebugItems = LogFactory.getLog(ReflectorRunnable.class).isDebugEnabled(); sharedInformerFactory.setDebugItems(shouldDebugItems); serviceInformer.setDebugItems(shouldDebugItems); endpointsInformer.setDebugItems(shouldDebugItems); ``` Then, you can see streaming updates like this: ``` 2024-02-28T01:29:56.980Z DEBUG 1 --- [s.V1Endpoints-1] [ ] i.k.c.informer.cache.ReflectorRunnable : V1Endpoints#Receiving resourceVersion 154844 2024-02-28T01:30:06.582Z DEBUG 1 --- [s.V1Endpoints-1] [ ] i.k.c.informer.cache.ReflectorRunnable : V1Endpoints#Next item class V1Endpoints { apiVersion: v1 kind: Endpoints metadata: class V1ObjectMeta { annotations: {endpoints.kubernetes.io/last-change-trigger-time=2024-02-28T01:30:06Z} creationTimestamp: 2024-02-27T05:54:17Z deletionGracePeriodSeconds: null deletionTimestamp: null finalizers: null generateName: null generation: null labels: {app.kubernetes.io/instance=zipkin, app.kubernetes.io/managed-by=Helm, app.kubernetes.io/name=zipkin, app.kubernetes.io/version=3.0.6, helm.sh/chart=zipkin-0.2.0} managedFields: [class V1ManagedFieldsEntry { apiVersion: v1 fieldsType: FieldsV1 fieldsV1: {f:metadata={f:annotations={.={}, f:endpoints.kubernetes.io/last-change-trigger-time={}}, f:labels={.={}, f:app.kubernetes.io/instance={}, f:app.kubernetes.io/managed-by={}, f:app.kubernetes.io/name={}, f:app.kubernetes.io/version={}, f:helm.sh/chart={}}}, f:subsets={}} manager: k3s operation: Update subresource: null time: 2024-02-28T01:30:06Z }] name: zipkin namespace: default ownerReferences: null resourceVersion: 154855 selfLink: null uid: 462ae678-e912-4491-bc93-3851580cae10 } subsets: [class V1EndpointSubset { addresses: [class V1EndpointAddress { hostname: null ip: 10.42.0.225 nodeName: colima targetRef: class V1ObjectReference { apiVersion: null fieldPath: null kind: Pod name: zipkin-57667879f-kc49k namespace: default resourceVersion: null uid: 8670d47d-7582-4d0c-93b3-cff41d8a5a50 } }] notReadyAddresses: null ports: [class CoreV1EndpointPort { appProtocol: null name: http-query port: 9411 protocol: TCP }] }] } 2024-02-28T01:30:06.586Z DEBUG 1 --- [s.V1Endpoints-1] [ ] i.k.c.informer.cache.ReflectorRunnable : V1Endpoints#Receiving resourceVersion 154855 ``` This also reduces the verbosity of log lines, by making the log tag match the simple name of the type instead of the type's `toString()`. As a side effect, this also matches the item `toString` values who also use the simple name. For example, without this change, the log message prefix is extremely long and includes "class io.kubernetes.client.openapi.models." which isn't needed to disambiguate types watched. ``` 2024-02-28T01:07:55.114Z DEBUG 1 --- [els.V1Service-1] [ ] i.k.c.informer.cache.ReflectorRunnable : class io.kubernetes.client.openapi.models.V1Service#Extract resourceVersion 154265 list meta 2024-02-28T01:07:55.114Z DEBUG 1 --- [els.V1Service-1] [ ] i.k.c.informer.cache.ReflectorRunnable : class io.kubernetes.client.openapi.models.V1Service#Initial items [class V1Service { ``` Signed-off-by: Adrian Cole <adrian@tetrate.io>

codefromthecrypt · 2024-02-28T01:35:39Z

Changed impl and description, PTAL. If keen, I'll polish with tests.

yue9944882

thanks for adding this, additionally i'm slightly concerned if dumping the whole object might make the log content too lengthy but overall the code will help improve the informer's observability

/lgtm

codefromthecrypt · 2024-02-28T03:00:12Z

@yue9944882 hate to thrash, but another way is to add an ItemListener that gets these instead, and then someone can make a logging variant of it? (could also make a tracing variant of it, too)

codefromthecrypt · 2024-02-28T05:35:03Z

ps ignore my last comment. I violated the KISS principle. This change fits in with existing practice, is simple and doesn't prevent something more sophisticated later.

codefromthecrypt · 2024-02-28T05:35:57Z

on soft approval from both folks, I'll add tests!

k8s-ci-robot · 2024-03-05T17:01:54Z

New changes are detected. LGTM label has been removed.

k8s-ci-robot · 2024-03-05T17:01:56Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: codefromthecrypt
Once this PR has been reviewed and has the lgtm label, please ask for approval from yue9944882. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-triage-robot · 2024-06-03T23:45:29Z

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

Mark this PR as fresh with /remove-lifecycle stale
Close this PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-ci-robot requested review from brendandburns and yue9944882 February 27, 2024 15:04

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Feb 27, 2024

codefromthecrypt mentioned this pull request Feb 27, 2024

Synchronous watches #275

Closed

brendandburns reviewed Feb 27, 2024

View reviewed changes

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Feb 28, 2024

codefromthecrypt changed the title ~~Add debug logging for Watch~~ Add SharedInformer and SharedInformerFactory setDebugItems Feb 28, 2024

codefromthecrypt requested a review from brendandburns February 28, 2024 01:35

yue9944882 reviewed Feb 28, 2024

View reviewed changes

k8s-ci-robot assigned yue9944882 Feb 28, 2024

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 28, 2024

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 2, 2024

Merge branch 'master' into watch-log

14c791b

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 5, 2024

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 5, 2024

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add SharedInformer and SharedInformerFactory setDebugItems #3121

Add SharedInformer and SharedInformerFactory setDebugItems #3121

codefromthecrypt commented Feb 27, 2024 •

edited

codefromthecrypt commented Feb 27, 2024

brendandburns Feb 27, 2024

brendandburns commented Feb 27, 2024

codefromthecrypt commented Feb 27, 2024

codefromthecrypt commented Feb 28, 2024

yue9944882 left a comment

codefromthecrypt commented Feb 28, 2024

codefromthecrypt commented Feb 28, 2024

codefromthecrypt commented Feb 28, 2024

k8s-ci-robot commented Mar 5, 2024

k8s-ci-robot commented Mar 5, 2024

k8s-triage-robot commented Jun 3, 2024

Add SharedInformer and SharedInformerFactory setDebugItems #3121

Are you sure you want to change the base?

Add SharedInformer and SharedInformerFactory setDebugItems #3121

Conversation

codefromthecrypt commented Feb 27, 2024 • edited

codefromthecrypt commented Feb 27, 2024

brendandburns Feb 27, 2024

Choose a reason for hiding this comment

brendandburns commented Feb 27, 2024

codefromthecrypt commented Feb 27, 2024

codefromthecrypt commented Feb 28, 2024

yue9944882 left a comment

Choose a reason for hiding this comment

codefromthecrypt commented Feb 28, 2024

codefromthecrypt commented Feb 28, 2024

codefromthecrypt commented Feb 28, 2024

k8s-ci-robot commented Mar 5, 2024

k8s-ci-robot commented Mar 5, 2024

k8s-triage-robot commented Jun 3, 2024

codefromthecrypt commented Feb 27, 2024 •

edited