Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SharedInformer and SharedInformerFactory setDebugItems #3121

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

codefromthecrypt
Copy link
Contributor

@codefromthecrypt codefromthecrypt commented Feb 27, 2024

This centralizes logging into the existing ReflectorRunnable type, notably adding conditional logging of events.

This is toggled by setDebugItems(bool) on SharedInformer and SharedInformerFactory as ReflectorRunnable isn't directly accessible by end users.

Now, you can configure these independent of ApiType or if your needs are simple, use the logger.

boolean shouldDebugItems = LogFactory.getLog(ReflectorRunnable.class).isDebugEnabled();
sharedInformerFactory.setDebugItems(shouldDebugItems);
serviceInformer.setDebugItems(shouldDebugItems);
endpointsInformer.setDebugItems(shouldDebugItems);

Then, you can see streaming updates like this (in this case, I deleted a pod intentionally so it would be recreated):

2024-02-28T01:29:56.980Z DEBUG 1 --- [s.V1Endpoints-1] [                                                 ] i.k.c.informer.cache.ReflectorRunnable   : V1Endpoints#Receiving resourceVersion 154844
2024-02-28T01:30:06.582Z DEBUG 1 --- [s.V1Endpoints-1] [                                                 ] i.k.c.informer.cache.ReflectorRunnable   : V1Endpoints#Next item class V1Endpoints {
    apiVersion: v1
    kind: Endpoints
    metadata: class V1ObjectMeta {
        annotations: {endpoints.kubernetes.io/last-change-trigger-time=2024-02-28T01:30:06Z}
        creationTimestamp: 2024-02-27T05:54:17Z
        deletionGracePeriodSeconds: null
        deletionTimestamp: null
        finalizers: null
        generateName: null
        generation: null
        labels: {app.kubernetes.io/instance=zipkin, app.kubernetes.io/managed-by=Helm, app.kubernetes.io/name=zipkin, app.kubernetes.io/version=3.0.6, helm.sh/chart=zipkin-0.2.0}
        managedFields: [class V1ManagedFieldsEntry {
            apiVersion: v1
            fieldsType: FieldsV1
            fieldsV1: {f:metadata={f:annotations={.={}, f:endpoints.kubernetes.io/last-change-trigger-time={}}, f:labels={.={}, f:app.kubernetes.io/instance={}, f:app.kubernetes.io/managed-by={}, f:app.kubernetes.io/name={}, f:app.kubernetes.io/version={}, f:helm.sh/chart={}}}, f:subsets={}}
            manager: k3s
            operation: Update
            subresource: null
            time: 2024-02-28T01:30:06Z
        }]
        name: zipkin
        namespace: default
        ownerReferences: null
        resourceVersion: 154855
        selfLink: null
        uid: 462ae678-e912-4491-bc93-3851580cae10
    }
    subsets: [class V1EndpointSubset {
        addresses: [class V1EndpointAddress {
            hostname: null
            ip: 10.42.0.225
            nodeName: colima
            targetRef: class V1ObjectReference {
                apiVersion: null
                fieldPath: null
                kind: Pod
                name: zipkin-57667879f-kc49k
                namespace: default
                resourceVersion: null
                uid: 8670d47d-7582-4d0c-93b3-cff41d8a5a50
            }
        }]
        notReadyAddresses: null
        ports: [class CoreV1EndpointPort {
            appProtocol: null
            name: http-query
            port: 9411
            protocol: TCP
        }]
    }]
}
2024-02-28T01:30:06.586Z DEBUG 1 --- [s.V1Endpoints-1] [                                                 ] i.k.c.informer.cache.ReflectorRunnable   : V1Endpoints#Receiving resourceVersion 154855

This also reduces the verbosity of log lines, by making the log tag match the simple name of the type instead of the type's toString(). As a side effect, this also matches the item toString values who also use the simple name.

For example, without this change, the log message prefix is extremely long and includes "class io.kubernetes.client.openapi.models." which isn't needed to disambiguate types watched.

2024-02-28T01:07:55.114Z DEBUG 1 --- [els.V1Service-1] [                                                 ] i.k.c.informer.cache.ReflectorRunnable   : class io.kubernetes.client.openapi.models.V1Service#Extract resourceVersion 154265 list meta
2024-02-28T01:07:55.114Z DEBUG 1 --- [els.V1Service-1] [                                                 ] i.k.c.informer.cache.ReflectorRunnable   : class io.kubernetes.client.openapi.models.V1Service#Initial items [class V1Service {

This adds debug logging for the logger: io.kubernetes.client.util.Watch

Notably, this logs when a watch is created, receives a new response body
line, is exhausted, or closed.

The below is an example log from spring-cloud-kubernetes discoveryserver
which uses watches. As you'll notice, its log format already includes
the thread name. However, the path watched is in the log message so that
formats that don't include the thread can disambiguate.

Feedback welcome!

See kubernetes-client#275 (comment)

```
2024-02-27T14:56:19.253Z DEBUG 1 --- [s.V1Endpoints-1] [                                                 ] io.kubernetes.client.util.Watch          : creating watch /api/v1/namespaces/default/endpoints
2024-02-27T14:56:19.256Z DEBUG 1 --- [els.V1Service-1] [                                                 ] io.kubernetes.client.util.Watch          : creating watch /api/v1/namespaces/default/services
--snip--
2024-02-27T14:57:46.137Z DEBUG 1 --- [s.V1Endpoints-1] [
] io.kubernetes.client.util.Watch          :
/api/v1/namespaces/default/endpoints response line:
{"type":"MODIFIED","object":{"kind":"Endpoints","apiVersion":"v1","metadata":{"name":"demo2","namespace":"default","uid":"7f00905a-2f1f-4f19-ab0f-13e8131fc6d5","resourceVersion":"151340","creationTimestamp":"2024-02-23T07:43:53Z","labels":{"app.kubernetes.io/instance":"demo2","app.kubernetes.io/managed-by":"Helm","app.kubernetes.io/name":"demo2","app.kubernetes.io/version":"1.16.0","helm.sh/chart":"demo2-0.1.0"},"managedFields":[{"manager":"k3s","operation":"Update","apiVersion":"v1","time":"2024-02-23T07:43:54Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:labels":{".":{},"f:app.kubernetes.io/instance":{},"f:app.kubernetes.io/managed-by":{},"f:app.kubernetes.io/name":{},"f:app.kubernetes.io/version":{},"f:helm.sh/chart":{}}}}}]}}}
```

Signed-off-by: Adrian Cole <adrian@tetrate.io>
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Feb 27, 2024
@codefromthecrypt
Copy link
Contributor Author

ps happy to add integration tests if this feels helpful. cc @spencergibb @wind57 @ryanjbaxter as I feel with this log detail, it is a lot easier to reason with the result of API responses in spring-cloud-kubernetes-discoveryserver. Totally ack that in a realistic size cluster, this is way too much logging.

@@ -92,6 +93,8 @@ public static <T> Watch<T> createWatch(ApiClient client, Call call, Type watchTy
throw new ApiException("Watch is incompatible with debugging mode active.");
}
try {
String watchName = call.request().url().encodedPath();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to include the query parameters here since they can also modify the watch.

@brendandburns
Copy link
Contributor

Generally this looks ok to me. I do think that it is going to introduce a bunch of log spam in large clusters.

Perhaps we want the ability to turn it on/off for specific watches?

other than that lgtm.

@codefromthecrypt
Copy link
Contributor Author

Thanks for the review @brendandburns. In analyzing how to propagate a "trace logging" flag, it seems the most sensible place to instrument is ReflectorRunnable where all the other logging is happening, and where the tagging (to manage the interleaving issue) is going on. I'll post a different version in the next commit.

This centralizes logging into the existing `ReflectorRunnable` type, notably adding conditional logging of events.

This is toggled by `setDebugItems(bool)` on `SharedInformer` and `SharedInformerFactory` as `ReflectorRunnable` isn't directly accessible by end users.

Now, you can configure these independent of `ApiType` or if your needs are simple, use the logger.
```java
boolean shouldDebugItems = LogFactory.getLog(ReflectorRunnable.class).isDebugEnabled();
sharedInformerFactory.setDebugItems(shouldDebugItems);
serviceInformer.setDebugItems(shouldDebugItems);
endpointsInformer.setDebugItems(shouldDebugItems);
```

Then, you can see streaming updates like this:

```
2024-02-28T01:29:56.980Z DEBUG 1 --- [s.V1Endpoints-1] [                                                 ] i.k.c.informer.cache.ReflectorRunnable   : V1Endpoints#Receiving resourceVersion 154844
2024-02-28T01:30:06.582Z DEBUG 1 --- [s.V1Endpoints-1] [                                                 ] i.k.c.informer.cache.ReflectorRunnable   : V1Endpoints#Next item class V1Endpoints {
    apiVersion: v1
    kind: Endpoints
    metadata: class V1ObjectMeta {
        annotations: {endpoints.kubernetes.io/last-change-trigger-time=2024-02-28T01:30:06Z}
        creationTimestamp: 2024-02-27T05:54:17Z
        deletionGracePeriodSeconds: null
        deletionTimestamp: null
        finalizers: null
        generateName: null
        generation: null
        labels: {app.kubernetes.io/instance=zipkin, app.kubernetes.io/managed-by=Helm, app.kubernetes.io/name=zipkin, app.kubernetes.io/version=3.0.6, helm.sh/chart=zipkin-0.2.0}
        managedFields: [class V1ManagedFieldsEntry {
            apiVersion: v1
            fieldsType: FieldsV1
            fieldsV1: {f:metadata={f:annotations={.={}, f:endpoints.kubernetes.io/last-change-trigger-time={}}, f:labels={.={}, f:app.kubernetes.io/instance={}, f:app.kubernetes.io/managed-by={}, f:app.kubernetes.io/name={}, f:app.kubernetes.io/version={}, f:helm.sh/chart={}}}, f:subsets={}}
            manager: k3s
            operation: Update
            subresource: null
            time: 2024-02-28T01:30:06Z
        }]
        name: zipkin
        namespace: default
        ownerReferences: null
        resourceVersion: 154855
        selfLink: null
        uid: 462ae678-e912-4491-bc93-3851580cae10
    }
    subsets: [class V1EndpointSubset {
        addresses: [class V1EndpointAddress {
            hostname: null
            ip: 10.42.0.225
            nodeName: colima
            targetRef: class V1ObjectReference {
                apiVersion: null
                fieldPath: null
                kind: Pod
                name: zipkin-57667879f-kc49k
                namespace: default
                resourceVersion: null
                uid: 8670d47d-7582-4d0c-93b3-cff41d8a5a50
            }
        }]
        notReadyAddresses: null
        ports: [class CoreV1EndpointPort {
            appProtocol: null
            name: http-query
            port: 9411
            protocol: TCP
        }]
    }]
}
2024-02-28T01:30:06.586Z DEBUG 1 --- [s.V1Endpoints-1] [                                                 ] i.k.c.informer.cache.ReflectorRunnable   : V1Endpoints#Receiving resourceVersion 154855
```

This also reduces the verbosity of log lines, by making the log tag match the simple name of the type instead of the type's `toString()`. As a side effect, this also matches the item `toString` values who also use the simple name.

For example, without this change, the log message prefix is extremely long and includes "class io.kubernetes.client.openapi.models." which isn't needed to disambiguate types watched.

```
2024-02-28T01:07:55.114Z DEBUG 1 --- [els.V1Service-1] [                                                 ] i.k.c.informer.cache.ReflectorRunnable   : class io.kubernetes.client.openapi.models.V1Service#Extract resourceVersion 154265 list meta
2024-02-28T01:07:55.114Z DEBUG 1 --- [els.V1Service-1] [                                                 ] i.k.c.informer.cache.ReflectorRunnable   : class io.kubernetes.client.openapi.models.V1Service#Initial items [class V1Service {
```

Signed-off-by: Adrian Cole <adrian@tetrate.io>
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Feb 28, 2024
@codefromthecrypt codefromthecrypt changed the title Add debug logging for Watch Add SharedInformer and SharedInformerFactory setDebugItems Feb 28, 2024
@codefromthecrypt
Copy link
Contributor Author

Changed impl and description, PTAL. If keen, I'll polish with tests.

Copy link
Member

@yue9944882 yue9944882 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for adding this, additionally i'm slightly concerned if dumping the whole object might make the log content too lengthy but overall the code will help improve the informer's observability

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 28, 2024
@codefromthecrypt
Copy link
Contributor Author

@yue9944882 hate to thrash, but another way is to add an ItemListener that gets these instead, and then someone can make a logging variant of it? (could also make a tracing variant of it, too)

@codefromthecrypt
Copy link
Contributor Author

ps ignore my last comment. I violated the KISS principle. This change fits in with existing practice, is simple and doesn't prevent something more sophisticated later.

@codefromthecrypt
Copy link
Contributor Author

on soft approval from both folks, I'll add tests!

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 2, 2024
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 5, 2024
@k8s-ci-robot
Copy link
Contributor

New changes are detected. LGTM label has been removed.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: codefromthecrypt
Once this PR has been reviewed and has the lgtm label, please ask for approval from yue9944882. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants