Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graceful Node Shutdown does not update endpoints for terminating pods #116965

Open
mikekap opened this issue Mar 28, 2023 · 34 comments · May be fixed by #117822
Open

Graceful Node Shutdown does not update endpoints for terminating pods #116965

mikekap opened this issue Mar 28, 2023 · 34 comments · May be fixed by #117822
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@mikekap
Copy link
Contributor

mikekap commented Mar 28, 2023

What happened?

When graceful node shutdown is enabled, I expect that services are drained appropriately during pod shutdown (i.e. the same way that kubectl delete pod would work). However this doesn't seem to be the case - the pod never enters the Terminating state according to kubectl and doesn't get removed from the service endpoints until it enters the Completed or Failed state. This causes connection errors and lost requests in Cluster-routed services.

What did you expect to happen?

After pod termination starts, the EndpointSlice should be updated to have the pod that is terminating removed or marked terminating, even before the pod is fully shut down. This is what happens when a pod gets removed for any other reason (e.g. kubectl delete pod).

How can we reproduce it (as minimally and precisely as possible)?

You can see the repro at https://github.com/mikekap/vagrant-kubeadm-kubernetes/tree/graceful-shutdown-repro . The repro sets up a 2 node cluster with graceful shutdown enabled and a simple deployment/service.

Here is the deployment/service yaml pasted again
apiVersion: apps/v1
kind: Deployment
metadata:
  name: echo-deployment
spec:
  selector:
    matchLabels:
      app: echo
  replicas: 2
  template:
    metadata:
      labels:
        app: echo
    spec:
      containers:
      - name: echo
        image: mendhak/http-https-echo
        readinessProbe:
          httpGet:
            path: /healthz
            port: 8080
        lifecycle:
          preStop:
            exec:
              command: ["/bin/sh", "-c", "echo 'Begin shutdown' > /proc/1/fd/1; sleep 20; echo 'End shutdown' > /proc/1/fd/1; exit 0"]
        ports:
        - containerPort: 8080
---
apiVersion: v1
kind: Service
metadata:
  name: my-service
spec:
  selector:
    app: echo
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080

To reproduce:

  • Run the deployment, preferably one pod on each node.
  • Run watch -n 1 'kubectl describe endpointslice my-service-5j2nw' in a separate terminal (you'll need to figure out the right name unfortunately)
  • Use the virtualbox UI to send an ACPI shutdown to one of the nodes.

Within 20 seconds (the preStop hook duration), the EndpointSlice should be updated to have the pod that is terminating removed. This is what happens when a pod gets removed for any other reason (e.g. kubectl delete pod).

To ensure that everything else worked, you can reboot the node and see the journal logs (via something like journalctl -b -1) of the shutdown sequence triggering the kubelet graceful shutdown logic.

Anything else we need to know?

No response

Kubernetes version

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.2", GitCommit:"5835544ca568b757a8ecae5c153f317e5736700e", GitTreeState:"clean", BuildDate:"2022-09-21T14:33:49Z", GoVersion:"go1.19.1", Compiler:"gc", Platform:"darwin/amd64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.3", GitCommit:"9e644106593f3f4aa98f8a84b23db5fa378900bd", GitTreeState:"clean", BuildDate:"2023-03-15T13:33:12Z", GoVersion:"go1.19.7", Compiler:"gc", Platform:"linux/amd64"}

Cloud provider

None

OS version

# On Linux:
$ cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.2 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.2 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
$ uname -a
Linux master-node 5.15.0-67-generic #74-Ubuntu SMP Wed Feb 22 14:14:39 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Install tools

kubeadm

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

@mikekap mikekap added the kind/bug Categorizes issue or PR as related to a bug. label Mar 28, 2023
@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Mar 28, 2023
@mikekap
Copy link
Contributor Author

mikekap commented Mar 28, 2023

/sig node

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Mar 28, 2023
@olderTaoist
Copy link
Contributor

/assign

@olderTaoist
Copy link
Contributor

olderTaoist commented Mar 29, 2023

When graceful node shutdown is enabled, I expect that services are drained appropriately during pod shutdown (i.e. the same way that kubectl delete pod would work). However this doesn't seem to be the case - the pod never enters the Terminating state according to kubectl and doesn't get removed from the service endpoints until it enters the Completed or Failed state. This causes connection errors and lost requests in Cluster-routed services.

i don't know your backend implementation of service,it’s significant feature to route backend healthy backend。the terminating pod in EndpointSlice is not ready as described below,shouldn't route new traffic to it。

Name:         my-service-rxsv4
Namespace:    default
Labels:       endpointslice.kubernetes.io/managed-by=endpointslice-controller.k8s.io
              kubernetes.io/service-name=my-service
Annotations:  endpoints.kubernetes.io/last-change-trigger-time: 2023-03-29T08:41:01Z
AddressType:  IPv4
Ports:
  Name     Port  Protocol
  ----     ----  --------
  <unset>  8080  TCP
Endpoints:
  - Addresses:  10.244.2.2
    Conditions:
      Ready:    false
    Hostname:   <unset>
    TargetRef:  Pod/echo-deployment-754f54d797-hjptn
    Topology:   kubernetes.io/hostname=kubernetes1.23.8-worker3
  - Addresses:  10.244.1.16
    Conditions:
      Ready:    true
    Hostname:   <unset>
    TargetRef:  Pod/echo-deployment-754f54d797-hstr7
    Topology:   kubernetes.io/hostname=kubernetes1.23.8-worker

but it‘s unreasonable terminating pod in EndpointSlice,should like Endpoints。

@aojea
Copy link
Member

aojea commented Mar 29, 2023

Within 5 seconds (the preStop hook duration), the EndpointSlice should be updated to have the pod that is terminating removed. This is what happens when a pod gets removed for any other reason (e.g. kubectl delete pod).

It will be important to check the logs of the endpointslice controller in the controller-manager and correlate with the pod and node states

@mikekap
Copy link
Contributor Author

mikekap commented Mar 29, 2023

Terminating pods appearing in EndpointSlice might be intentional due to KEP-1672, though that's just a guess from the name.

However, I'm not seeing what you posted above. What I'm seeing is both Endpoints & EndpointSlice are not updated for the duration of the shutdown - only when the shutdown ends. The readiness check continues reporting true. The only thing that gets updated is the status of the Node.

I updated the repro a little bit to print the preStop hook sequence in the log & an increase the grace period. It doesn't materially change anything. Here's the video of what happens from the repo I linked on my computer, with controller logs: https://drive.google.com/file/d/1-_hsvkJq3cOtLxEzQ2R8Fy4MEXka2Z82/view?usp=sharing . Both Endpoint and EndpointSlice have the terminating pod listed after "Begin shutdown" appears.

@SergeyKanzhelev
Copy link
Member

Do you see the terminating flag value on Endpoint slice (https://kubernetes.io/docs/concepts/services-networking/endpoint-slices/#terminating) changed for the terminating pod?

/triage needs-information

Also if you can share kubelet.log and @aojea requested the logs of the endpointslice controller - it will be helpful

@k8s-ci-robot k8s-ci-robot added the triage/needs-information Indicates an issue needs more information in order to work on it. label Mar 29, 2023
@SergeyKanzhelev SergeyKanzhelev moved this from Triage to Needs Information in SIG Node Bugs Mar 29, 2023
@mikekap
Copy link
Contributor Author

mikekap commented Mar 29, 2023

The EndpointSlice conditions remain as these for the entire duration:

  conditions:
    ready: true
    serving: true
    terminating: false

I captured the logs in the video I posted above as well. Here are the logs pasted (from a second attempt). I shut down the node @ 18:02:58. Spoiler: there's nothing there. Is there be a flag to turn on more specific logs for the endpoint slice controller?

kube-controller-manager.log
I0329 17:44:20.259182       1 serving.go:348] Generated self-signed cert in-memory
I0329 17:44:21.113389       1 controllermanager.go:182] Version: v1.26.3
I0329 17:44:21.113680       1 controllermanager.go:184] "Golang settings" GOGC="" GOMAXPROCS="" GOTRACEBACK=""
I0329 17:44:21.114684       1 secure_serving.go:210] Serving securely on 127.0.0.1:10257
I0329 17:44:21.115427       1 leaderelection.go:248] attempting to acquire leader lease kube-system/kube-controller-manager...
I0329 17:44:21.115695       1 dynamic_cafile_content.go:157] "Starting controller" name="request-header::/etc/kubernetes/pki/front-proxy-ca.crt"
I0329 17:44:21.115791       1 tlsconfig.go:240] "Starting DynamicServingCertificateController"
I0329 17:44:21.115866       1 dynamic_cafile_content.go:157] "Starting controller" name="client-ca-bundle::/etc/kubernetes/pki/ca.crt"
E0329 17:44:22.929615       1 leaderelection.go:330] error retrieving resource lock kube-system/kube-controller-manager: leases.coordination.k8s.io "kube-controller-manager" is forbidden: User "system:kube-controller-manager" cannot get resource "leases" in API group "coordination.k8s.io" in the namespace "kube-system"
I0329 17:44:25.227809       1 leaderelection.go:258] successfully acquired lease kube-system/kube-controller-manager
I0329 17:44:25.228405       1 event.go:294] "Event occurred" object="kube-system/kube-controller-manager" fieldPath="" kind="Lease" apiVersion="coordination.k8s.io/v1" type="Normal" reason="LeaderElection" message="master-node_a9fb8898-c5ec-4850-8f26-8abe3edd4cb9 became leader"
I0329 17:44:25.253865       1 shared_informer.go:273] Waiting for caches to sync for tokens
I0329 17:44:25.273293       1 controllermanager.go:622] Started "podgc"
I0329 17:44:25.273722       1 gc_controller.go:102] Starting GC controller
I0329 17:44:25.273759       1 shared_informer.go:273] Waiting for caches to sync for GC
I0329 17:44:25.283313       1 node_lifecycle_controller.go:492] Controller will reconcile labels.
I0329 17:44:25.283388       1 controllermanager.go:622] Started "nodelifecycle"
W0329 17:44:25.283399       1 core.go:221] configure-cloud-routes is set, but no cloud provider specified. Will not configure cloud provider routes.
W0329 17:44:25.283404       1 controllermanager.go:600] Skipping "route"
I0329 17:44:25.283542       1 node_lifecycle_controller.go:527] Sending events to api server.
I0329 17:44:25.283598       1 node_lifecycle_controller.go:538] Starting node controller
I0329 17:44:25.283612       1 shared_informer.go:273] Waiting for caches to sync for taint
I0329 17:44:25.292067       1 controllermanager.go:622] Started "ttl-after-finished"
I0329 17:44:25.292225       1 ttlafterfinished_controller.go:104] Starting TTL after finished controller
I0329 17:44:25.292476       1 shared_informer.go:273] Waiting for caches to sync for TTL after finished
I0329 17:44:25.321570       1 controllermanager.go:622] Started "attachdetach"
I0329 17:44:25.322201       1 attach_detach_controller.go:328] Starting attach detach controller
I0329 17:44:25.322611       1 shared_informer.go:273] Waiting for caches to sync for attach detach
I0329 17:44:25.332003       1 controllermanager.go:622] Started "endpointslicemirroring"
I0329 17:44:25.332246       1 endpointslicemirroring_controller.go:211] Starting EndpointSliceMirroring controller
I0329 17:44:25.332261       1 shared_informer.go:273] Waiting for caches to sync for endpoint_slice_mirroring
I0329 17:44:25.342539       1 controllermanager.go:622] Started "daemonset"
I0329 17:44:25.343172       1 daemon_controller.go:265] Starting daemon sets controller
I0329 17:44:25.343239       1 shared_informer.go:273] Waiting for caches to sync for daemon sets
I0329 17:44:25.351166       1 controllermanager.go:622] Started "cronjob"
I0329 17:44:25.351346       1 cronjob_controllerv2.go:137] "Starting cronjob controller v2"
I0329 17:44:25.351678       1 shared_informer.go:273] Waiting for caches to sync for cronjob
I0329 17:44:25.354670       1 shared_informer.go:280] Caches are synced for tokens
E0329 17:44:25.360123       1 core.go:92] Failed to start service controller: WARNING: no cloud provider provided, services of type LoadBalancer will fail
W0329 17:44:25.360207       1 controllermanager.go:600] Skipping "service"
E0329 17:44:25.362385       1 core.go:207] failed to start cloud node lifecycle controller: no cloud provider provided
W0329 17:44:25.362429       1 controllermanager.go:600] Skipping "cloud-node-lifecycle"
I0329 17:44:25.368959       1 controllermanager.go:622] Started "clusterrole-aggregation"
I0329 17:44:25.369352       1 clusterroleaggregation_controller.go:188] Starting ClusterRoleAggregator
I0329 17:44:25.369655       1 shared_informer.go:273] Waiting for caches to sync for ClusterRoleAggregator
I0329 17:44:25.378420       1 controllermanager.go:622] Started "pv-protection"
I0329 17:44:25.378542       1 pv_protection_controller.go:75] Starting PV protection controller
I0329 17:44:25.378785       1 shared_informer.go:273] Waiting for caches to sync for PV protection
I0329 17:44:25.482651       1 controllermanager.go:622] Started "ephemeral-volume"
I0329 17:44:25.482726       1 controller.go:169] Starting ephemeral volume controller
I0329 17:44:25.482734       1 shared_informer.go:273] Waiting for caches to sync for ephemeral
I0329 17:44:25.633281       1 controllermanager.go:622] Started "endpoint"
I0329 17:44:25.633495       1 endpoints_controller.go:178] Starting endpoint controller
I0329 17:44:25.633539       1 shared_informer.go:273] Waiting for caches to sync for endpoint
I0329 17:44:25.975502       1 resource_quota_monitor.go:218] QuotaMonitor created object count evaluator for limitranges
I0329 17:44:25.975777       1 resource_quota_monitor.go:218] QuotaMonitor created object count evaluator for controllerrevisions.apps
I0329 17:44:25.976011       1 resource_quota_monitor.go:218] QuotaMonitor created object count evaluator for podtemplates
W0329 17:44:25.976097       1 shared_informer.go:550] resyncPeriod 16h24m13.897255872s is smaller than resyncCheckPeriod 20h37m54.968684142s and the informer has already started. Changing it to 20h37m54.968684142s
I0329 17:44:25.976825       1 resource_quota_monitor.go:218] QuotaMonitor created object count evaluator for serviceaccounts
I0329 17:44:25.976925       1 resource_quota_monitor.go:218] QuotaMonitor created object count evaluator for networkpolicies.networking.k8s.io
I0329 17:44:25.976951       1 resource_quota_monitor.go:218] QuotaMonitor created object count evaluator for ingresses.networking.k8s.io
I0329 17:44:25.976974       1 resource_quota_monitor.go:218] QuotaMonitor created object count evaluator for roles.rbac.authorization.k8s.io
I0329 17:44:25.977007       1 resource_quota_monitor.go:218] QuotaMonitor created object count evaluator for deployments.apps
I0329 17:44:25.977215       1 resource_quota_monitor.go:218] QuotaMonitor created object count evaluator for csistoragecapacities.storage.k8s.io
I0329 17:44:25.977547       1 resource_quota_monitor.go:218] QuotaMonitor created object count evaluator for leases.coordination.k8s.io
I0329 17:44:25.977938       1 resource_quota_monitor.go:218] QuotaMonitor created object count evaluator for endpoints
I0329 17:44:25.978454       1 resource_quota_monitor.go:218] QuotaMonitor created object count evaluator for horizontalpodautoscalers.autoscaling
I0329 17:44:25.978737       1 resource_quota_monitor.go:218] QuotaMonitor created object count evaluator for poddisruptionbudgets.policy
I0329 17:44:25.979088       1 resource_quota_monitor.go:218] QuotaMonitor created object count evaluator for daemonsets.apps
I0329 17:44:25.979255       1 resource_quota_monitor.go:218] QuotaMonitor created object count evaluator for cronjobs.batch
I0329 17:44:25.979450       1 resource_quota_monitor.go:218] QuotaMonitor created object count evaluator for jobs.batch
I0329 17:44:25.985947       1 resource_quota_monitor.go:218] QuotaMonitor created object count evaluator for rolebindings.rbac.authorization.k8s.io
I0329 17:44:25.986052       1 resource_quota_monitor.go:218] QuotaMonitor created object count evaluator for replicasets.apps
I0329 17:44:25.986079       1 resource_quota_monitor.go:218] QuotaMonitor created object count evaluator for statefulsets.apps
I0329 17:44:25.986106       1 resource_quota_monitor.go:218] QuotaMonitor created object count evaluator for endpointslices.discovery.k8s.io
I0329 17:44:25.986128       1 controllermanager.go:622] Started "resourcequota"
I0329 17:44:25.986550       1 resource_quota_controller.go:277] Starting resource quota controller
I0329 17:44:25.986560       1 shared_informer.go:273] Waiting for caches to sync for resource quota
I0329 17:44:25.986781       1 resource_quota_monitor.go:295] QuotaMonitor running
I0329 17:44:26.187306       1 controllermanager.go:622] Started "garbagecollector"
I0329 17:44:26.187498       1 garbagecollector.go:154] Starting garbage collector controller
I0329 17:44:26.187582       1 shared_informer.go:273] Waiting for caches to sync for garbage collector
I0329 17:44:26.187626       1 graph_builder.go:291] GraphBuilder running
I0329 17:44:26.434244       1 controllermanager.go:622] Started "replicaset"
I0329 17:44:26.434361       1 replica_set.go:201] Starting replicaset controller
I0329 17:44:26.434371       1 shared_informer.go:273] Waiting for caches to sync for ReplicaSet
I0329 17:44:26.635794       1 controllermanager.go:622] Started "disruption"
I0329 17:44:26.635830       1 disruption.go:424] Sending events to api server.
I0329 17:44:26.635867       1 disruption.go:435] Starting disruption controller
I0329 17:44:26.635873       1 shared_informer.go:273] Waiting for caches to sync for disruption
I0329 17:44:26.788951       1 controllermanager.go:622] Started "replicationcontroller"
I0329 17:44:26.789052       1 replica_set.go:201] Starting replicationcontroller controller
I0329 17:44:26.789060       1 shared_informer.go:273] Waiting for caches to sync for ReplicationController
I0329 17:44:26.935227       1 certificate_controller.go:112] Starting certificate controller "csrsigning-kubelet-serving"
I0329 17:44:26.935244       1 shared_informer.go:273] Waiting for caches to sync for certificate-csrsigning-kubelet-serving
I0329 17:44:26.935262       1 dynamic_serving_content.go:132] "Starting controller" name="csr-controller::/etc/kubernetes/pki/ca.crt::/etc/kubernetes/pki/ca.key"
I0329 17:44:26.935602       1 certificate_controller.go:112] Starting certificate controller "csrsigning-kubelet-client"
I0329 17:44:26.935611       1 shared_informer.go:273] Waiting for caches to sync for certificate-csrsigning-kubelet-client
I0329 17:44:26.935623       1 dynamic_serving_content.go:132] "Starting controller" name="csr-controller::/etc/kubernetes/pki/ca.crt::/etc/kubernetes/pki/ca.key"
I0329 17:44:26.935867       1 certificate_controller.go:112] Starting certificate controller "csrsigning-kube-apiserver-client"
I0329 17:44:26.935876       1 shared_informer.go:273] Waiting for caches to sync for certificate-csrsigning-kube-apiserver-client
I0329 17:44:26.935889       1 dynamic_serving_content.go:132] "Starting controller" name="csr-controller::/etc/kubernetes/pki/ca.crt::/etc/kubernetes/pki/ca.key"
I0329 17:44:26.936108       1 controllermanager.go:622] Started "csrsigning"
I0329 17:44:26.936161       1 certificate_controller.go:112] Starting certificate controller "csrsigning-legacy-unknown"
I0329 17:44:26.936166       1 shared_informer.go:273] Waiting for caches to sync for certificate-csrsigning-legacy-unknown
I0329 17:44:26.936183       1 dynamic_serving_content.go:132] "Starting controller" name="csr-controller::/etc/kubernetes/pki/ca.crt::/etc/kubernetes/pki/ca.key"
I0329 17:44:36.998464       1 range_allocator.go:109] No Secondary Service CIDR provided. Skipping filtering out secondary service addresses.
I0329 17:44:36.998764       1 controllermanager.go:622] Started "nodeipam"
I0329 17:44:36.999126       1 node_ipam_controller.go:155] Starting ipam controller
I0329 17:44:36.999503       1 shared_informer.go:273] Waiting for caches to sync for node
E0329 17:44:37.082324       1 namespaced_resources_deleter.go:162] unable to get all supported resources from server: unable to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: the server is currently unable to handle the request
I0329 17:44:37.082557       1 controllermanager.go:622] Started "namespace"
I0329 17:44:37.082702       1 namespace_controller.go:195] Starting namespace controller
I0329 17:44:37.083406       1 shared_informer.go:273] Waiting for caches to sync for namespace
I0329 17:44:37.104763       1 controllermanager.go:622] Started "horizontalpodautoscaling"
I0329 17:44:37.105222       1 horizontal.go:181] Starting HPA controller
I0329 17:44:37.105300       1 shared_informer.go:273] Waiting for caches to sync for HPA
I0329 17:44:37.116242       1 controllermanager.go:622] Started "pvc-protection"
I0329 17:44:37.116599       1 pvc_protection_controller.go:99] "Starting PVC protection controller"
I0329 17:44:37.117124       1 shared_informer.go:273] Waiting for caches to sync for PVC protection
I0329 17:44:37.130583       1 controllermanager.go:622] Started "statefulset"
I0329 17:44:37.130734       1 stateful_set.go:152] Starting stateful set controller
I0329 17:44:37.130891       1 shared_informer.go:273] Waiting for caches to sync for stateful set
I0329 17:44:37.132483       1 controllermanager.go:622] Started "csrcleaner"
I0329 17:44:37.132702       1 cleaner.go:82] Starting CSR cleaner controller
I0329 17:44:37.147976       1 controllermanager.go:622] Started "bootstrapsigner"
I0329 17:44:37.148274       1 shared_informer.go:273] Waiting for caches to sync for bootstrap_signer
I0329 17:44:37.160552       1 controllermanager.go:622] Started "persistentvolume-expander"
I0329 17:44:37.160972       1 expand_controller.go:340] Starting expand controller
I0329 17:44:37.161019       1 shared_informer.go:273] Waiting for caches to sync for expand
I0329 17:44:37.170562       1 controllermanager.go:622] Started "serviceaccount"
I0329 17:44:37.170773       1 serviceaccounts_controller.go:111] Starting service account controller
I0329 17:44:37.170989       1 shared_informer.go:273] Waiting for caches to sync for service account
I0329 17:44:37.181090       1 controllermanager.go:622] Started "job"
I0329 17:44:37.181387       1 job_controller.go:191] Starting job controller
I0329 17:44:37.181481       1 shared_informer.go:273] Waiting for caches to sync for job
I0329 17:44:37.184259       1 controllermanager.go:622] Started "csrapproving"
I0329 17:44:37.184403       1 certificate_controller.go:112] Starting certificate controller "csrapproving"
I0329 17:44:37.184444       1 shared_informer.go:273] Waiting for caches to sync for certificate-csrapproving
I0329 17:44:37.289238       1 controllermanager.go:622] Started "tokencleaner"
I0329 17:44:37.289427       1 tokencleaner.go:111] Starting token cleaner controller
I0329 17:44:37.289563       1 shared_informer.go:273] Waiting for caches to sync for token_cleaner
I0329 17:44:37.289922       1 shared_informer.go:280] Caches are synced for token_cleaner
I0329 17:44:37.442330       1 controllermanager.go:622] Started "persistentvolume-binder"
I0329 17:44:37.442464       1 pv_controller_base.go:318] Starting persistent volume controller
I0329 17:44:37.442476       1 shared_informer.go:273] Waiting for caches to sync for persistent volume
I0329 17:44:37.591165       1 controllermanager.go:622] Started "endpointslice"
I0329 17:44:37.591430       1 endpointslice_controller.go:257] Starting endpoint slice controller
I0329 17:44:37.591695       1 shared_informer.go:273] Waiting for caches to sync for endpoint_slice
I0329 17:44:37.738674       1 controllermanager.go:622] Started "deployment"
I0329 17:44:37.738835       1 deployment_controller.go:154] "Starting controller" controller="deployment"
I0329 17:44:37.738984       1 shared_informer.go:273] Waiting for caches to sync for deployment
I0329 17:44:37.889823       1 controllermanager.go:622] Started "ttl"
I0329 17:44:37.889984       1 ttl_controller.go:120] Starting TTL controller
I0329 17:44:37.890032       1 shared_informer.go:273] Waiting for caches to sync for TTL
I0329 17:44:38.038471       1 controllermanager.go:622] Started "root-ca-cert-publisher"
I0329 17:44:38.038762       1 publisher.go:101] Starting root CA certificate configmap publisher
I0329 17:44:38.038810       1 shared_informer.go:273] Waiting for caches to sync for crt configmap
I0329 17:44:38.041054       1 shared_informer.go:273] Waiting for caches to sync for resource quota
W0329 17:44:38.057002       1 actual_state_of_world.go:541] Failed to update statusUpdateNeeded field in actual state of world: Failed to set statusUpdateNeeded to needed true, because nodeName="master-node" does not exist
I0329 17:44:38.074322       1 shared_informer.go:280] Caches are synced for ClusterRoleAggregator
I0329 17:44:38.081546       1 shared_informer.go:280] Caches are synced for job
I0329 17:44:38.087283       1 shared_informer.go:280] Caches are synced for certificate-csrapproving
I0329 17:44:38.089101       1 shared_informer.go:280] Caches are synced for ReplicationController
I0329 17:44:38.091465       1 shared_informer.go:280] Caches are synced for TTL
I0329 17:44:38.094515       1 shared_informer.go:280] Caches are synced for namespace
I0329 17:44:38.095778       1 shared_informer.go:280] Caches are synced for ephemeral
I0329 17:44:38.095964       1 shared_informer.go:280] Caches are synced for taint
I0329 17:44:38.096046       1 node_lifecycle_controller.go:1438] Initializing eviction metric for zone: 
W0329 17:44:38.096119       1 node_lifecycle_controller.go:1053] Missing timestamp for Node master-node. Assuming now as a timestamp.
I0329 17:44:38.096164       1 node_lifecycle_controller.go:1254] Controller detected that zone  is now in state Normal.
I0329 17:44:38.096607       1 taint_manager.go:206] "Starting NoExecuteTaintManager"
I0329 17:44:38.096667       1 taint_manager.go:211] "Sending events to api server"
I0329 17:44:38.096945       1 event.go:294] "Event occurred" object="master-node" fieldPath="" kind="Node" apiVersion="v1" type="Normal" reason="RegisteredNode" message="Node master-node event: Registered Node master-node in Controller"
E0329 17:44:38.097146       1 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
I0329 17:44:38.087915       1 shared_informer.go:280] Caches are synced for service account
I0329 17:44:38.087927       1 shared_informer.go:280] Caches are synced for GC
I0329 17:44:38.087985       1 shared_informer.go:280] Caches are synced for PV protection
I0329 17:44:38.099052       1 shared_informer.go:280] Caches are synced for TTL after finished
I0329 17:44:38.100587       1 shared_informer.go:280] Caches are synced for node
I0329 17:44:38.100655       1 range_allocator.go:167] Sending events to api server.
I0329 17:44:38.100701       1 range_allocator.go:171] Starting range CIDR allocator
I0329 17:44:38.100725       1 shared_informer.go:273] Waiting for caches to sync for cidrallocator
I0329 17:44:38.100748       1 shared_informer.go:280] Caches are synced for cidrallocator
I0329 17:44:38.105938       1 shared_informer.go:280] Caches are synced for HPA
E0329 17:44:38.114577       1 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
I0329 17:44:38.117716       1 shared_informer.go:280] Caches are synced for PVC protection
I0329 17:44:38.119761       1 shared_informer.go:273] Waiting for caches to sync for garbage collector
I0329 17:44:38.124153       1 range_allocator.go:372] Set node master-node PodCIDR to [172.16.0.0/24]
I0329 17:44:38.133904       1 shared_informer.go:280] Caches are synced for stateful set
I0329 17:44:38.137337       1 shared_informer.go:280] Caches are synced for certificate-csrsigning-legacy-unknown
I0329 17:44:38.137488       1 shared_informer.go:280] Caches are synced for ReplicaSet
I0329 17:44:38.137560       1 shared_informer.go:280] Caches are synced for disruption
I0329 17:44:38.137669       1 shared_informer.go:280] Caches are synced for certificate-csrsigning-kube-apiserver-client
I0329 17:44:38.137897       1 shared_informer.go:280] Caches are synced for certificate-csrsigning-kubelet-serving
I0329 17:44:38.137973       1 shared_informer.go:280] Caches are synced for certificate-csrsigning-kubelet-client
I0329 17:44:38.139471       1 shared_informer.go:280] Caches are synced for crt configmap
I0329 17:44:38.139529       1 shared_informer.go:280] Caches are synced for deployment
I0329 17:44:38.144566       1 shared_informer.go:280] Caches are synced for daemon sets
I0329 17:44:38.144690       1 shared_informer.go:280] Caches are synced for persistent volume
I0329 17:44:38.149458       1 shared_informer.go:280] Caches are synced for bootstrap_signer
I0329 17:44:38.152381       1 shared_informer.go:280] Caches are synced for cronjob
I0329 17:44:38.161708       1 shared_informer.go:280] Caches are synced for expand
I0329 17:44:38.223092       1 shared_informer.go:280] Caches are synced for attach detach
I0329 17:44:38.291183       1 shared_informer.go:280] Caches are synced for resource quota
I0329 17:44:38.297172       1 shared_informer.go:280] Caches are synced for endpoint_slice
I0329 17:44:38.333085       1 shared_informer.go:280] Caches are synced for endpoint_slice_mirroring
I0329 17:44:38.334962       1 shared_informer.go:280] Caches are synced for endpoint
I0329 17:44:38.342374       1 shared_informer.go:280] Caches are synced for resource quota
I0329 17:44:38.688777       1 shared_informer.go:280] Caches are synced for garbage collector
I0329 17:44:38.688868       1 garbagecollector.go:163] Garbage collector: all resource monitors have synced. Proceeding to collect garbage
I0329 17:44:38.721936       1 shared_informer.go:280] Caches are synced for garbage collector
I0329 17:44:38.962531       1 event.go:294] "Event occurred" object="kube-system/calico-kube-controllers" fieldPath="" kind="Deployment" apiVersion="apps/v1" type="Normal" reason="ScalingReplicaSet" message="Scaled up replica set calico-kube-controllers-57b57c56f to 1"
I0329 17:44:38.967324       1 event.go:294] "Event occurred" object="kube-system/metrics-server" fieldPath="" kind="Deployment" apiVersion="apps/v1" type="Normal" reason="ScalingReplicaSet" message="Scaled up replica set metrics-server-85dfd9f587 to 1"
I0329 17:44:38.967373       1 event.go:294] "Event occurred" object="default/echo-deployment" fieldPath="" kind="Deployment" apiVersion="apps/v1" type="Normal" reason="ScalingReplicaSet" message="Scaled up replica set echo-deployment-d658654dc to 2"
I0329 17:44:38.967568       1 event.go:294] "Event occurred" object="kube-system/coredns" fieldPath="" kind="Deployment" apiVersion="apps/v1" type="Normal" reason="ScalingReplicaSet" message="Scaled up replica set coredns-787d4945fb to 2"
I0329 17:44:39.045181       1 event.go:294] "Event occurred" object="kube-system/kube-proxy" fieldPath="" kind="DaemonSet" apiVersion="apps/v1" type="Normal" reason="SuccessfulCreate" message="Created pod: kube-proxy-r4z4s"
I0329 17:44:39.052824       1 event.go:294] "Event occurred" object="kube-system/calico-node" fieldPath="" kind="DaemonSet" apiVersion="apps/v1" type="Normal" reason="SuccessfulCreate" message="Created pod: calico-node-gmchc"
I0329 17:44:39.272219       1 event.go:294] "Event occurred" object="kube-system/calico-kube-controllers-57b57c56f" fieldPath="" kind="ReplicaSet" apiVersion="apps/v1" type="Normal" reason="SuccessfulCreate" message="Created pod: calico-kube-controllers-57b57c56f-2mld7"
I0329 17:44:39.281771       1 event.go:294] "Event occurred" object="kube-system/metrics-server-85dfd9f587" fieldPath="" kind="ReplicaSet" apiVersion="apps/v1" type="Normal" reason="SuccessfulCreate" message="Created pod: metrics-server-85dfd9f587-526gn"
I0329 17:44:39.298651       1 event.go:294] "Event occurred" object="kube-system/coredns-787d4945fb" fieldPath="" kind="ReplicaSet" apiVersion="apps/v1" type="Normal" reason="SuccessfulCreate" message="Created pod: coredns-787d4945fb-cbjgc"
I0329 17:44:39.376524       1 event.go:294] "Event occurred" object="kube-system/coredns-787d4945fb" fieldPath="" kind="ReplicaSet" apiVersion="apps/v1" type="Normal" reason="SuccessfulCreate" message="Created pod: coredns-787d4945fb-4wj8k"
I0329 17:44:39.377219       1 event.go:294] "Event occurred" object="default/echo-deployment-d658654dc" fieldPath="" kind="ReplicaSet" apiVersion="apps/v1" type="Normal" reason="SuccessfulCreate" message="Created pod: echo-deployment-d658654dc-nq5wx"
I0329 17:44:39.402610       1 event.go:294] "Event occurred" object="default/echo-deployment-d658654dc" fieldPath="" kind="ReplicaSet" apiVersion="apps/v1" type="Normal" reason="SuccessfulCreate" message="Created pod: echo-deployment-d658654dc-n96q7"
E0329 17:45:08.386017       1 resource_quota_controller.go:417] unable to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: the server is currently unable to handle the request
I0329 17:45:08.386237       1 resource_quota_monitor.go:218] QuotaMonitor created object count evaluator for networksets.crd.projectcalico.org
I0329 17:45:08.386282       1 resource_quota_monitor.go:218] QuotaMonitor created object count evaluator for networkpolicies.crd.projectcalico.org
I0329 17:45:08.386528       1 shared_informer.go:273] Waiting for caches to sync for resource quota
I0329 17:45:08.690887       1 shared_informer.go:280] Caches are synced for resource quota
W0329 17:45:08.767657       1 garbagecollector.go:752] failed to discover some groups: map[metrics.k8s.io/v1beta1:the server is currently unable to handle the request]
E0329 17:45:08.775842       1 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0329 17:45:08.789507       1 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
I0329 17:45:08.790986       1 shared_informer.go:273] Waiting for caches to sync for garbage collector
I0329 17:45:09.491928       1 shared_informer.go:280] Caches are synced for garbage collector
E0329 17:45:38.707225       1 resource_quota_controller.go:417] unable to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: the server is currently unable to handle the request
W0329 17:45:39.587820       1 garbagecollector.go:752] failed to discover some groups: map[metrics.k8s.io/v1beta1:the server is currently unable to handle the request]
E0329 17:46:08.731545       1 resource_quota_controller.go:417] unable to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: the server is currently unable to handle the request
W0329 17:46:09.642296       1 garbagecollector.go:752] failed to discover some groups: map[metrics.k8s.io/v1beta1:the server is currently unable to handle the request]
E0329 17:46:38.746253       1 resource_quota_controller.go:417] unable to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: the server is currently unable to handle the request
W0329 17:46:39.715956       1 garbagecollector.go:752] failed to discover some groups: map[metrics.k8s.io/v1beta1:the server is currently unable to handle the request]
W0329 17:46:42.059229       1 actual_state_of_world.go:541] Failed to update statusUpdateNeeded field in actual state of world: Failed to set statusUpdateNeeded to needed true, because nodeName="worker-node01" does not exist
I0329 17:46:42.103010       1 range_allocator.go:372] Set node worker-node01 PodCIDR to [172.16.1.0/24]
I0329 17:46:42.127848       1 event.go:294] "Event occurred" object="kube-system/kube-proxy" fieldPath="" kind="DaemonSet" apiVersion="apps/v1" type="Normal" reason="SuccessfulCreate" message="Created pod: kube-proxy-d94qh"
I0329 17:46:42.132599       1 event.go:294] "Event occurred" object="kube-system/calico-node" fieldPath="" kind="DaemonSet" apiVersion="apps/v1" type="Normal" reason="SuccessfulCreate" message="Created pod: calico-node-ffp7j"
W0329 17:46:42.503162       1 topologycache.go:232] Can't get CPU or zone information for worker-node01 node
W0329 17:46:43.122804       1 node_lifecycle_controller.go:1053] Missing timestamp for Node worker-node01. Assuming now as a timestamp.
I0329 17:46:43.123023       1 event.go:294] "Event occurred" object="worker-node01" fieldPath="" kind="Node" apiVersion="v1" type="Normal" reason="RegisteredNode" message="Node worker-node01 event: Registered Node worker-node01 in Controller"
E0329 17:47:08.779181       1 resource_quota_controller.go:417] unable to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: the server is currently unable to handle the request
W0329 17:47:09.787004       1 garbagecollector.go:752] failed to discover some groups: map[metrics.k8s.io/v1beta1:the server is currently unable to handle the request]
E0329 17:47:38.803273       1 resource_quota_controller.go:417] unable to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: the server is currently unable to handle the request
W0329 17:47:39.815692       1 garbagecollector.go:752] failed to discover some groups: map[metrics.k8s.io/v1beta1:the server is currently unable to handle the request]
W0329 17:49:36.642933       1 topologycache.go:232] Can't get CPU or zone information for worker-node01 node
W0329 17:49:36.643821       1 actual_state_of_world.go:541] Failed to update statusUpdateNeeded field in actual state of world: Failed to set statusUpdateNeeded to needed true, because nodeName="worker-node02" does not exist
I0329 17:49:36.664881       1 range_allocator.go:372] Set node worker-node02 PodCIDR to [172.16.2.0/24]
I0329 17:49:36.697228       1 event.go:294] "Event occurred" object="kube-system/kube-proxy" fieldPath="" kind="DaemonSet" apiVersion="apps/v1" type="Normal" reason="SuccessfulCreate" message="Created pod: kube-proxy-dkw9j"
I0329 17:49:36.720105       1 event.go:294] "Event occurred" object="kube-system/calico-node" fieldPath="" kind="DaemonSet" apiVersion="apps/v1" type="Normal" reason="SuccessfulCreate" message="Created pod: calico-node-g5mcr"
W0329 17:49:38.161727       1 node_lifecycle_controller.go:1053] Missing timestamp for Node worker-node02. Assuming now as a timestamp.
I0329 17:49:38.162203       1 event.go:294] "Event occurred" object="worker-node02" fieldPath="" kind="Node" apiVersion="v1" type="Normal" reason="RegisteredNode" message="Node worker-node02 event: Registered Node worker-node02 in Controller"
W0329 17:49:39.691377       1 topologycache.go:232] Can't get CPU or zone information for worker-node01 node
I0329 17:49:40.793578       1 event.go:294] "Event occurred" object="kubernetes-dashboard/kubernetes-dashboard" fieldPath="" kind="Deployment" apiVersion="apps/v1" type="Normal" reason="ScalingReplicaSet" message="Scaled up replica set kubernetes-dashboard-6c7ccbcf87 to 1"
I0329 17:49:40.821788       1 event.go:294] "Event occurred" object="kubernetes-dashboard/kubernetes-dashboard-6c7ccbcf87" fieldPath="" kind="ReplicaSet" apiVersion="apps/v1" type="Normal" reason="SuccessfulCreate" message="Created pod: kubernetes-dashboard-6c7ccbcf87-vssvh"
I0329 17:49:40.927636       1 event.go:294] "Event occurred" object="kubernetes-dashboard/dashboard-metrics-scraper" fieldPath="" kind="Deployment" apiVersion="apps/v1" type="Normal" reason="ScalingReplicaSet" message="Scaled up replica set dashboard-metrics-scraper-7bc864c59 to 1"
I0329 17:49:40.960028       1 event.go:294] "Event occurred" object="kubernetes-dashboard/dashboard-metrics-scraper-7bc864c59" fieldPath="" kind="ReplicaSet" apiVersion="apps/v1" type="Normal" reason="SuccessfulCreate" message="Created pod: dashboard-metrics-scraper-7bc864c59-zz598"
I0329 17:49:43.177221       1 event.go:294] "Event occurred" object="kubernetes-dashboard/kubernetes-dashboard-6c7ccbcf87-vssvh" fieldPath="" kind="Pod" apiVersion="" type="Normal" reason="TaintManagerEviction" message="Cancelling deletion of Pod kubernetes-dashboard/kubernetes-dashboard-6c7ccbcf87-vssvh"
I0329 17:49:43.177311       1 event.go:294] "Event occurred" object="kubernetes-dashboard/dashboard-metrics-scraper-7bc864c59-zz598" fieldPath="" kind="Pod" apiVersion="" type="Normal" reason="TaintManagerEviction" message="Cancelling deletion of Pod kubernetes-dashboard/dashboard-metrics-scraper-7bc864c59-zz598"
I0329 17:49:52.172015       1 event.go:294] "Event occurred" object="default/echo-deployment-d658654dc" fieldPath="" kind="ReplicaSet" apiVersion="apps/v1" type="Normal" reason="SuccessfulCreate" message="Created pod: echo-deployment-d658654dc-8gkvk"
W0329 18:02:58.062499       1 topologycache.go:232] Can't get CPU or zone information for worker-node01 node
I0329 18:02:59.691439       1 event.go:294] "Event occurred" object="kubernetes-dashboard/dashboard-metrics-scraper-7bc864c59" fieldPath="" kind="ReplicaSet" apiVersion="apps/v1" type="Normal" reason="SuccessfulCreate" message="Created pod: dashboard-metrics-scraper-7bc864c59-gc8tt"
I0329 18:02:59.762007       1 event.go:294] "Event occurred" object="kubernetes-dashboard/kubernetes-dashboard-6c7ccbcf87" fieldPath="" kind="ReplicaSet" apiVersion="apps/v1" type="Normal" reason="SuccessfulCreate" message="Created pod: kubernetes-dashboard-6c7ccbcf87-tfx44"
I0329 18:03:18.692228       1 event.go:294] "Event occurred" object="default/echo-deployment-d658654dc" fieldPath="" kind="ReplicaSet" apiVersion="apps/v1" type="Normal" reason="SuccessfulCreate" message="Created pod: echo-deployment-d658654dc-6qbvk"
W0329 18:05:59.952210       1 topologycache.go:232] Can't get CPU or zone information for worker-node02 node
I0329 18:06:03.461551       1 event.go:294] "Event occurred" object="kubernetes-dashboard/kubernetes-dashboard-6c7ccbcf87-vssvh" fieldPath="" kind="Pod" apiVersion="" type="Normal" reason="TaintManagerEviction" message="Cancelling deletion of Pod kubernetes-dashboard/kubernetes-dashboard-6c7ccbcf87-vssvh"
I0329 18:06:03.462453       1 event.go:294] "Event occurred" object="kubernetes-dashboard/dashboard-metrics-scraper-7bc864c59-zz598" fieldPath="" kind="Pod" apiVersion="" type="Normal" reason="TaintManagerEviction" message="Cancelling deletion of Pod kubernetes-dashboard/dashboard-metrics-scraper-7bc864c59-zz598"
I0329 18:06:03.462743       1 event.go:294] "Event occurred" object="default/echo-deployment-d658654dc-8gkvk" fieldPath="" kind="Pod" apiVersion="" type="Normal" reason="TaintManagerEviction" message="Cancelling deletion of Pod default/echo-deployment-d658654dc-8gkvk"
kubelet.log
Mar 29 17:49:13 worker-node02 systemd[1]: Started kubelet: The Kubernetes Node Agent.
Mar 29 17:49:13 worker-node02 kubelet[4596]: E0329 17:49:13.196597    4596 run.go:74] "command failed" err="failed to validate kubelet flags: the container runtime endpoint address was not specified or empty, use --container-runtime-endpoint to set"
Mar 29 17:49:13 worker-node02 systemd[1]: kubelet.service: Current command vanished from the unit file, execution of the command list won't be resumed.
Mar 29 17:49:13 worker-node02 systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
Mar 29 17:49:13 worker-node02 systemd[1]: kubelet.service: Failed with result 'exit-code'.
Mar 29 17:49:13 worker-node02 systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
Mar 29 17:49:13 worker-node02 systemd[1]: Started kubelet: The Kubernetes Node Agent.
Mar 29 17:49:13 worker-node02 kubelet[4634]: E0329 17:49:13.760008    4634 run.go:74] "command failed" err="failed to validate kubelet flags: the container runtime endpoint address was not specified or empty, use --container-runtime-endpoint to set"
Mar 29 17:49:13 worker-node02 systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
Mar 29 17:49:13 worker-node02 systemd[1]: kubelet.service: Failed with result 'exit-code'.
Mar 29 17:49:22 worker-node02 systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
Mar 29 17:49:22 worker-node02 systemd[1]: Started kubelet: The Kubernetes Node Agent.
Mar 29 17:49:22 worker-node02 kubelet[5319]: Flag --pod-infra-container-image has been deprecated, will be removed in 1.27. Image garbage collector will get sandbox image information from CRI.
Mar 29 17:49:22 worker-node02 kubelet[5319]: I0329 17:49:22.999361    5319 server.go:198] "--pod-infra-container-image will not be pruned by the image garbage collector in kubelet and should also be set in the remote runtime"
Mar 29 17:49:23 worker-node02 kubelet[5319]: Flag --pod-infra-container-image has been deprecated, will be removed in 1.27. Image garbage collector will get sandbox image information from CRI.
Mar 29 17:49:23 worker-node02 kubelet[5319]: I0329 17:49:23.650221    5319 server.go:412] "Kubelet version" kubeletVersion="v1.26.1"
Mar 29 17:49:23 worker-node02 kubelet[5319]: I0329 17:49:23.650505    5319 server.go:414] "Golang settings" GOGC="" GOMAXPROCS="" GOTRACEBACK=""
Mar 29 17:49:23 worker-node02 kubelet[5319]: I0329 17:49:23.651809    5319 server.go:836] "Client rotation is on, will bootstrap in background"
Mar 29 17:49:23 worker-node02 kubelet[5319]: I0329 17:49:23.681671    5319 dynamic_cafile_content.go:157] "Starting controller" name="client-ca-bundle::/etc/kubernetes/pki/ca.crt"
Mar 29 17:49:23 worker-node02 kubelet[5319]: I0329 17:49:23.694583    5319 server.go:659] "--cgroups-per-qos enabled, but --cgroup-root was not specified.  defaulting to /"
Mar 29 17:49:23 worker-node02 kubelet[5319]: I0329 17:49:23.697917    5319 container_manager_linux.go:267] "Container manager verified user specified cgroup-root exists" cgroupRoot=[]
Mar 29 17:49:23 worker-node02 kubelet[5319]: I0329 17:49:23.698203    5319 container_manager_linux.go:272] "Creating Container Manager object based on Node Config" nodeConfig={RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: KubeletOOMScoreAdj:-999 ContainerRuntime: CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:systemd KubeletRootDir:/var/lib/kubelet ProtectKernelDefaults:false NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: ReservedSystemCPUs: EnforceNodeAllocatable:map[pods:{}] KubeReserved:map[] SystemReserved:map[] HardEvictionThresholds:[{Signal:memory.available Operator:LessThan Value:{Quantity:100Mi Percentage:0} GracePeriod:0s MinReclaim:<nil>} {Signal:nodefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.1} GracePeriod:0s MinReclaim:<nil>} {Signal:nodefs.inodesFree Operator:LessThan Value:{Quantity:<nil> Percentage:0.05} GracePeriod:0s MinReclaim:<nil>} {Signal:imagefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.15} GracePeriod:0s MinReclaim:<nil>}]} QOSReserved:map[] CPUManagerPolicy:none CPUManagerPolicyOptions:map[] ExperimentalTopologyManagerScope:container CPUManagerReconcilePeriod:10s ExperimentalMemoryManagerPolicy:None ExperimentalMemoryManagerReservedMemory:[] ExperimentalPodPidsLimit:-1 EnforceCPULimits:true CPUCFSQuotaPeriod:100ms ExperimentalTopologyManagerPolicy:none ExperimentalTopologyManagerPolicyOptions:map[]}
Mar 29 17:49:23 worker-node02 kubelet[5319]: I0329 17:49:23.699195    5319 topology_manager.go:134] "Creating topology manager with policy per scope" topologyPolicyName="none" topologyScopeName="container"
Mar 29 17:49:23 worker-node02 kubelet[5319]: I0329 17:49:23.699413    5319 container_manager_linux.go:308] "Creating device plugin manager"
Mar 29 17:49:23 worker-node02 kubelet[5319]: I0329 17:49:23.700517    5319 state_mem.go:36] "Initialized new in-memory state store"
Mar 29 17:49:23 worker-node02 kubelet[5319]: I0329 17:49:23.722092    5319 kubelet.go:398] "Attempting to sync node with API server"
Mar 29 17:49:23 worker-node02 kubelet[5319]: I0329 17:49:23.722455    5319 kubelet.go:286] "Adding static pod path" path="/etc/kubernetes/manifests"
Mar 29 17:49:23 worker-node02 kubelet[5319]: I0329 17:49:23.724164    5319 kubelet.go:297] "Adding apiserver pod source"
Mar 29 17:49:23 worker-node02 kubelet[5319]: I0329 17:49:23.725532    5319 apiserver.go:42] "Waiting for node sync before watching apiserver pods"
Mar 29 17:49:23 worker-node02 kubelet[5319]: I0329 17:49:23.739636    5319 kuberuntime_manager.go:244] "Container runtime initialized" containerRuntime="cri-o" version="1.26.1" apiVersion="v1"
Mar 29 17:49:23 worker-node02 kubelet[5319]: W0329 17:49:23.743400    5319 probe.go:268] Flexvolume plugin directory at /usr/libexec/kubernetes/kubelet-plugins/volume/exec/ does not exist. Recreating.
Mar 29 17:49:23 worker-node02 kubelet[5319]: I0329 17:49:23.747169    5319 nodeshutdown_manager_linux.go:137] "Creating node shutdown manager" shutdownGracePeriodRequested="1m0s" shutdownGracePeriodCriticalPods="10s" shutdownGracePeriodByPodPriority=[{Priority:0 ShutdownGracePeriodSeconds:50} {Priority:2000000000 ShutdownGracePeriodSeconds:10}]
Mar 29 17:49:23 worker-node02 kubelet[5319]: I0329 17:49:23.747510    5319 server.go:1186] "Started kubelet"
Mar 29 17:49:23 worker-node02 kubelet[5319]: E0329 17:49:23.755795    5319 kubelet.go:1386] "Image garbage collection failed once. Stats initialization may not have completed yet" err="failed to get imageFs info: unable to find data in memory cache"
Mar 29 17:49:23 worker-node02 kubelet[5319]: I0329 17:49:23.758748    5319 fs_resource_analyzer.go:67] "Starting FS ResourceAnalyzer"
Mar 29 17:49:23 worker-node02 kubelet[5319]: I0329 17:49:23.770366    5319 server.go:161] "Starting to listen" address="0.0.0.0" port=10250
Mar 29 17:49:23 worker-node02 kubelet[5319]: I0329 17:49:23.772856    5319 server.go:451] "Adding debug handlers to kubelet server"
Mar 29 17:49:23 worker-node02 kubelet[5319]: I0329 17:49:23.781367    5319 volume_manager.go:293] "Starting Kubelet Volume Manager"
Mar 29 17:49:23 worker-node02 kubelet[5319]: I0329 17:49:23.785647    5319 desired_state_of_world_populator.go:151] "Desired state populator starts to run"
Mar 29 17:49:23 worker-node02 kubelet[5319]: E0329 17:49:23.885447    5319 container_manager_linux.go:945] "Unable to get rootfs data from cAdvisor interface" err="unable to find data in memory cache"
Mar 29 17:49:23 worker-node02 kubelet[5319]: W0329 17:49:23.943444    5319 reflector.go:424] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.CSIDriver: csidrivers.storage.k8s.io is forbidden: User "system:anonymous" cannot list resource "csidrivers" in API group "storage.k8s.io" at the cluster scope
Mar 29 17:49:23 worker-node02 kubelet[5319]: E0329 17:49:23.949633    5319 reflector.go:140] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.CSIDriver: failed to list *v1.CSIDriver: csidrivers.storage.k8s.io is forbidden: User "system:anonymous" cannot list resource "csidrivers" in API group "storage.k8s.io" at the cluster scope
Mar 29 17:49:23 worker-node02 kubelet[5319]: W0329 17:49:23.950041    5319 reflector.go:424] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.Service: services is forbidden: User "system:anonymous" cannot list resource "services" in API group "" at the cluster scope
Mar 29 17:49:23 worker-node02 kubelet[5319]: E0329 17:49:23.966374    5319 reflector.go:140] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.Service: failed to list *v1.Service: services is forbidden: User "system:anonymous" cannot list resource "services" in API group "" at the cluster scope
Mar 29 17:49:23 worker-node02 kubelet[5319]: W0329 17:49:23.951435    5319 reflector.go:424] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.Node: nodes "worker-node02" is forbidden: User "system:anonymous" cannot list resource "nodes" in API group "" at the cluster scope
Mar 29 17:49:23 worker-node02 kubelet[5319]: E0329 17:49:23.966929    5319 reflector.go:140] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.Node: failed to list *v1.Node: nodes "worker-node02" is forbidden: User "system:anonymous" cannot list resource "nodes" in API group "" at the cluster scope
Mar 29 17:49:23 worker-node02 kubelet[5319]: E0329 17:49:23.952464    5319 controller.go:146] failed to ensure lease exists, will retry in 200ms, error: leases.coordination.k8s.io "worker-node02" is forbidden: User "system:anonymous" cannot get resource "leases" in API group "coordination.k8s.io" in the namespace "kube-node-lease"
Mar 29 17:49:23 worker-node02 kubelet[5319]: E0329 17:49:23.952493    5319 event.go:267] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"worker-node02.1750f51c7efae8d0", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Node", Namespace:"", Name:"worker-node02", UID:"worker-node02", APIVersion:"", ResourceVersion:"", FieldPath:""}, Reason:"Starting", Message:"Starting kubelet.", Source:v1.EventSource{Component:"kubelet", Host:"worker-node02"}, FirstTimestamp:time.Date(2023, time.March, 29, 17, 49, 23, 747457232, time.Local), LastTimestamp:time.Date(2023, time.March, 29, 17, 49, 23, 747457232, time.Local), Count:1, Type:"Normal", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events is forbidden: User "system:anonymous" cannot create resource "events" in API group "" in the namespace "default"' (will not retry!)
Mar 29 17:49:23 worker-node02 kubelet[5319]: I0329 17:49:23.961713    5319 kubelet_node_status.go:70] "Attempting to register node" node="worker-node02"
Mar 29 17:49:23 worker-node02 kubelet[5319]: E0329 17:49:23.974850    5319 event.go:267] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"worker-node02.1750f51c8bbc34f1", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Node", Namespace:"", Name:"worker-node02", UID:"worker-node02", APIVersion:"", ResourceVersion:"", FieldPath:""}, Reason:"NodeHasSufficientMemory", Message:"Node worker-node02 status is now: NodeHasSufficientMemory", Source:v1.EventSource{Component:"kubelet", Host:"worker-node02"}, FirstTimestamp:time.Date(2023, time.March, 29, 17, 49, 23, 961451761, time.Local), LastTimestamp:time.Date(2023, time.March, 29, 17, 49, 23, 961451761, time.Local), Count:1, Type:"Normal", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events is forbidden: User "system:anonymous" cannot create resource "events" in API group "" in the namespace "default"' (will not retry!)
Mar 29 17:49:23 worker-node02 kubelet[5319]: E0329 17:49:23.975278    5319 kubelet_node_status.go:92] "Unable to register node with API server" err="nodes is forbidden: User \"system:anonymous\" cannot create resource \"nodes\" in API group \"\" at the cluster scope" node="worker-node02"
Mar 29 17:49:23 worker-node02 kubelet[5319]: I0329 17:49:23.980926    5319 cpu_manager.go:214] "Starting CPU manager" policy="none"
Mar 29 17:49:23 worker-node02 kubelet[5319]: I0329 17:49:23.981572    5319 cpu_manager.go:215] "Reconciling" reconcilePeriod="10s"
Mar 29 17:49:23 worker-node02 kubelet[5319]: I0329 17:49:23.981757    5319 state_mem.go:36] "Initialized new in-memory state store"
Mar 29 17:49:23 worker-node02 kubelet[5319]: E0329 17:49:23.987763    5319 event.go:267] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"worker-node02.1750f51c8bbc673f", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Node", Namespace:"", Name:"worker-node02", UID:"worker-node02", APIVersion:"", ResourceVersion:"", FieldPath:""}, Reason:"NodeHasNoDiskPressure", Message:"Node worker-node02 status is now: NodeHasNoDiskPressure", Source:v1.EventSource{Component:"kubelet", Host:"worker-node02"}, FirstTimestamp:time.Date(2023, time.March, 29, 17, 49, 23, 961464639, time.Local), LastTimestamp:time.Date(2023, time.March, 29, 17, 49, 23, 961464639, time.Local), Count:1, Type:"Normal", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events is forbidden: User "system:anonymous" cannot create resource "events" in API group "" in the namespace "default"' (will not retry!)
Mar 29 17:49:23 worker-node02 kubelet[5319]: E0329 17:49:23.993583    5319 event.go:267] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"worker-node02.1750f51c8bbc7668", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Node", Namespace:"", Name:"worker-node02", UID:"worker-node02", APIVersion:"", ResourceVersion:"", FieldPath:""}, Reason:"NodeHasSufficientPID", Message:"Node worker-node02 status is now: NodeHasSufficientPID", Source:v1.EventSource{Component:"kubelet", Host:"worker-node02"}, FirstTimestamp:time.Date(2023, time.March, 29, 17, 49, 23, 961468520, time.Local), LastTimestamp:time.Date(2023, time.March, 29, 17, 49, 23, 961468520, time.Local), Count:1, Type:"Normal", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events is forbidden: User "system:anonymous" cannot create resource "events" in API group "" in the namespace "default"' (will not retry!)
Mar 29 17:49:24 worker-node02 kubelet[5319]: I0329 17:49:24.003505    5319 policy_none.go:49] "None policy: Start"
Mar 29 17:49:24 worker-node02 kubelet[5319]: E0329 17:49:24.005208    5319 event.go:267] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"worker-node02.1750f51c8bbc34f1", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Node", Namespace:"", Name:"worker-node02", UID:"worker-node02", APIVersion:"", ResourceVersion:"", FieldPath:""}, Reason:"NodeHasSufficientMemory", Message:"Node worker-node02 status is now: NodeHasSufficientMemory", Source:v1.EventSource{Component:"kubelet", Host:"worker-node02"}, FirstTimestamp:time.Date(2023, time.March, 29, 17, 49, 23, 961451761, time.Local), LastTimestamp:time.Date(2023, time.March, 29, 17, 49, 23, 961673297, time.Local), Count:2, Type:"Normal", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events "worker-node02.1750f51c8bbc34f1" is forbidden: User "system:anonymous" cannot patch resource "events" in API group "" in the namespace "default"' (will not retry!)
Mar 29 17:49:24 worker-node02 kubelet[5319]: I0329 17:49:24.008529    5319 memory_manager.go:169] "Starting memorymanager" policy="None"
Mar 29 17:49:24 worker-node02 kubelet[5319]: I0329 17:49:24.008742    5319 state_mem.go:35] "Initializing new in-memory state store"
Mar 29 17:49:24 worker-node02 kubelet[5319]: E0329 17:49:24.013124    5319 event.go:267] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"worker-node02.1750f51c8bbc673f", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Node", Namespace:"", Name:"worker-node02", UID:"worker-node02", APIVersion:"", ResourceVersion:"", FieldPath:""}, Reason:"NodeHasNoDiskPressure", Message:"Node worker-node02 status is now: NodeHasNoDiskPressure", Source:v1.EventSource{Component:"kubelet", Host:"worker-node02"}, FirstTimestamp:time.Date(2023, time.March, 29, 17, 49, 23, 961464639, time.Local), LastTimestamp:time.Date(2023, time.March, 29, 17, 49, 23, 961684197, time.Local), Count:2, Type:"Normal", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events "worker-node02.1750f51c8bbc673f" is forbidden: User "system:anonymous" cannot patch resource "events" in API group "" in the namespace "default"' (will not retry!)
Mar 29 17:49:24 worker-node02 kubelet[5319]: E0329 17:49:24.044883    5319 event.go:267] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"worker-node02.1750f51c8bbc7668", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Node", Namespace:"", Name:"worker-node02", UID:"worker-node02", APIVersion:"", ResourceVersion:"", FieldPath:""}, Reason:"NodeHasSufficientPID", Message:"Node worker-node02 status is now: NodeHasSufficientPID", Source:v1.EventSource{Component:"kubelet", Host:"worker-node02"}, FirstTimestamp:time.Date(2023, time.March, 29, 17, 49, 23, 961468520, time.Local), LastTimestamp:time.Date(2023, time.March, 29, 17, 49, 23, 961686568, time.Local), Count:2, Type:"Normal", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events "worker-node02.1750f51c8bbc7668" is forbidden: User "system:anonymous" cannot patch resource "events" in API group "" in the namespace "default"' (will not retry!)
Mar 29 17:49:24 worker-node02 kubelet[5319]: I0329 17:49:24.088591    5319 manager.go:455] "Failed to read data from checkpoint" checkpoint="kubelet_internal_checkpoint" err="checkpoint is not found"
Mar 29 17:49:24 worker-node02 kubelet[5319]: E0329 17:49:24.091117    5319 eviction_manager.go:261] "Eviction manager: failed to get summary stats" err="failed to get node info: node \"worker-node02\" not found"
Mar 29 17:49:24 worker-node02 kubelet[5319]: I0329 17:49:24.091400    5319 plugin_manager.go:118] "Starting Kubelet Plugin Manager"
Mar 29 17:49:24 worker-node02 kubelet[5319]: E0329 17:49:24.102309    5319 event.go:267] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"worker-node02.1750f51c93c31c08", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Node", Namespace:"", Name:"worker-node02", UID:"worker-node02", APIVersion:"", ResourceVersion:"", FieldPath:""}, Reason:"NodeAllocatableEnforced", Message:"Updated Node Allocatable limit across pods", Source:v1.EventSource{Component:"kubelet", Host:"worker-node02"}, FirstTimestamp:time.Date(2023, time.March, 29, 17, 49, 24, 96121864, time.Local), LastTimestamp:time.Date(2023, time.March, 29, 17, 49, 24, 96121864, time.Local), Count:1, Type:"Normal", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events is forbidden: User "system:anonymous" cannot create resource "events" in API group "" in the namespace "default"' (will not retry!)
Mar 29 17:49:24 worker-node02 kubelet[5319]: I0329 17:49:24.105903    5319 kubelet_network_linux.go:63] "Initialized iptables rules." protocol=IPv4
Mar 29 17:49:24 worker-node02 kubelet[5319]: I0329 17:49:24.165022    5319 kubelet_network_linux.go:63] "Initialized iptables rules." protocol=IPv6
Mar 29 17:49:24 worker-node02 kubelet[5319]: I0329 17:49:24.165382    5319 status_manager.go:176] "Starting to sync pod status with apiserver"
Mar 29 17:49:24 worker-node02 kubelet[5319]: I0329 17:49:24.165659    5319 kubelet.go:2113] "Starting kubelet main sync loop"
Mar 29 17:49:24 worker-node02 kubelet[5319]: E0329 17:49:24.165831    5319 kubelet.go:2137] "Skipping pod synchronization" err="PLEG is not healthy: pleg has yet to be successful"
Mar 29 17:49:24 worker-node02 kubelet[5319]: E0329 17:49:24.173853    5319 controller.go:146] failed to ensure lease exists, will retry in 400ms, error: leases.coordination.k8s.io "worker-node02" is forbidden: User "system:anonymous" cannot get resource "leases" in API group "coordination.k8s.io" in the namespace "kube-node-lease"
Mar 29 17:49:24 worker-node02 kubelet[5319]: I0329 17:49:24.176533    5319 kubelet_node_status.go:70] "Attempting to register node" node="worker-node02"
Mar 29 17:49:24 worker-node02 kubelet[5319]: W0329 17:49:24.179374    5319 reflector.go:424] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.RuntimeClass: runtimeclasses.node.k8s.io is forbidden: User "system:anonymous" cannot list resource "runtimeclasses" in API group "node.k8s.io" at the cluster scope
Mar 29 17:49:24 worker-node02 kubelet[5319]: E0329 17:49:24.179584    5319 reflector.go:140] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.RuntimeClass: failed to list *v1.RuntimeClass: runtimeclasses.node.k8s.io is forbidden: User "system:anonymous" cannot list resource "runtimeclasses" in API group "node.k8s.io" at the cluster scope
Mar 29 17:49:24 worker-node02 kubelet[5319]: E0329 17:49:24.180428    5319 kubelet_node_status.go:92] "Unable to register node with API server" err="nodes is forbidden: User \"system:anonymous\" cannot create resource \"nodes\" in API group \"\" at the cluster scope" node="worker-node02"
Mar 29 17:49:24 worker-node02 kubelet[5319]: E0329 17:49:24.181089    5319 event.go:267] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"worker-node02.1750f51c8bbc34f1", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Node", Namespace:"", Name:"worker-node02", UID:"worker-node02", APIVersion:"", ResourceVersion:"", FieldPath:""}, Reason:"NodeHasSufficientMemory", Message:"Node worker-node02 status is now: NodeHasSufficientMemory", Source:v1.EventSource{Component:"kubelet", Host:"worker-node02"}, FirstTimestamp:time.Date(2023, time.March, 29, 17, 49, 23, 961451761, time.Local), LastTimestamp:time.Date(2023, time.March, 29, 17, 49, 24, 176506536, time.Local), Count:3, Type:"Normal", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events "worker-node02.1750f51c8bbc34f1" is forbidden: User "system:anonymous" cannot patch resource "events" in API group "" in the namespace "default"' (will not retry!)
Mar 29 17:49:24 worker-node02 kubelet[5319]: E0329 17:49:24.186811    5319 event.go:267] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"worker-node02.1750f51c8bbc673f", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Node", Namespace:"", Name:"worker-node02", UID:"worker-node02", APIVersion:"", ResourceVersion:"", FieldPath:""}, Reason:"NodeHasNoDiskPressure", Message:"Node worker-node02 status is now: NodeHasNoDiskPressure", Source:v1.EventSource{Component:"kubelet", Host:"worker-node02"}, FirstTimestamp:time.Date(2023, time.March, 29, 17, 49, 23, 961464639, time.Local), LastTimestamp:time.Date(2023, time.March, 29, 17, 49, 24, 176512538, time.Local), Count:3, Type:"Normal", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events "worker-node02.1750f51c8bbc673f" is forbidden: User "system:anonymous" cannot patch resource "events" in API group "" in the namespace "default"' (will not retry!)
Mar 29 17:49:24 worker-node02 kubelet[5319]: E0329 17:49:24.190136    5319 event.go:267] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"worker-node02.1750f51c8bbc7668", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Node", Namespace:"", Name:"worker-node02", UID:"worker-node02", APIVersion:"", ResourceVersion:"", FieldPath:""}, Reason:"NodeHasSufficientPID", Message:"Node worker-node02 status is now: NodeHasSufficientPID", Source:v1.EventSource{Component:"kubelet", Host:"worker-node02"}, FirstTimestamp:time.Date(2023, time.March, 29, 17, 49, 23, 961468520, time.Local), LastTimestamp:time.Date(2023, time.March, 29, 17, 49, 24, 176514599, time.Local), Count:3, Type:"Normal", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events "worker-node02.1750f51c8bbc7668" is forbidden: User "system:anonymous" cannot patch resource "events" in API group "" in the namespace "default"' (will not retry!)
Mar 29 17:49:24 worker-node02 kubelet[5319]: E0329 17:49:24.578556    5319 controller.go:146] failed to ensure lease exists, will retry in 800ms, error: leases.coordination.k8s.io "worker-node02" is forbidden: User "system:anonymous" cannot get resource "leases" in API group "coordination.k8s.io" in the namespace "kube-node-lease"
Mar 29 17:49:24 worker-node02 kubelet[5319]: I0329 17:49:24.582160    5319 kubelet_node_status.go:70] "Attempting to register node" node="worker-node02"
Mar 29 17:49:24 worker-node02 kubelet[5319]: E0329 17:49:24.587346    5319 kubelet_node_status.go:92] "Unable to register node with API server" err="nodes is forbidden: User \"system:anonymous\" cannot create resource \"nodes\" in API group \"\" at the cluster scope" node="worker-node02"
Mar 29 17:49:24 worker-node02 kubelet[5319]: E0329 17:49:24.588007    5319 event.go:267] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"worker-node02.1750f51c8bbc34f1", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Node", Namespace:"", Name:"worker-node02", UID:"worker-node02", APIVersion:"", ResourceVersion:"", FieldPath:""}, Reason:"NodeHasSufficientMemory", Message:"Node worker-node02 status is now: NodeHasSufficientMemory", Source:v1.EventSource{Component:"kubelet", Host:"worker-node02"}, FirstTimestamp:time.Date(2023, time.March, 29, 17, 49, 23, 961451761, time.Local), LastTimestamp:time.Date(2023, time.March, 29, 17, 49, 24, 582116830, time.Local), Count:4, Type:"Normal", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events "worker-node02.1750f51c8bbc34f1" is forbidden: User "system:anonymous" cannot patch resource "events" in API group "" in the namespace "default"' (will not retry!)
Mar 29 17:49:24 worker-node02 kubelet[5319]: E0329 17:49:24.591801    5319 event.go:267] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"worker-node02.1750f51c8bbc673f", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Node", Namespace:"", Name:"worker-node02", UID:"worker-node02", APIVersion:"", ResourceVersion:"", FieldPath:""}, Reason:"NodeHasNoDiskPressure", Message:"Node worker-node02 status is now: NodeHasNoDiskPressure", Source:v1.EventSource{Component:"kubelet", Host:"worker-node02"}, FirstTimestamp:time.Date(2023, time.March, 29, 17, 49, 23, 961464639, time.Local), LastTimestamp:time.Date(2023, time.March, 29, 17, 49, 24, 582122798, time.Local), Count:4, Type:"Normal", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events "worker-node02.1750f51c8bbc673f" is forbidden: User "system:anonymous" cannot patch resource "events" in API group "" in the namespace "default"' (will not retry!)
Mar 29 17:49:24 worker-node02 kubelet[5319]: E0329 17:49:24.595349    5319 event.go:267] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"worker-node02.1750f51c8bbc7668", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Node", Namespace:"", Name:"worker-node02", UID:"worker-node02", APIVersion:"", ResourceVersion:"", FieldPath:""}, Reason:"NodeHasSufficientPID", Message:"Node worker-node02 status is now: NodeHasSufficientPID", Source:v1.EventSource{Component:"kubelet", Host:"worker-node02"}, FirstTimestamp:time.Date(2023, time.March, 29, 17, 49, 23, 961468520, time.Local), LastTimestamp:time.Date(2023, time.March, 29, 17, 49, 24, 582127009, time.Local), Count:4, Type:"Normal", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events "worker-node02.1750f51c8bbc7668" is forbidden: User "system:anonymous" cannot patch resource "events" in API group "" in the namespace "default"' (will not retry!)
Mar 29 17:49:24 worker-node02 kubelet[5319]: W0329 17:49:24.817128    5319 reflector.go:424] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.Node: nodes "worker-node02" is forbidden: User "system:anonymous" cannot list resource "nodes" in API group "" at the cluster scope
Mar 29 17:49:24 worker-node02 kubelet[5319]: E0329 17:49:24.817439    5319 reflector.go:140] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.Node: failed to list *v1.Node: nodes "worker-node02" is forbidden: User "system:anonymous" cannot list resource "nodes" in API group "" at the cluster scope
Mar 29 17:49:25 worker-node02 kubelet[5319]: W0329 17:49:25.197126    5319 reflector.go:424] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.Service: services is forbidden: User "system:anonymous" cannot list resource "services" in API group "" at the cluster scope
Mar 29 17:49:25 worker-node02 kubelet[5319]: E0329 17:49:25.197747    5319 reflector.go:140] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.Service: failed to list *v1.Service: services is forbidden: User "system:anonymous" cannot list resource "services" in API group "" at the cluster scope
Mar 29 17:49:25 worker-node02 kubelet[5319]: W0329 17:49:25.323987    5319 reflector.go:424] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.CSIDriver: csidrivers.storage.k8s.io is forbidden: User "system:anonymous" cannot list resource "csidrivers" in API group "storage.k8s.io" at the cluster scope
Mar 29 17:49:25 worker-node02 kubelet[5319]: E0329 17:49:25.324376    5319 reflector.go:140] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.CSIDriver: failed to list *v1.CSIDriver: csidrivers.storage.k8s.io is forbidden: User "system:anonymous" cannot list resource "csidrivers" in API group "storage.k8s.io" at the cluster scope
Mar 29 17:49:25 worker-node02 kubelet[5319]: E0329 17:49:25.383015    5319 controller.go:146] failed to ensure lease exists, will retry in 1.6s, error: leases.coordination.k8s.io "worker-node02" is forbidden: User "system:anonymous" cannot get resource "leases" in API group "coordination.k8s.io" in the namespace "kube-node-lease"
Mar 29 17:49:25 worker-node02 kubelet[5319]: I0329 17:49:25.388362    5319 kubelet_node_status.go:70] "Attempting to register node" node="worker-node02"
Mar 29 17:49:25 worker-node02 kubelet[5319]: E0329 17:49:25.393135    5319 kubelet_node_status.go:92] "Unable to register node with API server" err="nodes is forbidden: User \"system:anonymous\" cannot create resource \"nodes\" in API group \"\" at the cluster scope" node="worker-node02"
Mar 29 17:49:25 worker-node02 kubelet[5319]: E0329 17:49:25.394930    5319 event.go:267] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"worker-node02.1750f51c8bbc34f1", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Node", Namespace:"", Name:"worker-node02", UID:"worker-node02", APIVersion:"", ResourceVersion:"", FieldPath:""}, Reason:"NodeHasSufficientMemory", Message:"Node worker-node02 status is now: NodeHasSufficientMemory", Source:v1.EventSource{Component:"kubelet", Host:"worker-node02"}, FirstTimestamp:time.Date(2023, time.March, 29, 17, 49, 23, 961451761, time.Local), LastTimestamp:time.Date(2023, time.March, 29, 17, 49, 25, 388324240, time.Local), Count:5, Type:"Normal", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events "worker-node02.1750f51c8bbc34f1" is forbidden: User "system:anonymous" cannot patch resource "events" in API group "" in the namespace "default"' (will not retry!)
Mar 29 17:49:25 worker-node02 kubelet[5319]: E0329 17:49:25.399977    5319 event.go:267] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"worker-node02.1750f51c8bbc673f", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Node", Namespace:"", Name:"worker-node02", UID:"worker-node02", APIVersion:"", ResourceVersion:"", FieldPath:""}, Reason:"NodeHasNoDiskPressure", Message:"Node worker-node02 status is now: NodeHasNoDiskPressure", Source:v1.EventSource{Component:"kubelet", Host:"worker-node02"}, FirstTimestamp:time.Date(2023, time.March, 29, 17, 49, 23, 961464639, time.Local), LastTimestamp:time.Date(2023, time.March, 29, 17, 49, 25, 388331417, time.Local), Count:5, Type:"Normal", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events "worker-node02.1750f51c8bbc673f" is forbidden: User "system:anonymous" cannot patch resource "events" in API group "" in the namespace "default"' (will not retry!)
Mar 29 17:49:25 worker-node02 kubelet[5319]: E0329 17:49:25.403524    5319 event.go:267] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"worker-node02.1750f51c8bbc7668", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Node", Namespace:"", Name:"worker-node02", UID:"worker-node02", APIVersion:"", ResourceVersion:"", FieldPath:""}, Reason:"NodeHasSufficientPID", Message:"Node worker-node02 status is now: NodeHasSufficientPID", Source:v1.EventSource{Component:"kubelet", Host:"worker-node02"}, FirstTimestamp:time.Date(2023, time.March, 29, 17, 49, 23, 961468520, time.Local), LastTimestamp:time.Date(2023, time.March, 29, 17, 49, 25, 388334787, time.Local), Count:5, Type:"Normal", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events "worker-node02.1750f51c8bbc7668" is forbidden: User "system:anonymous" cannot patch resource "events" in API group "" in the namespace "default"' (will not retry!)
Mar 29 17:49:25 worker-node02 kubelet[5319]: W0329 17:49:25.722736    5319 reflector.go:424] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.RuntimeClass: runtimeclasses.node.k8s.io is forbidden: User "system:anonymous" cannot list resource "runtimeclasses" in API group "node.k8s.io" at the cluster scope
Mar 29 17:49:25 worker-node02 kubelet[5319]: E0329 17:49:25.723084    5319 reflector.go:140] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.RuntimeClass: failed to list *v1.RuntimeClass: runtimeclasses.node.k8s.io is forbidden: User "system:anonymous" cannot list resource "runtimeclasses" in API group "node.k8s.io" at the cluster scope
Mar 29 17:49:26 worker-node02 kubelet[5319]: W0329 17:49:26.768736    5319 reflector.go:424] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.Node: nodes "worker-node02" is forbidden: User "system:anonymous" cannot list resource "nodes" in API group "" at the cluster scope
Mar 29 17:49:26 worker-node02 kubelet[5319]: E0329 17:49:26.769902    5319 reflector.go:140] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.Node: failed to list *v1.Node: nodes "worker-node02" is forbidden: User "system:anonymous" cannot list resource "nodes" in API group "" at the cluster scope
Mar 29 17:49:26 worker-node02 kubelet[5319]: E0329 17:49:26.986715    5319 controller.go:146] failed to ensure lease exists, will retry in 3.2s, error: leases.coordination.k8s.io "worker-node02" is forbidden: User "system:anonymous" cannot get resource "leases" in API group "coordination.k8s.io" in the namespace "kube-node-lease"
Mar 29 17:49:26 worker-node02 kubelet[5319]: I0329 17:49:26.994921    5319 kubelet_node_status.go:70] "Attempting to register node" node="worker-node02"
Mar 29 17:49:26 worker-node02 kubelet[5319]: E0329 17:49:26.999112    5319 kubelet_node_status.go:92] "Unable to register node with API server" err="nodes is forbidden: User \"system:anonymous\" cannot create resource \"nodes\" in API group \"\" at the cluster scope" node="worker-node02"
Mar 29 17:49:26 worker-node02 kubelet[5319]: E0329 17:49:26.999691    5319 event.go:267] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"worker-node02.1750f51c8bbc34f1", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Node", Namespace:"", Name:"worker-node02", UID:"worker-node02", APIVersion:"", ResourceVersion:"", FieldPath:""}, Reason:"NodeHasSufficientMemory", Message:"Node worker-node02 status is now: NodeHasSufficientMemory", Source:v1.EventSource{Component:"kubelet", Host:"worker-node02"}, FirstTimestamp:time.Date(2023, time.March, 29, 17, 49, 23, 961451761, time.Local), LastTimestamp:time.Date(2023, time.March, 29, 17, 49, 26, 994885992, time.Local), Count:6, Type:"Normal", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events "worker-node02.1750f51c8bbc34f1" is forbidden: User "system:anonymous" cannot patch resource "events" in API group "" in the namespace "default"' (will not retry!)
Mar 29 17:49:27 worker-node02 kubelet[5319]: E0329 17:49:27.002684    5319 event.go:267] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"worker-node02.1750f51c8bbc673f", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Node", Namespace:"", Name:"worker-node02", UID:"worker-node02", APIVersion:"", ResourceVersion:"", FieldPath:""}, Reason:"NodeHasNoDiskPressure", Message:"Node worker-node02 status is now: NodeHasNoDiskPressure", Source:v1.EventSource{Component:"kubelet", Host:"worker-node02"}, FirstTimestamp:time.Date(2023, time.March, 29, 17, 49, 23, 961464639, time.Local), LastTimestamp:time.Date(2023, time.March, 29, 17, 49, 26, 994894464, time.Local), Count:6, Type:"Normal", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events "worker-node02.1750f51c8bbc673f" is forbidden: User "system:anonymous" cannot patch resource "events" in API group "" in the namespace "default"' (will not retry!)
Mar 29 17:49:27 worker-node02 kubelet[5319]: E0329 17:49:27.005487    5319 event.go:267] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"worker-node02.1750f51c8bbc7668", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Node", Namespace:"", Name:"worker-node02", UID:"worker-node02", APIVersion:"", ResourceVersion:"", FieldPath:""}, Reason:"NodeHasSufficientPID", Message:"Node worker-node02 status is now: NodeHasSufficientPID", Source:v1.EventSource{Component:"kubelet", Host:"worker-node02"}, FirstTimestamp:time.Date(2023, time.March, 29, 17, 49, 23, 961468520, time.Local), LastTimestamp:time.Date(2023, time.March, 29, 17, 49, 26, 994896749, time.Local), Count:6, Type:"Normal", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events "worker-node02.1750f51c8bbc7668" is forbidden: User "system:anonymous" cannot patch resource "events" in API group "" in the namespace "default"' (will not retry!)
Mar 29 17:49:27 worker-node02 kubelet[5319]: W0329 17:49:27.113162    5319 reflector.go:424] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.CSIDriver: csidrivers.storage.k8s.io is forbidden: User "system:anonymous" cannot list resource "csidrivers" in API group "storage.k8s.io" at the cluster scope
Mar 29 17:49:27 worker-node02 kubelet[5319]: E0329 17:49:27.113556    5319 reflector.go:140] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.CSIDriver: failed to list *v1.CSIDriver: csidrivers.storage.k8s.io is forbidden: User "system:anonymous" cannot list resource "csidrivers" in API group "storage.k8s.io" at the cluster scope
Mar 29 17:49:27 worker-node02 kubelet[5319]: W0329 17:49:27.734969    5319 reflector.go:424] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.Service: services is forbidden: User "system:anonymous" cannot list resource "services" in API group "" at the cluster scope
Mar 29 17:49:27 worker-node02 kubelet[5319]: E0329 17:49:27.735290    5319 reflector.go:140] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.Service: failed to list *v1.Service: services is forbidden: User "system:anonymous" cannot list resource "services" in API group "" at the cluster scope
Mar 29 17:49:28 worker-node02 kubelet[5319]: W0329 17:49:28.292273    5319 reflector.go:424] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.RuntimeClass: runtimeclasses.node.k8s.io is forbidden: User "system:anonymous" cannot list resource "runtimeclasses" in API group "node.k8s.io" at the cluster scope
Mar 29 17:49:28 worker-node02 kubelet[5319]: E0329 17:49:28.292777    5319 reflector.go:140] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.RuntimeClass: failed to list *v1.RuntimeClass: runtimeclasses.node.k8s.io is forbidden: User "system:anonymous" cannot list resource "runtimeclasses" in API group "node.k8s.io" at the cluster scope
Mar 29 17:49:30 worker-node02 kubelet[5319]: E0329 17:49:30.190412    5319 controller.go:146] failed to ensure lease exists, will retry in 6.4s, error: leases.coordination.k8s.io "worker-node02" is forbidden: User "system:anonymous" cannot get resource "leases" in API group "coordination.k8s.io" in the namespace "kube-node-lease"
Mar 29 17:49:30 worker-node02 kubelet[5319]: I0329 17:49:30.201525    5319 kubelet_node_status.go:70] "Attempting to register node" node="worker-node02"
Mar 29 17:49:30 worker-node02 kubelet[5319]: E0329 17:49:30.206234    5319 kubelet_node_status.go:92] "Unable to register node with API server" err="nodes is forbidden: User \"system:anonymous\" cannot create resource \"nodes\" in API group \"\" at the cluster scope" node="worker-node02"
Mar 29 17:49:30 worker-node02 kubelet[5319]: E0329 17:49:30.206553    5319 event.go:267] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"worker-node02.1750f51c8bbc34f1", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Node", Namespace:"", Name:"worker-node02", UID:"worker-node02", APIVersion:"", ResourceVersion:"", FieldPath:""}, Reason:"NodeHasSufficientMemory", Message:"Node worker-node02 status is now: NodeHasSufficientMemory", Source:v1.EventSource{Component:"kubelet", Host:"worker-node02"}, FirstTimestamp:time.Date(2023, time.March, 29, 17, 49, 23, 961451761, time.Local), LastTimestamp:time.Date(2023, time.March, 29, 17, 49, 30, 201473598, time.Local), Count:7, Type:"Normal", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events "worker-node02.1750f51c8bbc34f1" is forbidden: User "system:anonymous" cannot patch resource "events" in API group "" in the namespace "default"' (will not retry!)
Mar 29 17:49:30 worker-node02 kubelet[5319]: E0329 17:49:30.210005    5319 event.go:267] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"worker-node02.1750f51c8bbc673f", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Node", Namespace:"", Name:"worker-node02", UID:"worker-node02", APIVersion:"", ResourceVersion:"", FieldPath:""}, Reason:"NodeHasNoDiskPressure", Message:"Node worker-node02 status is now: NodeHasNoDiskPressure", Source:v1.EventSource{Component:"kubelet", Host:"worker-node02"}, FirstTimestamp:time.Date(2023, time.March, 29, 17, 49, 23, 961464639, time.Local), LastTimestamp:time.Date(2023, time.March, 29, 17, 49, 30, 201492432, time.Local), Count:7, Type:"Normal", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events "worker-node02.1750f51c8bbc673f" is forbidden: User "system:anonymous" cannot patch resource "events" in API group "" in the namespace "default"' (will not retry!)
Mar 29 17:49:30 worker-node02 kubelet[5319]: E0329 17:49:30.212623    5319 event.go:267] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"worker-node02.1750f51c8bbc7668", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Node", Namespace:"", Name:"worker-node02", UID:"worker-node02", APIVersion:"", ResourceVersion:"", FieldPath:""}, Reason:"NodeHasSufficientPID", Message:"Node worker-node02 status is now: NodeHasSufficientPID", Source:v1.EventSource{Component:"kubelet", Host:"worker-node02"}, FirstTimestamp:time.Date(2023, time.March, 29, 17, 49, 23, 961468520, time.Local), LastTimestamp:time.Date(2023, time.March, 29, 17, 49, 30, 201495623, time.Local), Count:7, Type:"Normal", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events "worker-node02.1750f51c8bbc7668" is forbidden: User "system:anonymous" cannot patch resource "events" in API group "" in the namespace "default"' (will not retry!)
Mar 29 17:49:32 worker-node02 kubelet[5319]: W0329 17:49:32.068738    5319 reflector.go:424] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.Node: nodes "worker-node02" is forbidden: User "system:anonymous" cannot list resource "nodes" in API group "" at the cluster scope
Mar 29 17:49:32 worker-node02 kubelet[5319]: E0329 17:49:32.069206    5319 reflector.go:140] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.Node: failed to list *v1.Node: nodes "worker-node02" is forbidden: User "system:anonymous" cannot list resource "nodes" in API group "" at the cluster scope
Mar 29 17:49:32 worker-node02 kubelet[5319]: W0329 17:49:32.119492    5319 reflector.go:424] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.Service: services is forbidden: User "system:anonymous" cannot list resource "services" in API group "" at the cluster scope
Mar 29 17:49:32 worker-node02 kubelet[5319]: E0329 17:49:32.119785    5319 reflector.go:140] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.Service: failed to list *v1.Service: services is forbidden: User "system:anonymous" cannot list resource "services" in API group "" at the cluster scope
Mar 29 17:49:33 worker-node02 kubelet[5319]: W0329 17:49:33.120012    5319 reflector.go:424] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.CSIDriver: csidrivers.storage.k8s.io is forbidden: User "system:anonymous" cannot list resource "csidrivers" in API group "storage.k8s.io" at the cluster scope
Mar 29 17:49:33 worker-node02 kubelet[5319]: E0329 17:49:33.120700    5319 reflector.go:140] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.CSIDriver: failed to list *v1.CSIDriver: csidrivers.storage.k8s.io is forbidden: User "system:anonymous" cannot list resource "csidrivers" in API group "storage.k8s.io" at the cluster scope
Mar 29 17:49:33 worker-node02 kubelet[5319]: I0329 17:49:33.674904    5319 transport.go:135] "Certificate rotation detected, shutting down client connections to start using new credentials"
Mar 29 17:49:34 worker-node02 kubelet[5319]: E0329 17:49:34.091952    5319 eviction_manager.go:261] "Eviction manager: failed to get summary stats" err="failed to get node info: node \"worker-node02\" not found"
Mar 29 17:49:34 worker-node02 kubelet[5319]: E0329 17:49:34.316024    5319 csi_plugin.go:295] Failed to initialize CSINode: error updating CSINode annotation: timed out waiting for the condition; caused by: nodes "worker-node02" not found
Mar 29 17:49:35 worker-node02 kubelet[5319]: E0329 17:49:35.557139    5319 csi_plugin.go:295] Failed to initialize CSINode: error updating CSINode annotation: timed out waiting for the condition; caused by: nodes "worker-node02" not found
Mar 29 17:49:36 worker-node02 kubelet[5319]: E0329 17:49:36.601473    5319 nodelease.go:49] "Failed to get node when trying to set owner ref to the node lease" err="nodes \"worker-node02\" not found" node="worker-node02"
Mar 29 17:49:36 worker-node02 kubelet[5319]: I0329 17:49:36.608079    5319 kubelet_node_status.go:70] "Attempting to register node" node="worker-node02"
Mar 29 17:49:36 worker-node02 kubelet[5319]: I0329 17:49:36.960982    5319 kubelet_node_status.go:73] "Successfully registered node" node="worker-node02"
Mar 29 17:49:36 worker-node02 kubelet[5319]: E0329 17:49:36.999045    5319 kubelet_node_status.go:458] "Error getting the current node from lister" err="node \"worker-node02\" not found"
Mar 29 17:49:37 worker-node02 kubelet[5319]: E0329 17:49:37.099445    5319 kubelet_node_status.go:458] "Error getting the current node from lister" err="node \"worker-node02\" not found"
Mar 29 17:49:37 worker-node02 kubelet[5319]: E0329 17:49:37.201268    5319 kubelet_node_status.go:458] "Error getting the current node from lister" err="node \"worker-node02\" not found"
Mar 29 17:49:37 worker-node02 kubelet[5319]: E0329 17:49:37.302078    5319 kubelet_node_status.go:458] "Error getting the current node from lister" err="node \"worker-node02\" not found"
Mar 29 17:49:37 worker-node02 kubelet[5319]: E0329 17:49:37.403584    5319 kubelet_node_status.go:458] "Error getting the current node from lister" err="node \"worker-node02\" not found"
Mar 29 17:49:37 worker-node02 kubelet[5319]: E0329 17:49:37.513182    5319 kubelet_node_status.go:458] "Error getting the current node from lister" err="node \"worker-node02\" not found"
Mar 29 17:49:37 worker-node02 kubelet[5319]: E0329 17:49:37.614796    5319 kubelet_node_status.go:458] "Error getting the current node from lister" err="node \"worker-node02\" not found"
Mar 29 17:49:37 worker-node02 kubelet[5319]: E0329 17:49:37.716226    5319 kubelet_node_status.go:458] "Error getting the current node from lister" err="node \"worker-node02\" not found"
Mar 29 17:49:37 worker-node02 kubelet[5319]: E0329 17:49:37.817642    5319 kubelet_node_status.go:458] "Error getting the current node from lister" err="node \"worker-node02\" not found"
Mar 29 17:49:37 worker-node02 kubelet[5319]: E0329 17:49:37.918766    5319 kubelet_node_status.go:458] "Error getting the current node from lister" err="node \"worker-node02\" not found"
Mar 29 17:49:38 worker-node02 kubelet[5319]: E0329 17:49:38.019620    5319 kubelet_node_status.go:458] "Error getting the current node from lister" err="node \"worker-node02\" not found"
Mar 29 17:49:38 worker-node02 kubelet[5319]: E0329 17:49:38.119974    5319 kubelet_node_status.go:458] "Error getting the current node from lister" err="node \"worker-node02\" not found"
Mar 29 17:49:38 worker-node02 kubelet[5319]: E0329 17:49:38.223264    5319 kubelet_node_status.go:458] "Error getting the current node from lister" err="node \"worker-node02\" not found"
Mar 29 17:49:38 worker-node02 kubelet[5319]: E0329 17:49:38.325452    5319 kubelet_node_status.go:458] "Error getting the current node from lister" err="node \"worker-node02\" not found"
Mar 29 17:49:38 worker-node02 kubelet[5319]: E0329 17:49:38.428814    5319 kubelet_node_status.go:458] "Error getting the current node from lister" err="node \"worker-node02\" not found"
Mar 29 17:49:38 worker-node02 kubelet[5319]: E0329 17:49:38.534339    5319 kubelet_node_status.go:458] "Error getting the current node from lister" err="node \"worker-node02\" not found"
Mar 29 17:49:38 worker-node02 kubelet[5319]: E0329 17:49:38.643592    5319 kubelet_node_status.go:458] "Error getting the current node from lister" err="node \"worker-node02\" not found"
Mar 29 17:49:38 worker-node02 kubelet[5319]: E0329 17:49:38.744673    5319 kubelet_node_status.go:458] "Error getting the current node from lister" err="node \"worker-node02\" not found"
Mar 29 17:49:38 worker-node02 kubelet[5319]: E0329 17:49:38.845778    5319 kubelet_node_status.go:458] "Error getting the current node from lister" err="node \"worker-node02\" not found"
Mar 29 17:49:38 worker-node02 kubelet[5319]: E0329 17:49:38.946143    5319 kubelet_node_status.go:458] "Error getting the current node from lister" err="node \"worker-node02\" not found"
Mar 29 17:49:39 worker-node02 kubelet[5319]: E0329 17:49:39.047431    5319 kubelet_node_status.go:458] "Error getting the current node from lister" err="node \"worker-node02\" not found"
Mar 29 17:49:39 worker-node02 kubelet[5319]: E0329 17:49:39.148149    5319 kubelet_node_status.go:458] "Error getting the current node from lister" err="node \"worker-node02\" not found"
Mar 29 17:49:39 worker-node02 kubelet[5319]: E0329 17:49:39.249388    5319 kubelet_node_status.go:458] "Error getting the current node from lister" err="node \"worker-node02\" not found"
Mar 29 17:49:39 worker-node02 kubelet[5319]: E0329 17:49:39.350270    5319 kubelet_node_status.go:458] "Error getting the current node from lister" err="node \"worker-node02\" not found"
Mar 29 17:49:39 worker-node02 kubelet[5319]: E0329 17:49:39.454387    5319 kubelet_node_status.go:458] "Error getting the current node from lister" err="node \"worker-node02\" not found"
Mar 29 17:49:39 worker-node02 kubelet[5319]: E0329 17:49:39.561692    5319 kubelet_node_status.go:458] "Error getting the current node from lister" err="node \"worker-node02\" not found"
Mar 29 17:49:39 worker-node02 kubelet[5319]: E0329 17:49:39.664680    5319 kubelet_node_status.go:458] "Error getting the current node from lister" err="node \"worker-node02\" not found"
Mar 29 17:49:39 worker-node02 kubelet[5319]: E0329 17:49:39.774840    5319 kubelet_node_status.go:458] "Error getting the current node from lister" err="node \"worker-node02\" not found"
Mar 29 17:49:39 worker-node02 kubelet[5319]: E0329 17:49:39.884676    5319 kubelet_node_status.go:458] "Error getting the current node from lister" err="node \"worker-node02\" not found"
Mar 29 17:49:39 worker-node02 kubelet[5319]: I0329 17:49:39.997503    5319 kuberuntime_manager.go:1114] "Updating runtime config through cri with podcidr" CIDR="172.16.2.0/24"
Mar 29 17:49:39 worker-node02 kubelet[5319]: I0329 17:49:39.998942    5319 kubelet_network.go:61] "Updating Pod CIDR" originalPodCIDR="" newPodCIDR="172.16.2.0/24"
Mar 29 17:49:39 worker-node02 kubelet[5319]: I0329 17:49:39.999957    5319 kubelet_node_status.go:493] "Fast updating node status as it just became ready"
Mar 29 17:49:40 worker-node02 kubelet[5319]: I0329 17:49:40.756302    5319 apiserver.go:52] "Watching apiserver"
Mar 29 17:49:40 worker-node02 kubelet[5319]: I0329 17:49:40.783470    5319 topology_manager.go:210] "Topology Admit Handler"
Mar 29 17:49:40 worker-node02 kubelet[5319]: I0329 17:49:40.783848    5319 topology_manager.go:210] "Topology Admit Handler"
Mar 29 17:49:40 worker-node02 kubelet[5319]: I0329 17:49:40.791625    5319 desired_state_of_world_populator.go:159] "Finished populating initial desired state of world"
Mar 29 17:49:40 worker-node02 kubelet[5319]: I0329 17:49:40.798996    5319 reconciler_common.go:253] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kube-proxy\" (UniqueName: \"kubernetes.io/configmap/075134a2-8dbf-46d2-98b1-6f2d722cfee7-kube-proxy\") pod \"kube-proxy-dkw9j\" (UID: \"075134a2-8dbf-46d2-98b1-6f2d722cfee7\") " pod="kube-system/kube-proxy-dkw9j"
Mar 29 17:49:40 worker-node02 kubelet[5319]: I0329 17:49:40.799358    5319 reconciler_common.go:253] "operationExecutor.VerifyControllerAttachedVolume started for volume \"bpffs\" (UniqueName: \"kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-bpffs\") pod \"calico-node-g5mcr\" (UID: \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\") " pod="kube-system/calico-node-g5mcr"
Mar 29 17:49:40 worker-node02 kubelet[5319]: I0329 17:49:40.799594    5319 reconciler_common.go:253] "operationExecutor.VerifyControllerAttachedVolume started for volume \"cni-log-dir\" (UniqueName: \"kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-cni-log-dir\") pod \"calico-node-g5mcr\" (UID: \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\") " pod="kube-system/calico-node-g5mcr"
Mar 29 17:49:40 worker-node02 kubelet[5319]: I0329 17:49:40.800396    5319 reconciler_common.go:253] "operationExecutor.VerifyControllerAttachedVolume started for volume \"cni-net-dir\" (UniqueName: \"kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-cni-net-dir\") pod \"calico-node-g5mcr\" (UID: \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\") " pod="kube-system/calico-node-g5mcr"
Mar 29 17:49:40 worker-node02 kubelet[5319]: I0329 17:49:40.800650    5319 reconciler_common.go:253] "operationExecutor.VerifyControllerAttachedVolume started for volume \"host-local-net-dir\" (UniqueName: \"kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-host-local-net-dir\") pod \"calico-node-g5mcr\" (UID: \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\") " pod="kube-system/calico-node-g5mcr"
Mar 29 17:49:40 worker-node02 kubelet[5319]: I0329 17:49:40.800895    5319 reconciler_common.go:253] "operationExecutor.VerifyControllerAttachedVolume started for volume \"policysync\" (UniqueName: \"kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-policysync\") pod \"calico-node-g5mcr\" (UID: \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\") " pod="kube-system/calico-node-g5mcr"
Mar 29 17:49:40 worker-node02 kubelet[5319]: I0329 17:49:40.801166    5319 reconciler_common.go:253] "operationExecutor.VerifyControllerAttachedVolume started for volume \"lib-modules\" (UniqueName: \"kubernetes.io/host-path/075134a2-8dbf-46d2-98b1-6f2d722cfee7-lib-modules\") pod \"kube-proxy-dkw9j\" (UID: \"075134a2-8dbf-46d2-98b1-6f2d722cfee7\") " pod="kube-system/kube-proxy-dkw9j"
Mar 29 17:49:40 worker-node02 kubelet[5319]: I0329 17:49:40.801413    5319 reconciler_common.go:253] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kube-api-access-cxprl\" (UniqueName: \"kubernetes.io/projected/075134a2-8dbf-46d2-98b1-6f2d722cfee7-kube-api-access-cxprl\") pod \"kube-proxy-dkw9j\" (UID: \"075134a2-8dbf-46d2-98b1-6f2d722cfee7\") " pod="kube-system/kube-proxy-dkw9j"
Mar 29 17:49:40 worker-node02 kubelet[5319]: I0329 17:49:40.801761    5319 reconciler_common.go:253] "operationExecutor.VerifyControllerAttachedVolume started for volume \"lib-modules\" (UniqueName: \"kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-lib-modules\") pod \"calico-node-g5mcr\" (UID: \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\") " pod="kube-system/calico-node-g5mcr"
Mar 29 17:49:40 worker-node02 kubelet[5319]: I0329 17:49:40.802031    5319 reconciler_common.go:253] "operationExecutor.VerifyControllerAttachedVolume started for volume \"nodeproc\" (UniqueName: \"kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-nodeproc\") pod \"calico-node-g5mcr\" (UID: \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\") " pod="kube-system/calico-node-g5mcr"
Mar 29 17:49:40 worker-node02 kubelet[5319]: I0329 17:49:40.802204    5319 reconciler_common.go:253] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kube-api-access-6wb2h\" (UniqueName: \"kubernetes.io/projected/8402f0e8-dc01-4de3-8c0f-585b491d49a1-kube-api-access-6wb2h\") pod \"calico-node-g5mcr\" (UID: \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\") " pod="kube-system/calico-node-g5mcr"
Mar 29 17:49:40 worker-node02 kubelet[5319]: I0329 17:49:40.802401    5319 reconciler_common.go:253] "operationExecutor.VerifyControllerAttachedVolume started for volume \"var-run-calico\" (UniqueName: \"kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-var-run-calico\") pod \"calico-node-g5mcr\" (UID: \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\") " pod="kube-system/calico-node-g5mcr"
Mar 29 17:49:40 worker-node02 kubelet[5319]: I0329 17:49:40.803471    5319 reconciler_common.go:253] "operationExecutor.VerifyControllerAttachedVolume started for volume \"var-lib-calico\" (UniqueName: \"kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-var-lib-calico\") pod \"calico-node-g5mcr\" (UID: \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\") " pod="kube-system/calico-node-g5mcr"
Mar 29 17:49:40 worker-node02 kubelet[5319]: I0329 17:49:40.803705    5319 reconciler_common.go:253] "operationExecutor.VerifyControllerAttachedVolume started for volume \"xtables-lock\" (UniqueName: \"kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-xtables-lock\") pod \"calico-node-g5mcr\" (UID: \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\") " pod="kube-system/calico-node-g5mcr"
Mar 29 17:49:40 worker-node02 kubelet[5319]: I0329 17:49:40.804434    5319 reconciler_common.go:253] "operationExecutor.VerifyControllerAttachedVolume started for volume \"cni-bin-dir\" (UniqueName: \"kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-cni-bin-dir\") pod \"calico-node-g5mcr\" (UID: \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\") " pod="kube-system/calico-node-g5mcr"
Mar 29 17:49:40 worker-node02 kubelet[5319]: I0329 17:49:40.804683    5319 reconciler_common.go:253] "operationExecutor.VerifyControllerAttachedVolume started for volume \"xtables-lock\" (UniqueName: \"kubernetes.io/host-path/075134a2-8dbf-46d2-98b1-6f2d722cfee7-xtables-lock\") pod \"kube-proxy-dkw9j\" (UID: \"075134a2-8dbf-46d2-98b1-6f2d722cfee7\") " pod="kube-system/kube-proxy-dkw9j"
Mar 29 17:49:40 worker-node02 kubelet[5319]: I0329 17:49:40.805348    5319 reconciler_common.go:253] "operationExecutor.VerifyControllerAttachedVolume started for volume \"sys-fs\" (UniqueName: \"kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-sys-fs\") pod \"calico-node-g5mcr\" (UID: \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\") " pod="kube-system/calico-node-g5mcr"
Mar 29 17:49:40 worker-node02 kubelet[5319]: I0329 17:49:40.805534    5319 reconciler.go:41] "Reconciler: start to sync state"
Mar 29 17:49:41 worker-node02 kubelet[5319]: I0329 17:49:41.165122    5319 topology_manager.go:210] "Topology Admit Handler"
Mar 29 17:49:41 worker-node02 kubelet[5319]: I0329 17:49:41.208568    5319 reconciler_common.go:253] "operationExecutor.VerifyControllerAttachedVolume started for volume \"tmp-volume\" (UniqueName: \"kubernetes.io/empty-dir/b210cff4-254b-4942-9c41-82388850e7a8-tmp-volume\") pod \"kubernetes-dashboard-6c7ccbcf87-vssvh\" (UID: \"b210cff4-254b-4942-9c41-82388850e7a8\") " pod="kubernetes-dashboard/kubernetes-dashboard-6c7ccbcf87-vssvh"
Mar 29 17:49:41 worker-node02 kubelet[5319]: I0329 17:49:41.208639    5319 reconciler_common.go:253] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kubernetes-dashboard-certs\" (UniqueName: \"kubernetes.io/secret/b210cff4-254b-4942-9c41-82388850e7a8-kubernetes-dashboard-certs\") pod \"kubernetes-dashboard-6c7ccbcf87-vssvh\" (UID: \"b210cff4-254b-4942-9c41-82388850e7a8\") " pod="kubernetes-dashboard/kubernetes-dashboard-6c7ccbcf87-vssvh"
Mar 29 17:49:41 worker-node02 kubelet[5319]: I0329 17:49:41.208671    5319 reconciler_common.go:253] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kube-api-access-gzsch\" (UniqueName: \"kubernetes.io/projected/b210cff4-254b-4942-9c41-82388850e7a8-kube-api-access-gzsch\") pod \"kubernetes-dashboard-6c7ccbcf87-vssvh\" (UID: \"b210cff4-254b-4942-9c41-82388850e7a8\") " pod="kubernetes-dashboard/kubernetes-dashboard-6c7ccbcf87-vssvh"
Mar 29 17:49:41 worker-node02 kubelet[5319]: I0329 17:49:41.303068    5319 topology_manager.go:210] "Topology Admit Handler"
Mar 29 17:49:41 worker-node02 kubelet[5319]: I0329 17:49:41.414820    5319 reconciler_common.go:253] "operationExecutor.VerifyControllerAttachedVolume started for volume \"tmp-volume\" (UniqueName: \"kubernetes.io/empty-dir/27512b05-aa14-4653-8feb-99a4ee019dbf-tmp-volume\") pod \"dashboard-metrics-scraper-7bc864c59-zz598\" (UID: \"27512b05-aa14-4653-8feb-99a4ee019dbf\") " pod="kubernetes-dashboard/dashboard-metrics-scraper-7bc864c59-zz598"
Mar 29 17:49:41 worker-node02 kubelet[5319]: I0329 17:49:41.415362    5319 reconciler_common.go:253] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kube-api-access-k66lp\" (UniqueName: \"kubernetes.io/projected/27512b05-aa14-4653-8feb-99a4ee019dbf-kube-api-access-k66lp\") pod \"dashboard-metrics-scraper-7bc864c59-zz598\" (UID: \"27512b05-aa14-4653-8feb-99a4ee019dbf\") " pod="kubernetes-dashboard/dashboard-metrics-scraper-7bc864c59-zz598"
Mar 29 17:49:52 worker-node02 kubelet[5319]: I0329 17:49:52.545781    5319 pod_startup_latency_tracker.go:102] "Observed pod startup duration" pod="kube-system/kube-proxy-dkw9j" podStartSLOduration=-9.223372020309105e+09 pod.CreationTimestamp="2023-03-29 17:49:36 +0000 UTC" firstStartedPulling="2023-03-29 17:49:42.909102202 +0000 UTC m=+19.983180697" lastFinishedPulling="0001-01-01 00:00:00 +0000 UTC" observedRunningTime="2023-03-29 17:49:52.337600649 +0000 UTC m=+29.411679157" watchObservedRunningTime="2023-03-29 17:49:52.545671383 +0000 UTC m=+29.619749883"
Mar 29 17:49:52 worker-node02 kubelet[5319]: I0329 17:49:52.547325    5319 topology_manager.go:210] "Topology Admit Handler"
Mar 29 17:49:52 worker-node02 kubelet[5319]: W0329 17:49:52.566908    5319 reflector.go:424] object-"default"/"kube-root-ca.crt": failed to list *v1.ConfigMap: configmaps "kube-root-ca.crt" is forbidden: User "system:node:worker-node02" cannot list resource "configmaps" in API group "" in the namespace "default": no relationship found between node 'worker-node02' and this object
Mar 29 17:49:52 worker-node02 kubelet[5319]: E0329 17:49:52.567402    5319 reflector.go:140] object-"default"/"kube-root-ca.crt": Failed to watch *v1.ConfigMap: failed to list *v1.ConfigMap: configmaps "kube-root-ca.crt" is forbidden: User "system:node:worker-node02" cannot list resource "configmaps" in API group "" in the namespace "default": no relationship found between node 'worker-node02' and this object
Mar 29 17:49:52 worker-node02 kubelet[5319]: I0329 17:49:52.624118    5319 reconciler_common.go:253] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kube-api-access-jr2b5\" (UniqueName: \"kubernetes.io/projected/86bce07d-98ce-4c7c-a3eb-bb23753bcaa0-kube-api-access-jr2b5\") pod \"echo-deployment-d658654dc-8gkvk\" (UID: \"86bce07d-98ce-4c7c-a3eb-bb23753bcaa0\") " pod="default/echo-deployment-d658654dc-8gkvk"
Mar 29 17:49:53 worker-node02 kubelet[5319]: E0329 17:49:53.744368    5319 projected.go:292] Couldn't get configMap default/kube-root-ca.crt: failed to sync configmap cache: timed out waiting for the condition
Mar 29 17:49:53 worker-node02 kubelet[5319]: E0329 17:49:53.745381    5319 projected.go:198] Error preparing data for projected volume kube-api-access-jr2b5 for pod default/echo-deployment-d658654dc-8gkvk: failed to sync configmap cache: timed out waiting for the condition
Mar 29 17:49:53 worker-node02 kubelet[5319]: E0329 17:49:53.745678    5319 nestedpendingoperations.go:348] Operation for "{volumeName:kubernetes.io/projected/86bce07d-98ce-4c7c-a3eb-bb23753bcaa0-kube-api-access-jr2b5 podName:86bce07d-98ce-4c7c-a3eb-bb23753bcaa0 nodeName:}" failed. No retries permitted until 2023-03-29 17:49:54.245642839 +0000 UTC m=+31.319721339 (durationBeforeRetry 500ms). Error: MountVolume.SetUp failed for volume "kube-api-access-jr2b5" (UniqueName: "kubernetes.io/projected/86bce07d-98ce-4c7c-a3eb-bb23753bcaa0-kube-api-access-jr2b5") pod "echo-deployment-d658654dc-8gkvk" (UID: "86bce07d-98ce-4c7c-a3eb-bb23753bcaa0") : failed to sync configmap cache: timed out waiting for the condition
Mar 29 17:50:22 worker-node02 kubelet[5319]: I0329 17:50:22.913276    5319 pod_startup_latency_tracker.go:102] "Observed pod startup duration" pod="kubernetes-dashboard/kubernetes-dashboard-6c7ccbcf87-vssvh" podStartSLOduration=-9.223371993952316e+09 pod.CreationTimestamp="2023-03-29 17:49:40 +0000 UTC" firstStartedPulling="2023-03-29 17:49:43.341007377 +0000 UTC m=+20.415085872" lastFinishedPulling="0001-01-01 00:00:00 +0000 UTC" observedRunningTime="2023-03-29 17:50:17.847190216 +0000 UTC m=+54.921268720" watchObservedRunningTime="2023-03-29 17:50:22.902460351 +0000 UTC m=+59.976538869"
Mar 29 17:50:32 worker-node02 kubelet[5319]: I0329 17:50:32.994132    5319 pod_startup_latency_tracker.go:102] "Observed pod startup duration" pod="kubernetes-dashboard/dashboard-metrics-scraper-7bc864c59-zz598" podStartSLOduration=-9.223371983860693e+09 pod.CreationTimestamp="2023-03-29 17:49:40 +0000 UTC" firstStartedPulling="2023-03-29 17:49:43.601677507 +0000 UTC m=+20.675756002" lastFinishedPulling="0001-01-01 00:00:00 +0000 UTC" observedRunningTime="2023-03-29 17:50:22.91946549 +0000 UTC m=+59.993544001" watchObservedRunningTime="2023-03-29 17:50:32.994082485 +0000 UTC m=+70.068160987"
Mar 29 17:50:32 worker-node02 kubelet[5319]: I0329 17:50:32.995737    5319 pod_startup_latency_tracker.go:102] "Observed pod startup duration" pod="default/echo-deployment-d658654dc-8gkvk" podStartSLOduration=-9.223371995859076e+09 pod.CreationTimestamp="2023-03-29 17:49:52 +0000 UTC" firstStartedPulling="2023-03-29 17:49:54.579967652 +0000 UTC m=+31.654046144" lastFinishedPulling="0001-01-01 00:00:00 +0000 UTC" observedRunningTime="2023-03-29 17:50:32.993814844 +0000 UTC m=+70.067893343" watchObservedRunningTime="2023-03-29 17:50:32.995700441 +0000 UTC m=+70.069778943"
Mar 29 17:51:01 worker-node02 kubelet[5319]: I0329 17:51:01.462434    5319 pod_startup_latency_tracker.go:102] "Observed pod startup duration" pod="kube-system/calico-node-g5mcr" podStartSLOduration=-9.22337195139845e+09 pod.CreationTimestamp="2023-03-29 17:49:36 +0000 UTC" firstStartedPulling="2023-03-29 17:49:42.911898155 +0000 UTC m=+19.985976645" lastFinishedPulling="0001-01-01 00:00:00 +0000 UTC" observedRunningTime="2023-03-29 17:50:48.290684359 +0000 UTC m=+85.364762862" watchObservedRunningTime="2023-03-29 17:51:01.456324583 +0000 UTC m=+98.530403089"
Mar 29 18:03:00 worker-node02 kubelet[5319]: I0329 18:03:00.231150    5319 setters.go:548] "Node became not ready" node="worker-node02" condition={Type:Ready Status:False LastHeartbeatTime:2023-03-29 18:03:00.226360686 +0000 UTC m=+817.300439173 LastTransitionTime:2023-03-29 18:03:00.226360686 +0000 UTC m=+817.300439173 Reason:KubeletNotReady Message:node is shutting down}
Mar 29 18:03:00 worker-node02 kubelet[5319]: I0329 18:03:00.938975    5319 reconciler_common.go:169] "operationExecutor.UnmountVolume started for volume \"kube-api-access-gzsch\" (UniqueName: \"kubernetes.io/projected/b210cff4-254b-4942-9c41-82388850e7a8-kube-api-access-gzsch\") pod \"b210cff4-254b-4942-9c41-82388850e7a8\" (UID: \"b210cff4-254b-4942-9c41-82388850e7a8\") "
Mar 29 18:03:00 worker-node02 kubelet[5319]: I0329 18:03:00.939598    5319 reconciler_common.go:169] "operationExecutor.UnmountVolume started for volume \"kube-api-access-k66lp\" (UniqueName: \"kubernetes.io/projected/27512b05-aa14-4653-8feb-99a4ee019dbf-kube-api-access-k66lp\") pod \"27512b05-aa14-4653-8feb-99a4ee019dbf\" (UID: \"27512b05-aa14-4653-8feb-99a4ee019dbf\") "
Mar 29 18:03:00 worker-node02 kubelet[5319]: I0329 18:03:00.939829    5319 reconciler_common.go:169] "operationExecutor.UnmountVolume started for volume \"tmp-volume\" (UniqueName: \"kubernetes.io/empty-dir/27512b05-aa14-4653-8feb-99a4ee019dbf-tmp-volume\") pod \"27512b05-aa14-4653-8feb-99a4ee019dbf\" (UID: \"27512b05-aa14-4653-8feb-99a4ee019dbf\") "
Mar 29 18:03:00 worker-node02 kubelet[5319]: I0329 18:03:00.939999    5319 reconciler_common.go:169] "operationExecutor.UnmountVolume started for volume \"tmp-volume\" (UniqueName: \"kubernetes.io/empty-dir/b210cff4-254b-4942-9c41-82388850e7a8-tmp-volume\") pod \"b210cff4-254b-4942-9c41-82388850e7a8\" (UID: \"b210cff4-254b-4942-9c41-82388850e7a8\") "
Mar 29 18:03:00 worker-node02 kubelet[5319]: I0329 18:03:00.940174    5319 reconciler_common.go:169] "operationExecutor.UnmountVolume started for volume \"kubernetes-dashboard-certs\" (UniqueName: \"kubernetes.io/secret/b210cff4-254b-4942-9c41-82388850e7a8-kubernetes-dashboard-certs\") pod \"b210cff4-254b-4942-9c41-82388850e7a8\" (UID: \"b210cff4-254b-4942-9c41-82388850e7a8\") "
Mar 29 18:03:00 worker-node02 kubelet[5319]: W0329 18:03:00.947464    5319 empty_dir.go:525] Warning: Failed to clear quota on /var/lib/kubelet/pods/27512b05-aa14-4653-8feb-99a4ee019dbf/volumes/kubernetes.io~empty-dir/tmp-volume: clearQuota called, but quotas disabled
Mar 29 18:03:00 worker-node02 kubelet[5319]: I0329 18:03:00.948423    5319 operation_generator.go:890] UnmountVolume.TearDown succeeded for volume "kubernetes.io/empty-dir/27512b05-aa14-4653-8feb-99a4ee019dbf-tmp-volume" (OuterVolumeSpecName: "tmp-volume") pod "27512b05-aa14-4653-8feb-99a4ee019dbf" (UID: "27512b05-aa14-4653-8feb-99a4ee019dbf"). InnerVolumeSpecName "tmp-volume". PluginName "kubernetes.io/empty-dir", VolumeGidValue ""
Mar 29 18:03:00 worker-node02 kubelet[5319]: W0329 18:03:00.950924    5319 empty_dir.go:525] Warning: Failed to clear quota on /var/lib/kubelet/pods/b210cff4-254b-4942-9c41-82388850e7a8/volumes/kubernetes.io~empty-dir/tmp-volume: clearQuota called, but quotas disabled
Mar 29 18:03:00 worker-node02 kubelet[5319]: I0329 18:03:00.955372    5319 operation_generator.go:890] UnmountVolume.TearDown succeeded for volume "kubernetes.io/projected/27512b05-aa14-4653-8feb-99a4ee019dbf-kube-api-access-k66lp" (OuterVolumeSpecName: "kube-api-access-k66lp") pod "27512b05-aa14-4653-8feb-99a4ee019dbf" (UID: "27512b05-aa14-4653-8feb-99a4ee019dbf"). InnerVolumeSpecName "kube-api-access-k66lp". PluginName "kubernetes.io/projected", VolumeGidValue ""
Mar 29 18:03:00 worker-node02 kubelet[5319]: I0329 18:03:00.955973    5319 operation_generator.go:890] UnmountVolume.TearDown succeeded for volume "kubernetes.io/empty-dir/b210cff4-254b-4942-9c41-82388850e7a8-tmp-volume" (OuterVolumeSpecName: "tmp-volume") pod "b210cff4-254b-4942-9c41-82388850e7a8" (UID: "b210cff4-254b-4942-9c41-82388850e7a8"). InnerVolumeSpecName "tmp-volume". PluginName "kubernetes.io/empty-dir", VolumeGidValue ""
Mar 29 18:03:00 worker-node02 kubelet[5319]: I0329 18:03:00.957225    5319 operation_generator.go:890] UnmountVolume.TearDown succeeded for volume "kubernetes.io/projected/b210cff4-254b-4942-9c41-82388850e7a8-kube-api-access-gzsch" (OuterVolumeSpecName: "kube-api-access-gzsch") pod "b210cff4-254b-4942-9c41-82388850e7a8" (UID: "b210cff4-254b-4942-9c41-82388850e7a8"). InnerVolumeSpecName "kube-api-access-gzsch". PluginName "kubernetes.io/projected", VolumeGidValue ""
Mar 29 18:03:00 worker-node02 kubelet[5319]: I0329 18:03:00.961317    5319 operation_generator.go:890] UnmountVolume.TearDown succeeded for volume "kubernetes.io/secret/b210cff4-254b-4942-9c41-82388850e7a8-kubernetes-dashboard-certs" (OuterVolumeSpecName: "kubernetes-dashboard-certs") pod "b210cff4-254b-4942-9c41-82388850e7a8" (UID: "b210cff4-254b-4942-9c41-82388850e7a8"). InnerVolumeSpecName "kubernetes-dashboard-certs". PluginName "kubernetes.io/secret", VolumeGidValue ""
Mar 29 18:03:01 worker-node02 kubelet[5319]: I0329 18:03:01.041237    5319 reconciler_common.go:295] "Volume detached for volume \"kube-api-access-k66lp\" (UniqueName: \"kubernetes.io/projected/27512b05-aa14-4653-8feb-99a4ee019dbf-kube-api-access-k66lp\") on node \"worker-node02\" DevicePath \"\""
Mar 29 18:03:01 worker-node02 kubelet[5319]: I0329 18:03:01.041530    5319 reconciler_common.go:295] "Volume detached for volume \"tmp-volume\" (UniqueName: \"kubernetes.io/empty-dir/27512b05-aa14-4653-8feb-99a4ee019dbf-tmp-volume\") on node \"worker-node02\" DevicePath \"\""
Mar 29 18:03:01 worker-node02 kubelet[5319]: I0329 18:03:01.041678    5319 reconciler_common.go:295] "Volume detached for volume \"tmp-volume\" (UniqueName: \"kubernetes.io/empty-dir/b210cff4-254b-4942-9c41-82388850e7a8-tmp-volume\") on node \"worker-node02\" DevicePath \"\""
Mar 29 18:03:01 worker-node02 kubelet[5319]: I0329 18:03:01.041978    5319 reconciler_common.go:295] "Volume detached for volume \"kubernetes-dashboard-certs\" (UniqueName: \"kubernetes.io/secret/b210cff4-254b-4942-9c41-82388850e7a8-kubernetes-dashboard-certs\") on node \"worker-node02\" DevicePath \"\""
Mar 29 18:03:01 worker-node02 kubelet[5319]: I0329 18:03:01.041997    5319 reconciler_common.go:295] "Volume detached for volume \"kube-api-access-gzsch\" (UniqueName: \"kubernetes.io/projected/b210cff4-254b-4942-9c41-82388850e7a8-kube-api-access-gzsch\") on node \"worker-node02\" DevicePath \"\""
Mar 29 18:03:01 worker-node02 kubelet[5319]: I0329 18:03:01.803460    5319 pod_container_deletor.go:80] "Container not found in pod's containers" containerID="cda7376715405fa75d64824c51d5325938fabeb33f41c40c5f810c379331725a"
Mar 29 18:03:01 worker-node02 kubelet[5319]: I0329 18:03:01.816393    5319 pod_container_deletor.go:80] "Container not found in pod's containers" containerID="0c9f84ba5e7fc3375cd9ab700747b8e78b2f915bd25f32dd2258c24899b944de"
Mar 29 18:03:20 worker-node02 kubelet[5319]: I0329 18:03:20.843305    5319 reconciler_common.go:169] "operationExecutor.UnmountVolume started for volume \"kube-api-access-jr2b5\" (UniqueName: \"kubernetes.io/projected/86bce07d-98ce-4c7c-a3eb-bb23753bcaa0-kube-api-access-jr2b5\") pod \"86bce07d-98ce-4c7c-a3eb-bb23753bcaa0\" (UID: \"86bce07d-98ce-4c7c-a3eb-bb23753bcaa0\") "
Mar 29 18:03:20 worker-node02 kubelet[5319]: I0329 18:03:20.853974    5319 operation_generator.go:890] UnmountVolume.TearDown succeeded for volume "kubernetes.io/projected/86bce07d-98ce-4c7c-a3eb-bb23753bcaa0-kube-api-access-jr2b5" (OuterVolumeSpecName: "kube-api-access-jr2b5") pod "86bce07d-98ce-4c7c-a3eb-bb23753bcaa0" (UID: "86bce07d-98ce-4c7c-a3eb-bb23753bcaa0"). InnerVolumeSpecName "kube-api-access-jr2b5". PluginName "kubernetes.io/projected", VolumeGidValue ""
Mar 29 18:03:20 worker-node02 kubelet[5319]: I0329 18:03:20.856434    5319 pod_container_deletor.go:80] "Container not found in pod's containers" containerID="9aea2a5d0aea33642a08b84685ec6ece93ecebc138398a2e79e228111efd772b"
Mar 29 18:03:20 worker-node02 kubelet[5319]: I0329 18:03:20.945286    5319 reconciler_common.go:295] "Volume detached for volume \"kube-api-access-jr2b5\" (UniqueName: \"kubernetes.io/projected/86bce07d-98ce-4c7c-a3eb-bb23753bcaa0-kube-api-access-jr2b5\") on node \"worker-node02\" DevicePath \"\""
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.252513    5319 reconciler_common.go:169] "operationExecutor.UnmountVolume started for volume \"xtables-lock\" (UniqueName: \"kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-xtables-lock\") pod \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\" (UID: \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\") "
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.252572    5319 reconciler_common.go:169] "operationExecutor.UnmountVolume started for volume \"cni-bin-dir\" (UniqueName: \"kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-cni-bin-dir\") pod \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\" (UID: \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\") "
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.252593    5319 reconciler_common.go:169] "operationExecutor.UnmountVolume started for volume \"cni-log-dir\" (UniqueName: \"kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-cni-log-dir\") pod \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\" (UID: \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\") "
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.252611    5319 reconciler_common.go:169] "operationExecutor.UnmountVolume started for volume \"host-local-net-dir\" (UniqueName: \"kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-host-local-net-dir\") pod \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\" (UID: \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\") "
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.252629    5319 reconciler_common.go:169] "operationExecutor.UnmountVolume started for volume \"lib-modules\" (UniqueName: \"kubernetes.io/host-path/075134a2-8dbf-46d2-98b1-6f2d722cfee7-lib-modules\") pod \"075134a2-8dbf-46d2-98b1-6f2d722cfee7\" (UID: \"075134a2-8dbf-46d2-98b1-6f2d722cfee7\") "
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.252648    5319 reconciler_common.go:169] "operationExecutor.UnmountVolume started for volume \"lib-modules\" (UniqueName: \"kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-lib-modules\") pod \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\" (UID: \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\") "
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.252674    5319 reconciler_common.go:169] "operationExecutor.UnmountVolume started for volume \"kube-api-access-6wb2h\" (UniqueName: \"kubernetes.io/projected/8402f0e8-dc01-4de3-8c0f-585b491d49a1-kube-api-access-6wb2h\") pod \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\" (UID: \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\") "
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.252691    5319 reconciler_common.go:169] "operationExecutor.UnmountVolume started for volume \"var-run-calico\" (UniqueName: \"kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-var-run-calico\") pod \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\" (UID: \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\") "
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.252709    5319 reconciler_common.go:169] "operationExecutor.UnmountVolume started for volume \"sys-fs\" (UniqueName: \"kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-sys-fs\") pod \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\" (UID: \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\") "
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.252743    5319 reconciler_common.go:169] "operationExecutor.UnmountVolume started for volume \"bpffs\" (UniqueName: \"kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-bpffs\") pod \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\" (UID: \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\") "
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.252784    5319 reconciler_common.go:169] "operationExecutor.UnmountVolume started for volume \"kube-api-access-cxprl\" (UniqueName: \"kubernetes.io/projected/075134a2-8dbf-46d2-98b1-6f2d722cfee7-kube-api-access-cxprl\") pod \"075134a2-8dbf-46d2-98b1-6f2d722cfee7\" (UID: \"075134a2-8dbf-46d2-98b1-6f2d722cfee7\") "
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.252813    5319 reconciler_common.go:169] "operationExecutor.UnmountVolume started for volume \"xtables-lock\" (UniqueName: \"kubernetes.io/host-path/075134a2-8dbf-46d2-98b1-6f2d722cfee7-xtables-lock\") pod \"075134a2-8dbf-46d2-98b1-6f2d722cfee7\" (UID: \"075134a2-8dbf-46d2-98b1-6f2d722cfee7\") "
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.252842    5319 reconciler_common.go:169] "operationExecutor.UnmountVolume started for volume \"cni-net-dir\" (UniqueName: \"kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-cni-net-dir\") pod \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\" (UID: \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\") "
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.252888    5319 reconciler_common.go:169] "operationExecutor.UnmountVolume started for volume \"policysync\" (UniqueName: \"kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-policysync\") pod \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\" (UID: \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\") "
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.252918    5319 reconciler_common.go:169] "operationExecutor.UnmountVolume started for volume \"var-lib-calico\" (UniqueName: \"kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-var-lib-calico\") pod \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\" (UID: \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\") "
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.252959    5319 reconciler_common.go:169] "operationExecutor.UnmountVolume started for volume \"kube-proxy\" (UniqueName: \"kubernetes.io/configmap/075134a2-8dbf-46d2-98b1-6f2d722cfee7-kube-proxy\") pod \"075134a2-8dbf-46d2-98b1-6f2d722cfee7\" (UID: \"075134a2-8dbf-46d2-98b1-6f2d722cfee7\") "
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.252992    5319 reconciler_common.go:169] "operationExecutor.UnmountVolume started for volume \"nodeproc\" (UniqueName: \"kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-nodeproc\") pod \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\" (UID: \"8402f0e8-dc01-4de3-8c0f-585b491d49a1\") "
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.268690    5319 operation_generator.go:890] UnmountVolume.TearDown succeeded for volume "kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-nodeproc" (OuterVolumeSpecName: "nodeproc") pod "8402f0e8-dc01-4de3-8c0f-585b491d49a1" (UID: "8402f0e8-dc01-4de3-8c0f-585b491d49a1"). InnerVolumeSpecName "nodeproc". PluginName "kubernetes.io/host-path", VolumeGidValue ""
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.268801    5319 operation_generator.go:890] UnmountVolume.TearDown succeeded for volume "kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-cni-bin-dir" (OuterVolumeSpecName: "cni-bin-dir") pod "8402f0e8-dc01-4de3-8c0f-585b491d49a1" (UID: "8402f0e8-dc01-4de3-8c0f-585b491d49a1"). InnerVolumeSpecName "cni-bin-dir". PluginName "kubernetes.io/host-path", VolumeGidValue ""
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.268824    5319 operation_generator.go:890] UnmountVolume.TearDown succeeded for volume "kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-cni-log-dir" (OuterVolumeSpecName: "cni-log-dir") pod "8402f0e8-dc01-4de3-8c0f-585b491d49a1" (UID: "8402f0e8-dc01-4de3-8c0f-585b491d49a1"). InnerVolumeSpecName "cni-log-dir". PluginName "kubernetes.io/host-path", VolumeGidValue ""
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.268842    5319 operation_generator.go:890] UnmountVolume.TearDown succeeded for volume "kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-host-local-net-dir" (OuterVolumeSpecName: "host-local-net-dir") pod "8402f0e8-dc01-4de3-8c0f-585b491d49a1" (UID: "8402f0e8-dc01-4de3-8c0f-585b491d49a1"). InnerVolumeSpecName "host-local-net-dir". PluginName "kubernetes.io/host-path", VolumeGidValue ""
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.268967    5319 operation_generator.go:890] UnmountVolume.TearDown succeeded for volume "kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-xtables-lock" (OuterVolumeSpecName: "xtables-lock") pod "8402f0e8-dc01-4de3-8c0f-585b491d49a1" (UID: "8402f0e8-dc01-4de3-8c0f-585b491d49a1"). InnerVolumeSpecName "xtables-lock". PluginName "kubernetes.io/host-path", VolumeGidValue ""
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.269004    5319 operation_generator.go:890] UnmountVolume.TearDown succeeded for volume "kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-lib-modules" (OuterVolumeSpecName: "lib-modules") pod "8402f0e8-dc01-4de3-8c0f-585b491d49a1" (UID: "8402f0e8-dc01-4de3-8c0f-585b491d49a1"). InnerVolumeSpecName "lib-modules". PluginName "kubernetes.io/host-path", VolumeGidValue ""
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.269780    5319 operation_generator.go:890] UnmountVolume.TearDown succeeded for volume "kubernetes.io/host-path/075134a2-8dbf-46d2-98b1-6f2d722cfee7-lib-modules" (OuterVolumeSpecName: "lib-modules") pod "075134a2-8dbf-46d2-98b1-6f2d722cfee7" (UID: "075134a2-8dbf-46d2-98b1-6f2d722cfee7"). InnerVolumeSpecName "lib-modules". PluginName "kubernetes.io/host-path", VolumeGidValue ""
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.269829    5319 operation_generator.go:890] UnmountVolume.TearDown succeeded for volume "kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-var-run-calico" (OuterVolumeSpecName: "var-run-calico") pod "8402f0e8-dc01-4de3-8c0f-585b491d49a1" (UID: "8402f0e8-dc01-4de3-8c0f-585b491d49a1"). InnerVolumeSpecName "var-run-calico". PluginName "kubernetes.io/host-path", VolumeGidValue ""
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.269867    5319 operation_generator.go:890] UnmountVolume.TearDown succeeded for volume "kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-sys-fs" (OuterVolumeSpecName: "sys-fs") pod "8402f0e8-dc01-4de3-8c0f-585b491d49a1" (UID: "8402f0e8-dc01-4de3-8c0f-585b491d49a1"). InnerVolumeSpecName "sys-fs". PluginName "kubernetes.io/host-path", VolumeGidValue ""
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.269892    5319 operation_generator.go:890] UnmountVolume.TearDown succeeded for volume "kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-bpffs" (OuterVolumeSpecName: "bpffs") pod "8402f0e8-dc01-4de3-8c0f-585b491d49a1" (UID: "8402f0e8-dc01-4de3-8c0f-585b491d49a1"). InnerVolumeSpecName "bpffs". PluginName "kubernetes.io/host-path", VolumeGidValue ""
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.405165    5319 reconciler_common.go:295] "Volume detached for volume \"nodeproc\" (UniqueName: \"kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-nodeproc\") on node \"worker-node02\" DevicePath \"\""
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.405220    5319 reconciler_common.go:295] "Volume detached for volume \"cni-bin-dir\" (UniqueName: \"kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-cni-bin-dir\") on node \"worker-node02\" DevicePath \"\""
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.405248    5319 reconciler_common.go:295] "Volume detached for volume \"cni-log-dir\" (UniqueName: \"kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-cni-log-dir\") on node \"worker-node02\" DevicePath \"\""
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.405281    5319 reconciler_common.go:295] "Volume detached for volume \"host-local-net-dir\" (UniqueName: \"kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-host-local-net-dir\") on node \"worker-node02\" DevicePath \"\""
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.405311    5319 reconciler_common.go:295] "Volume detached for volume \"xtables-lock\" (UniqueName: \"kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-xtables-lock\") on node \"worker-node02\" DevicePath \"\""
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.405331    5319 reconciler_common.go:295] "Volume detached for volume \"lib-modules\" (UniqueName: \"kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-lib-modules\") on node \"worker-node02\" DevicePath \"\""
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.405343    5319 reconciler_common.go:295] "Volume detached for volume \"var-run-calico\" (UniqueName: \"kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-var-run-calico\") on node \"worker-node02\" DevicePath \"\""
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.405357    5319 reconciler_common.go:295] "Volume detached for volume \"sys-fs\" (UniqueName: \"kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-sys-fs\") on node \"worker-node02\" DevicePath \"\""
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.405370    5319 reconciler_common.go:295] "Volume detached for volume \"bpffs\" (UniqueName: \"kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-bpffs\") on node \"worker-node02\" DevicePath \"\""
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.405379    5319 reconciler_common.go:295] "Volume detached for volume \"lib-modules\" (UniqueName: \"kubernetes.io/host-path/075134a2-8dbf-46d2-98b1-6f2d722cfee7-lib-modules\") on node \"worker-node02\" DevicePath \"\""
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.405433    5319 operation_generator.go:890] UnmountVolume.TearDown succeeded for volume "kubernetes.io/host-path/075134a2-8dbf-46d2-98b1-6f2d722cfee7-xtables-lock" (OuterVolumeSpecName: "xtables-lock") pod "075134a2-8dbf-46d2-98b1-6f2d722cfee7" (UID: "075134a2-8dbf-46d2-98b1-6f2d722cfee7"). InnerVolumeSpecName "xtables-lock". PluginName "kubernetes.io/host-path", VolumeGidValue ""
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.405493    5319 operation_generator.go:890] UnmountVolume.TearDown succeeded for volume "kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-cni-net-dir" (OuterVolumeSpecName: "cni-net-dir") pod "8402f0e8-dc01-4de3-8c0f-585b491d49a1" (UID: "8402f0e8-dc01-4de3-8c0f-585b491d49a1"). InnerVolumeSpecName "cni-net-dir". PluginName "kubernetes.io/host-path", VolumeGidValue ""
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.405522    5319 operation_generator.go:890] UnmountVolume.TearDown succeeded for volume "kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-policysync" (OuterVolumeSpecName: "policysync") pod "8402f0e8-dc01-4de3-8c0f-585b491d49a1" (UID: "8402f0e8-dc01-4de3-8c0f-585b491d49a1"). InnerVolumeSpecName "policysync". PluginName "kubernetes.io/host-path", VolumeGidValue ""
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.449886    5319 operation_generator.go:890] UnmountVolume.TearDown succeeded for volume "kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-var-lib-calico" (OuterVolumeSpecName: "var-lib-calico") pod "8402f0e8-dc01-4de3-8c0f-585b491d49a1" (UID: "8402f0e8-dc01-4de3-8c0f-585b491d49a1"). InnerVolumeSpecName "var-lib-calico". PluginName "kubernetes.io/host-path", VolumeGidValue ""
Mar 29 18:03:21 worker-node02 kubelet[5319]: W0329 18:03:21.455137    5319 empty_dir.go:525] Warning: Failed to clear quota on /var/lib/kubelet/pods/075134a2-8dbf-46d2-98b1-6f2d722cfee7/volumes/kubernetes.io~configmap/kube-proxy: clearQuota called, but quotas disabled
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.457759    5319 operation_generator.go:890] UnmountVolume.TearDown succeeded for volume "kubernetes.io/configmap/075134a2-8dbf-46d2-98b1-6f2d722cfee7-kube-proxy" (OuterVolumeSpecName: "kube-proxy") pod "075134a2-8dbf-46d2-98b1-6f2d722cfee7" (UID: "075134a2-8dbf-46d2-98b1-6f2d722cfee7"). InnerVolumeSpecName "kube-proxy". PluginName "kubernetes.io/configmap", VolumeGidValue ""
Mar 29 18:03:21 worker-node02 kubelet[5319]: E0329 18:03:21.473124    5319 nestedpendingoperations.go:348] Operation for "{volumeName:kubernetes.io/projected/075134a2-8dbf-46d2-98b1-6f2d722cfee7-kube-api-access-cxprl podName:075134a2-8dbf-46d2-98b1-6f2d722cfee7 nodeName:}" failed. No retries permitted until 2023-03-29 18:03:21.957929295 +0000 UTC m=+839.032007790 (durationBeforeRetry 500ms). Error: UnmountVolume.TearDown failed for volume "kube-api-access-cxprl" (UniqueName: "kubernetes.io/projected/075134a2-8dbf-46d2-98b1-6f2d722cfee7-kube-api-access-cxprl") pod "075134a2-8dbf-46d2-98b1-6f2d722cfee7" (UID: "075134a2-8dbf-46d2-98b1-6f2d722cfee7") : unmount failed: signal: terminated
Mar 29 18:03:21 worker-node02 kubelet[5319]: Unmounting arguments: /var/lib/kubelet/pods/075134a2-8dbf-46d2-98b1-6f2d722cfee7/volumes/kubernetes.io~projected/kube-api-access-cxprl
Mar 29 18:03:21 worker-node02 kubelet[5319]: Output:
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.631595    5319 operation_generator.go:890] UnmountVolume.TearDown succeeded for volume "kubernetes.io/projected/8402f0e8-dc01-4de3-8c0f-585b491d49a1-kube-api-access-6wb2h" (OuterVolumeSpecName: "kube-api-access-6wb2h") pod "8402f0e8-dc01-4de3-8c0f-585b491d49a1" (UID: "8402f0e8-dc01-4de3-8c0f-585b491d49a1"). InnerVolumeSpecName "kube-api-access-6wb2h". PluginName "kubernetes.io/projected", VolumeGidValue ""
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.631745    5319 reconciler_common.go:295] "Volume detached for volume \"kube-api-access-6wb2h\" (UniqueName: \"kubernetes.io/projected/8402f0e8-dc01-4de3-8c0f-585b491d49a1-kube-api-access-6wb2h\") on node \"worker-node02\" DevicePath \"\""
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.631764    5319 reconciler_common.go:295] "Volume detached for volume \"policysync\" (UniqueName: \"kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-policysync\") on node \"worker-node02\" DevicePath \"\""
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.631777    5319 reconciler_common.go:295] "Volume detached for volume \"xtables-lock\" (UniqueName: \"kubernetes.io/host-path/075134a2-8dbf-46d2-98b1-6f2d722cfee7-xtables-lock\") on node \"worker-node02\" DevicePath \"\""
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.631788    5319 reconciler_common.go:295] "Volume detached for volume \"cni-net-dir\" (UniqueName: \"kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-cni-net-dir\") on node \"worker-node02\" DevicePath \"\""
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.631798    5319 reconciler_common.go:295] "Volume detached for volume \"var-lib-calico\" (UniqueName: \"kubernetes.io/host-path/8402f0e8-dc01-4de3-8c0f-585b491d49a1-var-lib-calico\") on node \"worker-node02\" DevicePath \"\""
Mar 29 18:03:21 worker-node02 kubelet[5319]: I0329 18:03:21.632347    5319 reconciler_common.go:295] "Volume detached for volume \"kube-proxy\" (UniqueName: \"kubernetes.io/configmap/075134a2-8dbf-46d2-98b1-6f2d722cfee7-kube-proxy\") on node \"worker-node02\" DevicePath \"\""
Mar 29 18:03:21 worker-node02 systemd[1]: Stopping kubelet: The Kubernetes Node Agent...
Mar 29 18:03:21 worker-node02 systemd[1]: kubelet.service: Deactivated successfully.
Mar 29 18:03:21 worker-node02 systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
Mar 29 18:03:21 worker-node02 systemd[1]: kubelet.service: Consumed 13.964s CPU time.

@aojea
Copy link
Member

aojea commented Mar 29, 2023

poiler: there's nothing there. Is there be a flag to turn on more specific logs for the endpoint slice controller?

all the components can increase the verbosity level with the -v flag, -v 4 should be enough to see some important data

@aojea
Copy link
Member

aojea commented Mar 29, 2023

Within 5 seconds (the preStop hook duration), the EndpointSlice should be updated to have the pod that is terminating removed. This is what happens when a pod gets removed for any other reason (e.g. kubectl delete pod).

The endpointslice reflects the state of the Pod, it literally checks the pod state, same you are doing with kubectl describe pod , if it is reflecting the conditions

  conditions:
    ready: true
    serving: true
    terminating: false

you'll need to check the pod at that time, it should be ready, if not, there is a problem we need to investigate more

@mikekap
Copy link
Contributor Author

mikekap commented Mar 29, 2023

The pod also reports everything is great while shutting down:

Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True

Afaict the pod never even gets marked as Terminating -- only the containers get killed. Here are the same logs with -v 4: https://gist.github.com/mikekap/f55753e23d050bdaf39cd2b06ad747a0 . I don't fully grok the logs, but this seemed like an interesting snippet from the kubelet logs:

Mar 29 19:04:15 worker-node02 kubelet[783]: I0329 19:04:15.041733     783 nodeshutdown_manager_linux.go:265] "Shutdown manager detected new shutdown event, isNodeShuttingDownNow" event=true
Mar 29 19:04:15 worker-node02 kubelet[783]: I0329 19:04:15.042229     783 nodeshutdown_manager_linux.go:273] "Shutdown manager detected new shutdown event" event="shutdown"
Mar 29 19:04:15 worker-node02 kubelet[783]: I0329 19:04:15.042396     783 nodeshutdown_manager_linux.go:322] "Shutdown manager processing shutdown event"
Mar 29 19:04:15 worker-node02 kubelet[783]: I0329 19:04:15.047298     783 nodeshutdown_manager_linux.go:375] "Shutdown manager killing pod with gracePeriod" pod="default/echo-deployment-d658654dc-k8srg" gracePeriod=30
Mar 29 19:04:15 worker-node02 kubelet[783]: I0329 19:04:15.047490     783 pod_workers.go:652] "Pod is being removed by the kubelet, begin teardown" pod="default/echo-deployment-d658654dc-k8srg" podUID=bc22c80a-a843-49da-b7a2-5d70512bb54d
Mar 29 19:04:15 worker-node02 kubelet[783]: I0329 19:04:15.047676     783 pod_workers.go:902] "Processing pod event" pod="default/echo-deployment-d658654dc-k8srg" podUID=bc22c80a-a843-49da-b7a2-5d70512bb54d updateType=1
Mar 29 19:04:15 worker-node02 kubelet[783]: I0329 19:04:15.047829     783 pod_workers.go:1019] "Pod worker has observed request to terminate" pod="default/echo-deployment-d658654dc-k8srg" podUID=bc22c80a-a843-49da-b7a2-5d70512bb54d
Mar 29 19:04:15 worker-node02 kubelet[783]: I0329 19:04:15.049317     783 kubelet.go:1858] "syncTerminatingPod enter" pod="default/echo-deployment-d658654dc-k8srg" podUID=bc22c80a-a843-49da-b7a2-5d70512bb54d
Mar 29 19:04:15 worker-node02 kubelet[783]: I0329 19:04:15.049448     783 kubelet_pods.go:1461] "Generating pod status" pod="default/echo-deployment-d658654dc-k8srg"
Mar 29 19:04:15 worker-node02 kubelet[783]: I0329 19:04:15.050428     783 kubelet_pods.go:1471] "Got phase for pod" pod="default/echo-deployment-d658654dc-k8srg" oldPhase=Running phase=Running
Mar 29 19:04:15 worker-node02 kubelet[783]: I0329 19:04:15.050671     783 kubelet.go:1888] "Pod terminating with grace period" pod="default/echo-deployment-d658654dc-k8srg" podUID=bc22c80a-a843-49da-b7a2-5d70512bb54d gracePeriod=30

@mikekap
Copy link
Contributor Author

mikekap commented Mar 29, 2023

Seems the code around pod status merging is at least partially responsible for the dropped api server update. The shutdown manager code only sets Phase, Message, Reason and adds a DisruptionTarget condition. The status merging code mentioned above disregards updates to all of these fields when the pod still has running containers (i.e. during draining).

Not sure what the right fix is here, but I think that's the root cause? If I'm understanding the code right, this would affect pod eviction and any other kubelet-led graceful pod shutdown.

@olderTaoist olderTaoist removed their assignment Mar 30, 2023
@mikekap
Copy link
Contributor Author

mikekap commented Apr 5, 2023

@SergeyKanzhelev would you need anything else from me to triage this?

This may be a bit presumptuous but I noticed @mimowo & @smarterclayton worked on this area recently. Would either of you have the time to see if my diagnosis in the previous comment is correct/useful?

@SergeyKanzhelev
Copy link
Member

/triage accepted

xref to the KEP: kubernetes/enhancements#2000 to consider when going GA

Will this problem be the same when we try to delete Pod due to some eviction reasons than? The PR is changing the existing tests - need to understand that the proposed fix is not regressing some other behavior.

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jun 24, 2023
@SergeyKanzhelev SergeyKanzhelev moved this from Needs Information to Triaged in SIG Node Bugs Jun 24, 2023
@SergeyKanzhelev
Copy link
Member

/remove-triage needs-information

@k8s-ci-robot k8s-ci-robot removed the triage/needs-information Indicates an issue needs more information in order to work on it. label Jun 28, 2023
@mikekap
Copy link
Contributor Author

mikekap commented Jul 4, 2023

Will this problem be the same when we try to delete Pod due to some eviction reasons than? The PR is changing the existing tests - need to understand that the proposed fix is not regressing some other behavior.

Yes - I believe so. All kubelet-led evictions only set a terminal phase and expect that to propagate to the API server to update readiness conditions. However that currently doesn't happen since that update is squelched until the containers finish terminating.

The overall issue comes from the fact that "terminating" in the endpoints world doesn't match "terminating" in the pod world. i.e.

terminating := pod.DeletionTimestamp != nil
doesn't capture cases where the kubelet is evicting the pod (since in those cases DeletionTimestamp is not set).

@yylt
Copy link
Contributor

yylt commented Aug 3, 2023

like the issue #109998

and in my kubernetes cluster, use pr to fix this problem, it has been running continuously for one year

@aojea
Copy link
Member

aojea commented Aug 3, 2023

doesn't capture cases where the kubelet is evicting the pod (since in those cases DeletionTimestamp is not set).

can you elaborate? is the pod ready during that time?

@mikekap
Copy link
Contributor Author

mikekap commented Aug 3, 2023

doesn't capture cases where the kubelet is evicting the pod (since in those cases DeletionTimestamp is not set).

can you elaborate? is the pod ready during that time?

I'm not sure that I know the code well enough, but the core problem is the title of this ticket. Basically you'll have to define what you mean by "ready" - the readiness probe might be succeeding during this time just fine, but in endpoints-land the ready field should be false. The endpoints code calls a "terminating but passing the readiness probe" pod serving but not ready. See the code around

serving := podutil.IsPodReady(pod)
terminating := pod.DeletionTimestamp != nil
// For compatibility reasons, "ready" should never be "true" if a pod is terminatng, unless
// publishNotReadyAddresses was set.
ready := service.Spec.PublishNotReadyAddresses || (serving && !terminating)

There is probably a larger discussion to be had since the assumption that DeletionTimestamp as the marker for "pod is terminating" is prevalent - e.g.

func IsPodTerminating(p *v1.Pod) bool {
return !podutil.IsPodTerminal(p) &&
p.DeletionTimestamp != nil
}
&
if curPod.DeletionTimestamp != nil {
. I wonder what third party controllers are also breaking because that assumption just isn't true for kubelet-led evictions. My PR simply fixes a pretty painful symptom (i.e. traffic being sent to dead pods) for services under kubelet-led evictions.

@yylt
Copy link
Contributor

yylt commented Aug 4, 2023

when execute a shutdown operation, if the connection between kubelet and kube-apiserver is disconnected, any attempts to delete or mark services as unhealthy will be ineffective, because kube-apiserver can not update.

The kube-apiserver is typically deployed in static or systemd service mode, and it is crucial to ensure that kube-apiserver is the last to stop. This will update pod status in the cluster, and should add the highest priority in static pod, like this

apiVersion: v1
kind: Pod
metadata:
  name: kube-apiserver
  namespace: kube-system
  labels:
    k8s-app: kube-apiserver
spec:
  hostNetwork: true
  priorityClassName: system-cluster-critical  # mark as last remove or evict
  dnsPolicy: ClusterFirstWithHostNet
  containers:
  - name: kube-apiserver
    image: dodo/hyperkube:v1.26.4
    imagePullPolicy: IfNotPresent
    command:
    - /kube-apiserver
    ...

And when kube-apiserver can not connected, but node had shutdown, than the nodelifecycle controller will work on this and should mark the service as unhealthy, that is the #109998 happened.

@mikekap
Copy link
Contributor Author

mikekap commented Aug 4, 2023

@yylt You might be discussing a different problem there. This issue is not about shutting down the api server - the "graceful node shutdown" is of a worker node, not an API server node. Furthermore - the network is perfect in my case - this is "normal operation" of the graceful node shutdown feature - not related to the network conditions or other things happening to the API server or really anywhere else.

@mimowo
Copy link
Contributor

mimowo commented Aug 7, 2023

@mikekap could you check if this issue reproduces on 1.27?

There were pretty important changes for the pod lifecycle. In particular in #115331 we make sure kubelet transitions a terminating pod to terminal phase before actual deletion (on 1.26 the pod might have been deleted in the Running phase).

This was required for Jobs to make sure the Job controller observes the pod as terminated, as it uses finalizers. However, I'm not sure about endpointslice controller.

@aojea
Copy link
Member

aojea commented Aug 7, 2023

Basically you'll have to define what you mean by "ready"

Ready to serve traffic means the pod is ready and not terminating, so it keeps serving the existing traffic but we don't send any traffic to a pod that we know eventually will disappear and blackhole traffic creating disruption

The overall issue comes from the fact that "terminating" in the endpoints world doesn't match "terminating" in the pod world. i.e.

There is probably a larger discussion to be had since the assumption that DeletionTimestamp as the marker for "pod is terminating" is prevalen

What discussion? DeletionTimestamp IS the marker for "pod is terminating", if not we have a big problem, components communicate through the API that represent the objects, and objects have a lifecycle kubernetes/community#6787 , if the pod is being deleted it has to represent its state in the API object so all the components act accordenly

@mimowo
Copy link
Contributor

mimowo commented Aug 7, 2023

What discussion? DeletionTimestamp IS the marker for "pod is terminating", if not we have a big problem, components communicate through the API that represent the objects, and objects have a lifecycle kubernetes/community#6787 , if the pod is being deleted it has to represent its state in the API object so all the components act accordenly

The Kubelet evicts pods using the killPodFunc function from a couple of places, including:

  1. nodeshutdown manager
  2. eviction_manager
  3. preemption

The function triggers termination of containers, so the pods are terminating in a sense, but don't think the function sets DeletionTimestamp, and I suppose it might be a bug in kubelet, but quite old. if so, it might be worth considering changes in other controllers.

cc @bobbypage @smarterclayton @alculquicondor for insights.

EDIT; still it would be good to see if this remains an issue e2e on 1.27 as suggested in #116965 (comment)

@mikekap
Copy link
Contributor Author

mikekap commented Aug 10, 2023

Just reproduced the same behavior in 1.27.1. The repro repo in the first comment is updated to 1.27.1 now. When sending an ACPI shutdown, the pod begins shutdown (the preStop hook runs) but while the hook is running, the pod continues to have:

  conditions:
    ready: true
    serving: true
    terminating: false

in the endpointslice.

re: the discussion - before this issue, I was under the same impression that DeletionTimestamp is the only thing that signals shutdown. My guess is the kubelet killPodFunc tries to keep the pod object around so operators(?) can see when evictions and the like happen. I don't think I'm going to be useful in the "right approach" discussion - but that difference is the source of this bug.

@mimowo
Copy link
Contributor

mimowo commented Aug 11, 2023

@mikekap thanks for reproducing - at least we know the new version does not help.

I'm not sure what the "right approach" is. IIUC the comment suggests kubelet should set the deletionTimestamp when evicting pods, and this sounds reasonable to me.

Maybe some sig-node folks can give more context if this is a bug, or there are some good reasons.

@noony
Copy link

noony commented Apr 5, 2024

Hi, we also encounter this issue on 1.27.6.

What we see, is that after the node deletion triggered, pod is still Running during the prestop lifecycle, also when sigterm is sent. Then the pod transition to status to completed.

Endpoint is updated when the pod pass in terminating state, but because during shutdown it never transition to Terminating, it stills send traffic to this pod until the pod is Completed, which cause errors.

@alculquicondor
Copy link
Member

/remove-triage accepted

Can someone from SIG node re-triage this?

IIUC, the question is how/when does kubelet report to the apiserver that the Pod is terminating when doing a graceful node shutdown, so that other controllers can react.

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. and removed triage/accepted Indicates an issue or PR is ready to be actively worked on. labels Apr 5, 2024
@rphillips
Copy link
Member

rphillips commented Apr 17, 2024

Double check that your service has a sigterm handler that sets the readiness endpoint to false. Some of the behavior described on this ticket is expected behavior. If the readiness endpoint is never set to false, then the pod is expected to stay ready until the entire lifecycle is terminated.

The kubelet also needs to be setup in a way where if it is running under systemd the dependency chain has the kubelet terminate prior to networking.

@rphillips
Copy link
Member

Also note, the pre-stop hook blocks the pod termination until it has completed.

Hook handler calls are synchronous within the context of the Pod containing the Container.

https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/#hook-handler-execution

@mikekap
Copy link
Contributor Author

mikekap commented Apr 22, 2024

@rphillips I don't think this is specced behavior: readiness probes auto-fail when pods start shutdown normally. See this page: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#container-probes

Note: If you want to be able to drain requests when the Pod is deleted, you do not necessarily need a readiness probe; on deletion, the Pod automatically puts itself into an unready state regardless of whether the readiness probe exists. The Pod remains in the unready state while it waits for the containers in the Pod to stop.

@hshiina
Copy link

hshiina commented May 1, 2024

Double check that your service has a sigterm handler that sets the readiness endpoint to false. Some of the behavior described on this ticket is expected behavior. If the readiness endpoint is never set to false, then the pod is expected to stay ready until the entire lifecycle is terminated.

Because of a bug(#124648), readiness probe doesn’t work during pod termination triggered in kubelet such as eviction and node shutdown. This may be one of reasons why an endpoint is not updated early.

@aojea
Copy link
Member

aojea commented May 15, 2024

Endpoint is updated when the pod pass in terminating state, but because during shutdown it never transition to Terminating, it stills send traffic to this pod until the pod is Completed, which cause errors.

@rphillips @smarterclayton it seems the main problem here is the lack of update on the status, once the status on eviction is propagated Terminating or Failing, Services will work correctly, we had a similar issue in the past #110255

@smarterclayton
Copy link
Contributor

So zooming out:

GracefulShutdown by itself does not say whether the node is coming back or not - that's the responsibility of the agent that "owns" the node (cloud provider, human admin, vm orchestrator, whatever) to determine whether the node is coming back or not. So using graceful shutdown by itself can make no decision on whether to delete the pods or not, only to correctly represent that the pod is shutting down and being not ready.

GracefulShutdown was not intended to replace a controlled drain operation on the node. If the use case described here is assuming that graceful shutdown will guarantee that pods are terminated, that is not generally true and not the design we intended for GracefulShutdown. Any "set delete timestamp" action would have to be opt in and probably not the responsibility of the kubelet generally - the responsibility today belongs to the initiator of the "node going away permanently" action (cloud provider, orchestrator, human) and I could argue it should stay there.

I think we can say that GracefulShutdown is intended to end the processes of pods as best effort within a specific timeframe and report the status of those pods appropriately. At minimum, the pods should be described as non ready, and the containers should be described as stopped with terminal exit codes and/or phase (RestartAlways pods should be left in running phase, because there was no signal that the node is not coming back). But it is best effort - it cannot guarantee due to being part of a distributed system that those status updates will propagate.

The remaining question is whether the node is doing a good enough job during graceful shutdown. I think the answer given this thread is "no" - I would expect the following (as i commented on the other bug):

  1. Pod deletion and pod readiness are orthogonal, they should be kept orthogonal
  2. When the kubelet shuts down a pod gracefully for any reason (including deletion, static pod update, graceful node shutdown) the kubelet should continue to run readiness probes until a readiness check is "not ready" or the container stops, then the kubelet should stop readiness probes and not restart them until the container is started again
    a. if the container has no readiness probe, the ready state should go to False after the prestop completes
  3. The kubelet should aggressively attempt to report the most important status transitions during graceful shutdown, prioritizing ready -> false and container -> stop. This is challenging today because the kubelet status loop is single threaded, but we have discussed improving it.
  4. The initiator of node graceful shutdown is responsible for deciding whether the pod will never come back - if so invoke pod deletion before or concurrent with node graceful shutdown.

Anything I'm missing? Does this match everyone's experience and expectations?

@mikekap
Copy link
Contributor Author

mikekap commented May 16, 2024

One piece I might want to clarify is point 2:

  1. When the kubelet shuts down a pod gracefully for any reason (including deletion, static pod update, graceful node shutdown) the kubelet should continue to run readiness probes until a readiness check is "not ready" or the container stops, then the kubelet should stop readiness probes and not restart them until the container is started again

    a. if the container has no readiness probe, the ready state should go to False after the prestop completes

This isn't entirely complete - the endpoint slices that include these pods need to be updated correctly. Specifically, there are 3 booleans currently exposed: "ready", "serving", and "terminating". For pods on nodes undergoing graceful node shutdown these should be:

  • serving=(normal readiness probe result) - but this should be true if no probe is defined - same as it is when the pod is running normally.
  • ready=false - regardless of probe result; see the note about terminating pods here.
  • terminating=true - this is sort of debatable - but I think most ingress controllers use this as an indicator to drain traffic permanently (eg shut down persistent connections). This helps maintain their correctness.

Unfortunately doing this in the endpoint controller is impossible since the kubelet doesn't expose any indicator that the pod is shutting down. There may be a disruption condition now - so maybe this bug is about having the endpoint controller use that.

Separately - and this is just a non-advanced end user opinion that you can definitely ignore - IMO kubelet eviction should use pod deletion. The problem ultimately is doing anything else violates the "principle of least surprise". During eviction the pod is "Terminating" but unless everyone remembers that there are 2 ways of checking that - deletion timestamp & a disruption condition (or something similar) - there will be bugs everywhere. Even kubectl doesn't do this right now (to show "Terminating"). Nobody expects "kubectl drain" and "sudo halt" to do anything different - it's a nasty surprise.

I'm not totally sure I understand what's gained by leaving the undead pods around. In theory you might be able to restart them if the node comes back up, but even if that doesn't happen - the parent controller should be recreating them, right? Then is this done to protect "fully uncontrolled" pods (ie non-static and non-daemonset/replicaset/statefulset/etc)? These are super weird - you can't even use "kubectl delete" to restart them - right? I readily admit I have not run kube on bare metal or with pet nodes (only cattle nodes) - so I'd love to understand how this feature helps folks.

Ultimately if allowing kubelets to actually delete pods is a breaking change in the kube API guarantees, IMO it's worth doing. It likely fixes more bugs than it creates.

All that said, the endpoint slice controller behavior is what this bug report is about. So long as that readiness starts reporting false when termination begins - that works to fix this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
Development

Successfully merging a pull request may close this issue.