Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrics-server can't scrape nodes after enable_cri_dockerd is set to true #2709

Closed
immanuelfodor opened this issue Sep 27, 2021 · 22 comments
Closed

Comments

@immanuelfodor
Copy link

RKE version: v1.3.1

Docker version: (docker version,docker info preferred)

Client: Docker Engine - Community
 Version:           20.10.8
 API version:       1.41
 Go version:        go1.16.6
 Git commit:        3967b7d
 Built:             Fri Jul 30 19:53:39 2021
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.8
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.16.6
  Git commit:       75249d8
  Built:            Fri Jul 30 19:52:00 2021
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.4.9
  GitCommit:        e25210fe30a0a703442421b0f60afac609f950a3
 runc:
  Version:          1.0.1
  GitCommit:        v1.0.1-0-g4144b63
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

Operating system and kernel: (cat /etc/os-release, uname -r preferred)

NAME="CentOS Stream"
VERSION="8"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="8"
PLATFORM_ID="platform:el8"
PRETTY_NAME="CentOS Stream 8"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:8"
HOME_URL="https://centos.org/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux 8"
REDHAT_SUPPORT_PRODUCT_VERSION="CentOS Stream"

Type/provider of hosts: (VirtualBox/Bare-metal/AWS/GCE/DO)

Proxmox

cluster.yml file:

Maybe relevant lines:

system_images:
  metrics_server: rancher/mirrored-metrics-server:v0.5.0

enable_cri_dockerd: true

Steps to Reproduce:

Update RKE v1.2.11 -> v1.3.1 (k8s v1.20.9 -> v1.21.5, metrics-server v0.4.1->v0.5.0)

Results:

After updating RKE, another metrics-server instance is created which is unable to start, can't scrape the nodes, but the previous one is also present and it works fine with the previous version:

E0927 21:13:19.888512       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node1"                                         
E0927 21:13:19.888556       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.20:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node2"                                         
E0927 21:13:19.888591       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node3"                                         
I0927 21:13:24.433910       1 server.go:188] "Failed probe" probe="metric-storage-ready" err="not metrics to serve"

Diffing the old good and the new bad replicasets, the only relevant change seem to be the port change. Maybe a 4 digit is missing?

diff -u replicaset.yaml replicaset-bad.yaml 
--- replicaset.yaml     2021-09-27 22:38:41.236441973 +0200
+++ replicaset-bad.yaml 2021-09-27 23:04:44.510924790 +0200
@@ -4,13 +4,14 @@
   annotations:
     deployment.kubernetes.io/desired-replicas: "1"
     deployment.kubernetes.io/max-replicas: "2"
-    deployment.kubernetes.io/revision: "4"
-  creationTimestamp: "2021-05-09T14:00:19Z"
-  generation: 1
+    deployment.kubernetes.io/revision: "8"
+    deployment.kubernetes.io/revision-history: "5"
+  creationTimestamp: "2021-09-27T20:11:58Z"
+  generation: 3
   labels:
     k8s-app: metrics-server
-    pod-template-hash: 55fdd84cd4
-  name: metrics-server-55fdd84cd4
+    pod-template-hash: 7bf4b68b78
+  name: metrics-server-7bf4b68b78
   namespace: kube-system
   ownerReferences:
   - apiVersion: apps/v1
@@ -19,20 +20,20 @@
     kind: Deployment
     name: metrics-server
     uid: 10b674cb-611f-4292-acc4-a4c095298cf2
-  resourceVersion: "276893173"
-  uid: 5e5cb232-85c7-45b0-b28c-a1e922eeac42
+  resourceVersion: "276904165"
+  uid: 2262ffba-08b7-498d-a32c-279b9c1c4a8e
 spec:
   replicas: 1
   selector:
     matchLabels:
       k8s-app: metrics-server
-      pod-template-hash: 55fdd84cd4
+      pod-template-hash: 7bf4b68b78
   template:
     metadata:
       creationTimestamp: null
       labels:
         k8s-app: metrics-server
-        pod-template-hash: 55fdd84cd4
+        pod-template-hash: 7bf4b68b78
       name: metrics-server
     spec:
       affinity:
@@ -49,11 +50,12 @@
       containers:
       - args:
         - --cert-dir=/tmp
-        - --secure-port=4443
+        - --secure-port=443
         - --kubelet-insecure-tls
         - --kubelet-preferred-address-types=InternalIP
+        - --metric-resolution=15s
         - --logtostderr
-        image: rancher/mirrored-metrics-server:v0.4.1
+        image: rancher/mirrored-metrics-server:v0.5.0
         imagePullPolicy: IfNotPresent
         livenessProbe:
           failureThreshold: 3
@@ -66,7 +68,7 @@
           timeoutSeconds: 1
         name: metrics-server
         ports:
-        - containerPort: 4443
+        - containerPort: 443
           name: https
           protocol: TCP
         readinessProbe:
@@ -75,10 +77,14 @@
             path: /readyz
             port: https
             scheme: HTTPS
+          initialDelaySeconds: 20
           periodSeconds: 10
           successThreshold: 1
           timeoutSeconds: 1
-        resources: {}
+        resources:
+          requests:
+            cpu: 100m
+            memory: 200Mi
         securityContext:
           readOnlyRootFilesystem: true
           runAsNonRoot: true
@@ -105,8 +111,6 @@
       - emptyDir: {}
         name: tmp-dir
 status:
-  availableReplicas: 1
   fullyLabeledReplicas: 1
-  observedGeneration: 1
-  readyReplicas: 1
+  observedGeneration: 3
   replicas: 1

I also did a Rancher app update from v2.5 latest patch release (can't remember which version it was) to v2.6.0 with monitoring update 14.5.100->100.0.0+up16.6.0 but RKE is not provisioned from Rancher, I only use it to easily deploy the monitoring stack. The Rancher update was before the RKE update and metrics-server was untouched by the Rancher update. Only got it failing over v0.5.0 after the RKE update. v0.4.1 works fine even after manually deleting its replicaset and restoring it from a yaml file backup.

@immanuelfodor
Copy link
Author

Well, it's not fully working, though, this is what I get after rebooting all three nodes:

$ k top node
NAME        CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%     
node2       276m         5%     4899Mi          56%         
node1       <unknown>                           <unknown>               <unknown>               <unknown>               
node3       <unknown>                           <unknown>               <unknown>               <unknown>

So something is still not fine, although, it's clear that v0.4.1 is submitting the metrics to kubectl as it's log contains errors for node 1 and 3

E0927 21:29:48.436880       1 server.go:132] unable to fully scrape metrics: [unable to fully scrape metrics from node rke-node1: unable to fetch metrics from node node1: Get "https://192.168.1.19:10250/stats/summary?only_cpu_and_
memory=true": context deadline exceeded, unable to fully scrape metrics from node node3: unable to fetch metrics from node rke-node3: Get "https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true": context deadline exceed
ed]                                                                                                                                                                                                                                        
E0927 21:30:48.436217       1 server.go:132] unable to fully scrape metrics: [unable to fully scrape metrics from node rke-node3: unable to fetch metrics from node node3: Get "https://192.168.1.21:10250/stats/summary?only_cpu_and_
memory=true": context deadline exceeded, unable to fully scrape metrics from node node1: unable to fetch metrics from node rke-node1: Get "https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true": context deadline exceed
ed]                                                                                                                                                                                                                                        
E0927 21:31:48.438900       1 server.go:132] unable to fully scrape metrics: [unable to fully scrape metrics from node rke-node3: unable to fetch metrics from node node3: Get "https://192.168.1.21:10250/stats/summary?only_cpu_and_
memory=true": context deadline exceeded, unable to fully scrape metrics from node node1: unable to fetch metrics from node rke-node1: Get "https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true": context deadline exceed
ed]                                                                                                                                                                                                                                        
E0927 21:32:48.436462       1 server.go:132] unable to fully scrape metrics: [unable to fully scrape metrics from node rke-node1: unable to fetch metrics from node node1: Get "https://192.168.1.19:10250/stats/summary?only_cpu_and_
memory=true": context deadline exceeded, unable to fully scrape metrics from node node3: unable to fetch metrics from node rke-node3: Get "https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true": context deadline exceed
ed]                                                                                                                                                                                                                                        
E0927 21:33:48.436226       1 server.go:132] unable to fully scrape metrics: [unable to fully scrape metrics from node rke-node1: unable to fetch metrics from node node1: Get "https://192.168.1.19:10250/stats/summary?only_cpu_and_
memory=true": context deadline exceeded, unable to fully scrape metrics from node node3: unable to fetch metrics from node rke-node3: Get "https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true": context deadline exceed
ed] 

Maybe related to the dockershim update? It was enabled upon doing the rke up command with the new images. But I still don't get why 2 nodes are not reporting metrics and 1 does.

@superseb
Copy link
Contributor

Please use kubernetes_version as described on https://rancher.com/docs/rke/latest/en/upgrades/. Was/is it fully working with v0.4.1 or not? I can't extract that from the info, the error indicates it can't contact the kubelet on it's port.

If it is related to the dockershim update, you can disable it and see if that solves the issue.

@immanuelfodor
Copy link
Author

Yes, it was working fine with v0.4.1 before the upgrade. However, a new v0.5.0 pod was also running besides the old one after the upgrade but just one Deployment with v0.5.0 was present. This time, I made a backup of the running v0.4.1 replicaset as YAML, deleted it, maybe it conflicts with the new version. But the new one didn't succeed nonetheless. Then I put back the v0.4.1 replicaset from the backup YAML file, and so it succeeded scraping the metrics. Then I rebooted all nodes, and then the top command is returning just one node's info, not all. Firewalld is disabled as it should be.

Thanks for the idea, now I did these steps:

  1. Set kubernetes_version to v1.21.5-rancher1-1 and then did an rke up, didn't solve the issue.
  2. Then disabled the dockershim option (didn't know it's not a one-way ticket), then the old v0.4.1 replicaset was automatically deleted, and the v0.5.0 is the only one running. Just as it's expected with an updated Deployment otherwise.
  3. Now top is returning the node resource info fine, not even a reboot was needed, worked instantly.

So I think the issue is around the dockershim option.

@immanuelfodor
Copy link
Author

Just for curiosity, I enabled the dockershim option again, rke up, then the previously fine v0.5.0 became unhealthy: "Warning Unhealthy 5s (x3 over 25s) kubelet Readiness probe failed: HTTP probe failed with statuscode: 500" and top is returning "Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)"

Disabled dockershim again, rke up, then v0.5.0 works fine, top is returning the metrics again.

@immanuelfodor immanuelfodor changed the title Metrics-server can't scrape nodes after RKE update Metrics-server can't scrape nodes after enable_cri_dockerd is set to true Sep 28, 2021
@superseb
Copy link
Contributor

@immanuelfodor I ran the same scenario and although I did see the errors in the metrics-server, they went away after ~2 minutes. I also did not encounter the duplicate metrics-server deployment/pods with different images. If there is any more info you can provide on the setup and steps to reproduce, that would help.

@immanuelfodor
Copy link
Author

immanuelfodor commented Sep 30, 2021

I think the duplicated metrics-server with two different images appeared just because the new metrics-server replicaset/pod didn't become Healthy, so the kubelet didn't remove the previous replicaset/pod until the new one started up fine. The old replicaset/pod only disappeared automatically after I disabled the dockershim and so the new one become Healthy finally. I don't think we should investigate this more.

I reenabled dockershim, did an rke up, waited for more than 2 minutes and metrics-server is still unhealthy.

image

Events:
  Type     Reason     Age                     From     Message
  ----     ------     ----                    ----     -------
  Warning  Unhealthy  4m47s (x93 over 19m)    kubelet  Readiness probe failed: HTTP probe failed with statuscode: 500

Here is the full cluster YAML, maybe it helps:

cluster.yml
nodes:
- address: 192.168.1.19
  port: "22"
  internal_address: ""
  role:
  - controlplane
  - worker
  - etcd
  hostname_override: node1
  user: centos
  docker_socket: /var/run/docker.sock
  ssh_key: ""
  ssh_key_path: ~/.ssh/id_ed25519
  ssh_cert: ""
  ssh_cert_path: ""
  labels: {}
  taints: []
- address: 192.168.1.20
  port: "22"
  internal_address: ""
  role:
  - controlplane
  - worker
  - etcd
  hostname_override: node2
  user: centos
  docker_socket: /var/run/docker.sock
  ssh_key: ""
  ssh_key_path: ~/.ssh/id_ed25519
  ssh_cert: ""
  ssh_cert_path: ""
  labels: {}
  taints: []
- address: 192.168.1.21
  port: "22"
  internal_address: ""
  role:
  - controlplane
  - worker
  - etcd
  hostname_override: node3
  user: centos
  docker_socket: /var/run/docker.sock
  ssh_key: ""
  ssh_key_path: ~/.ssh/id_ed25519
  ssh_cert: ""
  ssh_cert_path: ""
  labels: {}
  taints: []
services:
  etcd:
    image: ""
    extra_args: {}
    extra_binds: []
    extra_env: []
    win_extra_args: {}
    win_extra_binds: []
    win_extra_env: []
    external_urls: []
    ca_cert: ""
    cert: ""
    key: ""
    path: ""
    uid: 1000
    gid: 1000
    snapshot: true
    retention: 48h
    creation: 6h
    backup_config:
      interval_hours: 12
      retention: 6
  kube-api:
    image: ""
    extra_args: {}
    extra_binds: []
    extra_env: []
    win_extra_args: {}
    win_extra_binds: []
    win_extra_env: []
    service_cluster_ip_range: 10.43.0.0/16
    service_node_port_range: ""
    pod_security_policy: false
    always_pull_images: false
    secrets_encryption_config:
      enabled: true
    audit_log:
      enabled: true
    admission_configuration: null
    event_rate_limit: null
  kube-controller:
    image: ""
    extra_args: {}
    extra_binds: []
    extra_env: []
    win_extra_args: {}
    win_extra_binds: []
    win_extra_env: []
    cluster_cidr: 10.42.0.0/16
    service_cluster_ip_range: 10.43.0.0/16
  scheduler:
    image: ""
    extra_args: {}
    extra_binds: []
    extra_env: []
    win_extra_args: {}
    win_extra_binds: []
    win_extra_env: []
  kubelet:
    image: ""
    extra_args:
      max-pods: 150
      # @see: https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/#general-guidelines
      enforce-node-allocatable: "pods"
      system-reserved: "cpu=300m,memory=5Mi,ephemeral-storage=1Gi"
      kube-reserved: "cpu=200m,memory=5Mi,ephemeral-storage=1Gi"
      eviction-hard: "memory.available<1Gi,nodefs.available<10%"
    extra_binds:
    # added ISCSI paths due to OpenEBS cStor requirements
    # @see: https://docs.openebs.io/docs/next/prerequisites.html#rancher
    - "/etc/iscsi:/etc/iscsi"
    - "/sbin/iscsiadm:/sbin/iscsiadm"
    - "/var/lib/iscsi:/var/lib/iscsi"
    - "/lib/modules"
    - "/var/openebs/local:/var/openebs/local"
    - "/usr/lib64/libcrypto.so.1.1:/usr/lib/libcrypto.so.1.1"
    - "/usr/lib64/libopeniscsiusr.so.0.2.0:/usr/lib/libopeniscsiusr.so.0.2.0"
    extra_env: []
    win_extra_args: {}
    win_extra_binds: []
    win_extra_env: []
    cluster_domain: cluster.local
    infra_container_image: ""
    cluster_dns_server: 10.43.0.10
    fail_swap_on: false
    generate_serving_certificate: false
  kubeproxy:
    image: ""
    extra_args: {}
    extra_binds: []
    extra_env: []
    win_extra_args: {}
    win_extra_binds: []
    win_extra_env: []
network:
  plugin: canal
  options:
    # workaround to get hostnetworked pods DNS resolution working on nodes that don't have a CoreDNS replica running
    # do the rke up then reboot all nodes to apply
    # @see: https://github.com/rancher/k3s/issues/1827#issuecomment-636362097
    # @see: https://github.com/coreos/flannel/issues/1243#issuecomment-589542796
    # @see: https://rancher.com/docs/rke/latest/en/config-options/add-ons/network-plugins/
    canal_flannel_backend_type: host-gw
  mtu: 0
  node_selector: {}
  update_strategy: null
  tolerations: []
authentication:
  strategy: x509
  sans:
  # floating virtual IP with kube-karp, @see: https://github.com/immanuelfodor/kube-karp
  - "192.168.1.10"
  webhook: null
addons: ""
addons_include:
- dashboard/recommended.yml
- dashboard/dashboard-adminuser.yml
system_images:
  etcd: rancher/mirrored-coreos-etcd:v3.4.16-rancher1
  alpine: rancher/rke-tools:v0.1.78
  nginx_proxy: rancher/rke-tools:v0.1.78
  cert_downloader: rancher/rke-tools:v0.1.78
  kubernetes_services_sidecar: rancher/rke-tools:v0.1.78
  kubedns: rancher/mirrored-k8s-dns-kube-dns:1.17.4
  dnsmasq: rancher/mirrored-k8s-dns-dnsmasq-nanny:1.17.4
  kubedns_sidecar: rancher/mirrored-k8s-dns-sidecar:1.17.4
  kubedns_autoscaler: rancher/mirrored-cluster-proportional-autoscaler:1.8.3
  coredns: rancher/mirrored-coredns-coredns:1.8.4
  coredns_autoscaler: rancher/mirrored-cluster-proportional-autoscaler:1.8.3
  nodelocal: rancher/mirrored-k8s-dns-node-cache:1.18.0
  kubernetes: rancher/hyperkube:v1.21.5-rancher1
  flannel: rancher/mirrored-coreos-flannel:v0.14.0
  flannel_cni: rancher/flannel-cni:v0.3.0-rancher6
  calico_node: rancher/mirrored-calico-node:v3.19.2
  calico_cni: rancher/mirrored-calico-cni:v3.19.2
  calico_controllers: rancher/mirrored-calico-kube-controllers:v3.19.2
  calico_ctl: rancher/mirrored-calico-ctl:v3.19.2
  calico_flexvol: rancher/mirrored-calico-pod2daemon-flexvol:v3.19.2
  canal_node: rancher/mirrored-calico-node:v3.19.2
  canal_cni: rancher/mirrored-calico-cni:v3.19.2
  canal_controllers: rancher/mirrored-calico-kube-controllers:v3.19.2
  canal_flannel: rancher/mirrored-coreos-flannel:v0.14.0
  canal_flexvol: rancher/mirrored-calico-pod2daemon-flexvol:v3.19.2
  weave_node: weaveworks/weave-kube:2.8.1
  weave_cni: weaveworks/weave-npc:2.8.1
  pod_infra_container: rancher/mirrored-pause:3.4.1
  ingress: rancher/nginx-ingress-controller:nginx-0.48.1-rancher1
  ingress_backend: rancher/mirrored-nginx-ingress-controller-defaultbackend:1.5-rancher1
  ingress_webhook: rancher/mirrored-jettech-kube-webhook-certgen:v1.5.1
  metrics_server: rancher/mirrored-metrics-server:v0.5.0
  windows_pod_infra_container: rancher/kubelet-pause:v0.1.6
  aci_cni_deploy_container: noiro/cnideploy:5.1.1.0.1ae238a
  aci_host_container: noiro/aci-containers-host:5.1.1.0.1ae238a
  aci_opflex_container: noiro/opflex:5.1.1.0.1ae238a
  aci_mcast_container: noiro/opflex:5.1.1.0.1ae238a
  aci_ovs_container: noiro/openvswitch:5.1.1.0.1ae238a
  aci_controller_container: noiro/aci-containers-controller:5.1.1.0.1ae238a
  aci_gbp_server_container: noiro/gbp-server:5.1.1.0.1ae238a
  aci_opflex_server_container: noiro/opflex-server:5.1.1.0.1ae238a
ssh_key_path: ~/.ssh/id_ed25519
ssh_cert_path: ""
ssh_agent_auth: false
authorization:
  mode: rbac
  options: {}
ignore_docker_version: null
# default null, can be true, @see: https://github.com/rancher/rancher/issues/31943
# my issue, @see: https://github.com/rancher/rke/issues/2709
enable_cri_dockerd: true
kubernetes_version: ""
private_registries: []
ingress:
  provider: nginx
  options:
    use-forwarded-headers: "true"
    proxy-body-size: "80M"
    use-http2: "true"
  node_selector: {}
  extra_args: {}
  dns_policy: ""
  extra_envs: []
  extra_volumes: []
  extra_volume_mounts: []
  update_strategy: null
  # @see: https://github.com/rancher/rke/issues/1876
  # @see: https://github.com/rancher/rke/commit/5a63de09bc21267955461372aa2969cdff6e5b2c
  http_port: 0
  https_port: 0
  # default "", can be hostNetwork, hostPort, @see: https://rancher.com/docs/rke/latest/en/config-options/add-ons/ingress-controllers/#configuring-network-options
  # @see: https://github.com/rancher/rke/issues/2702#issuecomment-928950593
  network_mode: ""
  tolerations: []
  default_backend: null
  default_http_backend_priority_class_name: ""
  nginx_ingress_controller_priority_class_name: ""
cluster_name: "rke"
cloud_provider:
  name: ""
prefix_path: ""
win_prefix_path: ""
addon_job_timeout: 0
bastion_host:
  address: ""
  port: ""
  user: ""
  ssh_key: ""
  ssh_key_path: ""
  ssh_cert: ""
  ssh_cert_path: ""
  # default false, @see: https://github.com/rancher/rke/issues/2525
  ignore_proxy_env_vars: false
monitoring:
  provider: ""
  options: {}
  node_selector: {}
  update_strategy: null
  replicas: null
  tolerations: []
  metrics_server_priority_class_name: ""
restore:
  restore: false
  snapshot_name: ""
# automatic 'rke encrypt rotate-key', @see: https://github.com/rancher/rancher/issues/27735
rotate_encryption_key: false
dns:
  provider: coredns
  upstreamnameservers:
  - 192.168.1.33
  - 192.168.1.34

I've also got these errors firing in Rancher over the last half an hour after re-enabling dockershim:

  • An aggregated API v1beta1.metrics.k8s.io/default has been only 30% available over the last 10m
  • 33.33% of the kubelet/rancher-monitoring-kubelet targets in kube-system namespace are down.
  • Deployment kube-system/metrics-server has not matched the expected number of replicas for longer than 15 minutes.
  • Kubelet has disappeared from Prometheus target discovery.
  • An aggregated API v1beta1.metrics.k8s.io/default has been only 0% available over the last 10m
  • Kubernetes API server client 'kube-controller-manager/192.168.1.21:10252' is experiencing 1.214% errors.

@superseb
Copy link
Contributor

Please run metrics-server with highest verbosity log and see what is causing the issue. Is your host firewall enabled or disabled? With or without custom rules?

@immanuelfodor
Copy link
Author

immanuelfodor commented Sep 30, 2021

Firewalld is disabled on all 3 nodes, and all of them are on the same subnet, so it shouldn't cause any problems, and everything is fine when dockershim is not enabled.

Redeployed metrics-server with:

monitoring:
  options:
    # increase metrics server verbosity, @see: https://github.com/rancher/rke/issues/2709#issuecomment-931630835
    # for other params, @see: docker run --rm k8s.gcr.io/metrics-server/metrics-server:v0.5.0 --help
    v: "10"

Since the new one can't become healthy, two pods/replicasets are running (this confirms the previous assumption):

image

Here are the verbose logs from the new one:

metrics-server verbose logs
I0930 20:38:10.361605       1 round_trippers.go:454] GET https://10.43.0.1:443/api/v1/namespaces/kube-system/configmaps?allowWatchBookmarks=true&fieldSelector=metadata.name%3Dextension-apiserver-authentication&resourceVersion=278291938&timeout=5m46s&timeoutSeconds=346&watch=true 200 OK in 0 milliseconds
I0930 20:38:10.363644       1 round_trippers.go:460] Response Headers:
I0930 20:38:10.363664       1 round_trippers.go:463]     Cache-Control: no-cache, private
I0930 20:38:10.363667       1 round_trippers.go:463]     Content-Type: application/json
I0930 20:38:10.363669       1 round_trippers.go:463]     Date: Thu, 30 Sep 2021 20:38:10 GMT
I0930 20:38:10.364056       1 round_trippers.go:435] curl -k -v -XGET  -H "Accept: application/vnd.kubernetes.protobuf, */*" -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" -H "Authorization: Bearer <masked>" 'https://10.43.0.1:443/api/v1/nodes?allowWatchBookmarks=true&resourceVersion=278291783&timeout=5m29s&timeoutSeconds=329&watch=true'
I0930 20:38:10.364152       1 scraper.go:136] "Scraping node" node="node3"
I0930 20:38:10.364211       1 round_trippers.go:435] curl -k -v -XGET  -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" -H "Authorization: Bearer <masked>" 'https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true'
I0930 20:38:10.364909       1 round_trippers.go:454] GET https://10.43.0.1:443/api/v1/nodes?allowWatchBookmarks=true&resourceVersion=278291783&timeout=5m29s&timeoutSeconds=329&watch=true 200 OK in 0 milliseconds
I0930 20:38:10.364918       1 round_trippers.go:460] Response Headers:
I0930 20:38:10.364942       1 round_trippers.go:463]     Content-Type: application/vnd.kubernetes.protobuf;stream=watch
I0930 20:38:10.364946       1 round_trippers.go:463]     Date: Thu, 30 Sep 2021 20:38:10 GMT
I0930 20:38:10.364950       1 round_trippers.go:463]     Cache-Control: no-cache, private
I0930 20:38:10.458787       1 shared_informer.go:270] caches populated
I0930 20:38:10.458804       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file 
I0930 20:38:10.458852       1 shared_informer.go:270] caches populated
I0930 20:38:10.458857       1 shared_informer.go:270] caches populated
I0930 20:38:10.458867       1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController 
I0930 20:38:10.458859       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file 
I0930 20:38:10.459169       1 tlsconfig.go:178] loaded client CA [0/"client-ca::kube-system::extension-apiserver-authentication::client-ca-file,client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"]: "kube-apiserver-requestheader-ca" [] issuer="<self>" (2020-04-17 05:37:08 +0000 UTC to 2030-04-15 05:37:08 +0000 UTC (now=2021-09-30 20:38:10.459148421 +0000 UTC))
I0930 20:38:10.459390       1 tlsconfig.go:200] loaded serving cert ["serving-cert::/tmp/apiserver.crt::/tmp/apiserver.key"]: "localhost@1633034290" [serving] validServingFor=[127.0.0.1,localhost,localhost] issuer="localhost-ca@1633034289" (2021-09-30 19:38:09 +0000 UTC to 2022-09-30 19:38:09 +0000 UTC (now=2021-09-30 20:38:10.459383182 +0000 UTC))
I0930 20:38:10.459483       1 named_certificates.go:53] loaded SNI cert [0/"self-signed loopback"]: "apiserver-loopback-client@1633034290" [serving] validServingFor=[apiserver-loopback-client] issuer="apiserver-loopback-client-ca@1633034290" (2021-09-30 19:38:10 +0000 UTC to 2022-09-30 19:38:10 +0000 UTC (now=2021-09-30 20:38:10.459478515 +0000 UTC))
I0930 20:38:10.459536       1 tlsconfig.go:178] loaded client CA [0/"client-ca::kube-system::extension-apiserver-authentication::client-ca-file,client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"]: "kube-ca" [] issuer="<self>" (2020-04-17 05:37:08 +0000 UTC to 2030-04-15 05:37:08 +0000 UTC (now=2021-09-30 20:38:10.459530753 +0000 UTC))
I0930 20:38:10.459550       1 tlsconfig.go:178] loaded client CA [1/"client-ca::kube-system::extension-apiserver-authentication::client-ca-file,client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"]: "kube-apiserver-requestheader-ca" [] issuer="<self>" (2020-04-17 05:37:08 +0000 UTC to 2030-04-15 05:37:08 +0000 UTC (now=2021-09-30 20:38:10.459546147 +0000 UTC))
I0930 20:38:10.459638       1 tlsconfig.go:200] loaded serving cert ["serving-cert::/tmp/apiserver.crt::/tmp/apiserver.key"]: "localhost@1633034290" [serving] validServingFor=[127.0.0.1,localhost,localhost] issuer="localhost-ca@1633034289" (2021-09-30 19:38:09 +0000 UTC to 2022-09-30 19:38:09 +0000 UTC (now=2021-09-30 20:38:10.459634656 +0000 UTC))
I0930 20:38:10.459723       1 named_certificates.go:53] loaded SNI cert [0/"self-signed loopback"]: "apiserver-loopback-client@1633034290" [serving] validServingFor=[apiserver-loopback-client] issuer="apiserver-loopback-client-ca@1633034290" (2021-09-30 19:38:10 +0000 UTC to 2022-09-30 19:38:10 +0000 UTC (now=2021-09-30 20:38:10.459719213 +0000 UTC))
I0930 20:38:12.305606       1 handler.go:153] metrics-server: GET "/livez" satisfied by nonGoRestful
I0930 20:38:12.305632       1 pathrecorder.go:240] metrics-server: "/livez" satisfied by exact match
I0930 20:38:12.305737       1 httplog.go:89] "HTTP" verb="GET" URI="/livez" latency="447.877µs" userAgent="kube-probe/1.21" srcIP="192.168.1.20:58808" resp=200
I0930 20:38:22.305842       1 handler.go:153] metrics-server: GET "/livez" satisfied by nonGoRestful
I0930 20:38:22.305863       1 pathrecorder.go:240] metrics-server: "/livez" satisfied by exact match
I0930 20:38:22.305966       1 httplog.go:89] "HTTP" verb="GET" URI="/livez" latency="281.7µs" userAgent="kube-probe/1.21" srcIP="192.168.1.20:50423" resp=200
I0930 20:38:23.841785       1 round_trippers.go:454] GET https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true  in 13489 milliseconds
I0930 20:38:23.841820       1 round_trippers.go:460] Response Headers:
I0930 20:38:23.841849       1 decode.go:59] "Failed getting node metric timestamp" node=""
I0930 20:38:23.841873       1 round_trippers.go:454] GET https://192.168.1.20:10250/stats/summary?only_cpu_and_memory=true  in 13499 milliseconds
E0930 20:38:23.841899       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node1"
I0930 20:38:23.841904       1 round_trippers.go:460] Response Headers:
I0930 20:38:23.841965       1 decode.go:59] "Failed getting node metric timestamp" node=""
E0930 20:38:23.842048       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.20:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node2"
I0930 20:38:23.842077       1 round_trippers.go:454] GET https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true  in 13477 milliseconds
I0930 20:38:23.842081       1 round_trippers.go:460] Response Headers:
I0930 20:38:23.842089       1 decode.go:59] "Failed getting node metric timestamp" node=""
E0930 20:38:23.842105       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node3"
I0930 20:38:23.842117       1 scraper.go:157] "Scrape finished" duration="13.501071844s" nodeCount=0 podCount=0
I0930 20:38:23.842122       1 server.go:139] "Storing metrics"
I0930 20:38:23.842151       1 server.go:144] "Scraping cycle complete"
I0930 20:38:25.341515       1 server.go:136] "Scraping metrics"
I0930 20:38:25.341545       1 scraper.go:114] "Scraping metrics from nodes" nodeCount=3
I0930 20:38:25.348787       1 scraper.go:136] "Scraping node" node="node2"
I0930 20:38:25.348850       1 round_trippers.go:435] curl -k -v -XGET  -H "Authorization: Bearer <masked>" -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" 'https://192.168.1.20:10250/stats/summary?only_cpu_and_memory=true'
I0930 20:38:25.354201       1 scraper.go:136] "Scraping node" node="node3"
I0930 20:38:25.354256       1 round_trippers.go:435] curl -k -v -XGET  -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" -H "Authorization: Bearer <masked>" 'https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true'
I0930 20:38:25.356426       1 scraper.go:136] "Scraping node" node="node1"
I0930 20:38:25.356474       1 round_trippers.go:435] curl -k -v -XGET  -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" -H "Authorization: Bearer <masked>" 'https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true'
I0930 20:38:32.306393       1 handler.go:153] metrics-server: GET "/livez" satisfied by nonGoRestful
I0930 20:38:32.306429       1 pathrecorder.go:240] metrics-server: "/livez" satisfied by exact match
I0930 20:38:32.306506       1 httplog.go:89] "HTTP" verb="GET" URI="/livez" latency="215.747µs" userAgent="kube-probe/1.21" srcIP="192.168.1.20:36666" resp=200
I0930 20:38:32.306515       1 handler.go:153] metrics-server: GET "/readyz" satisfied by nonGoRestful
I0930 20:38:32.306525       1 pathrecorder.go:240] metrics-server: "/readyz" satisfied by exact match
I0930 20:38:32.306552       1 shared_informer.go:270] caches populated
I0930 20:38:32.306562       1 server.go:188] "Failed probe" probe="metric-storage-ready" err="not metrics to serve"
I0930 20:38:32.306571       1 healthz.go:244] metric-storage-ready check failed: readyz
[-]metric-storage-ready failed: not metrics to serve
I0930 20:38:32.306627       1 httplog.go:89] "HTTP" verb="GET" URI="/readyz" latency="365.828µs" userAgent="kube-probe/1.21" srcIP="192.168.1.20:5725" resp=0
I0930 20:38:38.842568       1 round_trippers.go:454] GET https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true  in 13486 milliseconds
I0930 20:38:38.842584       1 round_trippers.go:460] Response Headers:
I0930 20:38:38.842603       1 decode.go:59] "Failed getting node metric timestamp" node=""
E0930 20:38:38.842624       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node1"
I0930 20:38:38.843015       1 round_trippers.go:454] GET https://192.168.1.20:10250/stats/summary?only_cpu_and_memory=true  in 13494 milliseconds
I0930 20:38:38.843020       1 round_trippers.go:460] Response Headers:
I0930 20:38:38.843027       1 decode.go:59] "Failed getting node metric timestamp" node=""
E0930 20:38:38.843036       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.20:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node2"
I0930 20:38:38.843035       1 round_trippers.go:454] GET https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true  in 13488 milliseconds
I0930 20:38:38.843043       1 round_trippers.go:460] Response Headers:
I0930 20:38:38.843054       1 decode.go:59] "Failed getting node metric timestamp" node=""
E0930 20:38:38.843068       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node3"
I0930 20:38:38.843082       1 scraper.go:157] "Scrape finished" duration="13.50152628s" nodeCount=0 podCount=0
I0930 20:38:38.843086       1 server.go:139] "Storing metrics"
I0930 20:38:38.843093       1 server.go:144] "Scraping cycle complete"
I0930 20:38:40.341533       1 server.go:136] "Scraping metrics"
I0930 20:38:40.341563       1 scraper.go:114] "Scraping metrics from nodes" nodeCount=3
I0930 20:38:40.358695       1 scraper.go:136] "Scraping node" node="node2"
I0930 20:38:40.358736       1 scraper.go:136] "Scraping node" node="node3"
I0930 20:38:40.358746       1 round_trippers.go:435] curl -k -v -XGET  -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" -H "Authorization: Bearer <masked>" 'https://192.168.1.20:10250/stats/summary?only_cpu_and_memory=true'
I0930 20:38:40.358767       1 round_trippers.go:435] curl -k -v -XGET  -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" -H "Authorization: Bearer <masked>" 'https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true'
I0930 20:38:40.360734       1 scraper.go:136] "Scraping node" node="node1"
I0930 20:38:40.360810       1 round_trippers.go:435] curl -k -v -XGET  -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" -H "Authorization: Bearer <masked>" 'https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true'
I0930 20:38:42.305610       1 handler.go:153] metrics-server: GET "/livez" satisfied by nonGoRestful
I0930 20:38:42.305633       1 pathrecorder.go:240] metrics-server: "/livez" satisfied by exact match
I0930 20:38:42.305709       1 httplog.go:89] "HTTP" verb="GET" URI="/livez" latency="255.962µs" userAgent="kube-probe/1.21" srcIP="192.168.1.20:14800" resp=200
I0930 20:38:42.305730       1 handler.go:153] metrics-server: GET "/readyz" satisfied by nonGoRestful
I0930 20:38:42.305738       1 pathrecorder.go:240] metrics-server: "/readyz" satisfied by exact match
I0930 20:38:42.305760       1 shared_informer.go:270] caches populated
I0930 20:38:42.305769       1 server.go:188] "Failed probe" probe="metric-storage-ready" err="not metrics to serve"
I0930 20:38:42.305777       1 healthz.go:244] metric-storage-ready check failed: readyz
[-]metric-storage-ready failed: not metrics to serve
I0930 20:38:42.305795       1 httplog.go:89] "HTTP" verb="GET" URI="/readyz" latency="96.913µs" userAgent="kube-probe/1.21" srcIP="192.168.1.20:20018" resp=0
I0930 20:38:52.307779       1 handler.go:153] metrics-server: GET "/livez" satisfied by nonGoRestful
I0930 20:38:52.307798       1 pathrecorder.go:240] metrics-server: "/livez" satisfied by exact match
I0930 20:38:52.307918       1 httplog.go:89] "HTTP" verb="GET" URI="/livez" latency="343.816µs" userAgent="kube-probe/1.21" srcIP="192.168.1.20:52656" resp=200
I0930 20:38:52.308836       1 handler.go:153] metrics-server: GET "/readyz" satisfied by nonGoRestful
I0930 20:38:52.308852       1 pathrecorder.go:240] metrics-server: "/readyz" satisfied by exact match
I0930 20:38:52.308893       1 shared_informer.go:270] caches populated
I0930 20:38:52.308909       1 server.go:188] "Failed probe" probe="metric-storage-ready" err="not metrics to serve"
I0930 20:38:52.308921       1 healthz.go:244] metric-storage-ready check failed: readyz
[-]metric-storage-ready failed: not metrics to serve
I0930 20:38:52.309455       1 httplog.go:89] "HTTP" verb="GET" URI="/readyz" latency="676.608µs" userAgent="kube-probe/1.21" srcIP="192.168.1.20:36819" resp=0
I0930 20:38:53.842730       1 round_trippers.go:454] GET https://192.168.1.20:10250/stats/summary?only_cpu_and_memory=true  in 13483 milliseconds
I0930 20:38:53.842750       1 round_trippers.go:460] Response Headers:
I0930 20:38:53.842787       1 decode.go:59] "Failed getting node metric timestamp" node=""
I0930 20:38:53.842822       1 round_trippers.go:454] GET https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true  in 13481 milliseconds
I0930 20:38:53.842844       1 round_trippers.go:460] Response Headers:
I0930 20:38:53.842857       1 decode.go:59] "Failed getting node metric timestamp" node=""
I0930 20:38:53.842864       1 round_trippers.go:454] GET https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true  in 13484 milliseconds
I0930 20:38:53.842868       1 round_trippers.go:460] Response Headers:
I0930 20:38:53.842875       1 decode.go:59] "Failed getting node metric timestamp" node=""
E0930 20:38:53.842884       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node3"
E0930 20:38:53.842885       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node1"
E0930 20:38:53.842828       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.20:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node2"
I0930 20:38:53.842906       1 scraper.go:157] "Scrape finished" duration="13.501326852s" nodeCount=0 podCount=0
I0930 20:38:53.842911       1 server.go:139] "Storing metrics"
I0930 20:38:53.842919       1 server.go:144] "Scraping cycle complete"
I0930 20:38:55.341571       1 server.go:136] "Scraping metrics"
I0930 20:38:55.341598       1 scraper.go:114] "Scraping metrics from nodes" nodeCount=3
I0930 20:38:55.343821       1 scraper.go:136] "Scraping node" node="node2"
I0930 20:38:55.343871       1 round_trippers.go:435] curl -k -v -XGET  -H "Authorization: Bearer <masked>" -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" 'https://192.168.1.20:10250/stats/summary?only_cpu_and_memory=true'
I0930 20:38:55.344968       1 scraper.go:136] "Scraping node" node="node3"
I0930 20:38:55.345002       1 round_trippers.go:435] curl -k -v -XGET  -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" -H "Authorization: Bearer <masked>" 'https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true'
I0930 20:38:55.347142       1 scraper.go:136] "Scraping node" node="node1"
I0930 20:38:55.347177       1 round_trippers.go:435] curl -k -v -XGET  -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" -H "Authorization: Bearer <masked>" 'https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true'
I0930 20:39:02.305718       1 handler.go:153] metrics-server: GET "/livez" satisfied by nonGoRestful
I0930 20:39:02.305733       1 pathrecorder.go:240] metrics-server: "/livez" satisfied by exact match
I0930 20:39:02.305798       1 httplog.go:89] "HTTP" verb="GET" URI="/livez" latency="221.023µs" userAgent="kube-probe/1.21" srcIP="192.168.1.20:43868" resp=200
I0930 20:39:02.307924       1 handler.go:153] metrics-server: GET "/readyz" satisfied by nonGoRestful
I0930 20:39:02.307939       1 pathrecorder.go:240] metrics-server: "/readyz" satisfied by exact match
I0930 20:39:02.307971       1 shared_informer.go:270] caches populated
I0930 20:39:02.307987       1 server.go:188] "Failed probe" probe="metric-storage-ready" err="not metrics to serve"
I0930 20:39:02.308003       1 healthz.go:244] metric-storage-ready check failed: readyz
[-]metric-storage-ready failed: not metrics to serve
I0930 20:39:02.308032       1 httplog.go:89] "HTTP" verb="GET" URI="/readyz" latency="169.864µs" userAgent="kube-probe/1.21" srcIP="192.168.1.20:47446" resp=0
I0930 20:39:08.841949       1 round_trippers.go:454] GET https://192.168.1.20:10250/stats/summary?only_cpu_and_memory=true  in 13498 milliseconds
I0930 20:39:08.841966       1 round_trippers.go:460] Response Headers:
I0930 20:39:08.841986       1 decode.go:59] "Failed getting node metric timestamp" node=""
E0930 20:39:08.842009       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.20:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node2"
I0930 20:39:08.842077       1 round_trippers.go:454] GET https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true  in 13497 milliseconds
I0930 20:39:08.842085       1 round_trippers.go:460] Response Headers:
I0930 20:39:08.842098       1 decode.go:59] "Failed getting node metric timestamp" node=""
E0930 20:39:08.842111       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node3"
I0930 20:39:08.842153       1 round_trippers.go:454] GET https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true  in 13494 milliseconds
I0930 20:39:08.842163       1 round_trippers.go:460] Response Headers:
I0930 20:39:08.842175       1 decode.go:59] "Failed getting node metric timestamp" node=""
E0930 20:39:08.842188       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node1"
I0930 20:39:08.842202       1 scraper.go:157] "Scrape finished" duration="13.500591733s" nodeCount=0 podCount=0
I0930 20:39:08.842213       1 server.go:139] "Storing metrics"
I0930 20:39:08.842225       1 server.go:144] "Scraping cycle complete"
I0930 20:39:10.341600       1 server.go:136] "Scraping metrics"
I0930 20:39:10.341633       1 scraper.go:114] "Scraping metrics from nodes" nodeCount=3
I0930 20:39:10.343860       1 scraper.go:136] "Scraping node" node="node3"
I0930 20:39:10.343950       1 round_trippers.go:435] curl -k -v -XGET  -H "Authorization: Bearer <masked>" -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" 'https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true'
I0930 20:39:10.347008       1 scraper.go:136] "Scraping node" node="node2"
I0930 20:39:10.347046       1 round_trippers.go:435] curl -k -v -XGET  -H "Authorization: Bearer <masked>" -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" 'https://192.168.1.20:10250/stats/summary?only_cpu_and_memory=true'
I0930 20:39:10.365085       1 scraper.go:136] "Scraping node" node="node1"
I0930 20:39:10.365164       1 round_trippers.go:435] curl -k -v -XGET  -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" -H "Authorization: Bearer <masked>" 'https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true'
I0930 20:39:12.305129       1 handler.go:153] metrics-server: GET "/livez" satisfied by nonGoRestful
I0930 20:39:12.305149       1 pathrecorder.go:240] metrics-server: "/livez" satisfied by exact match
I0930 20:39:12.305212       1 httplog.go:89] "HTTP" verb="GET" URI="/livez" latency="247.806µs" userAgent="kube-probe/1.21" srcIP="192.168.1.20:35115" resp=200
I0930 20:39:12.307129       1 handler.go:153] metrics-server: GET "/readyz" satisfied by nonGoRestful
I0930 20:39:12.307150       1 pathrecorder.go:240] metrics-server: "/readyz" satisfied by exact match
I0930 20:39:12.307200       1 shared_informer.go:270] caches populated
I0930 20:39:12.307220       1 server.go:188] "Failed probe" probe="metric-storage-ready" err="not metrics to serve"
I0930 20:39:12.307245       1 healthz.go:244] metric-storage-ready check failed: readyz
[-]metric-storage-ready failed: not metrics to serve
I0930 20:39:12.307292       1 httplog.go:89] "HTTP" verb="GET" URI="/readyz" latency="229.049µs" userAgent="kube-probe/1.21" srcIP="192.168.1.20:15735" resp=0
I0930 20:39:22.305628       1 handler.go:153] metrics-server: GET "/readyz" satisfied by nonGoRestful
I0930 20:39:22.305662       1 pathrecorder.go:240] metrics-server: "/readyz" satisfied by exact match
I0930 20:39:22.305708       1 shared_informer.go:270] caches populated
I0930 20:39:22.305730       1 server.go:188] "Failed probe" probe="metric-storage-ready" err="not metrics to serve"
I0930 20:39:22.305741       1 healthz.go:244] metric-storage-ready check failed: readyz
[-]metric-storage-ready failed: not metrics to serve
I0930 20:39:22.305760       1 handler.go:153] metrics-server: GET "/livez" satisfied by nonGoRestful
I0930 20:39:22.305771       1 httplog.go:89] "HTTP" verb="GET" URI="/readyz" latency="306.604µs" userAgent="kube-probe/1.21" srcIP="192.168.1.20:25495" resp=0
I0930 20:39:22.305774       1 pathrecorder.go:240] metrics-server: "/livez" satisfied by exact match
I0930 20:39:22.305812       1 httplog.go:89] "HTTP" verb="GET" URI="/livez" latency="80.952µs" userAgent="kube-probe/1.21" srcIP="192.168.1.20:62760" resp=200
I0930 20:39:23.842616       1 round_trippers.go:454] GET https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true  in 13477 milliseconds
I0930 20:39:23.842631       1 round_trippers.go:460] Response Headers:
I0930 20:39:23.842658       1 decode.go:59] "Failed getting node metric timestamp" node=""
E0930 20:39:23.842680       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node1"
I0930 20:39:23.842719       1 round_trippers.go:454] GET https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true  in 13498 milliseconds
I0930 20:39:23.842726       1 round_trippers.go:460] Response Headers:
I0930 20:39:23.842736       1 decode.go:59] "Failed getting node metric timestamp" node=""
E0930 20:39:23.842755       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node3"
I0930 20:39:23.842786       1 round_trippers.go:454] GET https://192.168.1.20:10250/stats/summary?only_cpu_and_memory=true  in 13495 milliseconds
I0930 20:39:23.842791       1 round_trippers.go:460] Response Headers:
I0930 20:39:23.842799       1 decode.go:59] "Failed getting node metric timestamp" node=""
E0930 20:39:23.842812       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.20:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node2"
I0930 20:39:23.842837       1 scraper.go:157] "Scrape finished" duration="13.501180674s" nodeCount=0 podCount=0
I0930 20:39:23.842843       1 server.go:139] "Storing metrics"
I0930 20:39:23.842851       1 server.go:144] "Scraping cycle complete"
I0930 20:39:25.341134       1 server.go:136] "Scraping metrics"
I0930 20:39:25.341164       1 scraper.go:114] "Scraping metrics from nodes" nodeCount=3
I0930 20:39:25.349413       1 scraper.go:136] "Scraping node" node="node1"
I0930 20:39:25.349469       1 round_trippers.go:435] curl -k -v -XGET  -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" -H "Authorization: Bearer <masked>" 'https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true'
I0930 20:39:25.351571       1 scraper.go:136] "Scraping node" node="node2"
I0930 20:39:25.351620       1 round_trippers.go:435] curl -k -v -XGET  -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" -H "Authorization: Bearer <masked>" 'https://192.168.1.20:10250/stats/summary?only_cpu_and_memory=true'
I0930 20:39:25.359379       1 scraper.go:136] "Scraping node" node="node3"
I0930 20:39:25.359459       1 round_trippers.go:435] curl -k -v -XGET  -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" -H "Authorization: Bearer <masked>" 'https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true'
I0930 20:39:32.306246       1 handler.go:153] metrics-server: GET "/livez" satisfied by nonGoRestful
I0930 20:39:32.306263       1 pathrecorder.go:240] metrics-server: "/livez" satisfied by exact match
I0930 20:39:32.306310       1 handler.go:153] metrics-server: GET "/readyz" satisfied by nonGoRestful
I0930 20:39:32.306334       1 httplog.go:89] "HTTP" verb="GET" URI="/livez" latency="227.128µs" userAgent="kube-probe/1.21" srcIP="192.168.1.20:44590" resp=200
I0930 20:39:32.306341       1 pathrecorder.go:240] metrics-server: "/readyz" satisfied by exact match
I0930 20:39:32.306395       1 shared_informer.go:270] caches populated
I0930 20:39:32.306419       1 server.go:188] "Failed probe" probe="metric-storage-ready" err="not metrics to serve"
I0930 20:39:32.306441       1 healthz.go:244] metric-storage-ready check failed: readyz
[-]metric-storage-ready failed: not metrics to serve
I0930 20:39:32.306476       1 httplog.go:89] "HTTP" verb="GET" URI="/readyz" latency="224.31µs" userAgent="kube-probe/1.21" srcIP="192.168.1.20:40883" resp=0
I0930 20:39:38.842178       1 round_trippers.go:454] GET https://192.168.1.20:10250/stats/summary?only_cpu_and_memory=true  in 13490 milliseconds
I0930 20:39:38.842197       1 round_trippers.go:460] Response Headers:
I0930 20:39:38.842216       1 decode.go:59] "Failed getting node metric timestamp" node=""
E0930 20:39:38.842238       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.20:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node2"
I0930 20:39:38.842297       1 round_trippers.go:454] GET https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true  in 13482 milliseconds
I0930 20:39:38.842301       1 round_trippers.go:460] Response Headers:
I0930 20:39:38.842307       1 decode.go:59] "Failed getting node metric timestamp" node=""
E0930 20:39:38.842341       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node3"
I0930 20:39:38.842381       1 round_trippers.go:454] GET https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true  in 13492 milliseconds
I0930 20:39:38.842385       1 round_trippers.go:460] Response Headers:
I0930 20:39:38.842392       1 decode.go:59] "Failed getting node metric timestamp" node=""
E0930 20:39:38.842401       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node1"
I0930 20:39:38.842428       1 scraper.go:157] "Scrape finished" duration="13.501236237s" nodeCount=0 podCount=0
I0930 20:39:38.842435       1 server.go:139] "Storing metrics"
I0930 20:39:38.842446       1 server.go:144] "Scraping cycle complete"
I0930 20:39:40.341970       1 server.go:136] "Scraping metrics"
I0930 20:39:40.342002       1 scraper.go:114] "Scraping metrics from nodes" nodeCount=3
I0930 20:39:40.349488       1 scraper.go:136] "Scraping node" node="node1"
I0930 20:39:40.349552       1 round_trippers.go:435] curl -k -v -XGET  -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" -H "Authorization: Bearer <masked>" 'https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true'
I0930 20:39:40.358037       1 scraper.go:136] "Scraping node" node="node3"
I0930 20:39:40.358114       1 round_trippers.go:435] curl -k -v -XGET  -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" -H "Authorization: Bearer <masked>" 'https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true'
I0930 20:39:40.361879       1 scraper.go:136] "Scraping node" node="node2"
I0930 20:39:40.361966       1 round_trippers.go:435] curl -k -v -XGET  -H "Authorization: Bearer <masked>" -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" 'https://192.168.1.20:10250/stats/summary?only_cpu_and_memory=true'
I0930 20:39:42.307439       1 handler.go:153] metrics-server: GET "/livez" satisfied by nonGoRestful
I0930 20:39:42.307457       1 pathrecorder.go:240] metrics-server: "/livez" satisfied by exact match
I0930 20:39:42.307518       1 httplog.go:89] "HTTP" verb="GET" URI="/livez" latency="287.625µs" userAgent="kube-probe/1.21" srcIP="192.168.1.20:30423" resp=200
I0930 20:39:42.307605       1 handler.go:153] metrics-server: GET "/readyz" satisfied by nonGoRestful
I0930 20:39:42.307615       1 pathrecorder.go:240] metrics-server: "/readyz" satisfied by exact match
I0930 20:39:42.307648       1 shared_informer.go:270] caches populated
I0930 20:39:42.307658       1 server.go:188] "Failed probe" probe="metric-storage-ready" err="not metrics to serve"
I0930 20:39:42.307666       1 healthz.go:244] metric-storage-ready check failed: readyz
[-]metric-storage-ready failed: not metrics to serve
I0930 20:39:42.307689       1 httplog.go:89] "HTTP" verb="GET" URI="/readyz" latency="135.815µs" userAgent="kube-probe/1.21" srcIP="192.168.1.20:55023" resp=0
I0930 20:39:52.306863       1 handler.go:153] metrics-server: GET "/livez" satisfied by nonGoRestful
I0930 20:39:52.306880       1 pathrecorder.go:240] metrics-server: "/livez" satisfied by exact match
I0930 20:39:52.306944       1 httplog.go:89] "HTTP" verb="GET" URI="/livez" latency="264.454µs" userAgent="kube-probe/1.21" srcIP="192.168.1.20:7301" resp=200
I0930 20:39:52.307273       1 handler.go:153] metrics-server: GET "/readyz" satisfied by nonGoRestful
I0930 20:39:52.307284       1 pathrecorder.go:240] metrics-server: "/readyz" satisfied by exact match
I0930 20:39:52.307308       1 shared_informer.go:270] caches populated
I0930 20:39:52.307318       1 server.go:188] "Failed probe" probe="metric-storage-ready" err="not metrics to serve"
I0930 20:39:52.307326       1 healthz.go:244] metric-storage-ready check failed: readyz
[-]metric-storage-ready failed: not metrics to serve
I0930 20:39:52.307347       1 httplog.go:89] "HTTP" verb="GET" URI="/readyz" latency="136.074µs" userAgent="kube-probe/1.21" srcIP="192.168.1.20:22745" resp=0
I0930 20:39:53.842742       1 round_trippers.go:454] GET https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true  in 13493 milliseconds
I0930 20:39:53.842763       1 round_trippers.go:460] Response Headers:
I0930 20:39:53.842804       1 decode.go:59] "Failed getting node metric timestamp" node=""
E0930 20:39:53.842833       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node1"
I0930 20:39:53.842921       1 round_trippers.go:454] GET https://192.168.1.20:10250/stats/summary?only_cpu_and_memory=true  in 13480 milliseconds
I0930 20:39:53.842931       1 round_trippers.go:460] Response Headers:
I0930 20:39:53.842947       1 decode.go:59] "Failed getting node metric timestamp" node=""
E0930 20:39:53.842962       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.20:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node2"
I0930 20:39:53.843776       1 round_trippers.go:454] GET https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true  in 13485 milliseconds
I0930 20:39:53.843784       1 round_trippers.go:460] Response Headers:
I0930 20:39:53.843793       1 decode.go:59] "Failed getting node metric timestamp" node=""
E0930 20:39:53.843805       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node3"
I0930 20:39:53.843818       1 scraper.go:157] "Scrape finished" duration="13.501802343s" nodeCount=0 podCount=0
I0930 20:39:53.843823       1 server.go:139] "Storing metrics"
I0930 20:39:53.843831       1 server.go:144] "Scraping cycle complete"
I0930 20:39:55.342063       1 server.go:136] "Scraping metrics"
I0930 20:39:55.342104       1 scraper.go:114] "Scraping metrics from nodes" nodeCount=3
I0930 20:39:55.348319       1 scraper.go:136] "Scraping node" node="node1"
I0930 20:39:55.348399       1 round_trippers.go:435] curl -k -v -XGET  -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" -H "Authorization: Bearer <masked>" 'https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true'
I0930 20:39:55.357741       1 scraper.go:136] "Scraping node" node="node2"
I0930 20:39:55.357818       1 round_trippers.go:435] curl -k -v -XGET  -H "Authorization: Bearer <masked>" -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" 'https://192.168.1.20:10250/stats/summary?only_cpu_and_memory=true'
I0930 20:39:55.358923       1 scraper.go:136] "Scraping node" node="node3"
I0930 20:39:55.359054       1 round_trippers.go:435] curl -k -v -XGET  -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" -H "Authorization: Bearer <masked>" 'https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true'
I0930 20:40:02.306075       1 handler.go:153] metrics-server: GET "/livez" satisfied by nonGoRestful
I0930 20:40:02.306091       1 pathrecorder.go:240] metrics-server: "/livez" satisfied by exact match
I0930 20:40:02.306147       1 httplog.go:89] "HTTP" verb="GET" URI="/livez" latency="228.866µs" userAgent="kube-probe/1.21" srcIP="192.168.1.20:45340" resp=200
I0930 20:40:02.306653       1 handler.go:153] metrics-server: GET "/readyz" satisfied by nonGoRestful
I0930 20:40:02.306662       1 pathrecorder.go:240] metrics-server: "/readyz" satisfied by exact match
I0930 20:40:02.306686       1 shared_informer.go:270] caches populated
I0930 20:40:02.306695       1 server.go:188] "Failed probe" probe="metric-storage-ready" err="not metrics to serve"
I0930 20:40:02.306702       1 healthz.go:244] metric-storage-ready check failed: readyz
[-]metric-storage-ready failed: not metrics to serve
I0930 20:40:02.306731       1 httplog.go:89] "HTTP" verb="GET" URI="/readyz" latency="119.881µs" userAgent="kube-probe/1.21" srcIP="192.168.1.20:1328" resp=0
I0930 20:40:08.842894       1 round_trippers.go:454] GET https://192.168.1.20:10250/stats/summary?only_cpu_and_memory=true  in 13485 milliseconds
I0930 20:40:08.842914       1 round_trippers.go:460] Response Headers:
I0930 20:40:08.842937       1 decode.go:59] "Failed getting node metric timestamp" node=""
E0930 20:40:08.842961       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.20:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node2"
I0930 20:40:08.843007       1 round_trippers.go:454] GET https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true  in 13483 milliseconds
I0930 20:40:08.843015       1 round_trippers.go:460] Response Headers:
I0930 20:40:08.843024       1 decode.go:59] "Failed getting node metric timestamp" node=""
I0930 20:40:08.843030       1 round_trippers.go:454] GET https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true  in 13494 milliseconds
I0930 20:40:08.843037       1 round_trippers.go:460] Response Headers:
E0930 20:40:08.843071       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node3"
I0930 20:40:08.843073       1 decode.go:59] "Failed getting node metric timestamp" node=""
E0930 20:40:08.843088       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node1"
I0930 20:40:08.843153       1 scraper.go:157] "Scrape finished" duration="13.501009943s" nodeCount=0 podCount=0
I0930 20:40:08.843161       1 server.go:139] "Storing metrics"
I0930 20:40:08.843173       1 server.go:144] "Scraping cycle complete"
I0930 20:40:10.341484       1 server.go:136] "Scraping metrics"
I0930 20:40:10.341513       1 scraper.go:114] "Scraping metrics from nodes" nodeCount=3
I0930 20:40:10.341545       1 scraper.go:136] "Scraping node" node="node1"
I0930 20:40:10.341627       1 round_trippers.go:435] curl -k -v -XGET  -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" -H "Authorization: Bearer <masked>" 'https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true'
I0930 20:40:10.353173       1 scraper.go:136] "Scraping node" node="node2"
I0930 20:40:10.353238       1 round_trippers.go:435] curl -k -v -XGET  -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" -H "Authorization: Bearer <masked>" 'https://192.168.1.20:10250/stats/summary?only_cpu_and_memory=true'
I0930 20:40:10.355549       1 scraper.go:136] "Scraping node" node="node3"
I0930 20:40:10.355640       1 round_trippers.go:435] curl -k -v -XGET  -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" -H "Authorization: Bearer <masked>" 'https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true'
I0930 20:40:12.305607       1 handler.go:153] metrics-server: GET "/readyz" satisfied by nonGoRestful
I0930 20:40:12.305628       1 pathrecorder.go:240] metrics-server: "/readyz" satisfied by exact match
I0930 20:40:12.305693       1 shared_informer.go:270] caches populated
I0930 20:40:12.305708       1 server.go:188] "Failed probe" probe="metric-storage-ready" err="not metrics to serve"
I0930 20:40:12.305719       1 healthz.go:244] metric-storage-ready check failed: readyz
[-]metric-storage-ready failed: not metrics to serve
I0930 20:40:12.305766       1 httplog.go:89] "HTTP" verb="GET" URI="/readyz" latency="359.921µs" userAgent="kube-probe/1.21" srcIP="192.168.1.20:53219" resp=0
I0930 20:40:12.306016       1 handler.go:153] metrics-server: GET "/livez" satisfied by nonGoRestful
I0930 20:40:12.306027       1 pathrecorder.go:240] metrics-server: "/livez" satisfied by exact match
I0930 20:40:12.306081       1 httplog.go:89] "HTTP" verb="GET" URI="/livez" latency="118.249µs" userAgent="kube-probe/1.21" srcIP="192.168.1.20:57287" resp=200
I0930 20:40:22.305472       1 handler.go:153] metrics-server: GET "/livez" satisfied by nonGoRestful
I0930 20:40:22.305495       1 pathrecorder.go:240] metrics-server: "/livez" satisfied by exact match
I0930 20:40:22.305594       1 httplog.go:89] "HTTP" verb="GET" URI="/livez" latency="342.892µs" userAgent="kube-probe/1.21" srcIP="192.168.1.20:43010" resp=200
I0930 20:40:22.306678       1 handler.go:153] metrics-server: GET "/readyz" satisfied by nonGoRestful
I0930 20:40:22.306693       1 pathrecorder.go:240] metrics-server: "/readyz" satisfied by exact match
I0930 20:40:22.306721       1 shared_informer.go:270] caches populated
I0930 20:40:22.306732       1 server.go:188] "Failed probe" probe="metric-storage-ready" err="not metrics to serve"
I0930 20:40:22.306741       1 healthz.go:244] metric-storage-ready check failed: readyz
[-]metric-storage-ready failed: not metrics to serve
I0930 20:40:22.306784       1 httplog.go:89] "HTTP" verb="GET" URI="/readyz" latency="162.961µs" userAgent="kube-probe/1.21" srcIP="192.168.1.20:49739" resp=0
I0930 20:40:23.842391       1 round_trippers.go:454] GET https://192.168.1.20:10250/stats/summary?only_cpu_and_memory=true  in 13489 milliseconds
I0930 20:40:23.842406       1 round_trippers.go:460] Response Headers:
I0930 20:40:23.842427       1 decode.go:59] "Failed getting node metric timestamp" node=""
E0930 20:40:23.842455       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.20:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node2"
I0930 20:40:23.842535       1 round_trippers.go:454] GET https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true  in 13500 milliseconds
I0930 20:40:23.842545       1 round_trippers.go:460] Response Headers:
I0930 20:40:23.842536       1 round_trippers.go:454] GET https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true  in 13486 milliseconds
I0930 20:40:23.842559       1 decode.go:59] "Failed getting node metric timestamp" node=""
E0930 20:40:23.842579       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node1"
I0930 20:40:23.842559       1 round_trippers.go:460] Response Headers:
I0930 20:40:23.842591       1 decode.go:59] "Failed getting node metric timestamp" node=""
E0930 20:40:23.842600       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node3"
I0930 20:40:23.842612       1 scraper.go:157] "Scrape finished" duration="13.501088817s" nodeCount=0 podCount=0
I0930 20:40:23.842616       1 server.go:139] "Storing metrics"
I0930 20:40:23.842623       1 server.go:144] "Scraping cycle complete"
I0930 20:40:25.341896       1 server.go:136] "Scraping metrics"
I0930 20:40:25.341939       1 scraper.go:114] "Scraping metrics from nodes" nodeCount=3
I0930 20:40:25.362301       1 scraper.go:136] "Scraping node" node="node2"
I0930 20:40:25.362434       1 round_trippers.go:435] curl -k -v -XGET  -H "Authorization: Bearer <masked>" -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" 'https://192.168.1.20:10250/stats/summary?only_cpu_and_memory=true'
I0930 20:40:25.363487       1 scraper.go:136] "Scraping node" node="node1"
I0930 20:40:25.363545       1 round_trippers.go:435] curl -k -v -XGET  -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" -H "Authorization: Bearer <masked>" 'https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true'
I0930 20:40:25.365850       1 scraper.go:136] "Scraping node" node="node3"
I0930 20:40:25.365908       1 round_trippers.go:435] curl -k -v -XGET  -H "Authorization: Bearer <masked>" -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" 'https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true'
I0930 20:40:32.305866       1 handler.go:153] metrics-server: GET "/livez" satisfied by nonGoRestful
I0930 20:40:32.305899       1 pathrecorder.go:240] metrics-server: "/livez" satisfied by exact match
I0930 20:40:32.305971       1 httplog.go:89] "HTTP" verb="GET" URI="/livez" latency="264.793µs" userAgent="kube-probe/1.21" srcIP="192.168.1.20:53800" resp=200
I0930 20:40:32.307719       1 handler.go:153] metrics-server: GET "/readyz" satisfied by nonGoRestful
I0930 20:40:32.307737       1 pathrecorder.go:240] metrics-server: "/readyz" satisfied by exact match
I0930 20:40:32.307769       1 shared_informer.go:270] caches populated
I0930 20:40:32.307785       1 server.go:188] "Failed probe" probe="metric-storage-ready" err="not metrics to serve"
I0930 20:40:32.307796       1 healthz.go:244] metric-storage-ready check failed: readyz
[-]metric-storage-ready failed: not metrics to serve
I0930 20:40:32.307824       1 httplog.go:89] "HTTP" verb="GET" URI="/readyz" latency="145.816µs" userAgent="kube-probe/1.21" srcIP="192.168.1.20:14569" resp=0
I0930 20:40:38.842592       1 round_trippers.go:454] GET https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true  in 13479 milliseconds
I0930 20:40:38.842618       1 round_trippers.go:460] Response Headers:
I0930 20:40:38.842645       1 decode.go:59] "Failed getting node metric timestamp" node=""
I0930 20:40:38.842594       1 round_trippers.go:454] GET https://192.168.1.20:10250/stats/summary?only_cpu_and_memory=true  in 13480 milliseconds
I0930 20:40:38.842701       1 round_trippers.go:460] Response Headers:
E0930 20:40:38.842709       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node1"
I0930 20:40:38.842713       1 decode.go:59] "Failed getting node metric timestamp" node=""
E0930 20:40:38.842729       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.20:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node2"
I0930 20:40:38.843710       1 round_trippers.go:454] GET https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true  in 13477 milliseconds
I0930 20:40:38.843717       1 round_trippers.go:460] Response Headers:
I0930 20:40:38.843727       1 decode.go:59] "Failed getting node metric timestamp" node=""
E0930 20:40:38.843758       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node3"
I0930 20:40:38.843771       1 scraper.go:157] "Scrape finished" duration="13.501821496s" nodeCount=0 podCount=0
I0930 20:40:38.843777       1 server.go:139] "Storing metrics"
I0930 20:40:38.843785       1 server.go:144] "Scraping cycle complete"
I0930 20:40:40.341398       1 server.go:136] "Scraping metrics"
I0930 20:40:40.341423       1 scraper.go:114] "Scraping metrics from nodes" nodeCount=3
I0930 20:40:40.348670       1 scraper.go:136] "Scraping node" node="node3"
I0930 20:40:40.348720       1 round_trippers.go:435] curl -k -v -XGET  -H "Authorization: Bearer <masked>" -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" 'https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true'
I0930 20:40:40.351274       1 scraper.go:136] "Scraping node" node="node2"
I0930 20:40:40.351330       1 round_trippers.go:435] curl -k -v -XGET  -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" -H "Authorization: Bearer <masked>" 'https://192.168.1.20:10250/stats/summary?only_cpu_and_memory=true'
I0930 20:40:40.362794       1 scraper.go:136] "Scraping node" node="node1"
I0930 20:40:40.362849       1 round_trippers.go:435] curl -k -v -XGET  -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" -H "Authorization: Bearer <masked>" 'https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true'
I0930 20:40:42.305515       1 handler.go:153] metrics-server: GET "/livez" satisfied by nonGoRestful
I0930 20:40:42.305539       1 pathrecorder.go:240] metrics-server: "/livez" satisfied by exact match
I0930 20:40:42.305604       1 httplog.go:89] "HTTP" verb="GET" URI="/livez" latency="258.066µs" userAgent="kube-probe/1.21" srcIP="192.168.1.20:38787" resp=200
I0930 20:40:42.311248       1 handler.go:153] metrics-server: GET "/readyz" satisfied by nonGoRestful
I0930 20:40:42.311272       1 pathrecorder.go:240] metrics-server: "/readyz" satisfied by exact match
I0930 20:40:42.311309       1 shared_informer.go:270] caches populated
I0930 20:40:42.311323       1 server.go:188] "Failed probe" probe="metric-storage-ready" err="not metrics to serve"
I0930 20:40:42.311334       1 healthz.go:244] metric-storage-ready check failed: readyz
[-]metric-storage-ready failed: not metrics to serve
I0930 20:40:42.311390       1 httplog.go:89] "HTTP" verb="GET" URI="/readyz" latency="218.129µs" userAgent="kube-probe/1.21" srcIP="192.168.1.20:11542" resp=0
I0930 20:40:52.308646       1 handler.go:153] metrics-server: GET "/livez" satisfied by nonGoRestful
I0930 20:40:52.308677       1 pathrecorder.go:240] metrics-server: "/livez" satisfied by exact match
I0930 20:40:52.308744       1 httplog.go:89] "HTTP" verb="GET" URI="/livez" latency="287.528µs" userAgent="kube-probe/1.21" srcIP="192.168.1.20:31002" resp=200
I0930 20:40:52.308978       1 handler.go:153] metrics-server: GET "/readyz" satisfied by nonGoRestful
I0930 20:40:52.308989       1 pathrecorder.go:240] metrics-server: "/readyz" satisfied by exact match
I0930 20:40:52.309026       1 shared_informer.go:270] caches populated
I0930 20:40:52.309035       1 server.go:188] "Failed probe" probe="metric-storage-ready" err="not metrics to serve"
I0930 20:40:52.309044       1 healthz.go:244] metric-storage-ready check failed: readyz
[-]metric-storage-ready failed: not metrics to serve
I0930 20:40:52.309064       1 httplog.go:89] "HTTP" verb="GET" URI="/readyz" latency="147.709µs" userAgent="kube-probe/1.21" srcIP="192.168.1.20:15621" resp=0
I0930 20:40:53.842604       1 round_trippers.go:454] GET https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true  in 13493 milliseconds
I0930 20:40:53.842617       1 round_trippers.go:454] GET https://192.168.1.20:10250/stats/summary?only_cpu_and_memory=true  in 13491 milliseconds
I0930 20:40:53.842627       1 round_trippers.go:460] Response Headers:
I0930 20:40:53.842657       1 round_trippers.go:454] GET https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true  in 13479 milliseconds
I0930 20:40:53.842664       1 round_trippers.go:460] Response Headers:
I0930 20:40:53.842678       1 decode.go:59] "Failed getting node metric timestamp" node=""
E0930 20:40:53.842704       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node1"
I0930 20:40:53.842657       1 decode.go:59] "Failed getting node metric timestamp" node=""
E0930 20:40:53.842723       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.20:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node2"
I0930 20:40:53.842621       1 round_trippers.go:460] Response Headers:
I0930 20:40:53.842731       1 decode.go:59] "Failed getting node metric timestamp" node=""
E0930 20:40:53.842738       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node3"
I0930 20:40:53.842778       1 scraper.go:157] "Scrape finished" duration="13.501317418s" nodeCount=0 podCount=0
I0930 20:40:53.842782       1 server.go:139] "Storing metrics"
I0930 20:40:53.842790       1 server.go:144] "Scraping cycle complete"
I0930 20:40:55.341611       1 server.go:136] "Scraping metrics"
I0930 20:40:55.341639       1 scraper.go:114] "Scraping metrics from nodes" nodeCount=3
I0930 20:40:55.351958       1 scraper.go:136] "Scraping node" node="node2"
I0930 20:40:55.352032       1 round_trippers.go:435] curl -k -v -XGET  -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" -H "Authorization: Bearer <masked>" 'https://192.168.1.20:10250/stats/summary?only_cpu_and_memory=true'
I0930 20:40:55.355139       1 scraper.go:136] "Scraping node" node="node1"
I0930 20:40:55.355195       1 round_trippers.go:435] curl -k -v -XGET  -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" -H "Authorization: Bearer <masked>" 'https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true'
I0930 20:40:55.361445       1 scraper.go:136] "Scraping node" node="node3"
I0930 20:40:55.361522       1 round_trippers.go:435] curl -k -v -XGET  -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" -H "Authorization: Bearer <masked>" 'https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true'
I0930 20:41:02.306042       1 handler.go:153] metrics-server: GET "/readyz" satisfied by nonGoRestful
I0930 20:41:02.306060       1 pathrecorder.go:240] metrics-server: "/readyz" satisfied by exact match
I0930 20:41:02.306106       1 shared_informer.go:270] caches populated
I0930 20:41:02.306118       1 server.go:188] "Failed probe" probe="metric-storage-ready" err="not metrics to serve"
I0930 20:41:02.306132       1 healthz.go:244] metric-storage-ready check failed: readyz
[-]metric-storage-ready failed: not metrics to serve
I0930 20:41:02.306167       1 httplog.go:89] "HTTP" verb="GET" URI="/readyz" latency="314.747µs" userAgent="kube-probe/1.21" srcIP="192.168.1.20:53498" resp=0
I0930 20:41:02.306202       1 handler.go:153] metrics-server: GET "/livez" satisfied by nonGoRestful
I0930 20:41:02.306210       1 pathrecorder.go:240] metrics-server: "/livez" satisfied by exact match
I0930 20:41:02.306276       1 httplog.go:89] "HTTP" verb="GET" URI="/livez" latency="106.835µs" userAgent="kube-probe/1.21" srcIP="192.168.1.20:4635" resp=200
I0930 20:41:08.842580       1 round_trippers.go:454] GET https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true  in 13487 milliseconds
I0930 20:41:08.842600       1 round_trippers.go:460] Response Headers:
I0930 20:41:08.842621       1 decode.go:59] "Failed getting node metric timestamp" node=""
E0930 20:41:08.842664       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node1"
I0930 20:41:08.842935       1 round_trippers.go:454] GET https://192.168.1.20:10250/stats/summary?only_cpu_and_memory=true  in 13490 milliseconds
I0930 20:41:08.842946       1 round_trippers.go:460] Response Headers:
I0930 20:41:08.842976       1 decode.go:59] "Failed getting node metric timestamp" node=""
E0930 20:41:08.842995       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.20:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node2"
I0930 20:41:08.843991       1 round_trippers.go:454] GET https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true  in 13482 milliseconds
I0930 20:41:08.844004       1 round_trippers.go:460] Response Headers:
I0930 20:41:08.844014       1 decode.go:59] "Failed getting node metric timestamp" node=""
E0930 20:41:08.844025       1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node3"
I0930 20:41:08.844043       1 scraper.go:157] "Scrape finished" duration="13.502390379s" nodeCount=0 podCount=0
I0930 20:41:08.844049       1 server.go:139] "Storing metrics"
I0930 20:41:08.844083       1 server.go:144] "Scraping cycle complete"
I0930 20:41:10.341712       1 server.go:136] "Scraping metrics"
I0930 20:41:10.341743       1 scraper.go:114] "Scraping metrics from nodes" nodeCount=3
I0930 20:41:10.342880       1 scraper.go:136] "Scraping node" node="node1"
I0930 20:41:10.342980       1 round_trippers.go:435] curl -k -v -XGET  -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" -H "Authorization: Bearer <masked>" 'https://192.168.1.19:10250/stats/summary?only_cpu_and_memory=true'
I0930 20:41:10.345163       1 scraper.go:136] "Scraping node" node="node3"
I0930 20:41:10.345210       1 round_trippers.go:435] curl -k -v -XGET  -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" -H "Authorization: Bearer <masked>" 'https://192.168.1.21:10250/stats/summary?only_cpu_and_memory=true'
I0930 20:41:10.351825       1 scraper.go:136] "Scraping node" node="node2"
I0930 20:41:10.351889       1 round_trippers.go:435] curl -k -v -XGET  -H "User-Agent: metrics-server/v0.5.0 (linux/amd64) kubernetes/d766094" -H "Authorization: Bearer <masked>" 'https://192.168.1.20:10250/stats/summary?only_cpu_and_memory=true'
I0930 20:41:12.305130       1 handler.go:153] metrics-server: GET "/livez" satisfied by nonGoRestful
I0930 20:41:12.305151       1 pathrecorder.go:240] metrics-server: "/livez" satisfied by exact match
I0930 20:41:12.305203       1 httplog.go:89] "HTTP" verb="GET" URI="/livez" latency="225.203µs" userAgent="kube-probe/1.21" srcIP="192.168.1.20:3178" resp=200
I0930 20:41:12.305266       1 handler.go:153] metrics-server: GET "/readyz" satisfied by nonGoRestful
I0930 20:41:12.305275       1 pathrecorder.go:240] metrics-server: "/readyz" satisfied by exact match
I0930 20:41:12.305360       1 shared_informer.go:270] caches populated
I0930 20:41:12.305382       1 server.go:188] "Failed probe" probe="metric-storage-ready" err="not metrics to serve"
I0930 20:41:12.305396       1 healthz.go:244] metric-storage-ready check failed: readyz
[-]metric-storage-ready failed: not metrics to serve
I0930 20:41:12.305422       1 httplog.go:89] "HTTP" verb="GET" URI="/readyz" latency="185.042µs" userAgent="kube-probe/1.21" srcIP="192.168.1.20:25469" resp=0

@Raboo
Copy link

Raboo commented Nov 18, 2021

I'm affected by this as well when I upgraded RKE from v.1.2.8 to v1.3.2 and enable_cri_dockerd=true.
No firewalls
Using curl against same URLs works (except for auth)

$ curl -i -k 'https://rancher03:10250/stats/summary?only_cpu_and_memory=true'
HTTP/2 401
content-type: text/plain; charset=utf-8
content-length: 12
date: Thu, 18 Nov 2021 13:09:24 GMT

@stale
Copy link

stale bot commented Feb 3, 2022

This issue/PR has been automatically marked as stale because it has not had activity (commit/comment/label) for 60 days. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the status/stale label Feb 3, 2022
@immanuelfodor
Copy link
Author

I didn't have time to test it with latest RKE release, please don't close, it's possibly still relevant

@stale stale bot removed the status/stale label Feb 3, 2022
@travisghansen
Copy link

The linked issue I created (that was closed in favor of this) has pretty concise details about how to test and what the issue truly is.

#2716

@immanuelfodor
Copy link
Author

Tried to enable dockershim with latest rke v1.3.7 that now includes k8s v1.22.6 and metrics server v0.5.1. I get the same errors in metrics server logs as in #2709 (comment) and #2709 (comment) . Besides, kubectl top no returns Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io). Reverted back to disabled dockershim.

@chancez
Copy link

chancez commented Apr 8, 2022

This also impacts rancher-desktop

@github-actions
Copy link
Contributor

github-actions bot commented Jun 8, 2022

This repository uses an automated workflow to automatically label issues which have not had any activity (commit/comment/label) for 60 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the workflow can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the workflow will automatically close the issue in 14 days. Thank you for your contributions.

@immanuelfodor
Copy link
Author

Unstale, I need to test this with v1.22.9

@travisghansen
Copy link

Upstream just fixed this a couple weeks ago FYI: Mirantis/cri-dockerd#15

@sysadminxxx
Copy link

sysadminxxx commented Jun 12, 2022

Got the same issue with v1.22.9.

time curl  -k -H "Authorization: Bearer ${a}"  https://x.x.x.x:10250/metrics > metrics.txt              
                                        
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  327k    0  327k    0     0  11882      0 --:--:--  0:00:28 --:--:-- 84432
curl -k -H "Authorization: Bearer ${a}" https://x.x.x.x:10250/metrics >  0.01s user 0.01s system 0% cpu 28.199 total

@github-actions
Copy link
Contributor

This repository uses an automated workflow to automatically label issues which have not had any activity (commit/comment/label) for 60 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the workflow can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the workflow will automatically close the issue in 14 days. Thank you for your contributions.

@moray95
Copy link

moray95 commented Aug 31, 2022

Any updates on this? Just tested with RKE 1.3.14 and Kubernetes v1.23.7-rancher1-1, the issue still persists.

@stalin4suse
Copy link

@moray95 - This will be fixed in Rancher v2.6.7 according to #2938

@cite
Copy link

cite commented Sep 8, 2022

I can confirm we are still seeing this issue, see rancher/rancher#38816 - we initially thought that was a problem with the OEL kernel, but it's not related to that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests