cri_dockerd as included in rke-tools has extremely bad performance #2938

chfrank-cgn · 2022-05-15T07:58:56Z

RKE version:
1.3.11 - v1.23.6

Docker version: (docker version, docker info preferred)
20.10

Operating system and kernel: (cat /etc/os-release, uname -r preferred)
Ubuntu 20.04 LTS

Type/provider of hosts: (VirtualBox/Bare-metal/AWS/GCE/DO)
Azure

cluster.yml file:

  rke_config {
    kubernetes_version = var.k8version
    ignore_docker_version = false
    enable_cri_dockerd = true
    }
    network {
      plugin = "flannel"
    }

Steps to Reproduce:
Set enable_cri_dockerd will lead to constant, near 100% CPU consumption from an empty cluster on a 4vCPU VM (Standard_D4s_v3)

Results:
With 1.24 approaching fast, we should provide a working option for the installed base that wants to continue with RKE1 and Docker.

The text was updated successfully, but these errors were encountered:

tsde · 2022-05-23T21:18:21Z

I'm seeing the same behavior in a simple setup involving one node with 2 CPUs. Here's my spec for reference :

RKE / Kubernetes version:
1.3.11 - v1.23.6

Docker version: (docker version, docker info preferred)
20.10.12

Operating system and kernel: (cat /etc/os-release, uname -r preferred)
Flatcar Linux 3139.2.1

Type/provider of hosts: (VirtualBox/Bare-metal/AWS/GCE/DO)
VMWare workstation (using Flatcar Linux OVA)

With enable_cri_dockerd set to false, there's no issue. Setting it to true when creating a cluster results in the behavior described.
Stracing the dockerd process including its threads shown a very huge amount of constant overlay scanning using newfstatat and lgetxattr syscalls that hog the CPU (1 to 1,5 core constantly). This "scanning" involves overlays from running containers as well as already exited containers.

Small excerpt from strace:

.../...
[pid  1175] newfstatat(AT_FDCWD, "/var/lib/docker/overlay2/272c043a758e089d76818fd1d41fbf611dc57fa51f4c9fb9284c82f500afffca/merged/opt/az/lib/python3.8/site-packages/azure/multiapi/storage/v2015_04_05/common/__pycache__/_error.cpython-38.pyc",  <unfinished ...>
[pid   867] lgetxattr("/var/lib/docker/overlay2/272c043a758e089d76818fd1d41fbf611dc57fa51f4c9fb9284c82f500afffca/merged/opt/az/lib/python3.8/site-packages/azure/mgmt/web/v2021_03_01/operations/_container_apps_revisions_operations.py", "security.capability",  <unfinished ...>
[pid  1175] <... newfstatat resumed>{st_mode=S_IFREG|0644, st_size=214, ...}, AT_SYMLINK_NOFOLLOW) = 0
[pid   867] <... lgetxattr resumed>0xc00d1c6300, 128) = -1 ENODATA (No data available)
[pid   862] <... nanosleep resumed>NULL) = 0
[pid  1175] lgetxattr("/var/lib/docker/overlay2/272c043a758e089d76818fd1d41fbf611dc57fa51f4c9fb9284c82f500afffca-init/merged/opt/az/lib/python3.8/site-packages/azure/multiapi/storage/v2015_04_05/common/__pycache__/_error.cpython-38.pyc", "security.capability",  <unfinished ...>
[pid   867] newfstatat(AT_FDCWD, "/var/lib/docker/overlay2/272c043a758e089d76818fd1d41fbf611dc57fa51f4c9fb9284c82f500afffca-init/merged/opt/az/lib/python3.8/site-packages/azure/mgmt/web/v2021_03_01/operations/_deleted_web_apps_operations.py",  <unfinished ...>
[pid  1175] <... lgetxattr resumed>0xc00cc76580, 128) = -1 ENODATA (No data available)
[pid   867] <... newfstatat resumed>{st_mode=S_IFREG|0644, st_size=13747, ...}, AT_SYMLINK_NOFOLLOW) = 0
[pid   862] nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
[pid  1175] lgetxattr("/var/lib/docker/overlay2/272c043a758e089d76818fd1d41fbf611dc57fa51f4c9fb9284c82f500afffca/merged/opt/az/lib/python3.8/site-packages/azure/multiapi/storage/v2015_04_05/common/__pycache__/_error.cpython-38.pyc", "security.capability",  <unfinished ...>
[pid   867] newfstatat(AT_FDCWD, "/var/lib/docker/overlay2/272c043a758e089d76818fd1d41fbf611dc57fa51f4c9fb9284c82f500afffca/merged/opt/az/lib/python3.8/site-packages/azure/mgmt/web/v2021_03_01/operations/_deleted_web_apps_operations.py",  <unfinished ...>
[pid  1175] <... lgetxattr resumed>0xc00cc76600, 128) = -1 ENODATA (No data available)
[pid   867] <... newfstatat resumed>{st_mode=S_IFREG|0644, st_size=13747, ...}, AT_SYMLINK_NOFOLLOW) = 0
[pid   862] <... nanosleep resumed>NULL) = 0
[pid  1175] newfstatat(AT_FDCWD, "/var/lib/docker/overlay2/272c043a758e089d76818fd1d41fbf611dc57fa51f4c9fb9284c82f500afffca-init/merged/opt/az/lib/python3.8/site-packages/azure/multiapi/storage/v2015_04_05/common/__pycache__/models.cpython-38.pyc",  <unfinished ...>
[pid   867] lgetxattr("/var/lib/docker/overlay2/272c043a758e089d76818fd1d41fbf611dc57fa51f4c9fb9284c82f500afffca-init/merged/opt/az/lib/python3.8/site-packages/azure/mgmt/web/v2021_03_01/operations/_deleted_web_apps_operations.py", "security.capability",  <unfinished ...>
[pid   862] futex(0xc00191ed50, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  1175] <... newfstatat resumed>{st_mode=S_IFREG|0644, st_size=214, ...}, AT_SYMLINK_NOFOLLOW) = 0
[pid   867] <... lgetxattr resumed>0xc00d1c6480, 128) = -1 ENODATA (No data available)
[pid  5047] <... futex resumed>)        = 0
[pid  1175] newfstatat(AT_FDCWD, "/var/lib/docker/overlay2/272c043a758e089d76818fd1d41fbf611dc57fa51f4c9fb9284c82f500afffca/merged/opt/az/lib/python3.8/site-packages/azure/multiapi/storage/v2015_04_05/common/__pycache__/models.cpython-38.pyc",  <unfinished ...>
[pid   867] futex(0xc000152950, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid   862] <... futex resumed>)        = 1
[pid  5047] newfstatat(AT_FDCWD, "/var/lib/docker/overlay2/ee034feea43ec49dacc8e91bf26552519544814874aff156a378a374b80e31c2-init/merged/opt/az/lib/python3.8/site-packages/invoke/__pycache__",  <unfinished ...>
[pid  1175] <... newfstatat resumed>{st_mode=S_IFREG|0644, st_size=214, ...}, AT_SYMLINK_NOFOLLOW) = 0
[pid  5047] <... newfstatat resumed>{st_mode=S_IFDIR|0755, st_size=4096, ...}, AT_SYMLINK_NOFOLLOW) = 0
[pid  1175] lgetxattr("/var/lib/docker/overlay2/272c043a758e089d76818fd1d41fbf611dc57fa51f4c9fb9284c82f500afffca-init/merged/opt/az/lib/python3.8/site-packages/azure/multiapi/storage/v2015_04_05/common/__pycache__/models.cpython-38.pyc", "security.capability",  <unfinished ...>
[pid   862] nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
.../...

I also noticed that metrics server does not seem to work properly when enabling this option

I0523 21:13:25.464651       1 server.go:187] "Failed probe" probe="metric-storage-ready" err="no metrics to serve"
E0523 21:13:31.248300       1 scraper.go:140] "Failed to scrape node" err="Get \"https://172.16.9.130:10250/metrics/resource\": context deadline exceeded" node="fc-test"

jiaqiluo · 2022-06-06T23:12:37Z

This PR in the upstream repo possibly fixes this issue.
So the plan, for now, is to bring the changes from the upstream into our fork and then release a new image.

jiaqiluo · 2022-06-07T18:49:04Z

rke-tools 0.1.81 is available, it contains the latest cri-dockerd 0.2.1

we will use this new image in the coming KDM release.

Update:
We will use rke-tools 0.1.82 and it incldues cri-dockerd 0.2.2

Josh-Diamond · 2022-06-13T20:46:29Z

Ticket #2938 - Test Results

Reproduced with rke v1.3.11 and k8s v1.23.6:

Create a single EC-2 instance of t3.medium
Using rke v1.3.11, run rke up on instance with cluster.yml ignore_docker_version = false and enable_cri_dockerd = true configurations set - [k8s v1.23.6-rancher1-1]
Once cluster successfully builds, ssh into the node and run top to view CPU performace
Reproduced - CPU consumption EXTREMELY high from empty cluster

Screenshot:

Step 4

Raboo · 2022-06-16T13:15:11Z

In my case on flatcar Linux on a couple of +48 core nodes it used all available CPU and caused very high load eventually crashing workloads in the cluster.

@jiaqiluo When will rke-tools 0.1.82 be used in Rancher provisioned RKE1 clusters?

jiaqiluo · 2022-06-16T17:04:01Z

@Raboo The plan is in Rancher ~~2.6.6~~ 2.6.7

jiaqiluo · 2022-06-29T16:39:32Z

can be tested on v1.3.13-rc2 https://github.com/rancher/rke/releases/tag/v1.3.13-rc2

The validation steps should be the same as the reproduction steps #2938 (comment)

Josh-Diamond · 2022-07-05T20:33:10Z

Ticket #2938 - Test Results [pt. 2]

With rke v1.3.13-rc2 and k8s v1.23.6:

Create a single EC-2 instance of t3.medium
Using rke v1.3.13-rc2, run rke up on instance with cluster.yml ignore_docker_version = false and enable_cri_dockerd = true configurations set - [k8s v1.23.6-rancher1-1]
Once cluster successfully builds, ssh into the node and run top to view CPU performace
Verified - CPU consumption as expected from empty cluster

Screenshot:

Step 4

Raboo · 2022-08-15T11:29:10Z

It seems like this is not fixed in Rancher v2.6.6 if I read the release notes.

jiaqiluo · 2022-08-15T16:17:52Z

@Raboo You are right; the fix is not in rancher 2.6.6 but will be in 2.6.7 ( as the milestone indicates) which is planned to be released this week.

cite · 2022-09-08T12:15:17Z

We are still seeing massive performance problems when using enable_cri_dockerd - are we sure that the correct fixes are applied? I can see that the cri-dockerd from rancher/hyperkube:v1.24.4-rancher1 self-identifies as 0.2.4 (while the included docker client identifies as 1.12.3 from 2016), but we are still seeing really bad performance.

jiaqiluo · 2022-09-08T17:53:07Z

I can see that the cri-dockerd from rancher/hyperkube:v1.24.4-rancher1 self-identifies as 0.2.4

The correct fix is applied because the upsteam cri-dockerd has fixed the metrics performance issue in v0.2.1

chfrank-cgn · 2022-09-10T18:17:14Z

I can confirm - a fresh RKE installation 1.24.4 (from Rancher 2.6.8) shows the same behavior as before, with very high CPU usage and low performance.

Looking at the system itself, the process consuming all available CPU is dockerd.

Raboo · 2022-09-12T09:14:48Z

Well this makes me afraid of upgrading to k8s v1.24. Last time I enabled the enable_cri_dockerd it brought my cluster down to complete halt.

rvdh · 2022-11-15T12:12:55Z

Also confirmed here on Rancher 2.6.8, k8s v1.24.2 - dockerd using 100% cpu.

@jiaqiluo is it possible 2.6.8 reverted the fix in 2.6.7?

jympetalmd · 2022-12-13T16:20:29Z

same problem with 2.7.0 and v1.23.10-rancher1-1 with cri enabled
it's even worse with v1.24.8-rancher1-1

immanuelfodor · 2022-12-30T00:00:19Z

We should probably track rancher/rancher#38816 it seems to aggregate the abnormal CPU usage issues with the CRI

tsde mentioned this issue May 23, 2022

Update Terraform Provider RKE to 1.3.11 rancher/terraform-provider-rke#342

Closed

snasovich mentioned this issue May 24, 2022

RKE - Add support for provisioning 1.24 clusters #2936

Closed

9 tasks

jiaqiluo mentioned this issue Jun 6, 2022

rebase master branch to bring upstream changes rancher/cri-dockerd#13

Closed

jiaqiluo self-assigned this Jun 6, 2022

jiaqiluo added the [zube]: Review label Jun 6, 2022

jiaqiluo added this to the v2.6.6 milestone Jun 6, 2022

jiaqiluo mentioned this issue Jun 7, 2022

downlaod cri-dockerd from the upstream's release instead of rancher's fork rancher/rke-tools#151

Merged

jiaqiluo added the [zube]: Working label Jun 7, 2022

zube bot removed the [zube]: Review label Jun 7, 2022

jiaqiluo added [zube]: Review and removed [zube]: Working labels Jun 8, 2022

jiaqiluo mentioned this issue Jun 8, 2022

support 1.24 #2948

Merged

jiaqiluo added the team/area2 label Jun 9, 2022

slickwarren assigned Josh-Diamond Jun 9, 2022

Raboo mentioned this issue Jun 16, 2022

enable_cri_dockerd: true causes dockerd to use all CPU resources available. rancher/rancher#38018

Closed

timhaneunsoo assigned Josh-Diamond and unassigned Josh-Diamond Jun 16, 2022

timhaneunsoo added the QA/XS label Jun 16, 2022

jiaqiluo added the [zube]: To Test label Jun 29, 2022

zube bot removed the [zube]: Review label Jun 29, 2022

Sahota1225 modified the milestones: v2.6.6, v2.6.7 Jul 1, 2022

sowmyav27 removed the [zube]: To Test label Jul 5, 2022

sowmyav27 added the [zube]: QA Next up label Jul 5, 2022

zube bot added [zube]: QA Working and removed [zube]: QA Next up labels Jul 5, 2022

zube bot closed this as completed Jul 5, 2022

zube bot added [zube]: Done and removed [zube]: QA Working labels Jul 5, 2022

stalin4suse mentioned this issue Sep 2, 2022

Metrics-server can't scrape nodes after enable_cri_dockerd is set to true #2709

Closed

cite mentioned this issue Sep 8, 2022

[BUG] Abnormally high CPU usage on Kubernetes 1.24.4 rancher/rancher#38816

Closed

zube bot removed the [zube]: Done label Oct 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cri_dockerd as included in rke-tools has extremely bad performance #2938

cri_dockerd as included in rke-tools has extremely bad performance #2938

chfrank-cgn commented May 15, 2022

tsde commented May 23, 2022

jiaqiluo commented Jun 6, 2022

jiaqiluo commented Jun 7, 2022 •

edited

Josh-Diamond commented Jun 13, 2022 •

edited

Raboo commented Jun 16, 2022

jiaqiluo commented Jun 16, 2022 •

edited

jiaqiluo commented Jun 29, 2022 •

edited

Josh-Diamond commented Jul 5, 2022

Raboo commented Aug 15, 2022

jiaqiluo commented Aug 15, 2022

cite commented Sep 8, 2022

jiaqiluo commented Sep 8, 2022

chfrank-cgn commented Sep 10, 2022

Raboo commented Sep 12, 2022

rvdh commented Nov 15, 2022 •

edited

jympetalmd commented Dec 13, 2022

immanuelfodor commented Dec 30, 2022

cri_dockerd as included in rke-tools has extremely bad performance #2938

cri_dockerd as included in rke-tools has extremely bad performance #2938

Comments

chfrank-cgn commented May 15, 2022

tsde commented May 23, 2022

jiaqiluo commented Jun 6, 2022

jiaqiluo commented Jun 7, 2022 • edited

Josh-Diamond commented Jun 13, 2022 • edited

Raboo commented Jun 16, 2022

jiaqiluo commented Jun 16, 2022 • edited

jiaqiluo commented Jun 29, 2022 • edited

Josh-Diamond commented Jul 5, 2022

Raboo commented Aug 15, 2022

jiaqiluo commented Aug 15, 2022

cite commented Sep 8, 2022

jiaqiluo commented Sep 8, 2022

chfrank-cgn commented Sep 10, 2022

Raboo commented Sep 12, 2022

rvdh commented Nov 15, 2022 • edited

jympetalmd commented Dec 13, 2022

immanuelfodor commented Dec 30, 2022

jiaqiluo commented Jun 7, 2022 •

edited

Josh-Diamond commented Jun 13, 2022 •

edited

jiaqiluo commented Jun 16, 2022 •

edited

jiaqiluo commented Jun 29, 2022 •

edited

rvdh commented Nov 15, 2022 •

edited