[BUG] Abnormally high CPU usage on Kubernetes 1.24.4 #38816

msr-financial-com · 2022-09-06T14:34:08Z

Rancher Server Setup

Rancher version: v2.6.8
Installation option : Helm Chart
- RKE1
- Kubernetes: v1.24.4

Information about the Cluster

Kubernetes version: v1.24.4
Cluster Type (Local/Downstream): Downstream
- Custom

User Information

What is the role of the user logged in? Admin/Cluster

Describe the bug
When creating a Custom downstream cluster with Kubernetes v1.24.x and using servers with Oracle Linux 8.6 and the kernel-uek-core 5.15, the CPU usage goes abnormally high. This issue does not occur with the Kernel kernel-core 4.18 and Kubernetes 1.24.x. No extra pods have to be deployed. It will occur immediately after the node is registered into the cluster

To Reproduce

Create a downstream cluster via Rancher UI and use Kubernetes v1.24 or add node to an v1.24 cluster
Add serves with Oracle Linux 8.6 installed and Kernel-uek-core 5.15
Display cpu usage("top")

Result
CPU usage goes abnormally high. While only minimum of pods are deployed
Because of this, pods are getting killed because of a shortage on CPU.

Expected Result
Normal CPU usage.

Screenshots

Additional context

SURE-5282

cite · 2022-09-07T06:33:08Z

Can confirm. This is affecting RKE1 clusters, and it doesn't matter whether it's an upgrade or a new installation. Depending on the number of pods, even when there are no state changes, dockerd will consume several 10s of CPU cores (I've seen CPU loads as high as 5000%).

The affected 5.15 kernel is called Unbreakable Enterprise Kernel 7 by Oracle.

cite · 2022-09-08T12:06:55Z

It seems this is not caused by OEL or it's kernel, since the same symptoms can be reproduced on a CentOS 7 worker node. Is it possible that the fix for rancher/rke#2938 didn't make it into Rancher 2.6.8 yet? We can see that cri-dockerd within kubelet self-identifies as 0.2.4, but it's performance is still really bad. Getting local metrics takes forever and causes and inordinate amount of CPU consumption:

# time curl -sk --key /etc/kubernetes/ssl/kube-apiserver-key.pem --cert /etc/kubernetes/ssl/kube-apiserver.pem 'https://localhost:10250/stats/summary?only_cpu_and_memory=true' >/dev/null

real	0m6.390s
user	0m0.015s
sys	0m0.003s

Additionally, the kubelet log seems to indicate that dockerd is, from time to time, too flooded with connections to respond during stats collection:

time="2022-09-08T12:17:07Z" level=error msg="Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running? Failed to get stats from container 08b5ba4c0e1b0361aadc236fc872244e27d81e04a4c9813a76b61ef468f98f81"

jiaqiluo · 2022-09-08T17:59:06Z

This is a known issue: #38214 where we see the unexpected high memory usage of dockerd on OL 8.4 and 8.6 on k8s 1.24

cite · 2022-09-08T18:08:04Z

@jiaqiluo I don't believe this to be the same issue. In our setup, we don't see any if the error messages your linked report contains, and neither is memory consumption of the Docker daemon increased (see screenshots of the initial reporter).

EDIT: I stand corrected on the memory consumption, sorry (normally it's around 500-600M):

# systemctl status docker
● docker.service - Docker Application Container Engine
   Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/docker.service.d
           └─cgroup-driver.conf, http-proxy.conf
   Active: active (running) since Thu 2022-09-08 16:22:49 CEST; 3h 54min ago
     Docs: https://docs.docker.com
 Main PID: 1733 (dockerd)
    Tasks: 115
   Memory: 3.7G
   CGroup: /system.slice/docker.service
           └─1733 /usr/bin/dockerd -H fd:// --exec-opt native.cgroupdriver=systemd

The above is from an 1.23.10 RKE1 cluster with cri-dockerd enabled.

If it's of any interest, these are bare metal nodes, not AWS instances.

cite · 2022-09-08T18:25:44Z

@jiaqiluo Would it help if we were able to recreate this problem with CentOS 7? If so, I'll setup a new worker node and join it. The above is from an 1.23.10 cluster.

jiaqiluo · 2022-09-08T18:54:55Z

Hi @cite , it would be great if you could try it on CentOS 7 with RKE1 k8s 1.24.x? Can you also share the docker version on the node?

cite · 2022-09-09T10:49:36Z

@jiaqiluo This is not easy to pin down. I deployed a completely new RKE1 cluster with Rancher 2.6.8, two nodes OEL 8.6 with the default 4.18 kernel, one node with CentOS 7.9 with it's default 3.18 kernel.

# uname -r; rpm -q containerd.io docker-ce
3.10.0-1160.76.1.el7.x86_64
containerd.io-1.6.6-3.1.el7.x86_64
docker-ce-20.10.17-3.el7.x86_64

After deploying a cluster with version 1.23.10-rancher1 and enabling cri-dockerd, and deploying 400 nginx containers, I immediately noticed very bad metrics performance on all nodes:


real	0m4.728s
user	0m0.050s
sys	0m0.061s

There was no difference between OEL 8.6 and CentOS 7.9 workers. Furthermore, the kubelet logs showed errors like that:

E0909 10:09:37.686805   11277 remote_runtime.go:1134] "ReopenContainerLog from runtime service failed" err="rpc error: code = Unknown desc = docker does not support reopening container log files" containerID="80faaaf76cf140f5fe00b9615face7d1c38fc9fad12f85cbf565ee318252d93b"
E0909 10:09:37.686836   11277 container_log_manager.go:244] "Container log doesn't exist, reopen container log failed" err="rpc error: code = Unknown desc = docker does not support reopening container log files" containerID="80faaaf76cf140f5fe00b9615face7d1c38fc9fad12f85cbf565ee318252d93b" path="/var/log/pods/default_echo-75f946c956-tq9jl_20de94b8-9589-4a3f-a751-d41759751e60/echo/0.log"
time="2022-09-09T10:09:39Z" level=error msg="Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running? Failed to get stats from container 964410c5c8e039140b36ded75cd83500f168cfb65d023b8f33753d513f0ba19d"
[...repeated several 10s of times...]
E0909 10:09:39.066200   11277 remote_runtime.go:351] "ListPodSandbox with filter from runtime service failed" err="rpc error: code = Unknown desc = Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?" filter="nil"
E0909 10:09:39.066233   11277 kuberuntime_sandbox.go:292] "Failed to list pod sandboxes" err="rpc error: code = Unknown desc = Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?"
E0909 10:09:39.066244   11277 generic.go:205] "GenericPLEG: Unable to retrieve pods" err="rpc error: code = Unknown desc = Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?"
time="2022-09-09T10:09:41Z" level=info msg="Using CNI configuration file /etc/cni/net.d/10-canal.conflist"
time="2022-09-09T10:09:46Z" level=info msg="Using CNI configuration file /etc/cni/net.d/10-canal.conflist"
E0909 10:09:48.029518   11277 remote_runtime.go:1134] "ReopenContainerLog from runtime service failed" err="rpc error: code = Unknown desc = docker does not support reopening container log files" containerID="80faaaf76cf140f5fe00b9615face7d1c38fc9fad12f85cbf565ee318252d93b"
E0909 10:09:48.029547   11277 container_log_manager.go:244] "Container log doesn't exist, reopen container log failed" err="rpc error: code = Unknown desc = docker does not support reopening container log files" containerID="80faaaf76cf140f5fe00b9615face7d1c38fc9fad12f85cbf565ee318252d93b" path="/var/log/pods/default_echo-75f946c956-tq9jl_20de94b8-9589-4a3f-a751-d41759751e60/echo/0.log"

Again, there was no difference between the CentOS 7.9 and OEL 8.6 workers. The metrics performance is suprising, because rancher/rke#2938 states that the version of cri-dockerd as shipped with Rancher is 0.2.4, which should include Mirantis/cri-dockerd#38.

For all nodes, memory consumption of dockerd was increased by about 600% compared to normal. Also, CPU usage for dockerd was about 10x what we normally see on an idle cluster - with CentOS 7.9 faring about 2x better than OEL 8.6.

When disabling cri-dockerd, the logfiles immediately went back to normal, CPU consumption went back to normal, and memory consumption didn't increase anymore after restarting dockerd. Also, metrics are available about 100x faster:

# time curl -sk -H "Authorization: Bearer $b" 'https://localhost:10250/stats/summary?only_cpu_and_memory=true' >/dev/null

real	0m0.116s
user	0m0.045s
sys	0m0.058s

Upgrading the cluster to 1.24.4 didn't make any difference to the situation of running 1.23.10 with cri-dockerd enabled.

Looking at the code changes for Mirantis/cri-dockerd#38, it seems that metrics are available faster in that case because collecting them is parallelized by distributing the work to several goroutines. Is it possible that with a large number of pods, the parallel queries overwhelm dockerd (or hit somaxconn on it's unix domain socket)?

Conclusion: Enabling cri-dockerd seems to have a larger impact than the worker nodes' OS. However, there is a definitive degradation on OEL 8.6 when compared to CentOS 7.9.

Do you want me to perform other tests or ship more data?

chfrank-cgn · 2022-09-10T18:19:03Z

I've observed the same behavior on Ubuntu 20.04 LTS (RKE 1.24.4 from Rancher 2.6.8)

chfrank-cgn · 2022-09-11T17:37:59Z

Same issue, independent of whether I upgrade from 1.23.10 to 1.24.4 or install 1.24.4 right away - CPU usage on the nodes goes through the roof, with dockerd continuously using 290 to 310%.

Stopping kube-metrics-server and Prometheus ease the situation somewhat, but CPU usage remains high.

Right on the time of the upgrade, error messages appear in metrics server:

E0911 17:00:10.466866 1 scraper.go:140] "Failed to scrape node" err="Get \"https://10.240.0.5:10250/metrics/resource\": dial tcp 10.240.0.5:10250: connect: connection refused" node="rke-f51ad0-0" E0911 17:00:38.955105 1 scraper.go:140] "Failed to scrape node" err="Get \"https://10.240.0.5:10250/metrics/resource\": context deadline exceeded" node="rke-f51ad0-0"

chfrank-cgn · 2022-09-11T17:38:48Z

thaneunsoo · 2022-09-13T11:31:16Z

I tried to reproduce on 2.6.8

Reproduction Steps
Fresh

Create custom rke downstream cluster with 1.24 k8s version (cri-dockerd is enabled by default)
Install monitoring on the cluster
Verify CPU usage using Grafana
Verify CPU usage by ssh into control plane node and running top command

Upgrade

Create custom rke downstream cluster with 1.23 k8s version using Ubuntu 20.04 (enable cri-dockerd)
Install monitoring on the cluster
Verify CPU usage using Grafana
Verify CPU usage by ssh into control plane node and running top command
Upgrade k8s version to 1.24
Repeat steps 3-4

Result
I was not able to reproduce the behavior seen in this ticket

Update
I was able to reproduce this issue on azure instances but not ec2 instances.

cite · 2022-09-13T12:30:49Z

@thaneunsoo Were you able to reproduce metrics timings (with curl)? Why did you ssh into a Control Plane node and not a worker node that hosts pods? Also, your reproduction steps don't include generating a large number of pods.

EDIT: If the problem is related to metrics (or other interactions between cri-dockerd and dockerd), then looking at a node where not a lot is going on will not necessarily show the problem. The timings I have taken so far are all from a worker node with >100 pods (doesn't matter what kind of workload, just to make sure it's not creating CPU load by itself). Also, besides the CPU load and metrics timings, the kubelet error messages are also a sure sign of this problem happening.

chfrank-cgn · 2022-09-13T17:17:57Z

After @thaneunsoo tried to reproduce the issue and failed, I had another go:

Rancher 2.6.8, 3-node custom downstream cluster on Azure VMs with v1.23.10-rancher1-1 on Ubuntu 20.04.5 LTS / Docker 20.10.18 with Kernel: Linux rke-6d3730-0 5.15.0-1019-azure #24~20.04.1-Ubuntu SMP Tue Aug 23 15:52:52 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Top shows minimal CPU consumption, same for Grafana.

chfrank-cgn · 2022-09-13T17:20:44Z

Next step, updating from the Rancher UI to v1.24.4-rancher1-1

CPU consumption goes way up, both in top and in Grafana:

chfrank-cgn · 2022-09-13T17:24:01Z

Observations:

1/ metrics server complains about:
"Failed to scrape node" err="Get "https://10.241.0.6:10250/metrics/resource\": context deadline exceeded" node="rke-6d3730-0"

2/ docker complains a lot about:
Sep 13 16:33:29 rke-6d3730-0 systemd[1]: run-docker-runtime\x2drunc-moby-c90591c07b9383e658d7d6c7afcb24e84e76f0a2967bb7a86e26155765482065-runc.SY6JlR.mount: Succeeded.
Sep 13 16:33:30 rke-6d3730-0 systemd[30741]: var-lib-docker-overlay2-2da63044debc6b41d909672da6721d19243237777b0a9659abb729feac1467f3\x2dinit-merged.mount: Succeeded.

chfrank-cgn · 2022-09-15T14:31:35Z

dockerd spends most of its time in path/filepath.Clean and runtime.scanobject, according to perf top

strace shows that it seems to be iterating/looping around content in/var/lib/rancher-data/local-catalogs/, repeatedly calling lgetxattr() and newfstatat(), with repeated SIGURGs in between

snasovich · 2023-02-21T16:26:40Z

Moved this to v2.7-KDM-For-2.6.11 milestone, the move of changes to correct KDM branch is being done per rancher/kontainer-driver-metadata#1089.

sowmyav27 · 2023-02-21T18:29:39Z

@thaneunsoo please validate on 2.7.1 Rancher using KDM branch dev-v2.7-for-2.6.11 for this issue ^

cite · 2023-02-23T09:45:47Z

@kinarashah Will this fix be included in standalone RKE1 clusters, too? If so, do you know which version of RKE1 the changes are targeted for?

kinarashah · 2023-02-23T15:27:36Z

@cite Yeah, the standalone RKE versions v1.3.19 and v1.4.3 will also have this fixed version. If you wanted to check it out, v1.3.19-rc2 and v1.4.3-rc7 have v1.24.10-rancher3-1.

thaneunsoo · 2023-02-24T22:56:57Z

thaneunsoo said: ### Test Environment: ###
Rancher version: v2.7.1 with KDM dev-v2.7-for-2.6.11 branch
Rancher cluster type: HA
Docker version: 20.10

Downstream cluster type: rke1

Testing:

Tested this issue with the following steps:

Create ami with overlay native diff configurations set

set config for overlay and native diff
- add/change configuration for overlay/overlay2 in etc/docker/daemon.json

{
  "storage-driver": "overlay2"
}

restart docker
set config for native diff
- Add the following to /etc/modprobe.d/disable-overlay-redirect-dir.conf

options overlay metacopy=off redirect_dir=off

stop docker, reload the module, and restart docker

systemctl stop docker
modprobe -r overlay
modprobe overlay
systemctl start docker

Install KVM ( was unable to reproduce without kvm for some reason)

sudo apt install cpu-checker
3.Create downstream rke1 cluster using 1.24

Create multiple pods (created 100 generic pods)

for i in {1..100}; do kubectl run test$i --image=nginx --port=80; done

Display cpu usage("top")

Side note

1.23 cluster was also created to check for differences/similarities
Used aws node driver and also azure node driver clusters for testing with 3 etcd, 2 cp, 3 worker

Cases covered:

overlay with native diff true
overlay with native diff false
overlay2 with native diff true
overlay2 with native diff false

Result

1. overlay with native diff true

before the fix, the cpu usage was
after the fix, the cpu usage is now

2. overlay with native diff false

before the fix the cpu usage was
after the fix, the cpu usage is now

3. overlay2 with native diff true

before the fix the cpu usage was
after the fix, the cpu usage is now

4. overlay2 with native diff false

before the fix the cpu usage was
after the fix, the cpu usage is now

Additional Testing
Tested upgrading k8s version with the following steps:

Follow repro steps from above
After reproducing the issue, change kdm branch from the default release-v2.7 to dev-v2.7-for-2.6.11
Go to cluster and edit config
Upgrade k8s version to 1.24.10-rancher4-1

If you don't see 1.24.10-rancher4-1 then kdm branch has not been updated correctly

Save
Wait for cluster upgrade to complete
Verify cpu load in worker node

Result
Before the fix the cpu usage was spiking around 160% (apologies for no screenshot, but very similar results as repro images above)
After the k8s upgrade, the cpu load drastically improved

kinarashah · 2023-02-25T02:25:29Z

@thaneunsoo @git-ival
As discussed offline, I've updated version to v1.24.10-rancher4-1 in KDM. It doesn't have any new code changes - just build changes to reflect the right k8s version.

KDM branch: dev-v2.7-for-2.6.11 / dev-v2.6
RKE RC: v1.4.3-rc8/ v1.3.19-rc3

git-ival · 2023-02-27T18:34:32Z

Test Environment:

Rancher Cluster Config:

Rancher version: v2.7.1 with KDM dev-v2.7-for-2.6.11 branch
Rancher cluster type: HA , 3 all-roles nodes, 1 worker node (tainted to run rancher-monitoring)
Docker version: 20.10
K8s version: v1.24.10-rancher3-1

Downstream Cluster Config:

Downstream cluster type: HA RKE1, 5 etcd nodes, 6 control-plane nodes, 10 worker nodes
Installed apps: rancher-monitoring
K8s version: v1.24.10-rancher3-1

Reproduction Results:

I was able to reproduce on overlay, and overlay2 for pre-fix K8s versions. I saw CPU usage on the downstream worker nodes climb up to the 400%+ range every few minutes. I found it easier to reproduce on overlay2 with native diff overlay off.

Overlay repro:

Overlay2 Native Diff false repro:

Overlay2 Native Diff true repro:

Test Steps:

Create Rancher cluster
- Ensure the KDM branch is set to the dev-v2.7-for-2.6.11 branch by setting the rke-metadata-config Global Setting url value
Create Downstream cluster (1 for each case below)
- Create NodeTemplates for each Downstream cluster
  - Set the Engine Option for the Storage Driver as appropriate for the current test case
Wait for Downstream cluster to provision
(Overlay2 only) Set Native Overlay Diff on each cluster node as needed:
1. Stop docker
2. Reload the modprobe modules
3. Set modprobe options for Native Diff
4. Start docker
  systemctl stop docker && modprobe -r overlay && modprobe overlay metacopy=off redirect_dir=off && systemctl start docker
Scale-in a large # of pods (created 900 nginx Deployments with 1 pod each)
SSH into a control-plane and worker node
Run top to monitor CPU usage as scale-in proceeds
Once scale-in completes, proceed to Grafana metrics to confirm resource usager %s

Notes:

Provisioned all clusters via Terraform
Downstreams were all AWS nodedriver clusters
Used grafana metrics for results screenshots since the top screenshots are less useful

Cases covered:

Overlay
Overlay2 with native diff true
Overlay2 with native diff false

Test Case Results:

Overlay - Test Case Results:

Pre Pod Scale-in Phase:

17% CPU usage
6% Memory Usage

Post Pod Scale-in:

CPU Usage never reaches > 77%
Memory Usage never reaches > 47%

Overlay2 Native Diff false - Test Case Results:

Pre Pod Scale-in Phase:

~13% CPU usage
~4% Memory Usage

Post Pod Scale-in:

CPU Usage never reaches > 60%
Memory Usage never reaches > 51%

Overlay2 Native Diff true - Test Case Results:

Pre Pod Scale-in Phase:

~12% CPU usage
~3% Memory Usage

Post Pod Scale-in:

CPU Usage never reaches > 70%
Memory Usage never reaches > 53%

git-ival · 2023-02-28T16:06:28Z

Validations completed, closing.

qeternity · 2023-03-05T20:53:43Z

Hi @kinarashah thanks so much for your hard work. I know it's not GA but we've deployed 1.25.6-rancher4-1 to our dev cluster and have seen massive reduction in CPU usage.

immanuelfodor · 2023-06-08T18:35:37Z

Can confirm the CPU issue fixed: upgraded v1.23.16-rancher2-1 -> v1.24.13-rancher2-1 with enable_cri_dockerd: true in RKE1 and the CPU usage is normal.

msr-financial-com added the kind/bug Issues that are defects reported by users or that we know have reached a real release label Sep 6, 2022

msr-financial-com changed the title ~~[BUG] Very high CPU usage on Oracle Linux 8.6 with Kernel 5.15~~ [BUG] Abnormally high CPU usage on Oracle Linux 8.6 with Kernel 5.15 Sep 6, 2022

cite mentioned this issue Sep 8, 2022

Metrics-server can't scrape nodes after enable_cri_dockerd is set to true rancher/rke#2709

Closed

msr-financial-com changed the title ~~[BUG] Abnormally high CPU usage on Oracle Linux 8.6 with Kernel 5.15~~ [BUG] Abnormally high CPU usage on Kubernetes 1.24.4 Sep 8, 2022

jiaqiluo added team/hostbusters The team that is responsible for provisioning/managing downstream clusters + K8s version support [zube]: Team Area 2 labels Sep 12, 2022

jiaqiluo added this to the v2.7.0 milestone Sep 12, 2022

Sahota1225 assigned kinarashah Sep 12, 2022

Sahota1225 added [zube]: Next Up and removed [zube]: Team Area 2 labels Sep 12, 2022

Sahota1225 added priority/0 [zube]: Working and removed [zube]: Next Up labels Sep 13, 2022

zube bot assigned thaneunsoo Sep 15, 2022

snasovich modified the milestones: v2.7.2, v2.7-KDM-For-2.6.11 Feb 21, 2023

Sahota1225 added [zube]: QA Working and removed [zube]: To Test labels Feb 21, 2023

snasovich added the [zube]: Review label Feb 21, 2023

zube bot removed the [zube]: QA Working label Feb 21, 2023

Sahota1225 added the [zube]: To Test label Feb 21, 2023

zube bot removed the [zube]: Review label Feb 21, 2023

kinarashah mentioned this issue Feb 21, 2023

[v1.25] use container stats from cadvisor for cri-dockerd rancher/kubernetes#75

Merged

zube bot added [zube]: QA Working and removed [zube]: To Test labels Feb 22, 2023

zube bot closed this as completed Feb 28, 2023

zube bot added [zube]: Done and removed [zube]: QA Working labels Feb 28, 2023

zube bot removed the [zube]: Done label May 30, 2023

superseb mentioned this issue Oct 12, 2023

Test if using cri-dockerd v0.3.5 removes the need for forked kubelet rancher/rke#3409

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Abnormally high CPU usage on Kubernetes 1.24.4 #38816

[BUG] Abnormally high CPU usage on Kubernetes 1.24.4 #38816

msr-financial-com commented Sep 6, 2022 •

edited by snasovich

cite commented Sep 7, 2022

cite commented Sep 8, 2022 •

edited

jiaqiluo commented Sep 8, 2022

cite commented Sep 8, 2022 •

edited

cite commented Sep 8, 2022 •

edited

jiaqiluo commented Sep 8, 2022

cite commented Sep 9, 2022

chfrank-cgn commented Sep 10, 2022

chfrank-cgn commented Sep 11, 2022

chfrank-cgn commented Sep 11, 2022

thaneunsoo commented Sep 13, 2022 •

edited

cite commented Sep 13, 2022 •

edited

chfrank-cgn commented Sep 13, 2022

chfrank-cgn commented Sep 13, 2022

chfrank-cgn commented Sep 13, 2022

chfrank-cgn commented Sep 15, 2022

snasovich commented Feb 21, 2023

sowmyav27 commented Feb 21, 2023

cite commented Feb 23, 2023

kinarashah commented Feb 23, 2023

thaneunsoo commented Feb 24, 2023 •

edited by zube bot

kinarashah commented Feb 25, 2023

git-ival commented Feb 27, 2023

git-ival commented Feb 28, 2023

qeternity commented Mar 5, 2023

immanuelfodor commented Jun 8, 2023

[BUG] Abnormally high CPU usage on Kubernetes 1.24.4 #38816

[BUG] Abnormally high CPU usage on Kubernetes 1.24.4 #38816

Comments

msr-financial-com commented Sep 6, 2022 • edited by snasovich

cite commented Sep 7, 2022

cite commented Sep 8, 2022 • edited

jiaqiluo commented Sep 8, 2022

cite commented Sep 8, 2022 • edited

cite commented Sep 8, 2022 • edited

jiaqiluo commented Sep 8, 2022

cite commented Sep 9, 2022

chfrank-cgn commented Sep 10, 2022

chfrank-cgn commented Sep 11, 2022

chfrank-cgn commented Sep 11, 2022

thaneunsoo commented Sep 13, 2022 • edited

cite commented Sep 13, 2022 • edited

chfrank-cgn commented Sep 13, 2022

chfrank-cgn commented Sep 13, 2022

chfrank-cgn commented Sep 13, 2022

chfrank-cgn commented Sep 15, 2022

snasovich commented Feb 21, 2023

sowmyav27 commented Feb 21, 2023

cite commented Feb 23, 2023

kinarashah commented Feb 23, 2023

thaneunsoo commented Feb 24, 2023 • edited by zube bot

Testing:

kinarashah commented Feb 25, 2023

git-ival commented Feb 27, 2023

Test Environment:

Rancher Cluster Config:

Downstream Cluster Config:

Reproduction Results:

Overlay repro:

Overlay2 Native Diff false repro:

Overlay2 Native Diff true repro:

Test Steps:

Test Case Results:

Overlay - Test Case Results:

Pre Pod Scale-in Phase:

Post Pod Scale-in:

Overlay2 Native Diff false - Test Case Results:

Pre Pod Scale-in Phase:

Post Pod Scale-in:

Overlay2 Native Diff true - Test Case Results:

Pre Pod Scale-in Phase:

Post Pod Scale-in:

git-ival commented Feb 28, 2023

qeternity commented Mar 5, 2023

immanuelfodor commented Jun 8, 2023

msr-financial-com commented Sep 6, 2022 •

edited by snasovich

cite commented Sep 8, 2022 •

edited

cite commented Sep 8, 2022 •

edited

cite commented Sep 8, 2022 •

edited

thaneunsoo commented Sep 13, 2022 •

edited

cite commented Sep 13, 2022 •

edited

thaneunsoo commented Feb 24, 2023 •

edited by zube bot