What's the condition and behavior of "Task ran out of memory" error? #9734
Replies: 2 comments
-
I'm experimenting the same in GKE. ContainerD will report "Task 123abc ran out of memory" for all of the pods within a Node. But none of those Pods/Containers actually get killed or exit. They all seem to continue processing as if nothing happened. |
Beta Was this translation helpful? Give feedback.
-
The OOM killer plays at the process level, it's pretty well explained here Can you check that one of your container processes has not been killed? The memory limits that we configure at the level of our k8s resources will only have an impact at the process level (and possibly at container level if the killed process is linked to the life cycle of the container… incrementing the restart count I suppose) |
Beta Was this translation helpful? Give feedback.
-
Hi there!
So I faced a few cases in the last weeks that the containerd printed "Task ran out of memory" error, but neither the pod nor the container were killed or restarted.
I could see that the Pod memory usage was skyrocketing at that moment, and get stabilized right after the containerd "Task ran out of memory" got printed.
I was expected to see something like: "exit 137" or an increased pod restart counter, but there was nothing.
What is the actual behavior of the containerd when it prints the "Task ran out of memory" error, and what's the condition of the error?
I'm running the AWS EKS v1.27 cluster with amazon-eks-node-1.27-v20230711 AMI (ami-00f80984c1a72a9d1).
Kernel version: 5.10.199-190.747.amzn2.aarch64, Kubelet v1.27.7-eks-e71965b.
Can someone help me here?
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions