New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pod stuck in Terminating state despite StatefulSet replica adjustment to 0 in Kubernetes v1.26.2 cluster #5811
Comments
Presumably this is due to https://kubernetes.io/docs/concepts/storage/persistent-volumes/#persistentvolume-deletion-protection-finalizer - can you confirm what finalizer is in play here? I don't see why the statefulset or pods would stick around unless they also had finalizers, or you are using foreground deletion. |
Hi @shawkins Yes we we have this pvc-protection finalizers in our pvc finalizers:
- kubernetes.io/pvc-protection We also have finalizes in PV finalizers:
- kubernetes.io/pv-protection
- external-attacher/blockvolume-csi-oraclecloud-com But we don't have finalizers in pod.
It seems there might be an issue with unmounting. I have another Pod that doesn't have any PV mounts, it's deployed using If I directly delete the StatefulSet using kubectl, the Pod disappears without getting stuck in the "Terminating" state. The PVC remains, but it's not bound to any Pod. Then, I can delete the PVC using the Kubernetes client API.
|
There are two things going on here. The first is that statefulset pods have blockOwnerDeletion set to true, which is effectively forcing the foreground deletion behavior. The other is the pv-protection finalizer, which is keeping the pv around until the pvc is deleted. The fix for the statefulset deletion may be to first scale the statefulset to 0. I recall needed to do that in several operators. If the kubectl client is doing that automatically, then there's a case to be made for an enhancement to fabric8 to do the same. EDIT: I should clarify that this doesn't appear to be strictly required - trying with simple statefulset based upon the examples in the Kubernetes docs, they will delete just fine with mulitple replicas. Can you try scaling to 0 and seeing if you still get pods stuck in the terminating state? If so this will clarify that there is a general issue terminating that is not related to the statefulset deletion, and if they do terminate sucessfully it should be a viable workaround to scale the statefulset to 0 before deletion. Having a pod get stuck in the terminating state is not something that is clear from what you have above. From your first comment it looks like you may have first attempted to delete the pv without deleting the pvc, but locally that didn't cause the termination issue for me. Based upon the grace period, it will take up to 2 minutes for it to actually be terminated - at which point the pod and statefulset will go away. As you reference above the expected behavior with the delete reclaim policy is that the pv and pvc will remain after the statefulset and pod are deleted. |
Hi @shawkins I tried the scaling down but no luck. client.apps()
.statefulSets()
.inNamespace(namespace)
.withName(statefulSet.getMetadata().getName())
.scale(0, true); But this time I am able to see an Error Event in terminating pod. And this time PVC is still Bound (PVC is Terminating if I call delete statefulset directly) Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 15m default-scheduler Successfully assigned aaaaaaaavsghlprzxy5jmqpuukn2tq53art2ht2mtlhtxqyntvqxb35bnaoa0/opensearch-master-0 to 10.0.13.199
Normal SuccessfulAttachVolume 15m attachdetach-controller AttachVolume.Attach succeeded for volume "master-aaaaaaaavsghlprzxy5jmqpuukn2tq53art2ht2mtlhtxqyntvqxb35bnaoa0-0"
Normal Pulled 15m kubelet Container image "iad.ocir.io/axoxdievda5j/oci-opensearch:2.3.0.26.16" already present on machine
Normal Created 15m kubelet Created container configure-sysctl
Normal Started 15m kubelet Started container configure-sysctl
Normal Pulled 15m kubelet Container image "iad.ocir.io/axoxdievda5j/oci-opensearch:2.3.0.26.16" already present on machine
Normal Created 15m kubelet Created container opensearch
Normal Started 15m kubelet Started container opensearch
Warning Unhealthy 14m (x4 over 14m) kubelet Readiness probe failed: Waiting for cluster to become ready (request params: "wait_for_status=red&timeout=1s&local=true" )
Cluster is not yet ready (request params: "wait_for_status=red&timeout=1s&local=true" )
Warning NodeNotReady 10m node-controller Node is not ready We have this command running in the pod. Is it possible this is the reason? But it is running well with K8S 1.25.12 and fabric k9s-client 6.3.0 readinessProbe:
exec:
command:
- sh
- '-c'
- >
#!/usr/bin/env bash -e
# If the node is starting up wait for the cluster to be ready
(request params: 'wait_for_status=red&timeout=1s&local=true' )
# Once it has started only check that the node itself is
responding
START_FILE=/tmp/.es_start_file
http () {
local path="${1}"
curl -XGET -s -k --fail --insecure https://127.0.0.1:9200${path}
}
if [ -f "${START_FILE}" ]; then
echo 'Cluster is already running, lets check the node is healthy and there are master nodes available'
http "/_cluster/health?timeout=0s&local=true"
else
echo 'Waiting for cluster to become ready (request params: "wait_for_status=red&timeout=1s&local=true" )'
if http "/_cluster/health?wait_for_status=red&timeout=1s&local=true" ; then
touch ${START_FILE}
exit 0
else
echo 'Cluster is not yet ready (request params: "wait_for_status=red&timeout=1s&local=true" )'
exit 1
fi
fi
initialDelaySeconds: 10
timeoutSeconds: 5
periodSeconds: 10
successThreshold: 3
failureThreshold: 3 |
May I also ask why you think "you may have first attempted to delete the pv without deleting the pvc"? Because in our code we delete namespace first(which cleans all the resources including pvc), then we delete pv Our PV's persistentVolumeReclaimPolicy is Retain PV yamlapiVersion: v1 kind: PersistentVolume metadata: name: data-aaaaaaaavsghlprzxy5jmqpuukn2tq53art2ht2mtlhtxqyntvqxb35bnaoa0-0 uid: d18f781b-1a16-402e-a316-48b318e6a9df resourceVersion: '3659822' creationTimestamp: '2024-03-21T23:18:53Z' finalizers: - kubernetes.io/pv-protection - external-attacher/blockvolume-csi-oraclecloud-com managedFields: - manager: fabric8-kubernetes-client operation: Update apiVersion: v1 time: '2024-03-21T23:18:53Z' fieldsType: FieldsV1 fieldsV1: f:metadata: f:finalizers: .: {} v:"external-attacher/blockvolume-csi-oraclecloud-com": {} v:"kubernetes.io/pv-protection": {} f:spec: f:accessModes: {} f:capacity: .: {} f:storage: {} f:claimRef: .: {} f:kind: {} f:name: {} f:namespace: {} f:uid: {} f:csi: .: {} f:driver: {} f:fsType: {} f:volumeAttributes: .: {} f:vpusPerGB: {} f:volumeHandle: {} f:nodeAffinity: .: {} f:required: {} f:persistentVolumeReclaimPolicy: {} f:storageClassName: {} f:volumeMode: {} - manager: kube-controller-manager operation: Update apiVersion: v1 time: '2024-03-21T23:18:53Z' fieldsType: FieldsV1 fieldsV1: f:status: f:phase: {} subresource: status selfLink: >- /api/v1/persistentvolumes/data-aaaaaaaavsghlprzxy5jmqpuukn2tq53art2ht2mtlhtxqyntvqxb35bnaoa0-0 status: phase: Bound spec: capacity: storage: 50Gi csi: driver: blockvolume.csi.oraclecloud.com volumeHandle: >- ocid1.volume.oc1.iad.abuwcljtsak3qtpgf7rqy2ualvjbrnymtvmz27kbm75xgaotatbenlk3oqoa fsType: ext4 volumeAttributes: vpusPerGB: '10' accessModes: - ReadWriteOnce claimRef: kind: PersistentVolumeClaim namespace: aaaaaaaavsghlprzxy5jmqpuukn2tq53art2ht2mtlhtxqyntvqxb35bnaoa0 name: opensearch-data-opensearch-data-0 uid: c86d5a33-79c1-4332-a457-371bb53a759f persistentVolumeReclaimPolicy: Retain storageClassName: oci-bv volumeMode: Filesystem nodeAffinity: required: nodeSelectorTerms: - matchExpressions: - key: failure-domain.beta.kubernetes.io/zone operator: In values: - US-ASHBURN-AD-3 |
I think that just confirms it's taking a long time for the pod to terminate and it's failing the readiness probe while doing so. The best guess is that this is a symptom that your pod is not responding to termination signals properly and so it's taking until the end of the termination grace period to fully go away.
I don't think this behavior has anything to do with the kubernetes client. The scaling operation is simply an adjustment to the StatefulSet, then the client optionally waiting to observe that the operation completed. If the behavior in kubectl is different, then I would guess there is some legacy default there that is forcing the deletion of pods rather than waiting for the natural termination.
Because the pv's were stuck in termination waiting for the deletion of the pvcs. |
Hi @shawkins I added the force delete command for pods, and now I can successfully delete pods. When posting this issue, I see the latest version is v1.25.3. Do you still consider v1.26 as development version? |
You mean delete the statefulset correct? Can you double check the kubectl source and see if it's defaulting to a forced deletion of the pods? If so we could either do the same in the kubernetes client.
Generally the client is forwards compatible with later kubernetes versions. There is nothing that the client would be doing differently here based upon the kuberentes version. Also know that when the client is updated such that its built-in model classes are updated for later kubernetes versions that is generally non-breaking, but in circumstances where you may rely upon deprecated functionality on an older kubernetes release, then a newer client may not provide that. |
Describe the bug
I recently updated my Kubernetes version from v1.25.12 to v1.26.2. Previously, everything was running smoothly with Kubernetes version v1.25.12 and fabric8 k8s client 6.3.0.
However, after the update to v1.26.2, I encountered an issue. When using the k8s-client API, pods mounted with PVCs get stuck in the terminating state, whether directly deleting StatefulSets or scaling down replicas to 0. PVC is stuck in terminating as well. There is no event in pod.
Strangely, if I use
kubectl
directly in the terminal to delete the StatefulSet, both the StatefulSet and pods can be deleted successfully.I attempted to resolve this by updating my fabric8 k8s client version to 6.10.0, but unfortunately, I still faced the same error.
The pods are created via StatefulSets, and finalizers are set in the PVCs. However, I expect not to manually modify or remove finalizers.
Fabric8 Kubernetes Client version
6.10.0
Steps to reproduce
I also attach the description of terminating pod here:
Describe of pod - Click to expand
Expected behavior
pod is deleted together with statefulset. No need to manually remove the finalizer from PVC
Runtime
Kubernetes (vanilla)
Kubernetes API Server version
next (development version)
Environment
OCI cloud
Fabric8 Kubernetes Client Logs
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: