-
Notifications
You must be signed in to change notification settings - Fork 38.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong udp conntrack entries are populated after a pod is killed using --force #105657
Comments
/sig network |
/area kube-proxy |
can you share the manifests and steps to try to reproduce it? The conntrack entry using the nodeIP depends on the iptables rules you have configured in the host, if you target a service that doesn't exist most probably the CNI source nat the Pod to the hostIP |
Let me try to describe our setup, as reproducing our deployment would not be possible.
The issue here is between the services "ea1-tassnf" and "ea1-tasftapp". Both these services have 1 pod behind them.
The 'tas-sip-b2bua-tassnf' pod sends periodic heartbeats (SIP protocol, over UDP) to 'tas-sip-b2bua-tasftapp'. The heartbeats are addressed to the service name 'ea1-tasftapp'. |
most of the conntrack issues with UDP were fixed, I'm surprised this worked before and not now :/ |
@aojea I have the logs exactly from when I start our test scenario and we deleted the pod. The verbosity is 9 hope this is not a problem. Please find them attached! I attach also the same exactly test on kubernetes 1.21 (where everything works as expected). I can see some differences but to be honest, wasn't able to find what exactly is creating the issue. Since we can always replicate the issue, I can provide whatever else is needed/may help on the troubleshooting. |
EDIT WRONG
|
@pmarag do I read it correctly and the problem only happens when using --force to delete the pod, without force it works well? |
@aojea, actually our latest tests show otherwise, let me explain. The |
I don't have clear this is related to the kubernetes version, the logs are missing some information but in the 1.22 logs the logic seem correct iptables rule created
stale endpoint detected
entries deleted
less than one second later one new endpoint added to the service and stale service detected
and its entries are deleted
I don't know if you may have a race somewhere, in the application or with the CNI, I suggest you to try that with KIND that has less "networking overhead" |
@aojea We already tried with KIND, actually this is the CNI used on the environment that my colleague @vjhpe is using. Behavior is exactly the same. As I said I'm always available to provide further logs from any kubernetes version (1.20,21 or 22). Right now, Behavior is the same using Calico or KIND. The scenario is working on 1.20 and 1.21 but not on 1.22. |
wow, can you attach the full logs of the kube-proxy? I'd like to check the configuration options and the api calls |
@aojea Thank you again for the support and effort. Find bellow all the information: Kubernetes 1.21 [root@control-plane ~]# kubectl get nodes -o wide [root@control-plane ~]# kubectl version [root@control-plane ~]# kubectl get svc --namespace tas -o wide --selector platform=tas-scif-sip [root@control-plane ~]# kubectl get pods --namespace tas -o wide --selector platform=tas-scif-sip [root@control-plane ~]# conntrack -L | grep 10.110.133.170 After force delete: [root@control-plane ~]# kubectl get pods --namespace tas -o wide --selector platform=tas-scif-sip [root@control-plane ~]# conntrack -L | grep 10.110.133.170 Kube-proxy logs with verbosity 9 kubernetes_1.21_conntrack_issue.log Kubernets 1.22 [root@control-plane ~]# kubectl get nodes -o wide [root@control-plane ~]# kubectl version [root@control-plane ~]# kubectl get svc --namespace tas -o wide --selector platform=tas-scif-sip [root@control-plane ~]# kubectl get pods --namespace tas -o wide --selector platform=tas-scif-sip [root@control-plane ~]# conntrack -L | grep 10.102.145.128 After force delete: [root@control-plane ~]# kubectl get pods --namespace tas -o wide --selector platform=tas-scif-sip [root@control-plane ~]# conntrack -L | grep 10.102.145.128 Kube-proxy logs with verbosity 9 [kubernetes_1.22_conntrack_issue.log] This is again from my environment, using Calico but as I said, the behavior is exactly the same with kindnet |
I can't reproduce it :/ Install an UDP service with a server
Get the new service IP
Run another pod polling the udp service constantly
In another terminal watch the conntrack entries inside the node
After doing |
@aojea Thank you again for the effort. We are working to isolate the issue and provide what is needed in order for you to simulate it. We will come back as soon as possible with an update. |
@aojea Sorry for coming back late on this. We were trying our best to isolate this problem, and provide you a way to reproduce this. I have written 2 small Java applications, which you can deploy to simulate the problem in your k8s cluster. I have built container images for these 2 applications which you can pull into your environment -
The images are available on docker hub -
I've attached 2 yaml files that will deploy these into k8s -
We can check the logs from
Now we kill the
The options-sender pod soon starts getting timeouts, and this continues -
I monitor the conntrack entries during the time I delete the pod -
For the UNREPLIED udp entries, we still see wrong IP (node IP) on the return path and this is never cleared. I think if you deploy this into your k8s cluster, you should be able to reproduce this problem. And also figure out why adding an |
/assign @aojea |
is still the kubernetes version making any difference or was a red herring? |
Using the pods provided by my colleague, initial statement is still valid. The scenario works fine on kubernetes v1.21:
|
/triage accepted |
great catch, this is something worse than the udp conntrack problem |
and cherry=pick to 1.22 branch #106239 |
What happened?
On a single node kubernetes cluster, deployed using minikube:
[root@control-plane ~]# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
control-plane.minikube.internal Ready control-plane,master 46h v1.22.2 10.45.253.76 Red Hat Enterprise Linux 8.4 (Ootpa) 4.18.0-305.el8.x86_64 docker://20.10.6
We have two pods handling SIP traffic. One pod is sending an OPTIONS to the other on port 5060 using a service.
ea1-tasftapp ClusterIP 10.97.122.198 5060/UDP 46h
tas-sip-b2bua-tasftapp-69c574b88c-7nvsm 1/1 Running 0 13m 10.244.0.35 control-plane.minikube.internal
tas-sip-b2bua-tassnf-78fd487c7d-t9b2j 1/1 Running 0 45h 10.244.0.24 control-plane.minikube.internal
Connection on conntrack looks like this:
udp 17 119 src=10.244.0.24 dst=10.97.122.198 sport=5060 dport=5060 src=10.244.0.35 dst=10.244.0.24 sport=5060 dport=5060 [ASSURED] mark=0 use=1
If we delete the pod behind the ea1-tasftapp service by using kubectl delete pod, a new one is created:
tas-sip-b2bua-tasftapp-69c574b88c-7rdgm 1/1 Running 0 47s 10.244.0.36 control-plane.minikube.internal
tas-sip-b2bua-tassnf-78fd487c7d-t9b2j 1/1 Running 0 45h 10.244.0.24 control-plane.minikube.internal
and conntrack is updated correctly:
udp 17 119 src=10.244.0.24 dst=10.97.122.198 sport=5060 dport=5060 src=10.244.0.36 dst=10.244.0.24 sport=5060 dport=5060 [ASSURED] mark=0 use=1
If we delete the pod using force, a new pod is created:
tas-sip-b2bua-tasftapp-69c574b88c-gvx9g 1/1 Running 0 22s 10.244.0.37 control-plane.minikube.internal
tas-sip-b2bua-tassnf-78fd487c7d-t9b2j 1/1 Running 0 45h 10.244.0.24 control-plane.minikube.internal
but the conntrack is updated using the node ip instead of the new pods. Due to this, connectivity between the two pods doesn't work.
udp 17 29 src=10.244.0.24 dst=10.97.122.198 sport=5060 dport=5060 [UNREPLIED] src=10.97.122.198 dst=10.45.253.76 sport=5060 dport=9817 mark=0 use=1
Problem is solved only by restarting the pod initiated the connection (tas-sip-b2bua-tassnf-78fd487c7d-t9b2j) or by restarting kube-proxy.
Issue appears only on 1.22 kubernetes version. On version 1.21 everything is working properly.
What did you expect to happen?
After pod is killed, a new conntrack entry should be created towards the IP of the new pod.
How can we reproduce it (as minimally and precisely as possible)?
Having udp connectivity between a pod and a service and force delete the pod where the traffic is routed through the service.
Anything else we need to know?
No response
Kubernetes version
Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.2", GitCommit:"8b5a19147530eaac9476b0ab82980b4088bbc1b2", GitTreeState:"clean", BuildDate:"2021-09-15T21:38:50Z", GoVersion:"go1.16.8", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.2", GitCommit:"8b5a19147530eaac9476b0ab82980b4088bbc1b2", GitTreeState:"clean", BuildDate:"2021-09-15T21:32:41Z", GoVersion:"go1.16.8", Compiler:"gc", Platform:"linux/amd64"}
Cloud provider
On premises deployment.
OS version
NAME="Red Hat Enterprise Linux"
VERSION="8.4 (Ootpa)"
ID="rhel"
ID_LIKE="fedora"
VERSION_ID="8.4"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Red Hat Enterprise Linux 8.4 (Ootpa)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:8.4:GA"
HOME_URL="https://www.redhat.com/"
DOCUMENTATION_URL="https://access.redhat.com/documentation/red_hat_enterprise_linux/8/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 8"
REDHAT_BUGZILLA_PRODUCT_VERSION=8.4
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="8.4"
Linux control-plane.minikube.internal 4.18.0-305.el8.x86_64 #1 SMP Thu Apr 29 08:54:30 EDT 2021 x86_64 x86_64 x86_64 GNU/Linux
Install tools
Container runtime (CRI) and and version (if applicable)
minikube version: v1.23.2
commit: 0a0ad764652082477c00d51d2475284b5d39ceed
Related plugins (CNI, CSI, ...) and versions (if applicable)
[root@control-plane ~]# calicoctl version
Client Version: v3.20.2
Git commit: dcb4b76a
Cluster Version: v3.20.2
Cluster Type: k8s,kdd,bgp,kubeadm
The text was updated successfully, but these errors were encountered: