Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRI: Sandbox IP not present after containerd restart #7843

Closed
dcantah opened this issue Dec 20, 2022 · 4 comments · Fixed by #7845
Closed

CRI: Sandbox IP not present after containerd restart #7843

dcantah opened this issue Dec 20, 2022 · 4 comments · Fixed by #7845
Labels
area/cri Container Runtime Interface (CRI) kind/bug

Comments

@dcantah
Copy link
Member

dcantah commented Dec 20, 2022

Description

First reported on the CNCF slack https://cloud-native.slack.com/archives/C4RJZ9Z6Y/p1671470742340569

Before containerd process restart:

# crictl -r unix:///run/k0s/containerd.sock inspectp 94c43ab108db3 | jq .status.network
{
 "additionalIps": [],
 "ip": "10.244.0.24"
}

After restarting containerd:

# crictl -r unix:///run/k0s/containerd.sock inspectp 94c43ab108db3 | jq .status.network
{
 "additionalIps": [],
 "ip": ""
}

This means that kubelet sees the sandbox as changed and thus will restart each pod.

Steps to reproduce the issue

  1. For local testing, create a pod using crictl
  2. Restart containerd
  3. Inspect pod status after restart: crictl -r unix:///run/k0s/containerd.sock inspectp $podID | jq .status.network

Describe the results you received and expected

containerd to correctly preserve the sandbox's IP/networking information. This behavior regressed in 1.6.9 and may be related to #7456

What version of containerd are you using?

1.6.12

Any other relevant information

No response

Show configuration if it is related to CRI plugin.

Default containerd config

@dcantah dcantah added kind/bug area/cri Container Runtime Interface (CRI) labels Dec 20, 2022
@dcantah
Copy link
Member Author

dcantah commented Dec 20, 2022

cc @samuelkarp @MikeZappa87 as we were discussing on Slack

@qiutongs
Copy link
Contributor

Let me take a look.

@brandond
Copy link
Contributor

brandond commented Jan 4, 2023

This needs to be backported to v1.6 ASAP. All releases since 1.6.9 have this critical regression that breaks containerd's guarantees around non-disruptive restarts. Unaffected versions are affected by CVEs, so users are currently forced to pick between CVEs or or having their pods all restarted whenever containerd restarts.

@klueska
Copy link

klueska commented Jan 11, 2023

I believe the issue I reported here is also resolved by this fix:
https://cloud-native.slack.com/archives/CGEQHPYF4/p1667586414682319

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cri Container Runtime Interface (CRI) kind/bug
Projects
None yet
4 participants