Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Canal 3.26 failing in Ubuntu 22.04 #13236

Closed
xrstf opened this issue Apr 3, 2024 · 6 comments
Closed

Canal 3.26 failing in Ubuntu 22.04 #13236

xrstf opened this issue Apr 3, 2024 · 6 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/networking Denotes a PR or issue as being assigned to SIG Networking.

Comments

@xrstf
Copy link
Contributor

xrstf commented Apr 3, 2024

What happened?

I created a usercluster on AWS. Nodes have public IPs, cluster is IPv4 only. Nothing special. The cluster comes up and seems to work just fine, but the Kubernetes conformance tests get stuck in the BeforeSuite test, which ensure that all Pods in kube-system are ready.

But the canal pods are not. The calico container is crashing:

2024-04-03 09:17:14.125 [WARNING][3065588] felix/ipsets.go 319: Failed to resync with dataplane error=exit status 1 family="inet"
2024-04-03 09:17:14.144 [ERROR][3065588] felix/ipsets.go 569: Bad return code from 'ipset list'. error=exit status 1 family="inet" stderr="ipset v7.11: Kernel and userspace incompatible: settype hash:ip with revision 6 not supported by userspace.\n"
2024-04-03 09:17:10.307 [PANIC][3065426] felix/ipsets.go 352: Failed to update IP sets after multiple retries. family="inet"
panic: (*logrus.Entry) 0xc0001d1e30
goroutine 248 [running]:
[github.com/sirupsen/logrus.(*Entry).log(0xc0005cb500](http://github.com/sirupsen/logrus.(*Entry).log(0xc0005cb500), 0x0, {0xc00066e030, 0x30})
        /go/pkg/mod/github.com/sirupsen/logrus@v1.9.0/entry.go:260 +0x4d6
[github.com/sirupsen/logrus.(*Entry).Log(0xc0005cb500](http://github.com/sirupsen/logrus.(*Entry).Log(0xc0005cb500), 0x0, {0xc00082fb58?, 0x5?, 0x0?})
        /go/pkg/mod/github.com/sirupsen/logrus@v1.9.0/entry.go:304 +0x4f

Expected behavior

The CNI pods should be ready.

How to reproduce the issue?

  • Create an AWS usercluster.
  • Wait for it to be up and running.
  • Observe the canal pods randomly failing their liveness checks.

How is your environment configured?

  • KKP version: 2.24
  • Shared or separate master/seed clusters?: shared

What cloud provider are you running on?

AWS

What operating system are you running in your user cluster?

Ubuntu 22.04.04 is used for the worker nodes in my usercluster.

@xrstf xrstf added the kind/bug Categorizes issue or PR as related to a bug. label Apr 3, 2024
@cnvergence
Copy link
Member

/label sig-networking

@kubermatic-bot
Copy link
Contributor

@cnvergence: The label(s) /label sig-networking cannot be applied. These labels are supported: blocked by backend, merge-type/merge, merge-type/rebase, needs details, service accounts, Epic, MVP, customer-request, design, feature, proposal, ready-to-challenge, redesign, sig/api, sig/app-management, sig/cluster-management, sig/community, sig/infra, sig/networking, sig/ui, sig/virtualization, sprint, team/marketing, team/ps, lifecycle/frozen, backport-needed, backport-complete, ee, test/require-vsphere, test/require-kubevirt, test/require-vmwareclouddirector, test/require-nutanix. Is this label configured under labels -> additional_labels or labels -> restricted_labels in plugin.yaml?

In response to this:

/label sig-networking

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@xrstf xrstf added the sig/networking Denotes a PR or issue as being assigned to SIG Networking. label Apr 3, 2024
@cnvergence
Copy link
Member

cnvergence commented Apr 3, 2024

Seems to break on 6.5.0-1016-aws kernel version.

modinfo ip_set
filename:       /lib/modules/6.5.0-1016-aws/kernel/net/netfilter/ipset/ip_set.ko
description:    ip_set: protocol 7
alias:          nfnetlink-subsys-6
description:    core IP set support
author:         Jozsef Kadlecsik <kadlec@netfilter.org>
license:        GPL
srcversion:     0B880A446291E801E7C2F36
depends:        nfnetlink
retpoline:      Y
intree:         Y
name:           ip_set
vermagic:       6.5.0-1016-aws SMP mod_unload modversions
sig_id:         PKCS#7
signer:         Build time autogenerated kernel key
sig_key:        26:81:FA:61:12:04:05:A5:60:66:56:76:2A:90:D8:6B:CF:1E:BC:C1
sig_hashalgo:   sha512
signature:      0F:35:FC:3C:09:2C:AB:B5:25:5B:D2:77:9A:0E:3F:A0:D1:C6:6C:3C:
		61:36:A3:12:1B:2B:8E:A2:6F:8C:FA:B6:4A:F1:3B:4C:4C:9E:C9:6B:
		AA:B6:9E:03:4F:0D:7A:4F:D0:2C:6E:2E:86:A8:20:78:1F:EA:03:F8:
		F0:B1:FE:87:75:A9:D3:CE:43:AA:DE:88:0F:F1:3F:BC:21:9B:16:07:
		D8:40:A0:BC:F8:DB:E7:5D:6E:C3:61:E8:39:11:1B:1F:67:CF:DD:0D:
		30:88:3B:6F:05:50:00:B1:35:CE:23:2D:E6:9C:90:33:72:7B:09:D0:
		B6:0C:DD:16:A4:B6:4D:28:8F:52:1D:23:F3:93:2A:C9:78:D5:94:A2:
		BD:75:D4:9B:0F:67:57:72:54:5F:95:99:24:18:71:6D:EE:E3:72:E7:
		EC:58:C2:9E:4F:0D:FA:83:3C:1E:3F:4C:CA:D2:AA:4A:6F:B5:14:C4:
		17:CB:EA:01:5B:41:E4:7F:91:5F:3A:8A:A0:56:DD:3B:41:9F:2C:73:
		AF:63:4C:92:04:17:C7:03:CD:58:0F:65:B7:CA:23:B7:68:76:21:B1:
		8F:FD:45:A8:31:E4:BB:E7:86:A9:E4:DB:86:AE:C0:02:37:F6:0B:2F:
		7B:C8:F9:AD:1B:D5:12:BF:52:D6:9C:04:FF:6A:28:25:95:42:C0:18:
		50:E6:90:5D:DE:C6:04:E6:AE:FF:02:23:78:A9:41:F5:EB:A1:F3:5D:
		25:5E:F0:67:62:F5:C4:99:DB:79:E0:AE:2F:5B:5D:82:99:A6:59:9F:
		04:FE:39:9F:8D:E3:2E:9F:80:D3:61:8B:42:F0:47:F4:B0:B2:96:47:
		ED:89:1C:42:A2:68:88:D3:96:39:55:91:42:F1:FE:0D:7F:93:FE:A7:
		E0:C2:16:AC:42:8C:BB:2F:D4:57:28:28:E3:DD:0C:DB:F7:B9:F3:10:
		DD:71:FF:6A:27:A9:9D:9F:99:C5:CE:30:69:4F:D1:A7:A1:D0:AA:24:
		03:B0:13:4D:70:93:BD:80:E7:E4:11:08:AF:F4:32:39:56:6C:D9:58:
		79:20:C1:4C:52:2B:1F:E2:27:FE:F6:B7:8D:80:69:EA:8E:29:51:E4:
		C9:25:42:71:E6:93:A6:F9:F4:6D:92:A1:D8:13:3B:16:EB:6E:4E:07:
		92:1B:63:02:B9:25:B8:1D:C5:E8:A5:2A:82:42:5B:EE:E2:D4:CC:05:
		FE:05:D7:C5:AF:DB:48:0D:B4:F7:AD:C3:5F:A1:A6:31:C5:AF:7E:B3:
		59:C6:46:ED:10:76:1B:F5:35:48:D1:D4:67:CD:05:8F:B7:B2:19:88:
		CB:94:A4:9F:3C:62:51:EF:42:FF:5A:09
parm:           max_sets:maximal number of sets (int)

calico breaks by being unable to parse ip sets, probably something else has created ip set in AWS

ipset list
ipset v7.15: Kernel and userspace incompatible: settype hash:ip,port with revision 7 not supported by userspace.

@cnvergence
Copy link
Member

fixed by this projectcalico/calico#8387, in version 3.27.2

@embik
Copy link
Member

embik commented Jun 5, 2024

/close

this has been implemented in the PR and backports linked above.

@kubermatic-bot
Copy link
Contributor

@embik: Closing this issue.

In response to this:

/close

this has been implemented in the PR and backports linked above.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/networking Denotes a PR or issue as being assigned to SIG Networking.
Projects
None yet
Development

No branches or pull requests

4 participants