New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hetzner arm nodes not joining cluster consistently #16491
Comments
seems like after the 4th full delete and recreate it works again. I wonder if this is related to #15806 |
Only one way to find out. Please check the |
I will have a look when it fails again. Took me some time to realise the user to connect via is not
for the nodes it fails with an hard error, but for control plane it works just fine it seems. No errors or issues as far i was able to tell so far. All servers spawn and kubernetes says everything is happy. I had so far no workload on the cluster though. So it might have bugs I didnt see yet. But I am doubtful that there are any. |
This time its a control-plane node. It seems to fail on this:
as the server responds with TLDR: As a workaround deleting the server and updating to reinit the rest might be easiest here. |
That is pretty much the path of least resistance. |
Be sure and get in touch with Hetzner via support ticket if you get bit by a blocked IP. Best odds we have of them no longer being blackholed by Google is if Hetzner reaches out to them to see what the deal is. |
/kind bug
1. What
kops
version are you running? The commandkops version
, will displaythis information.
Client version: 1.29.0-beta.1 (git-v1.29.0-beta.1-154-g87a0483ca3)
2. What Kubernetes version are you running?
kubectl version
will print theversion if a cluster is running or provide the Kubernetes version specified as
a
kops
flag.3. What cloud provider are you using?
Hetzner
4. What commands did you run? What is the simplest way to reproduce this issue?
kops create cluster --name=cluster-example.k8s.local --ssh-public-key=~/.ssh/id_ed25519.pub --cloud=hetzner --zones=hel1 --networking=cilium --network-cidr=10.10.0.0/16 --node-count=2 --control-plane-count=3 --control-plane-zones=hel1,fsn1 --node-size=cax21 --control-plane-size cax11
5. What happened after the commands executed?
All nodes and resources are created however validate fails. The one node only joined after 3 recreations. The other one doesnt join at all:
6. What did you expect to happen?
All nodes join
7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml
to display your cluster manifest.You may want to remove your cluster name and other sensitive information.
8. Please run the commands with most verbose logging by adding the
-v 10
flag.Paste the logs into this report, or in a gist and provide the gist link here.
9. Anything else do we need to know?
Additionally the ssh key seems to not get applied. trying to ssh in only yields a user password request. the SSH key doesnt get accepted.
This was tried well beyond the 10m mark.
The text was updated successfully, but these errors were encountered: