Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ipv6 Only Servers #588

Open
eric1234 opened this issue Nov 17, 2023 · 4 comments
Open

ipv6 Only Servers #588

eric1234 opened this issue Nov 17, 2023 · 4 comments

Comments

@eric1234
Copy link

I'm testing out deploying and running an app on an ipv6 only server with that deployment happening via Kamal. I see on #87 that ipv6 should work but I've encountered a few issues.

I have some thoughts about fixes but wasn't sure of them so I figured I would start with opening an issue. If I can get a bit of feedback on it I would be glad to do the legwork to help resolve these issues. So far there are three issues.

Mismatch Between Host and Configuration String

This prevents the deploy from working as the env files won't push up (may affect other commands but since it fails so early not sure what else). The issue is when I specify my ipv6 address I must include the brackets around the address (similar to what web browsers require). Example:

servers:
  - "[2a01:4f7:c014:160::1]"

You can see these brackets being required in SSHKit here

The problem is both this line:

roles.select { |role| role.hosts.include?(host.to_s) }.map(&:name)

and this line:

config.accessories.select { |accessory| accessory.hosts.include?(host.to_s) }.map(&:name)

assume the configuration and hostname are the same. But this is really only true for the SimpleHostParser

All other host parsers do something with that initialization string. Therefore when the two problem lines call to_s they are only getting the hostname (which does not include the brackets) while the values it's comparing against from the config file do include the brackets.

The additional challenge here is that host isn't always a SSHKit::Host object. When doing an interactive execution it just becomes a string derived from Kamal.primary_host at:

exec KAMAL.app(role: KAMAL.primary_role).execute_in_new_container_over_ssh(cmd, host: KAMAL.primary_host)

With a simple host, just calling to_s on either type of object works (String or SSHKit::Host). But for hosts parsed by any other parser to_s is just the hostname which is incomplete. A bit of a hacky solution I implemented to move past it was to monkey-patch to_s in SSHKit with:

class SSHKit::Host
  module StringifyToFullConfig
    def initialize(configuration)
      @configuration = configuration

      super
    end

    def to_s = @configuration
  end

  prepend StringifyToFullConfig
end

On one hand this seems "good enough" as it works. OTOH what downstream effects will be cause by overwriting to_s in a 3rd party package?

Interactive SSH Host Format Invalid

The next issue is when I try to do an interactive ssh session I end up with the reverse problem. The ssh CLI tool needs it without the brackets and if it has brackets then it assumes a hostname that it tries to lookup in the DNS. Example:

$ kamal app exec -i bash
Get most recent version available as an image...
Launching interactive command with version latest via SSH from new container on [2a01:4f7:c014:160::1]...
ssh: Could not resolve hostname [2a01:4f7:c014:160::1]: Name or service not known

In this case we want to have the connection string parsed out and just provide the ssh program the hostname. I resolved this by changing the run_over_ssh so it converts the host to a SSHKit::Host then extracts just the hostname:

def run_over_ssh(*command, host:)
+  host = SSHKit::Host.new host

  "ssh".tap do |cmd|
    if config.ssh.proxy && config.ssh.proxy.is_a?(Net::SSH::Proxy::Jump)
      cmd << " -J #{config.ssh.proxy.jump_proxies}"
    elsif config.ssh.proxy && config.ssh.proxy.is_a?(Net::SSH::Proxy::Command)
      cmd << " -o ProxyCommand='#{config.ssh.proxy.command_line_template}'"
    end
-    cmd << " -t #{config.ssh.user}@#{host} -p #{config.ssh.port} '#{command.join(" ")}'"
+    cmd << " -t #{config.ssh.user}@#{host.hostname} -p #{config.ssh.port} '#{command.join(" ")}'"
  end
end

With that change in place I'm able to connect just fine into my container.

Configure Docker for ipv6

My final issue (so far!), is that Docker by doesn't setup its default bridge network with ipv6 support by default. They provide instructions that are straight-forward. Just drop in a config file, restart docker and your good to go. Now containers being run can call other ipv6 services (such as the database server). The result is to setup I really need to:

  1. Run kamal server bootstrap to install docker
  2. Manually drop that file and restart docker on all servers
  3. Run kamal setup to finish the rest of the setup

It would be nice if I could just run kamal setup and it would be ipv6 ready. This starts to seem like a Docker problem. But OTOH Kamal is taking responsibility for installing Docker. So perhaps it should also enable ipv6?

It could maybe do this unconditionally (even if not using ipv6). I don't think it hurts anything. Or perhaps we could pass a -ipv6 switch with kamal setup if we didn't want it global. Another solution might be a hook. Just run a hook on the server before being bootstraped which would allow me to have a shell script that drops the config in the right place.

eric1234 added a commit to eric1234/kamal that referenced this issue Nov 18, 2023
This is just a temp hack while a more robust solution is worked out on:

basecamp#588
eric1234 added a commit to eric1234/kamal that referenced this issue Nov 18, 2023
Probably somewhat close to the final solution but still working that out on:

basecamp#588
@zealot128
Copy link

I also stumbled upon this issue and wondered why no env was pushed to any server without any message, even after wrapping the IP in brackets.
(Missing the brackets just crashes with:
ERROR (SocketError): Exception while executing on host 2a01: getaddrinfo: Temporary failure in name resolution)

Using IPv6 one could remove the IPv4 address from all containers and save some bucks in the public cloud, like $3-4 on AWS, and $0,5 on Hetzner, and also significantly hide the machine from public scanners.

@djmb
Copy link
Collaborator

djmb commented Mar 6, 2024

@eric1234 - supporting IPv6 would be great, thank's for the great write-up of the issues! We'll definitely accept fixes for this.

If we don't include the square brackets in servers, but instead made sure to add them before passing them to SSHKit, would that solve the host comparison issue?

Regarding the setup, I think an --ipv6 switch sounds like the way to go.

It would also be nice to get an integration tests that does an IPv6 based deployment if possible.

@eric1234
Copy link
Author

If we don't include the square brackets in servers, but instead made sure to add them before passing them to SSHKit, would that solve the host comparison issue?

Excellent idea! I was stuck on supporting the format as SSHKit specifies it but we are not bound by that. If the Kamal format is without brackets and we just tweak that argument before passing to SSHKit that should allow everything to match up.

I put together a little test app to verify with the following diff on Kamal as a code spike:

diff --git a/lib/kamal/cli/base.rb b/lib/kamal/cli/base.rb
index 3527a21..1f68b65 100644
--- a/lib/kamal/cli/base.rb
+++ b/lib/kamal/cli/base.rb
@@ -1,3 +1,4 @@
+require "resolv"
 require "thor"
 require "dotenv"
 require "kamal/sshkit_with_ext"
@@ -181,5 +182,16 @@ module Kamal::Cli
           execute(*KAMAL.server.ensure_run_directory)
         end
       end
+
+      def on(hosts)
+        hosts = Array(hosts).map do |host|
+          next host unless host.is_a? String
+          next host unless Resolv::IPv6::Regex =~ host
+
+          "[#{host}]"
+        end
+
+        super
+      end
     end
 end

It worked beautifully. I did a full setup, I was able to app exec, etc without error. It does mean that we are deciding to supporting ipv6 addresses without brackets while with brackets in probably more common (although the ssh program shows there are already exceptions). This seems fine to me, just want to make sure it's an explicit decision not one that we just made due to the impl convenience.

My impl takes advantage of the fact that Kamal::Cli::Base mixes in SSHKit::DSL which provides the on method to operate on host(s). Therefore we are just going to decorate that method to tweak the arguments to format the host as SSHKit needs. This ensures that any subclass of Kamal::Cli::Base can just use the on method without having to remember to call some sort of method to tweak the arguments manually which will make having an ipv6 failure in the future less likely.

A few notes:

  • It's not always a list of hosts but can be a single host (probably for primary host operations?). Therefore wrapping with Array so we can map either way.
  • It's not always a string from the config but sometimes the on method receives a SSHKit::Host object. Therefore the check on if the host is a String and if not just passes it through.

To determine if the string is an ipv6 address is not straight forward because of all the various formats of ipv6 (0-compression, etc). My initial gut was to borrow from SSHKit. But that won't really work as that regexp is too loose. It really only works because it runs the SimpleHostParser first and that parser specifically negates out things that look like ipv6.

My next thought was to ask the oracle, but while likely correct, looks like a nightmare that Kamal probably doesn't want to maintain. So then I checked if Ruby has already abstracted that nightmare for us. Turns out it has via Resolv::IPv6::Regex, we just need to require resolv first. But it's built-in to the Ruby stdlib so it's no extra dep.

If this impl seems reasonable to you I can work up a more proper PR (with tests, integration tests as you indicated, etc).


Regarding the -ipv6 flag to enable IPv6 in Docker on setup, I've grown to like it less since I suggested it a few months ago. Most users are going to just follow some sort of "getting started" steps and completely miss the need for that switch. Also, I'm not sure there is one right action. While adding a /etc/docker/daemon.json file with:

{
  "ipv6": true,
  "fixed-cidr-v6": "2001:db8:1::/64",
  "experimental": true,
  "ip6tables": true
}

is the right thing if you have an ipv6 only server I don't think it's the right thing if you have a server that is supporting both ipv4 and ipv6 since the above makes the default default bridge network ipv6. If you are using a mix you might just want:

{
  "experimental": true,
  "ip6tables": true
}

And then add some ipv6 networks to docker. This is starting to feel like we should handle it like we do when the ssh user is not root. Have a note about it in the documentation but it's largely on them to bootstrap Docker with ipv6 enabled. Maybe they do that through a "userdata" file that looks like:

#cloud-config
write_files:
  - path: /etc/docker/daemon.json
    content: |
      {
        "ipv6": true,
        "fixed-cidr-v6": "2001:db8:1::/64",
        "experimental": true,
        "ip6tables": true
      }
    permissions: '0644'

This is what I have been doing since Nov when I originally opened the ticket. Hetzner (and many cloud providers) support providing this userdata file to apply changes to the server when it is created. Another option may be something like running Anisble on the server to enable ipv6 in Docker prior to doing the Kamal setup. You don't even have to install Docker (you can let Kamal still do that). You just need to drop that file in the right place.

Maybe it's best if we don't take on that responsibility but just point them in the right direction in the documentation. Once we get the ipv6 fix merged I could open up a PR on the documentation side to provide this info.

@jeromedalbert
Copy link

jeromedalbert commented May 19, 2024

Thanks @eric1234 for this great deep dive. I stumbled upon the same issues a year ago but only got halfway through what you did before deciding to shell out an extra $0.5 to Hetzner and just get an IPv4 address. But I would love for Kamal to support IPv6 for sure!

On one hand this seems "good enough" as it works. OTOH what downstream effects will be cause by overwriting to_s in a 3rd party package?

Maybe this concern could be addressed by using a ruby refinement. Still kind of a hack but at least it wouldn't leak elsewhere.

It worked beautifully. I did a full setup, I was able to app exec, etc without error. It does mean that we are deciding to supporting ipv6 addresses without brackets while with brackets in probably more common (although the ssh program shows there are already exceptions). This seems fine to me, just want to make sure it's an explicit decision not one that we just made due to the impl convenience.

If implementation weren't an issue I think it would be nice to allow both syntaxes. Maybe a way to do it would be to strip any brackets when loading the Kamal yaml configuration, then put them back in like you're doing when calling SSHKit, although that would be quite hacky.


I think the simpler fix, if possible, is for SSHKit itself to parse any IPv6 form correctly in the first place, instead of making Kamal work around SSHKit's current limitations.

I submitted an issue and a PR for a potential fix. When I try that PR with Kamal, unbracketed and bracketed IPv6 addresses seem to work well out of the box when deploying to Hetzner, and I didn't need to configure Docker for IPv6.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants