Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

World to host not working (& host to container) #705

Open
MichaelVoelkel opened this issue Jan 1, 2024 · 5 comments
Open

World to host not working (& host to container) #705

MichaelVoelkel opened this issue Jan 1, 2024 · 5 comments

Comments

@MichaelVoelkel
Copy link

Hi,

being on Debian12, I switched to NFT now again. The basic configuration is just from the docs, /etc/nftables.conf:

#!/usr/sbin/nft -f

flush ruleset

table inet filter {
    chain input {
        type filter hook input priority 0; policy drop;
        tcp dport 22 accept
    }
    chain forward {
        type filter hook forward priority 0; policy drop;
    }
    chain output {
        type filter hook output priority 0; policy accept;
    }
}

/etc/nftables exists but I don't use it. I hope it makes no trouble.

I can connect to ssh and nothing else works, so far so good.

Now I have portainer running on 9443 where I could before have world access and host access (it has the ports mapping of 9443:9443, so I should access it via localhost, e.g., nc -vv localhost 9443). As connection times out both ways, I'd assume that the package is dropped.

My rules.toml is: (yeah, small, nothing else)

[[wider_world_to_container.rules]]
network = "portainer_network"
expose_port = 9443
dst_container = "portainer"

I run dfw currently like this to see the logs:

docker run --rm       --name=dfw       -v /var/run/docker.sock:/var/run/docker.sock:ro       -v $PWD/rules.toml:/config/dfw.toml       --net host --cap-add=NET_ADMIN       pitkley/dfw:1.2.1 --log-level trace --config-path /config  

And yeah, stuff like nc etc. I do via a second ssh session, so I keep it open. :)

So, my problem clearly is that I cannot connect but would hope/expect to do so.

Some more information / pecularities / questions / comments:

  • sudo nft list ruleset does not seem to show any rules that have been created, is this expected?
table inet filter {
	chain input {
		type filter hook input priority filter; policy drop;
		tcp dport 22 accept
	}

	chain forward {
		type filter hook forward priority filter; policy drop;
	}

	chain output {
		type filter hook output priority filter; policy accept;
	}
}
table inet dfw {
	chain input {
		type filter hook input priority filter - 5; policy accept;
		ct state invalid drop
		ct state { established, related } accept
		iifname "docker0" meta mark set 0x000000df accept
	}

	chain forward {
		type filter hook forward priority filter - 5; policy accept;
		ct state invalid drop
		ct state { established, related } accept
	}
}
table ip dfw {
	chain prerouting {
		type nat hook prerouting priority dstnat - 5; policy accept;
	}

	chain postrouting {
		type nat hook postrouting priority srcnat - 5; policy accept;
	}
}
table ip6 dfw {
	chain prerouting {
		type nat hook prerouting priority dstnat - 5; policy accept;
	}

	chain postrouting {
		type nat hook postrouting priority srcnat - 5; policy accept;
	}
}
  • I'm puzzled why I need to state a network. Before I just had "bridge" network, so I also tried putting that as name. Later, I created a new custom bridge network and connected the container to it (which I double-checked via docker inspect) that I called portainer_network and which you see in my rules-file now.
  • I was unsure whether I needed to restart the container first or not, so I did that. I also stopped and started it (which retained the specific network)... nothing helped here.

Probably I messed up something very basic. I tried also to make sure that old iptables is disabled but sudo systemctl disable iptables told me it did not even know iptables.

Hm, when I reset nft rules to /etc/nftables.conf, it's empty though, when I start dfw, it fills up to the config shown above, so it seems to do something.

@MichaelVoelkel
Copy link
Author

MichaelVoelkel commented Jan 1, 2024

Hm, having set the device explicitly to eth0, I see now some rules (yeah, I tried other containers, too, but nothing works) and nftables seems to reflect it, but no change in behaviour. Apart from that I see that journalctl, despite logs, does not show the incoming 9443 packets anymore.

table inet filter {
	chain input {
		type filter hook input priority filter; policy drop;
		tcp dport 22 accept
		log
		log
	}

	chain forward {
		type filter hook forward priority filter; policy drop;
	}

	chain output {
		type filter hook output priority filter; policy accept;
	}
}
table inet dfw {
	chain input {
		type filter hook input priority filter - 5; policy accept;
		ct state invalid drop
		ct state { established, related } accept
		iifname "docker0" meta mark set 0x000000df accept
	}

	chain forward {
		type filter hook forward priority filter - 5; policy accept;
		ct state invalid drop
		ct state { established, related } accept
		iifname "docker0" oifname "eth0" meta mark set 0x000000df accept
		tcp dport 9443 ip daddr 172.17.0.2 iifname "eth0" oifname "br-22eb53281a80" meta mark set 0x000000df accept
		tcp dport 8000 ip daddr 172.17.0.2 iifname "eth0" oifname "br-22eb53281a80" meta mark set 0x000000df accept
		tcp dport 9115 ip daddr 172.20.0.3 iifname "eth0" oifname "br-d65ccc79fc1d" meta mark set 0x000000df accept
	}
}
table ip dfw {
	chain prerouting {
		type nat hook prerouting priority dstnat - 5; policy accept;
		tcp dport 9443 iifname "eth0" meta mark set 0x000000df dnat to 172.17.0.2:9443
		tcp dport 8000 iifname "eth0" meta mark set 0x000000df dnat to 172.17.0.2:8000
		tcp dport 9115 iifname "eth0" meta mark set 0x000000df dnat to 172.20.0.3:9115
	}

	chain postrouting {
		type nat hook postrouting priority srcnat - 5; policy accept;
		oifname "eth0" meta mark set 0x000000df masquerade
	}
}
table ip6 dfw {
	chain prerouting {
		type nat hook prerouting priority dstnat - 5; policy accept;
		tcp dport 9443 iifname "eth0" meta mark set 0x000000df
		tcp dport 8000 iifname "eth0" meta mark set 0x000000df
		tcp dport 9115 iifname "eth0" meta mark set 0x000000df
	}

	chain postrouting {
		type nat hook postrouting priority srcnat - 5; policy accept;
		oifname "eth0" meta mark set 0x000000df masquerade
	}
}

@MichaelVoelkel
Copy link
Author

MichaelVoelkel commented Jan 1, 2024

Ok, my log was stupid because now I want to log FORWARD of course. And there I see something:

Jan 01 13:25:18 v62887.php-friends.de kernel: IN=eth0 OUT=br-d65ccc79fc1d MAC=<filtered> SRC=<filtered> DST=172.20.0.2 LEN=64 TOS=0x00 PREC=0x00 TTL=50 ID=0 DF PROTO=TCP SPT=54694 DPT=9115 WINDOW=65535 RES=0x00 SYN URGP=0 MARK=0xdf 

Well, this seems fine. The packet is filtered towards the docker container but for some reason nothing happens there hmmm...

OUT is a bit strange though, this is some veth0 interface because this whole thing runs on a KVM virtual machine... (not managed by me but my provider where I buy the hosting solution, was not a problem so far though).

@MichaelVoelkel
Copy link
Author

MichaelVoelkel commented Jan 1, 2024

Ok, I needed to also add "backend_defaults"... I somehow thought this would not be needed as it was default anyways.

Also in your docs you describe some sample nftables.conf file... This is a REALLY bad one because it will also not allow pinging out or working with established connections. Maybe replacing it with something with sensible rules would make more sense?

I suggest:

#!/usr/sbin/nft -f

flush ruleset

table inet filter {
    chain input {
        type filter hook input priority 0; policy drop;
        tcp dport 22 accept
	ct state invalid drop
	ct state { established, related } accept
	ip protocol icmp icmp type echo-request accept;
	icmpv6 type echo-request accept;
    }
    chain forward {
        type filter hook forward priority 0; policy drop;
	ct state { established, related } accept
    }
    chain output {
        type filter hook output priority 0; policy accept;
    }
}

although it's certainly incomplete because ping6 does not work yet... anyways

@pitkley
Copy link
Owner

pitkley commented Jan 5, 2024

Hi @MichaelVoelkel, thanks for reaching out. I'll try to go through your various points one-by-one, although some might overlap with others. 🙂


I'm puzzled why I need to state a network. Before I just had "bridge" network, so I also tried putting that as name. Later, I created a new custom bridge network and connected the container to it (which I double-checked via docker inspect) that I called portainer_network and which you see in my rules-file now.

Every Docker container you run has to be attached to some kind of Linux network interface, at least assuming it should be able to connect to the network (which it does unless you specify --network none). Docker, when you run a container without specifying --network, does use the default bridge network it creates for itself when it first starts up.

Given this fact that a Docker container always will be associated with a virtual bridge network interface, for firewalling to work, nftables has to know which network interface packets are destined for or coming from and thus DFW has to know too, for it to be able to create rules with the correct constraints.

I was unsure whether I needed to restart the container first or not, so I did that. I also stopped and started it (which retained the specific network)... nothing helped here.

Unless you run DFW with the --run-once flag (which you haven't according to your first post), DFW will automatically update the nftables ruleset whenever anything surrounding Docker containers changes. So if you start a container after DFW is already running, and a rule you have defined applies to the container, DFW will automatically roll out this new rule.

If you have started your applications before you started DFW, DFW will still automatically apply all relevant rules, because it also applies all rules whenever it starts up.

/etc/nftables exists but I don't use it. I hope it makes no trouble.

I am fairly certain that you are using it, even if you don't think you are: the nftables systemd-service uses this file to apply rules when it launches. You can verify this using this command:

$ cat "$(systemctl show -P FragmentPath nftables.service)" | grep '^Exec'
ExecStart=/usr/sbin/nft -f /etc/nftables.conf
ExecReload=/usr/sbin/nft -f /etc/nftables.conf
ExecStop=/usr/sbin/nft flush ruleset

This means that the systemd-unit nftables.service during start and reload just instructs nftables through the nft command to load the ruleset from the /etc/nftables.conf file. You can verify that the nftables service is used through this command:

$ systemctl show --property ActiveState --property UnitFileState nftables.service
ActiveState=active
UnitFileState=enabled

If the unit is active and enabled, it works as I have described above.

The reason I'm going into so much detail here: my personal suggestion for setting up nftables is to configure your base-rules in /etc/nftables.conf, i.e. primarily rules that are not directly related to the Docker containers you are running, and then have DFW take care of the rest.

Following is the /etc/nftables.conf file that I'm using:

#!/usr/sbin/nft -f

flush ruleset

table inet filter {
    chain input {
        type filter hook input priority 0; policy drop;

        # Allow local traffic
        iif lo accept

        # Allow related traffic (-> stateful connection tracking)
        ct state { established, related } accept

        # Setup ICMP and ICMPv6
        icmp type { echo-request, echo-reply, time-exceeded, parameter-problem, destination-unreachable } accept
        icmpv6 type { echo-request, echo-reply, time-exceeded, parameter-problem, destination-unreachable, packet-too-big, nd-router-advert, nd-router-solicit, nd-neighbor-solicit, nd-neighbor-advert, mld-listener-query } accept

        # Configure SSH
        tcp dport 22 accept

        # reject traffic instead of just dropping it
        reject with icmpx type port-unreachable
    }

    chain forward {
        type filter hook forward priority 0; policy drop;
    }

    chain output {
        type filter hook output priority 0; policy accept;
    }
}

A few things to note in the input hook:

  • iif lo accept enables me to access services locally, which the drop policy would otherwise disallow.

  • ct state { established, related } accept is added to allow stateful tracking of traffic, as you have also suggested.

    This is actually not really necessary, because DFW will by default add both ct state invalid drop and ct state { established, related } accept rules to the input hook. The reason for this is that DFW expects stateful tracking to take place, and thus forces the creation of these rules.

    I still add ct state { established, related } accept here though because I want responses to connections to work on server startup even before DFW has run.

  • I allow ICMP.

  • I have a final rule to reject any traffic that didn't match, rather than just dropping it (which I understand to be good hygiene if you don't want try to hide your host).

(it has the ports mapping of 9443:9443, so I should access it via localhost, e.g., nc -vv localhost 9443). As connection times out both ways, I'd assume that the package is dropped.

You are correct that the package will be dropped. As shown above I add the iif lo accept rule to enable this kind of traffic to work. I think adding that to the default documentation would likely make sense, because it is very confusing if local traffic doesn't work.

Ok, I needed to also add "backend_defaults"... I somehow thought this would not be needed as it was default anyways.

Do I understand correctly that things work now after you have added backend_defaults, but didn't before?

One thing I did notice in the rulesets you have posted is that DFW does not hook itself into the filter tables, which it will do if the backend_defaults are set up like this:

[backend_defaults]
custom_tables = { name = "filter", chains = ["input", "forward"] }

You can find more details on this field in the documentation here, but the gist of it is this: DFW has to be able to act on traffic when it traverses any one of the input or forward hooks. This can be achieved in one of three ways:

  1. Have no other tables that hook input or forward, leaving only DFW's table.

    This is not really feasible, because that would leave you with an entirely open firewall, at least until DFW has run.

  2. Ensure that any existing tables that hook input or forward don't drop the traffic before it reaches DFW's tables.

    This is not great if you want your input hook to have a drop policy, which I personally would always want, just to make sure I don't accidentally expose any port I didn't intend to expose.

  3. Let DFW know about any existing tables and chains that hook input or forward.

    This is what the custom_tables setting does and it give us the best of both worlds: we can ensure that DFW can correctly accept traffic it is responsible for while still being able to default drop traffic in the input hook.

This is a REALLY bad one because it will also not allow pinging out or working with established connections.

Assuming DFW has run and is instructed to attach to the existing tables it would work: the output hook does not deny the echo-request (ping) and the input hook would allow related packets to let the echo-response (pong) come through. Without it having run though, the default would disallow this from happening, yes.

Regarding established/related packets: I agree, this should be part of the default config. Regarding allowing incoming pings:
I don't want to prescribe to a user of DFW whether they want their host to be pingable. I think a good middle ground would be to add it with a comment, i.e. indicating to the user that their host won't be pingable unless they enable that rule.


In summary, the final more-than-minimal configuration that works well for me is this:

  • /etc/nftables.conf:

    #!/usr/sbin/nft -f
    
    flush ruleset
    
    table inet filter {
        chain input {
            type filter hook input priority 0; policy drop;
    
            # Allow local traffic
            iif lo accept
    
            # Allow related traffic (-> stateful connection tracking)
            ct state { established, related } accept
    
            # Setup ICMP and ICMPv6
            icmp type { echo-request, echo-reply, time-exceeded, parameter-problem, destination-unreachable } accept
            icmpv6 type { echo-request, echo-reply, time-exceeded, parameter-problem, destination-unreachable, packet-too-big, nd-router-advert, nd-router-solicit, nd-neighbor-solicit, nd-neighbor-advert, mld-listener-query } accept
    
            # Configure SSH
            tcp dport 22 accept
    
            # reject traffic instead of just dropping it
            reject with icmpx type port-unreachable
        }
    
        chain forward {
            type filter hook forward priority 0; policy drop;
        }
    
        chain output {
            type filter hook output priority 0; policy accept;
        }
    }
    
  • dfw/main.toml:

    [global_defaults]
    external_network_interfaces = [
        "eno1",
    ]
    
    [backend_defaults]
    custom_tables = { name = "filter", chains = ["input", "forward"] }
  • dfw/wwtc.toml:

    [[wider_world_to_container.rules]]
    network = "reverseproxy"
    dst_container = "traefik-traefik-1"
    expose_port = 443

Tasks:

@MichaelVoelkel
Copy link
Author

Hi! Thanks for your great, long answer. Yeah, now everything works, maybe apart from locally accessing containers, BUT I will try out your iif lo filter because I don't have this one.

And all sounds really interesting what you write. Of course, it's true that nftables is used as base. And yes to:

Do I understand correctly that things work now after you have added backend_defaults, but didn't before?

My default policy clearly is drop. And by the way, as for pings, I was just talking to pings going outside. I agree that inside pings are a different story.

As for the network thingy, I was just thinking, if the docker container only has one network, dfw could theoretically read it from it and use it, as a convenience idea. But yeah, that's not necessarily needed.

All in all, I need to say: your solution is great!!

Getting nftables running with docker is normally not doable nicely... And it should really be a firewall solution that sits in its own docker container to have it maintainable. This seems like the best practice. And your repo offers exactly this solution. So thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants