Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PMTUD may be implemented incorrectly #3327

Closed
tobyxdd opened this issue Feb 3, 2022 · 18 comments
Closed

PMTUD may be implemented incorrectly #3327

tobyxdd opened this issue Feb 3, 2022 · 18 comments

Comments

@tobyxdd
Copy link
Contributor

tobyxdd commented Feb 3, 2022

Hi,

We talked about the DF bit on Windows a while ago, and I submitted #3155. But apparently it led to this unforeseen problem #3273 so PMTUD ended up being disabled on Windows as a dirty fix. As I revisited this today I realized that perhaps there is something fundamentally wrong with the way quic-go currently implements PMTUD, and it's not really a Windows issue.

Previously we both assumed that the DF bit is enabled by default on Linux. But that's not true, as the man page says this:

image

The default behavior on Linux (IP_PMTUDISC_WANT) is to only set DF bit when the packet is smaller than the path MTU, and fragment the packet otherwise. This implies that right now even if quic-go sends a packet larger than the path MTU, it will still be fragmented, sent, and received, nullifying PMTUD altogether. And if you explicitly set it to IP_PMTUDISC_DO, you will get a similar behavior as on Windows - errors when trying to send packets larger than the path MTU, instead of discarding them silently.

I wrote a demo and verified this on my Ubuntu 20.04 server, which has an MTU of 1500. When it sends packets smaller than 1500, the DF bit is set. When it sends packets larger than 1500, it fragments them and sends them successfully nonetheless.

image

@tobyxdd
Copy link
Contributor Author

tobyxdd commented Feb 3, 2022

I also wrote a custom PacketConn that adds some padding bytes to every packet to make it larger, and passed it to quic-go. The QUIC payload reaches MaxPacketBufferSize regardless of the MTU and the actual packet size going out.

@marten-seemann
Copy link
Member

Does setting the DF socket option resolve this?

@tobyxdd
Copy link
Contributor Author

tobyxdd commented Feb 6, 2022

That would lead to the same problem as in #3273 - the kernel returns error instead of silently dropping the packet, at least on Linux. Haven't tried it on any other platform.

@tobyxdd
Copy link
Contributor Author

tobyxdd commented Feb 6, 2022

The error looks like this on Linux: write udp [::]:52857->x.x.x.x:6003: sendto: message too long

@marten-seemann
Copy link
Member

We can either catch that error, or just ignore it and let loss recovery kick in after an RTT or so. This is not the main purpose of PMTUD anyway. You really shouldn't be sending packets larger than your local link MTU anyway. The more interesting use for PMTUD is finding the MTU of a path over the internet, for which the current code still works fine (or doesn't it)?

@tobyxdd
Copy link
Contributor Author

tobyxdd commented Feb 6, 2022

I'm not sure how the current code handles this error, I'll have a look tomorrow. But I can confirm that this error does NOT only occur when sending a packet larger than the local MTU. It occurs when sending a packet larger than the path MTU.

@marten-seemann
Copy link
Member

How does the kernel know about the path MTU in the case of UDP?

@tobyxdd
Copy link
Contributor Author

tobyxdd commented Feb 6, 2022

We can either catch that error, or just ignore it and let loss recovery kick in after an RTT or so. This is not the main purpose of PMTUD anyway. You really shouldn't be sending packets larger than your local link MTU anyway. The more interesting use for PMTUD is finding the MTU of a path over the internet, for which the current code still works fine (or doesn't it)?

No it doesn't. quic-go doesn't do anything about DF bit on Linux right now. Since by default it's IP_PMTUDISC_WANT, Linux DOES fragment and send the probe packets.

@tobyxdd
Copy link
Contributor Author

tobyxdd commented Feb 6, 2022

How does the kernel know about the path MTU in the case of UDP?

image

https://man7.org/linux/man-pages/man7/udp.7.html

@marten-seemann
Copy link
Member

I'm not sure how the kernel would do that for UDP. There's no acknowledgements, so the kernel can't know if a packet of a certain size has been delivered (ICMP can't be relied upon on the internet).

@tobyxdd
Copy link
Contributor Author

tobyxdd commented Feb 6, 2022

ICMP can't be relied upon on the internet

But ICMP does work in most cases. I assume that's how Linux implements it (RFC 1191)

@marten-seemann
Copy link
Member

Then the right solution would be to ignore this error message. PMTUD packets don't carry any application data, so there's no benefit in declaring them lost right away. If we wanted to be fancy, we could tell the MTU discoverer right away, but I'm not sure if that's worth the extra complexity.

Do you want to send us a PR?

@tobyxdd
Copy link
Contributor Author

tobyxdd commented Feb 6, 2022

Tried this code on a VPS with a local MTU of 9000 but a path MTU of 1500.

package main

import (
	"fmt"
	"golang.org/x/sys/unix"
	"net"
)

func main() {
	conn, err := net.ListenUDP("udp", nil)
	if err != nil {
		panic(err)
	}
	defer conn.Close()
	rc, err := conn.SyscallConn()
	if err != nil {
		panic(err)
	}
	err = rc.Control(func(fd uintptr) {
		err := unix.SetsockoptInt(int(fd), unix.IPPROTO_IP, unix.IP_MTU_DISCOVER, unix.IP_PMTUDISC_DO)
		if err != nil {
			panic(err)
		}
	})
	if err != nil {
		panic(err)
	}
	bs := make([]byte, 7000)
	fmt.Println(conn.WriteToUDP(bs, &net.UDPAddr{
		IP:   net.IPv4(8, 8, 8, 8),
		Port: 53,
	}))
}

image

First packet went through. Subsequent attempts all returned errors.

@tobyxdd
Copy link
Contributor Author

tobyxdd commented Feb 6, 2022

Then the right solution would be to ignore this error message. PMTUD packets don't carry any application data, so there's no benefit in declaring them lost right away. If we wanted to be fancy, we could tell the MTU discoverer right away, but I'm not sure if that's worth the extra complexity.

Do you want to send us a PR?

Maybe 🤔 I'll see what I can do

@zllovesuki
Copy link
Contributor

Then the right solution would be to ignore this error message. PMTUD packets don't carry any application data, so there's no benefit in declaring them lost right away. If we wanted to be fancy, we could tell the MTU discoverer right away, but I'm not sure if that's worth the extra complexity.
Do you want to send us a PR?

Maybe 🤔 I'll see what I can do

I'm too busy with school to dive into this. If you are able, would greatly appreciate your effort!

@tobyxdd
Copy link
Contributor Author

tobyxdd commented Feb 7, 2022

@marten-seemann Are we ignoring all errors from WritePacket or just MTU-related ones? I'm not sure if the former would have any side effects. As for the latter, the only way to tell is... not so elegant:

	bs := make([]byte, 7000)
	_, err = conn.WriteToUDP(bs, &net.UDPAddr{
		IP:   net.IPv4(8, 8, 8, 8),
		Port: 53,
	})
	if opErr, ok := err.(*net.OpError); ok {
		if syscallErr, ok := opErr.Err.(*os.SyscallError); ok {
			if syscallErrno, ok := syscallErr.Err.(syscall.Errno); ok {
				if syscallErrno == unix.EMSGSIZE {
					// This is what we are looking for.
					fmt.Println("EMSGSIZE")
				}
				fmt.Println(syscallErrno)
				fmt.Println(syscallErrno.Error())
			}
		}
	}

This is undocumented and platform-specific.

@marten-seemann
Copy link
Member

I think it would be nicer to just ignore the MTU-related ones. At least that would be a requirement if we want to feed the information back into the MTU discoverer immediately.

Your code might become a bit more elegant if you use errors.As, but of course that doesn't change the fact that we'll need a platform-specific isMTUError function.

zllovesuki added a commit to zllovesuki/quic-go that referenced this issue Feb 9, 2022
This commit introduces additional platform-dependent checking when the
kernel returns an error. Previously, the session is terminated when
PingFrame sends a discovery packet larger than the limit. With this
commit, the error is checked, and if it is "datagram too large", the
error is ignored.

Fixes quic-go#3273 quic-go#3327
zllovesuki added a commit to zllovesuki/quic-go that referenced this issue Feb 9, 2022
This commit introduces additional platform-dependent checking when the
kernel returns an error. Previously, the session is terminated when
PingFrame sends a discovery packet larger than the limit. With this
commit, the error is checked, and if it is "datagram too large", the
error is ignored.

Additionally, this commit re-enables MTU discovery on Windows unless it
is disable explicitly by user.

Fixes quic-go#3273 quic-go#3327
zllovesuki added a commit to zllovesuki/quic-go that referenced this issue Feb 9, 2022
This commit introduces additional platform-dependent checking when the
kernel returns an error. Previously, the session is terminated when
PingFrame sends a discovery packet larger than the limit. With this
commit, the error is checked, and if it is "datagram too large", the
error is ignored.

Additionally, this commit re-enables MTU discovery on Windows unless it
is disabled explicitly by user.

Fixes quic-go#3273 quic-go#3327
zllovesuki added a commit to zllovesuki/quic-go that referenced this issue Feb 9, 2022
This commit introduces additional platform-dependent checking when the
kernel returns an error. Previously, the session is terminated when
PingFrame sends a discovery packet larger than the limit. With this
commit, the error is checked, and if it is "datagram too large", the
error is ignored.

Additionally, this commit re-enables MTU discovery on Windows unless it
is disabled explicitly by user.

Fixes quic-go#3273 quic-go#3327
zllovesuki added a commit to zllovesuki/quic-go that referenced this issue Feb 9, 2022
This commit introduces additional platform-dependent checking when the
kernel returns an error. Previously, the session is terminated when
PingFrame sends a discovery packet larger than the limit. With this
commit, the error is checked, and if it is "datagram too large", the
error is ignored.

Additionally, this commit re-enables MTU discovery on Windows unless it
is disabled explicitly by user.

Fixes quic-go#3273 quic-go#3327
zllovesuki added a commit to zllovesuki/quic-go that referenced this issue Feb 9, 2022
This commit introduces additional platform-dependent checking when the
kernel returns an error. Previously, the session is terminated when
PingFrame sends a discovery packet larger than the limit. With this
commit, the error is checked, and if it is "datagram too large", the
error is ignored.

Additionally, this commit re-enables MTU discovery on Windows unless it
is disabled explicitly by user.

Undo quic-go#3276
Fixes quic-go#3273 quic-go#3327
zllovesuki added a commit to zllovesuki/quic-go that referenced this issue Feb 9, 2022
This commit introduces additional platform-dependent checking when the
kernel returns an error. Previously, the session is terminated when
PingFrame sends a discovery packet larger than the limit. With this
commit, the error is checked, and if it is "datagram too large", the
error is ignored.

Additionally,
1. This commit re-enables MTU discovery on Windows unless it
is disabled explicitly by user (Undo quic-go#3276),
2. Set IP_DONTFRAGMENT and IPV6_DONTFRAG with error checking on Windows,
   and
3. Set IP_MTU_DISCOVERY to PMTUDISC_DO for both IPv4 and IPv6 on Linux
   so that the kernel will return "message too long".

Fixes quic-go#3273 quic-go#3327
zllovesuki added a commit to zllovesuki/quic-go that referenced this issue Feb 10, 2022
This commit introduces additional platform-dependent checking when the
kernel returns an error. Previously, the session is terminated when
PingFrame sends a discovery packet larger than the limit. With this
commit, the error is checked, and if it is "datagram too large", the
error is ignored.

Additionally,
1. This commit re-enables MTU discovery on Windows unless it
is disabled explicitly by user (Undo quic-go#3276),
2. Set IP_DONTFRAGMENT and IPV6_DONTFRAG with error checking on Windows,
   and
3. Set IP_MTU_DISCOVERY to PMTUDISC_DO for both IPv4 and IPv6 on Linux
   so that the kernel will return "message too long".

Fixes quic-go#3273 quic-go#3327
zllovesuki added a commit to zllovesuki/quic-go that referenced this issue Feb 10, 2022
This commit introduces additional platform-dependent checking when the
kernel returns an error. Previously, the session is terminated when
PingFrame sends a discovery packet larger than the limit. With this
commit, the error is checked, and if it is "datagram too large", the
error is ignored.

Additionally,
1. This commit re-enables MTU discovery on Windows unless it
is disabled explicitly by user (Undo quic-go#3276),
2. Set IP_DONTFRAGMENT and IPV6_DONTFRAG with error checking on Windows,
   and
3. Set IP_MTU_DISCOVERY to PMTUDISC_DO for both IPv4 and IPv6 on Linux
   so that the kernel will return "message too long".

Fixes quic-go#3273 quic-go#3327
nmldiegues pushed a commit to nmldiegues/quic-go that referenced this issue Feb 10, 2022
This commit introduces additional platform-dependent checking when the
kernel returns an error. Previously, the session is terminated when
PingFrame sends a discovery packet larger than the limit. With this
commit, the error is checked, and if it is "datagram too large", the
error is ignored.

Additionally,
1. This commit re-enables MTU discovery on Windows unless it
is disabled explicitly by user (Undo quic-go#3276),
2. Set IP_DONTFRAGMENT and IPV6_DONTFRAG with error checking on Windows,
   and
3. Set IP_MTU_DISCOVERY to PMTUDISC_DO for both IPv4 and IPv6 on Linux
   so that the kernel will return "message too long".

Fixes quic-go#3273 quic-go#3327
marten-seemann pushed a commit that referenced this issue Feb 20, 2022
This commit introduces additional platform-dependent checking when the
kernel returns an error. Previously, the session is terminated when
PingFrame sends a discovery packet larger than the limit. With this
commit, the error is checked, and if it is "datagram too large", the
error is ignored.

Additionally,
1. This commit re-enables MTU discovery on Windows unless it
is disabled explicitly by user (Undo #3276),
2. Set IP_DONTFRAGMENT and IPV6_DONTFRAG with error checking on Windows,
   and
3. Set IP_MTU_DISCOVERY to PMTUDISC_DO for both IPv4 and IPv6 on Linux
   so that the kernel will return "message too long".

Fixes #3273 #3327
@marten-seemann
Copy link
Member

This was fixed in #3328.

nmldiegues pushed a commit to chungthuang/quic-go that referenced this issue Jun 6, 2022
This commit introduces additional platform-dependent checking when the
kernel returns an error. Previously, the session is terminated when
PingFrame sends a discovery packet larger than the limit. With this
commit, the error is checked, and if it is "datagram too large", the
error is ignored.

Additionally,
1. This commit re-enables MTU discovery on Windows unless it
is disabled explicitly by user (Undo quic-go#3276),
2. Set IP_DONTFRAGMENT and IPV6_DONTFRAG with error checking on Windows,
   and
3. Set IP_MTU_DISCOVERY to PMTUDISC_DO for both IPv4 and IPv6 on Linux
   so that the kernel will return "message too long".

Fixes quic-go#3273 quic-go#3327
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants