Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I use quic transmission,but sometimes quic will close all network connections. #519

Closed
s1377427321 opened this issue Jan 12, 2019 · 9 comments
Labels
status/blocked Unable to be worked further until needs are met

Comments

@s1377427321
Copy link

source code:
ts, err := c.conn.AcceptStream()

i detecte error
InternalError: read udp4 0.0.0.0:55289: wsarecvfrom: The connection has been broken due to keep-alive activity detecting a failure while the operation was in progress

@anacrolix anacrolix self-assigned this Jan 23, 2019
@anacrolix
Copy link
Contributor

Thanks for spotting this. This is a common bug around UDP use with Go on Windows. We should move this issue to either the QUIC transport wrapper, or report it directly to the Go QUIC implementation if the mistake is there. @marten-seemann can you assist?

@marten-seemann
Copy link
Contributor

marten-seemann commented Jan 23, 2019

We had an issue for this in quic-go: quic-go/quic-go#1737. Being pretty inexperienced with Windows, I assumed that wsarecvfrom was some an error code of another library wrapped around quic-go, which is why I closed this issue without action, and unfortunately never got any response.

I'm running all unit and some of the integration tests on AppVeyor on Windows, and everything seems to work fine there. I'll need some guidance in how to reproduce this issue.

@anacrolix
Copy link
Contributor

@Stebalien
Copy link
Member

@djdv, any ideas?

@djdv
Copy link

djdv commented Jan 24, 2019

Disclaimer: I lack experience with UDP on Windows. Mostly familiar with TCP and Named Pipe connections.

This appears to be
WSAECONNRESET (10052)
https://docs.microsoft.com/en-us/windows/desktop/WinSock/windows-sockets-error-codes-2
Edit: removed additional text, I mixed up WSAECONNRESET and WSAENETRESET 👀

I'm lacking context here though. During an accept or send, what's probably happening is that data was buffered, the remote host became unreachable, and the next read call errors. Whatever is wrapping these operations is then picking up on this and closing the socket. But I'm not sure the inner workings here or the dependency chain.

With context from @anacrolix it looks like we can receive innocuous or potentially incorrect (library fault) errors from Read() that are probably tripping something to close the socket prematurely. In the linked torrent issue it seems appropriate to disregard the error entirely or adjust the buffers (depending on what error was actually received.

My best guess is that if these errors aren't already exposed, we need to explode them out and switch on them using
https://docs.microsoft.com/en-us/windows/desktop/api/winsock2/nf-winsock2-wsarecvfrom
as a reference for what is and isn't fatal.

@anacrolix
Copy link
Contributor

With context from @anacrolix it looks like we can receive innocuous or potentially incorrect (library fault) errors from Read() that are probably tripping something to close the socket prematurely. In the linked torrent issue it seems appropriate to disregard the error entirely or adjust the buffers (depending on what error was actually received.

Yep, spot on.

My best guess is that if these errors aren't already exposed, we need to explode them out and switch on them

I'd love to see how you solve that. It's difficult to pin down specific errors in Go due to the very loose typing inside the error interface.

@djdv
Copy link

djdv commented Jan 25, 2019

I'd love to see how you solve that. It's difficult to pin down specific errors in Go due to the very loose typing inside the error interface.

It's true. Go can make this pretty ugly if the libs don't export error variables for us.
Typically you have to do some digging to find the origin of the error.

In this case it likely originates from pkg/net and is likely a net.OpError.
If we had a reproducible case we could easily just use Go itself to tell us the underlying type with a simple fmt call fmt.Printf("%T %#v\n", err, err).

Once we know the source and concrete type of the error, it's just a matter of unwrapping it. It's messy but likely necessary if the abstractions don't already provide a way to do something like err == somelib.ErrUDPTTL

err := ambiguousErr()
if netErr, ok := err.(net.OpError); ok {
    if netErr.Temporary() {
        // drop err;  break|return
    }
    log.Errorf("Fatal net error: {%T}%#v", netErr.Err, netErr.Err)
}
log.Errorf("Unexpected/Unhandled error: {%T}%#v", err, err)

Depending on the source of the error and the level of control needed, you could repeat the pattern to continue unwrapping. In the case of a net.OpError you could get the underlying error from the struct which is likely in itself a syscall.Errno or a wrapper of it.

...
if netErr.Temporary() {
    ...
}
if platformErr, ok := netErr.Err.(syscall.Errno); ok {
    if platformErr == SomeConst {
        // drop err;  break|return
    }
}

Edit: no more code changes 🤞

@djdv
Copy link

djdv commented Jan 25, 2019

It's looking like the error is generated here:
https://golang.org/src/net/fd_windows.go#L160

func (fd *netFD) readFrom(buf []byte) (int, syscall.Sockaddr, error) {
	n, sa, err := fd.pfd.ReadFrom(buf)
	runtime.KeepAlive(fd)
	return n, sa, wrapSyscallError("wsarecvfrom", err)
}

https://golang.org/src/net/error_posix.go

If needed you could insert this into one of them to see the whole call stack and see who's generating this error when it happens.

var i int
for {
	pc, fn, line, ok := runtime.Caller(i)
	if !ok {
		break
	}
	fmt.Printf("[%d] %s[%s:%d]\n", i, runtime.FuncForPC(pc).Name(), fn, line)
	i++
}

@anacrolix anacrolix removed their assignment Mar 27, 2019
@anacrolix anacrolix added the status/blocked Unable to be worked further until needs are met label Mar 27, 2019
@Stebalien
Copy link
Member

Closing in favor of quic-go/quic-go#1737.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status/blocked Unable to be worked further until needs are met
Projects
None yet
Development

No branches or pull requests

5 participants