Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UDP Problem on Windows #1737

Open
s1377427321 opened this issue Jan 12, 2019 · 18 comments
Open

UDP Problem on Windows #1737

s1377427321 opened this issue Jan 12, 2019 · 18 comments

Comments

@s1377427321
Copy link

I use quic transmission,but sometimes quic will close all network connections.
source code:
ts, err := c.conn.AcceptStream()
i detecte error
InternalError: read udp4 0.0.0.0:55289: wsarecvfrom: The connection has been broken due to keep-alive activity detecting a failure while the operation was in progress

@marten-seemann
Copy link
Member

What is this error?

wsarecvfrom: The connection has been broken due to keep-alive activity detecting a failure while the operation was in progress

This is not a log message of quic-go. Please enable logging as described in the wiki. Otherwise we can’t see what going on here.

@anacrolix
Copy link

Should this be reopened?

@marten-seemann marten-seemann changed the title quic close all connect UDP Problem on Windows Jan 25, 2019
@anacrolix
Copy link

Is there any update on this? There was some discussion at libp2p/go-libp2p#519.

@marten-seemann
Copy link
Member

To be honest, I don't really understand the problem and I have no idea how to reproduce it, so unfortunately, there's no update for this.

@apmattil
Copy link
Contributor

I have seen all kind of wsa errors when connection is broken (I'm testing this).
all wsa errors comes from windows sockets and really should just be treated as fatal error.. I tried to check the Temporary and retries, they really do not recover.

@djdv
Copy link

djdv commented Sep 20, 2020

I think my former guess was right: libp2p/go-libp2p#519 (comment)
I was messing around in another repo with a similar problem recently: anacrolix/dht#16 (comment)
and that's what I did there.

Potentially around here: https://github.com/lucas-clemente/quic-go/blob/272a2c88e670fcb39fe93b9dc74e5f2d5acc67a6/conn.go#L40
the error should probably be checked to see if it's temporary, in a way similar to the comment linked above. Ideally with the Windows specific code being placed in Go itself. I'll have to see if there's an existing issue for this and will report back later.

In the meantime, if someone has a reproducible, that'd make checking this easier. I think you just need to set up a server and kill a client in the middle of a send to get this error on Windows. Among other things for other errors, like having the network interface lose connection while receiving on the server side.
Without code backing this though, it's just speculation.

@marten-seemann
Copy link
Member

@djdv You mean we should check if err is a net.Error.Temporary, right? We could definitely do that, although I'd feel more comfortable making changes here if we could somehow reproduce the problem (and ideally add a unit test to make sure that we don't reintroduce this bug at some point in the future).

@VinozzZ
Copy link
Contributor

VinozzZ commented Sep 25, 2020

It seems that @djdv has created a screencast for the issue: https://www.youtube.com/watch?v=GAkn9KI4ceY&t=183s.
Is this good enough to be a reproducer?

@marten-seemann
Copy link
Member

Interesting screencast, @djdv. Is there any related Go issue yet?

@djdv
Copy link

djdv commented Sep 26, 2020

@marten-seemann

You mean we should check if err is a net.Error.Temporary, right?

That's right.
I'm not sure what quantifies as a "temporary" error for net though. I believe since this is non fatal that it would count, but I'm not sure. If Go doesn't want to consider it temporary then it has to be checked for manually.

Is there any related Go issue yet?

I found an issue related to file errors (golang/go#35131), but the CL is discussing WSA values as well (https://go-review.googlesource.com/c/go/+/208537/).
I asked @networkimprov in there if they could link the relevant issue. (I can't seem to find it)

@networkimprov
Copy link

Responded on golang/go/issues/35131

@marten-seemann
Copy link
Member

marten-seemann commented Sep 27, 2020

I have a partial fix for this issue in b7f05b5. Partial because, if I understand the screencast correct, not all WSA values are correctly marked as temporary errors by Go.
Not really sure how to continue from here onwards. Any suggestions?

@djdv
Copy link

djdv commented Sep 29, 2020

Not really sure how to continue from here onwards. Any suggestions?

I think the linked patch is good for now. While it won't solve the issue on Windows yet, if these errors get added as "temporary" errors then it will in later Go versions.

I'm in the middle of a different IPFS/libp2p fix right now, but will look into the Go side of things afterwards (if nobody beats me to it).

If Go starts treating these errors as temporary, then nothing further should be necessary.
If not, then we can just check for these errors specifically in a separate patch, which would need to be build constrained to Windows only (with checks being a NOOP on other platforms).

If it's important that this be fixed now, we can do the second option with build constraints to the Go version as well.
So that we can push a fix on Windows/Go ≤1.15, and then if Go gets patched it will be a NOOP on Windows/Go 1.16≥ (if Go gets patched).
Edit: or rather the opposite
We push the patch, and then if/when Go changes, we constrain that patch to the versions that don't treat these as temporary.

@marten-seemann
Copy link
Member

marten-seemann commented Apr 2, 2022

Go 1.18 deprecated the net.Error.Temporary method: https://pkg.go.dev/net#Error.

This means that we should also get rid of the logic to ignore temporary errors: https://github.com/lucas-clemente/quic-go/blob/d1498c360e207522c7f45c3fb30c39b66139b008/packet_handler_map.go#L347-L350
#3367 makes the corresponding change.

Regarding this issue, looks like we will need to finally identify WSA errors properly to actually fix this issue. As I'm not a Windows user, I'll need help with that.

@marten-seemann
Copy link
Member

Actually, we probably shouldn't remove this code. We shouldn't cause a regression. However, probably can't continue relying on Temporary() a lot longer, the standard library will probably stop return those at some point.

@zllovesuki
Copy link
Contributor

Reference here: https://docs.microsoft.com/en-us/windows/win32/api/winsock/nf-winsock-recvfrom?redirectedfrom=MSDN

Someone should reproduce the error and see which error code was causing the issues:
WSAECONNRESET: Windows kernel killed the socket
WSAENETRESET: Harmless error, should treat the packet as lost

@thepaul

This comment was marked as outdated.

@marten-seemann
Copy link
Member

marten-seemann commented Jul 14, 2023

@thepaul This seems to be a separate issue. I created a new one in #3956. Let's continue the discussion there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants