Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core/connection/pool: spawning a task does not inform the waker #2483

Closed
elenaf9 opened this issue Feb 7, 2022 · 2 comments
Closed

core/connection/pool: spawning a task does not inform the waker #2483

elenaf9 opened this issue Feb 7, 2022 · 2 comments

Comments

@elenaf9
Copy link
Contributor

elenaf9 commented Feb 7, 2022

While debugging a flaky AutoNAT test, I noticed that once in a while it happens that a pending connection that was added in Pool::add_incoming is not directly upgraded to an "established-connection" (the used TCP transport always directly returns upgrade: futures::ok, so that shouldn't be the issue?). Instead, it seem like after calling Pool::spawn with the pending connection, the waker of Pool::poll isn't waken again, and the upgrade of the connection does not happen. Only when a behaviour event occurred in the AutoNAT test, the network was polled again and hence also the Pool. This lead to some unexpected behaviour for said test:

  • the server successfully connects to the clients address
  • the client receives the incoming connection and the transport reports a ListenerEvent::Upgrade { upgrade: future::ok(incoming.stream).. }
  • the future is added via Pool::add_incoming to the stream of pending futures
  • the client's pool is not polled, therefore the stream stays as "pending"
  • at the same time from the server's PoV the connection was established, therefore is sends back a DialResponse::Ok to the client
  • the client receives the DialResponse::Ok, before being notified about the inbound connection in the first place

In the above case, it is mostly a issue of chronological correctness for the reported events. However, if there are no other events to waken the task at all, I wonder if the connection would never upgrade?
I am not sure what the reason for this behaviour is; on inbound connections the swarm reports a SwarmEvent::IncomingConnection, so I would think after that the pool would be polled again and the pending connection upgraded?

@thomaseizinger also wrote in #2059 (comment) about some weird behaviour of FuturesUnordered, maybe it's related with that?
I'll try to create a minimal example in the next days to reproduce it.

@mxinden
Copy link
Member

mxinden commented Feb 9, 2022

I am not sure what the reason for this behaviour is; on inbound connections the swarm reports a SwarmEvent::IncomingConnection, so I would think after that the pool would be polled again and the pending connection upgraded?

That is what I would expect to be happening as well.

I'll try to create a minimal example in the next days to reproduce it.

Cool. Thanks @elenaf9.

@elenaf9
Copy link
Contributor Author

elenaf9 commented Feb 13, 2022

I tried to reproduce this bug with a simpler, autonat-independent test, but wasn't successful. Even for above linked test autonat::test_client::test_auto_probe it only happens every ~1000 tests (which honestly makes it a real nightmare to debug).
Because this was never reported before and I am not able to reproduce it, and thus I am not even sure where the problem lies, I am closing this issue for now. If this occurs again and/or I have more time to look into it, I am happy to re-open or create a new issue once I figured out what the real problem is.
(Edit: thinking about it now, I am wondering if it is even a bug in the first place, or just due to the fact that the test runs in a single threaded environment, so that the server registers the connection and sends back the DialResponse::Ok while the client's task is still queued.)
For the autonat test in question I will add a fix within #2480 to handle the case that a OutboundProbeEvent::Response is reported before the SwarmEvent::ConnectionEstablished.

@elenaf9 elenaf9 closed this as completed Feb 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants