Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up state machine #90

Open
mehaase opened this issue Nov 16, 2018 · 2 comments
Open

Clean up state machine #90

mehaase opened this issue Nov 16, 2018 · 2 comments
Labels
refactoring This is a refactoring with no change to functionality
Milestone

Comments

@mehaase
Copy link
Contributor

mehaase commented Nov 16, 2018

There are a lot of complex code paths for closing a websocket connection, mostly because the first implementation was overly simplistic and I kept hacking away on it to handle each new problem that arose. This could be cleaned up and made a lot more readable.

@mehaase mehaase added the refactoring This is a refactoring with no change to functionality label Nov 17, 2018
@mehaase mehaase added this to the 1.0 milestone Nov 17, 2018
@mehaase
Copy link
Contributor Author

mehaase commented Nov 17, 2018

Brain dumping some of the issues that make the close code complex:

There are multiple stages of closing. We can say that CLOSING is when we are asked to start the handshake and CLOSED is when the underlying stream is torn down. (Or maybe not—see below.)

Wsproto and WebSocketConnection have different meanings of "closing" and "closed". Wsproto is CLOSING when it writes a close frame to its internal buffer. The websocket state is CLOSING when it somebody calls aclose() or a ConnectionClosed event is received but the response is not yet sent. Wsproto is CLOSED when it has sent and received a close frame. WebSocketConnection is closed when the websocket handshake is complete and the underlying stream has been torn down. We probably should not rely on wsproto's state property at all. Instead, we should observe state from events (like ConnectionClosed) and infer state from sequences like wsproto.close(); self._write_pending().

There are two layers to the protocol: the application layer (WebSocket) and the stream layer (i.e. typically TCP). We generally want to close the application layer with a handshake before closing the stream, but sometimes that's not possible, i.e. if the underlying stream is torn down before the handshake occurs.

I wonder if tearing down the stream inside WebSocketConnection is even the right thing to do? It might be surprising for somebody who calls wrap_client_stream(stream) and finds out later the stream was closed... maybe they wanted to reuse that stream for some other protocol? We could define WebSocketConnection closed state as when the handshake completes, then put the responsibility for stream teardown somewhere else, i.e. inside APIs like open_websocket(…). These APIs are already responsible for creating a stream, why shouldn't they be responsible for tearing it down?

The local and remote endpoints each have a state. E.g. if we close with code 4001 then we send a frame with 4001. The server completes the handshake by sending back a frame, but it can send back whatever code it wants, i.e. 4002. Should the local close code property show the close code we sent or the close code we received... or both?

Closing behavior is different depending on which side initiates. E.g. if the remote endpoint initiates closing, then we keep the internal message channel open until the user has read all of the messages queued in it. If the local endpoint initiates closing, then we close the receive channel and the user cannot read any more messages.

We have to be careful not to block the reader task. One mistake I made several times is trying to call self.aclose() from an event handler. This never works because all of the event handlers run inside the reader task and aclose() waits for the connection closed event, so the reader task is deadlocked. There needs to be some method for internal close that is safe to call from the reader task.

The next step is to try to draw a diagram to identify all of the possible ways the connection can close. This should help me understand how to rewrite this.

@mehaase mehaase self-assigned this May 10, 2019
@mehaase mehaase changed the title Clean up connection close code Clean up state machine May 15, 2019
@mehaase
Copy link
Contributor Author

mehaase commented May 15, 2019

I'm going to broaden the scope of this to look at the entire state machine in the library. I'm sure there are bugs (or undefined behavior) in certain spots, e.g. what happens if you call accept() and reject() on the same request? You'll probably get an obscure wsproto error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactoring This is a refactoring with no change to functionality
Projects
None yet
Development

No branches or pull requests

1 participant