Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make client and server to resync active connections #74

Open
wants to merge 16 commits into
base: master
Choose a base branch
from

Conversation

aruiz14
Copy link
Contributor

@aruiz14 aruiz14 commented Apr 5, 2024

Issue: 44576

Relates to #68
Depends on #78

Problem

remotedialer allows multiplexing connections between two peers using a single websocket connections by including a connection ID in the messages and using separate buffers. The protocol specifies different message types for different actions (Connect, Data, Error, Pause, Resume, etc.). In particular, the Error type is used to communicate the other end that a certain connection must be closed. However, depending on the cause of the original error, this message may never be successfully transmitted, as the sender will give up on sending it (#67 adds additional logging for this situation).

When this happens, one of the peers will never receive a termination message for that connection, making the underlying buffers to get stuck on Read() forever, hence causing goroutine and memory leaks.

Solution

This PR adds a new message type to the protocol (Resync), whose payload contains a list of connection IDs. Similarly to how clients sends Ping control messages, Resync messages will periodically tell the receiving peer that any connection not contained in the provided list is no longer needed and can be pruned.

Small caveat: we cannot use Control messages for this purpose, since websocket set a limit of 125 bytes for their payload, which would impose a tight restriction on the number of connections.

CheckList

  • Test

session_resync.go Outdated Show resolved Hide resolved
connection.go Show resolved Hide resolved
session_resync.go Outdated Show resolved Hide resolved
message.go Outdated Show resolved Hide resolved
session_resync.go Outdated Show resolved Hide resolved
session_resync.go Outdated Show resolved Hide resolved
connection.go Show resolved Hide resolved
moio
moio previously approved these changes Apr 9, 2024
Copy link
Contributor

@moio moio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now, thanks

session_sync.go Show resolved Hide resolved
session.go Outdated Show resolved Hide resolved
session_sync.go Show resolved Hide resolved
session.go Outdated Show resolved Hide resolved
session_sync.go Outdated Show resolved Hide resolved
Copy link
Contributor

@tomleb tomleb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to follow what we discussed during our meeting.

Given the potential deadlock mentioned in the comments below, I think this requires more testing. It should be tested in a custom Rancher build and it would be beneficial imo to have integration tests for this, though I understand that right now remotedialer doesn't have this in place.

session_sync.go Show resolved Hide resolved
session_sync.go Outdated Show resolved Hide resolved
message.go Show resolved Hide resolved
session.go Outdated Show resolved Hide resolved
session_sync.go Show resolved Hide resolved
types.go Outdated Show resolved Hide resolved
session_sync_test.go Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants