Skip to content

2018, [warner]: Home

piegames edited this page Nov 6, 2020 · 1 revision

I've spoken to several people about porting Magic-Wormhole from Python to Rust, so here's a repository to get things started. It's likely to be a long road, and I'm personally pretty new to the language, so don't expect things to be working for a while yet.

Protocols

The core magic-wormhole protocol (named "The Wormhole Protocol") interacts with the "mailbox server" (which I used to call the Rendezvous or Relay server), performs the PAKE negotiation, computes a session key, and then enables the exchanged of short queued messages. This involves Websockets, JSON, SPAKE2, libsodium SecretBox, and a dozen-ish state machines. The messages each have a distinct "phase", which can be any string. Numeric phases are delivered (in-order) to the application, while named phases are reserved for library use. The CLI client uses just this protocol for the wormhone send --text=MESSAGE feature.

The CLI client can also send files. This uses a a second protocol named "Transit". It uses the Wormhole protocol to exchange messages which 1: indicate that we want to send a file, or a directory (as opposed to just a text message, which doesn't need Transit), 2: share the IP address/port "connection hints" to which the other side should try connecting, and 3: acknowledge the successful receipt of a file. The connection hints are used to make a direct TCP connection between the two sides, and the session key is used to confirm a handshake message (so we know we're talking to the right peer), and to encrypt all subsequent messages. The Transit protocol also knows how to talk to a "Transit Helper" relay server, which has an extra line of setup protocol (both sides announce a token to the server when they connect, and two connections with the same token are glued together, so the relay handshake must complete before the transit handshake can be sent).

I'm in the process of replacing the "Transit" protocol with a new one, named "Dilation", in which the wide-bore direct connection is more directly integrated into the Wormhole API (w.dilate()). The new API lets applications create multiple subchannels, in either direction, which get multiplexed onto the same encrypted TCP stream, with flow control in both directions. This will make it much easier for GUI apps to leave the wormhole open for a long time, so users can drag and drop files into the window at any time after startup. In the python library, you get Twisted "Endpoint" objects from which you can create the subchannels (ep.connect() and ep.listen()). The Dilation protocol doesn't use libsodium directly, instead it uses the Noise protocol for the transport layer.

I'm also thinking about replacing the core Wormhole protocol's encryption format with Noise, but that will depend upon a future version of Noise that has the PAKE phase built-in. In this protocol, all phases would be numbered (they'd just be sequentially-numbered Noise frames), and application data would be included in a sub-field of the top-level message.

I'm thinking that the Rust port should include the Wormhole and Dilation protocol, but maybe it can leave out Transit. That will allow it to interoperate between new versions of the wormhole python CLI tool, but not the old ones (including the current 0.10.5 release), since Dilation isn't done yet.

Crates To Use

I'm pretty sure we should use Tokio as the core. From what I've seen, it's the moral equivalent of Twisted, although the control flow is even more twisted to my inexperienced eyes. My main concerns are:

  • no shared-state concurrency, too bug-prone
  • make it easy for applications to use wormhole without knowing too much about the internals
  • take advantage of existing libraries like websocket

I'm ok if the Wormhole object that applications see uses threads internally, perhaps to run the event loop while the application blocks for a response, but I want to build the wormhole library around Futures and Streams and such rather than trying to wrangle a bunch of threads.

(add other useful crates here)

websocket

python-library architecture

The Python implementation of Magic Wormhole is composed of about 13 state machines, and a front-end API object. These state machines happen to be implemented with Automat, but any reasonable FSM tool would be fine (as would a simple pile of enums and transition methods). The docs/state-machines/machines.dot file shows the various machines and how they're wired up (use the Makefile in that directory to turn it into a PNG). The frontend object comes two flavors: a DelegatedWormhole (which runs methods on an app-supplied "delegate" object to deliver events), and the DeferredWormhole (which has methods that return Deferreds which fire upon those same events).

Game Plan

I'm thinking that we start with the Websocket connector: just connect to the Mailbox server, send the BIND message, print out all received messages, driven by a CLI tool that takes no arguments and doesn't attempt to do anything useful.

Then, we wrap that in the reconnector state machine (the thing provided by Twisted's ClientService). This reacts to a lost connection by waiting a random delay and then starting a new connection attempt. The layer above this tells turns it on and off, and is notified about new connections becoming available. This is the state machine provided in the Rendezvous class (src/wormhole/_rendezvous.py). Then we add the message parsing and serialization code from that file.

Once all the functionality in _rendezvous.py is ported, we move to the next connected state machine, perhaps Allocator or Nameplate. These objects don't interact with the network (only Rendezvous does that), so they should be easier to port.

Then we port the other machines, one at a time.

The last machine to port will be Boss, which is responsible for instantiating the other objects and wiring them together.

Then we build the frontend Wormhole object, and figure out what sort of API would be easiest for applications to use. Should we expose Futures? Should the inbound series of app-relevan messages be delivered as a Stream, with the outbound messages as a Sink? Should we provide a blocking API so apps don't have to know about Tokio or Executors or Futures at all?

Ideally we'll write tests for each machine as we implement them, before moving on to the next. But we might need to experiment with the interactions between these components, so maybe we should build a couple and see if the architecture makes sense before putting too much time into the unit tests for any individual machine.