Build reliable, traceable, distributed systems with ZeroMQ (ZeroRPC) by Jérôme Petazzoni from dotCloud

Presenter: Jérôme Petazzoni (http://www.dotcloud.com/) (@jpetazzo)

PyCon 2012 presentation page: https://us.pycon.org/2012/schedule/presentation/260/

Slides: https://docs.google.com/presentation/d/1FRh6kb79tcT0Deb4bjYVDvdvJAVMXYjXJK3EFT4pj8w/edit

Video: http://pyvideo.org/video/639/build-reliable-traceable-distributed-systems-wi

Video running time: 36:22

ZeroRPC GitHub repo: https://github.com/dotcloud/zerorpc-python

This talk is about ZeroRPC, a library for doing RPC using ZeroMQ, that was recently open-sourced by dotCloud.

Introduction

Why did we build an RPC system?

(00:22 / 36:22 into the video)

dotCloud is a PaaS.
- We deploy, monitor, and scale your apps (in the cloud!)
Many moving parts
... On a large distributed cluster

What the architecture looks like

(00:45 / 36:22 into the video)

Easy Requirements

(01:08 / 36:22 into the video)

Expose arbitrary code with minimal modification
- if we can do import foo; foo.bar(42)
- we want to be able to do foo = RemoteService(...); foo.bar(42)
Self-documented system
- Want to see methods, signatures, and docstrings

More Difficult Requirements

(02:09 / 36:22 into the video)

Propagate exceptions
Language-agnostic
Brokerless, highly available, fast, support fan-in/fan/out
- Not necessarily all at the same time!
We want to trace & profile nested calls

Why not {x}?

(04:06 / 36:22 into the video)

Why not HTTP
Why not AMQP

What we came up with

(04:48 / 36:22 into the video)

ZeroRPC!

Based on ZeroMQ and MessagePack
Supports everything we needed!
Multiple implementations
- Internal "reference" implementation in Python
- Public "alternative" implementation with gevent
  
  Internal Node.js implementation (so-so)

Using it

Example: unmodified code

(05:45 / 36:22 into the video)

Expose the urllib module over RPC:

$ zerorpc-client --server --bind tcp://127.0.0.1:1234 urllib

Example: calling code

(06:14 / 36:22 into the video)

From the command-line (for testing):

$ zerorpc-client tcp://127.0.0.1:1234 quote "hello pycon"
connecting to "tcp://127.0.0.1:1234"
'hello%20pycon'

From Python code:

>>> import zerorpc
>>> remote_urllib = zerorpc.Client()
>>> remote_urllib.connect('tcp://127.0.0.1:1234')
[None]
>>> remote_urllib.quote('hello pycon')
'hello%20pycon'

Example: introspection

(06:38 / 36:22 into the video)

We can list methods:

$ zerorpc-client tcp://127.0.0.1:1234 | grep ^q
quote                       quote('abc def') -> 'abc%20def'
quote_plus                  Quote the query fragment of a URL; replacing ' ' with '+'

We can see signatures and docstrings:

$ zerorpc-client tcp://127.0.0.1:1234 quote_plus -?
connecting to "tcp://127.0.0.1:1234"

quote_plus(s, safe='')

Quote the query fragment of a URL; replacing ' ' with '+'

Example: exceptions

(06:47 / 36:22 into the video)

$ zerorpc-client tcp://127.0.0.1:1234 quote_plus
connecting to "tcp://127.0.0.1:1234"
Traceback (most recent call last):
...
zerorpc.exceptions.RemoteError: Traceback (most recent call last):
  File "/Users/marca/dev/git-repos/zerorpc-python/zerorpc/core.py", line 201, in _async_task
    functor.pattern.process_call(self._context, socket, event, functor)
  File "/Users/marca/dev/git-repos/zerorpc-python/zerorpc/core.py", line 74, in process_call
    result = context.middleware_call_procedure(functor, *event.args)
  File "/Users/marca/dev/git-repos/zerorpc-python/zerorpc/context.py", line 88, in middleware_call_procedure
    return procedure(*args, **kwargs)
  File "/Users/marca/dev/git-repos/zerorpc-python/zerorpc/core.py", line 55, in __call__
    return self._functor(*args, **kargs)
TypeError: quote_plus() takes at least 1 argument (0 given)

Example: load balancing

(07:07 / 36:22 into the video)

Start a load balancing hub:

$ cat foo.yml
in: "tcp://*:1111"
out: "tcp://*:2222"
type: queue
$ zerohub.py foo.yml

Start (at least) one worker:

$ zerorpc-client --server tcp://localhost:2222 urllib

Now connect to the "in" side of the hub:

$ zerorpc-client tcp://localhost:1111

Example: high availability

(07:30 / 36:22 into the video)

Start a local HAProxy in TCP mode, dispatching requests to 2 or more remote services or hubs:

$ cat haproxy.cfg
listen zerorpc 0.0.0.0:1111
    mode tcp
    server backend_a localhost:2222 check
    server backend_b localhost:3333 check
$ haproxy -f haproxy.cfg

Start (at least) one backend:

$ zerorpc-client --server --bind tcp://0:2222 urllib

Now connect to HAProxy:

$ zerorpc-client tcp://localhost:1111

Non-example: PUB/SUB

(08:01 / 36:22 into the video)

Not in public repo -- yet

Broadcast a message to a group of nodes
- But if a node leaves and rejoins, he'll lose messages
Send a continuous stream of information
- But if a speaker or listener leaves and rejoins...

You generally don't want to do this!

Better pattern: ZeroRPC streaming with gevent

Example: streaming

(09:10 / 36:22 into the video)

Server code returns an iterator
Client code gets an iterator
Small messages, high latency? No problem!
- Server code will pre-push elements
- Client code will notify server if pipeline runs low
Huge messages? No problem!
- Big data sets can be nicely chunked
- They don't have to fit entirely in memory
- Don't worry about timeouts anymore
Also supports long polling

Example: tracing

(10:15 / 36:22 into the video)

Not in public repo yet

Implementation details

(11:16 / 36:22 into the video)

This will be useful if:

You think you might want to use ZeroRPC
You think you might want to hack ZeroRPC
You want to implement something similar
You just happen to love distributed systems

ZeroMQ

(11:50 / 36:22 into the video)

Sockets on steroids - http://zguide.zeromq.org/page:all
Handles (re)connections for us
Works over regular TCP
Also has superfast ipc:// and inproc://
Different patterns:
- REQ/REP
- PUB/SUB
- PUSH/PULL
- DEALER/ROUTER
pip install pyzmq-static FTW (Thanks, Brandon Craig Rhodes!) (pyzmq-static on PyPI)

MessagePack

(13:28 / 36:22 into the video)

MessagePack
In our tests, msgpack is more efficient than JSON, BSON, YAML:
- 20-50x faster
- serialized output is 2x smaller or better

$ pip install msgpack-python

>>> import msgpack
>>> bytes = msgpack.dumps(data)

Wire format

(14:09 / 36:22 into the video)

Request: (headers, method_name, args)

headers dict
- no mandatory header
- carries the protocol version number
- used for tracing in our in-house version
args
- list of arguments
- no named parameters

Response: (headers, ERRSTREAM, value)

Timeouts

(15:20 / 36:22 into the video)

0MQ does not detect disconnections (or rather, it works hard to hide them)
You can't know when the remote is gone
Original implementation: 30s timeout
Published implementation: heartbeat

Introspection

(16:13 / 36:22 into the video)

Expose a few special calls:
- _zerorpc_list to list calls
- _zerorpc_name to know who you're talking to
- _zerorpc_ping (redundant with the previous one)
- _zerorpc_help to retrieve the docstring of a call
- _zerorpc_args to retrieve the argspec of a call
- _zerorpc_inspect to retrieve everything at once

Naming

(17:10 / 36:22 into the video)

Published implementation does not include any kind of naming/discovery
In-house version uses a flat YAML file, mapping service names to 0MQ addresses and socket types, but ashamed to publish this :-)
In progress: use DNS records
- SRV for host+port
- TXT for 0MQ socket type (not sure about this!)
In progress: registration of services
- Majordomo protocol

Security: there is none

(18:00 / 36:22 into the video)

No security at all in 0MQ
- assumes that you are on a private, internal network
If you need to run "in the wild", use SSL:
- bind 0MQ socket on localhost
- run stunnel (with client cert verification)
In progress: authentication layer
dotCloud API is actually ZeroRPC, exposed through a HTTP/ZeroRPC gateway
In progress: standardization of this gateway

Tracing (not published yet)

(19:26 / 36:22 into the video)

Initial implementation during a hack day
- bonus: displays live latency and request rates, using http://projects.nuttnet.net/hummingbird/
- bonus: displays graphical call flow, using http://raphaeljs.com/
- bonus: send exceptions to airbrake/sentry
Refactoring in progress, to "untie" it from the dotCloud infrastructure and Open Source It

How it works: all calls and responses are logged to a central place, along with a trace_id unique to each sequence of calls.

Tracing: trace_id

(21:14 / 36:22 into the video)

Each call has a trace_id
The trace_id is propagated to subcalls
The trace_id is bound to a local context (think thread local storage)
When making a call:
- If there is a local trace_id, use it
- If there is none ("root call"), generate one (GUID)
trace_id is passed in all calls and responses

Note: this is not (yet) in the GitHub repository:

Tracing: trace collection

(21:42 / 36:22 into the video)

If a message (sent or received) has a trace_id, we send out the following things:
- trace_id
- call name (or, for return values, OK|ERR+exception)
- current process name and hostname
- timestamp

Internal details: the collection is built on top of the standard logging module.

Tracing: trace storages

(22:10 / 36:22 into the video)

Traces are sent to a Redis key/value store
- each trace_id is associated with a list of traces
- we keep some per=service counters
- Redis persistence is disabled
- entries are given a TTL so they expire automatically
- entries were initially JSON (for easy debugging)
- ... then "compressed" with msgpack to save space
- approximately 16 GB of traces per day

Internal details: the logging handler does not talk directly to Redis; it sends traces to a collector (which itself talks to Redis)

The problem with being synchronous

(23:19 / 36:22 into the video)

Original implementation was synchronous
Long-running calls blocked the server
Workaround: multiple workers and a hub
Wastes resources
Does not work well for very long calls
- Deployment and provisioning of new cluster nodes
- Deployment and scaling of user apps

Note: this is not specific to ZeroRPC (Preforking servers, threaded servers, WSGI...)

First shot at asynchronicity

(24:28 / 36:22 into the video)

Send asynchronous events & setup callbacks
"Please do foo(42) and send the result to this other place once you're done"
We tried this. We failed.
- distributed spaghetti code
- trees falling in the forest with no one to hear them
Might have worked better if we had...
- better support in the library
- better naming system
- something to make sure we don't lose calls (a kind of distributed FSM, maybe?)

Gevent to the rescue!

(26:02 / 36:22 into the video)

Gevent -- http://www.gevent.org/
Write synchronous code (a.k.a.: don't rewrite your services)
Uses coroutines to achieve concurrency
No forks, no threads (no problems? :-))
Monkey patch standard library (to replace blocking calls with async versions)
Achieve "unlimited" concurrency server-side

The version published on GitHub uses gevent.

Show me the code!

(27:17 / 36:22 into the video)

https://github.com/dotcloud/zerorpc-python.git

$ pip install git+git://github.com/dotcloud/zerorpc-python.git

Has:

zerorpc module
zerorpc-client helper
exception propagation
gevent integration

Doesn't have:

tracing
naming
helpers for PUB/SUB and PUSH/PULL
authentication

Questions?

(28:25 / 36:22 into the video)

...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dotcloud_zerorpc.rst

dotcloud_zerorpc.rst

Build reliable, traceable, distributed systems with ZeroMQ (ZeroRPC) by Jérôme Petazzoni from dotCloud

Introduction

Why did we build an RPC system?

What the architecture looks like

Easy Requirements

More Difficult Requirements

Why not {x}?

What we came up with

Using it

Example: unmodified code

Example: calling code

Example: introspection

Example: exceptions

Example: load balancing

Example: high availability

Non-example: PUB/SUB

Example: streaming

Example: tracing

Implementation details

ZeroMQ

MessagePack

Wire format

Timeouts

Introspection

Naming

Security: there is none

Tracing (not published yet)

Tracing: trace_id

Tracing: trace collection

Tracing: trace storages

The problem with being synchronous

First shot at asynchronicity

Gevent to the rescue!

Show me the code!

Questions?

Files

dotcloud_zerorpc.rst

Latest commit

History

dotcloud_zerorpc.rst

File metadata and controls

Build reliable, traceable, distributed systems with ZeroMQ (ZeroRPC) by Jérôme Petazzoni from dotCloud

Introduction

Why did we build an RPC system?

What the architecture looks like

Easy Requirements

More Difficult Requirements

Why not {x}?

What we came up with

Using it

Example: unmodified code

Example: calling code

Example: introspection

Example: exceptions

Example: load balancing

Example: high availability

Non-example: PUB/SUB

Example: streaming

Example: tracing

Implementation details

ZeroMQ

MessagePack

Wire format

Timeouts

Introspection

Naming

Security: there is none

Tracing (not published yet)

Tracing: trace_id

Tracing: trace collection

Tracing: trace storages

The problem with being synchronous

First shot at asynchronicity

Gevent to the rescue!

Show me the code!

Questions?