Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use dill for serialization #121

Closed
wants to merge 4 commits into from
Closed

Use dill for serialization #121

wants to merge 4 commits into from

Conversation

chriso
Copy link
Member

@chriso chriso commented Mar 10, 2024

This PR introduces dill for serialization of coroutine state, replacing pickle from the standard library.

From the dill README:

dill can pickle the following standard types:
- none, type, bool, int, float, complex, bytes, str,
- tuple, list, dict, file, buffer, builtin,
- Python classes, namedtuples, dataclasses, metaclasses,
- instances of classes,
- set, frozenset, array, functions, exceptions

dill can also pickle more 'exotic' standard types:
- functions with yields, nested functions, lambdas,
- cell, method, unboundmethod, module, code, methodwrapper,
- methoddescriptor, getsetdescriptor, memberdescriptor, wrapperdescriptor,
- dictproxy, slice, notimplemented, ellipsis, quit
 
dill cannot yet pickle these standard types:
- frame, generator, traceback

dill also provides the capability to:
- save and load Python interpreter sessions
- save and extract the source code from functions and classes
- interactively diagnose pickling errors

Dispatch supports serializing coroutines (including generators) and their frames, so that's a non-issue.

The fact that dill can serialize cell vars means that this PR fixes #117.

One thing I like about dill is the built-in tracing. The DISPATCH_TRACE environment variable can be used to enable dill tracing. Below is an example trace when serializing the state of the functions from #117.

Example trace:
┬ T4: <class 'dispatch.scheduler.State'>
└ # T4 [31 B]
┬ D2: <dict object at 0x7710c08f5d80>
├┬ D2: <dict object at 0x7710bc313000>
│├┬ T4: <class 'dispatch.scheduler.Coroutine'>
││└ # T4 [16 B]
│├┬ D2: <dict object at 0x7710bc313a40>

[DISPATCH] Serializing DurableCoroutine(main.<locals>.main):
function = main.<locals>.main (/home/chris/Documents/dispatch-py/fail.py:16)
code hash = sha256:dbaf58eb0631ab76bf33303a556635379de62063f03365323a3a3d7d5d2c1a83
args = ()
kwargs = {}
wrapped coroutine = None
frame state = -1
IP = 44
SP = 4
stack[0] = <cell at 0x7710bed9e9e0: Function object at 0x7710beda76a0>
stack[1] = NULL
stack[2] = <built-in function print>
stack[3] = DurableCoroutineWrapper(Function._call_async)

││├┬ T4: <class 'dispatch.experimental.durable.function.DurableCoroutine'>
│││└ # T4 [62 B]
││├┬ D2: <dict object at 0x7710bc313c00>
│││├┬ D2: <dict object at 0x7710bc313b80>
││││├┬ D2: <dict object at 0x7710bc312c00>
│││││└ # D2 [2 B]
││││└ # D2 [197 B]
│││├┬ D2: <dict object at 0x7710bc313b40>
││││├┬ Ce1: <cell at 0x7710bed9e9e0: Function object at 0x7710beda76a0>
│││││├┬ F2: <function _create_cell at 0x7710bed5bb00>
││││││└ # F2 [30 B]
│││││├┬ T4: <class 'dispatch.function.Function'>
││││││└ # T4 [33 B]
│││││├┬ D2: <dict object at 0x7710bc313d00>
││││││├┬ Me1: <bound method Function._call_async of <dispatch.function.Function object at 0x7710beda76a0>>
│││││││├┬ T1: <class 'method'>
││││││││├┬ F2: <function _load_type at 0x7710bed5afc0>
│││││││││└ # F2 [17 B]
││││││││└ # T1 [34 B]
│││││││├┬ T4: <class 'dispatch.experimental.durable.function.DurableFunction'>
││││││││└ # T4 [22 B]
│││││││├┬ D2: <dict object at 0x7710bc313d80>
││││││││├┬ T4: <class 'dispatch.experimental.durable.registry.RegisteredFunction'>
│││││││││└ # T4 [64 B]
││││││││├┬ D2: <dict object at 0x7710bc313f00>
│││││││││└ # D2 [172 B]
││││││││└ # D2 [302 B]
│││││││└ # Me1 [371 B]
││││││├┬ T4: <class 'dispatch.client.Client'>
│││││││└ # T4 [29 B]
││││││├┬ D2: <dict object at 0x7710bc321040>
│││││││└ # D2 [83 B]
││││││├┬ D2: <dict object at 0x7710bc313e00>
│││││││├┬ D2: <dict object at 0x7710bc321180>
││││││││└ # D2 [121 B]
│││││││└ # D2 [146 B]
││││││└ # D2 [768 B]
│││││└ # Ce1 [842 B]

[DISPATCH] Serializing DurableCoroutineWrapper(Function._call_async):
function = Function._call_async (/home/chris/Documents/dispatch-py/src/dispatch/function.py:102)
code hash = sha256:16c439fd2da61359756fb7d093d125c96dfba71f90cfdc99e25f806dc4b60d6b
args = (<dispatch.function.Function object at 0x7710beda76a0>,)
kwargs = {}
wrapped coroutine = DurableCoroutine(Function._call_async)
frame state = -1
IP = 102
SP = 4
stack[0] = <dispatch.function.Function object at 0x7710beda76a0>
stack[1] = ()
stack[2] = {}
stack[3] = <types._GeneratorWrapper object at 0x7710bc3135d0>

││││├┬ T4: <class 'dispatch.experimental.durable.function.DurableGenerator'>
│││││└ # T4 [23 B]
││││├┬ D2: <dict object at 0x7710bc320f80>
│││││├┬ D2: <dict object at 0x7710bc321280>
││││││├┬ D2: <dict object at 0x7710bc3124c0>
│││││││└ # D2 [2 B]
││││││└ # D2 [30 B]

[DISPATCH] Serializing DurableCoroutine(Function._call_async):
function = Function._call_async (/home/chris/Documents/dispatch-py/src/dispatch/function.py:102)
code hash = sha256:16c439fd2da61359756fb7d093d125c96dfba71f90cfdc99e25f806dc4b60d6b
args = (<dispatch.function.Function object at 0x7710beda76a0>,)
kwargs = {}
wrapped coroutine = None
frame state = -1
IP = 102
SP = 4
stack[0] = <dispatch.function.Function object at 0x7710beda76a0>
stack[1] = ()
stack[2] = {}
stack[3] = <types._GeneratorWrapper object at 0x7710bc3135d0>

│││││├┬ D2: <dict object at 0x7710bc323600>
││││││├┬ D2: <dict object at 0x7710bc323540>
│││││││├┬ D2: <dict object at 0x7710c0738180>
││││││││└ # D2 [2 B]
│││││││└ # D2 [30 B]
││││││├┬ D2: <dict object at 0x7710bc321300>
│││││││├┬ D2: <dict object at 0x7710bc3125c0>
││││││││└ # D2 [2 B]
│││││││├┬ T4: <class 'types._GeneratorWrapper'>
││││││││└ # T4 [30 B]
│││││││├┬ D2: <dict object at 0x7710bc334340>

[DISPATCH] Serializing DurableGenerator(call):
function = call (/home/chris/Documents/dispatch-py/src/dispatch/coroutine.py:9)
code hash = sha256:4a80fb324f937fc1f3d2f33d15cb96ba73a0311cdd8371375856b5bbe256b16f
args = (Call(function='main.<locals>.sub1', input=Arguments(args=(), kwargs={}), endpoint='http://host.docker.internal:8000/', correlation_id=1),)
kwargs = {}
wrapped coroutine = None
frame state = -1
IP = 8
SP = 1
stack[0] = Call(function='main.<locals>.sub1', input=Arguments(args=(), kwargs={}), endpoint='http://host.docker.internal:8000/', correlation_id=1)

││││││││├┬ D2: <dict object at 0x7710bc334fc0>
│││││││││├┬ D2: <dict object at 0x7710bc334740>
││││││││││├┬ T4: <class 'dispatch.proto.Call'>
│││││││││││└ # T4 [26 B]
││││││││││├┬ D2: <dict object at 0x7710bc335200>
│││││││││││├┬ T4: <class 'dispatch.proto.Arguments'>
││││││││││││└ # T4 [16 B]
│││││││││││├┬ D2: <dict object at 0x7710bc3358c0>
││││││││││││├┬ D2: <dict object at 0x7710c08f5d40>
│││││││││││││└ # D2 [2 B]
││││││││││││└ # D2 [11 B]
│││││││││││└ # D2 [82 B]
││││││││││├┬ D2: <dict object at 0x7710bc313600>
│││││││││││└ # D2 [2 B]
││││││││││└ # D2 [280 B]
│││││││││├┬ D2: <dict object at 0x7710bc334cc0>
││││││││││└ # D2 [35 B]
│││││││││└ # D2 [326 B]
││││││││└ # D2 [401 B]
│││││││└ # D2 [477 B]
││││││└ # D2 [518 B]
│││││├┬ D2: <dict object at 0x7710bc313cc0>
││││││└ # D2 [44 B]
│││││└ # D2 [608 B]
││││└ # D2 [1 MiB]
│││└ # D2 [1 MiB]
││├┬ T4: <class 'dispatch.scheduler.CallFuture'>
│││└ # T4 [17 B]
││├┬ D2: <dict object at 0x7710bc313c40>
│││└ # D2 [22 B]
││└ # D2 [1 MiB]
│└ # D2 [1 MiB]
└ # D2 [2 MiB]

Although the size of the outermost object is reported as 2 MB, the serialized state in this case is ~2KB, which is only slightly larger than the equivalent state when using pickle. It's not a fair comparison though; pickle cannot serialize cell vars, and so I need to move the functions to the top-level in order to compare state.

The library also provides tooling for inspecting state offline, which may come in handy in future.

@chriso chriso self-assigned this Mar 10, 2024
Copy link
Contributor

@achille-roussel achille-roussel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good 👍

One question I would like to validate is if dill supports custom serialization via __reduce__, which would be needed to benefit from changes like encode/httpx#3108
It's not clear from a quick search, and the documentation doesn't seem to mention it either, so it might be useful to add a test to validate this on our side.

I would say one advantage of using dill instead of pickle is it's an open-source project separated from Python, would we ever need to make changes to it we could do so without being tied to the serialization

Copy link
Contributor

@achille-roussel achille-roussel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other question that came to mind, should we use replace other use of pickle with dill (for example)?

@chriso
Copy link
Member Author

chriso commented Mar 13, 2024

One question I would like to validate is if dill supports custom serialization via reduce, which would be needed to benefit from changes like encode/httpx#3108

Let's hold off on merging this until we have a better idea of the capabilities and trade-offs of dill, or until we have more users running into serialization issues and need a short-term fix. I'd like to explore whether pickle's dispatch tables could be used to solve #94, or whether dill provides an alternative solution there.

Other question that came to mind, should we use replace other use of pickle with dill (for example)?

Let's stick with pickle for now in the dispatch.proto file, since we're only using it to serialize input and output values which are less likely to include the "exotic" objects that dill is able to serialize.

@chriso
Copy link
Member Author

chriso commented Apr 7, 2024

This is out of sync. I'll reboot if/when necessary.

@chriso chriso closed this Apr 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cannot create durable nested function
2 participants