Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Imported object's method cannot be run properly when submited to dask client #4233

Closed
jiangtianli91 opened this issue Nov 10, 2020 · 5 comments

Comments

@jiangtianli91
Copy link

What happened:

When I import an class and submit its method to client, it does not run properly. It runs fine if the class is defined in the same script.

I also checked cloudpickle. Cloudpickle can successfully pickle the object.

>>> print(xor_net.print_name())
[<src.network.Group object at 0x7fbd28da0610>, <src.network.Group object at 0x7fbd28da05e0>, <src.network.Group object at 0x7fbd28da0580>]
>>> print(client.submit(xor_net.print_name).result())
[]

Minimal Complete Verifiable Example:

In this repo

Anything else we need to know?:

Environment:

  • Dask version: dask 2.30.0
  • Python version: python 3.8.5
  • Operating System: Mac os
  • Install method (conda, pip, source): conda
@jrbourbeau
Copy link
Member

Thanks for raising an issue @jiangtianli91. Here's a minimal reproducer:

In [1]: !cat test_module.py

class Foo:
    bar = []

    def append(self, value):
        self.bar.append(value)

    def show(self):
        return self.bar

In [2]: from test_module import Foo

In [3]: foo = Foo()

In [4]: foo.append(124)

In [5]: foo.show()
Out[5]: [124]

In [6]: from distributed import Client

In [7]: client = Client()

In [8]: client.submit(foo.show).result()
Out[8]: []

My guess is this has to do with how cloudpickle handles sending class-level attributes across processes. For example, if in the above snippet we use client = Client(processes=False) then everything works as expected. A quick workaround would be to see if you can use instance-level attributes instead of class-level attributes.

@jiangtianli91
Copy link
Author

jiangtianli91 commented Nov 12, 2020

@jrbourbeau Thank you for your explanation! I found the same problem in ray. I wonder is there a way to make accessing class-level attributed possible in dask?

@jrbourbeau
Copy link
Member

As this appears to be an issue with cloudpickle, I went ahead and opened up an issue cloudpipe/cloudpickle#398. Let's see where the discussion goes over there

@jrbourbeau
Copy link
Member

@jiangtianli91 this is a known limitation in cloudpickle (see cloudpipe/cloudpickle#398 (comment) for a nice explanation of the current behavior you're observing). There is a proposed update to ensure class-level attributes for classes from imported modules are included with cloudpickle (xref cloudpipe/cloudpickle#391) however this work is still ongoing.

In the meantime there is a temporary workaround where you can set the __module__ attribute on your class to "main" to trick cloudpickle into including class-level attributes:

class Foo:
    __module__ = "main"
    bar = []

    def append(self, value):
        self.bar.append(value)

but be aware this may also introduce side effects

@jrbourbeau
Copy link
Member

Closing as this is more of a cloudpickle issue than a distributed issue. @jiangtianli91 hopefully the temporary workaround helps while progress continues on cloudpipe/cloudpickle#391

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants