ActorFuture vs pure Future to submit #7115

YarShev · 2021-01-26T11:10:10Z

YarShev
Jan 26, 2021

Hi there,

Why is it forbidden to pass ActorFuture to submit? Wherein, pure Future can be passed to submit successfully.

from distributed.client import Client
c = Client()
future1 = c.submit(lambda: (1, 2))
futures = [c.submit(lambda l: l[i], future1) for i in range(2)]
futures[0].result()
1
futures[1].result()
2
class Actor:
    def foo(self):
        return (1, 2)
actor_future = c.submit(Actor, actor=True)
actor = actor_future.result()
future2 = actor.foo()
futures = [c.submit(lambda l: l[i], future2) for i in range(2)]
TypeError: cannot pickle '_thread.lock' object

Thanks in advance!

Answered by martindurant

Feb 8, 2021

In the dask model, futures denote tasks or values that are held in the cluster or due to be run in the cluster, and their results are stateless (i.e., you would get the same if you ran it again, on whichever worker).

Actors sit outside of this model - each is an instance on a specific worker, and maintains internal state. Running methods on an actor uses a completely different code-path, because the communication is directly from the client to the worker, and the scheduler is not involved. This is by design, to give minimum latency and stateful operation - since it's an arbitrary method, calling it twice might give new results. Note that type(future2) is not a normal future.

Your example …

View full answer

YarShev · 2021-02-02T15:46:14Z

YarShev
Feb 2, 2021
Author

cc @jsignell

1 reply

jsignell Feb 8, 2021
Collaborator

I am not sure if this is intentional or just an oversight. The Actor functionality is fairly new and likely has some rough edges. cc @martindurant who I think was looking into actors recently.

martindurant · 2021-02-08T15:15:43Z

martindurant
Feb 8, 2021

In the dask model, futures denote tasks or values that are held in the cluster or due to be run in the cluster, and their results are stateless (i.e., you would get the same if you ran it again, on whichever worker).

Actors sit outside of this model - each is an instance on a specific worker, and maintains internal state. Running methods on an actor uses a completely different code-path, because the communication is directly from the client to the worker, and the scheduler is not involved. This is by design, to give minimum latency and stateful operation - since it's an arbitrary method, calling it twice might give new results. Note that type(future2) is not a normal future.

Your example could be made to work by passing the actor itself and calling the method within the function you submit - yes, you can call actors from within tasks or even from within other actors. Totally agree that all of this API is niche and incomplete. I have two PRs on the matter, and no one to review them!

1 reply

YarShev Feb 9, 2021
Author

I saw your PRs on the matter. It would great to improve Actors' functionality. Also, I created the issue 4488 on the same point.

Regarding my example that needs to be make working, even though I pass actor itself to a function I can't pass a future, which actor returns, to another function because the same issue is arisen mentioned above.

mrocklin · 2021-02-08T15:23:08Z

mrocklin
Feb 8, 2021
Maintainer

I get the sense that you're trying to make Modin work well on Dask using Actors. I recommend zooming out a bit first and first engaging in an architectural discussion on what approach is best. It may be that you are in a rabbit hole here.

…

On Mon, Feb 8, 2021 at 7:16 AM Martin Durant ***@***.***> wrote: In the dask model, futures denote tasks or values that are held in the cluster or due to be run in the cluster, and their results are stateless (i.e., you would get the same if you ran it again, on whichever worker). Actors sit outside of this model - each is an instance on a specific worker, and maintains internal state. Running methods on an actor uses a completely different code-path, because the communication is directly from the client to the worker, and the scheduler is not involved. This is by design, to give minimum latency and stateful operation - since it's an arbitrary method, calling it twice might give new results. Note that type(future2) is not a normal future. Your example could be made to work by passing the actor itself and calling the method within the function you submit - yes, you can call actors from within tasks or even from within other actors. Totally agree that all of this API is niche and incomplete. I have two PRs on the matter, and no one to review them! — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#7115 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AACKZTHIM77Z7JL2PVU7XFTS5753FANCNFSM4WTIUTHA> .

1 reply

YarShev Feb 9, 2021
Author

So far, we don't have concrete plans to make Modin work well using Dask actors. But there are other activities that require Dask actors' functionality working well. So, I think, actors' functionality should be improved/extended.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ActorFuture vs pure Future to submit #7115

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments 3 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

ActorFuture vs pure Future to submit #7115

YarShev Jan 26, 2021

Replies: 3 comments · 3 replies

YarShev Feb 2, 2021 Author

jsignell Feb 8, 2021 Collaborator

martindurant Feb 8, 2021

YarShev Feb 9, 2021 Author

mrocklin Feb 8, 2021 Maintainer

YarShev Feb 9, 2021 Author

YarShev
Jan 26, 2021

Replies: 3 comments 3 replies

YarShev
Feb 2, 2021
Author

jsignell Feb 8, 2021
Collaborator

martindurant
Feb 8, 2021

YarShev Feb 9, 2021
Author

mrocklin
Feb 8, 2021
Maintainer

YarShev Feb 9, 2021
Author