Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release Feedback from the SYCL Community #18

Open
ax3l opened this issue Dec 15, 2019 · 7 comments
Open

Release Feedback from the SYCL Community #18

ax3l opened this issue Dec 15, 2019 · 7 comments

Comments

@ax3l
Copy link
Member

ax3l commented Dec 15, 2019

Hi everyone,

we got some feedback from core SYCL developers over on Twitter on our release. They are super interested in our work cases and would like feedback on what one could improve in SYCL task graphs to have similar control.

I gave some examples but would love you to take a look so I did not forget anything important.

Full thread: https://twitter.com/axccl/status/1206318925270532097

Branch 1: https://twitter.com/illuhad/status/1206336202204438530
Branch 2: https://twitter.com/illuhad/status/1206338789301456896
Branch 2b: https://twitter.com/axccl/status/1206347291738529792
Branch 3: https://twitter.com/codeandrew/status/1206349235928649728

Quick summary on SYCL tasks:

  • can be host and device
  • one can call MPI in host tasks again [1]
  • one can dynamically append in tasks to tasks [2]

Side note: I realized we do not have an MPI example in the repo / readthedocs yet. Can we please add a good one?

There is also a question on other related works: how this compares to Legion's task concept:

@michaelsippel
Copy link
Member

michaelsippel commented Dec 16, 2019

OK so here my thoughts:

Both SYCL and Legion solve similar problems as RedGrapes, but have a much broader scope and very specific execution models. RedGrapes is just about tasks and nothing more.

  • SYCL has a host/device model, this is out of scope of RedGrapes
  • SYCL is focused on OpenCL SYCL tasks ;) only, we want to use alpaka
  • As far as i understand, SYCL doesn't allow custom scheduling with domain specific information.

Regarding Legion: RedGrapes is much simpler, mainly because it's node-local. But distributed scheduling could be easily built on top of it. Legion looks very complicated.

Side note: I realized we do not have an MPI example in the repo / readthedocs yet. Can we please add a good one?

The MPI abstractions are currently in pmacc only, but I will make an minimal example how mpi can be used in the next days.
Basically, in a task we create a mpi-request and then register an event which delays the removal of the vertex from the graph. The event gets notified from a polling loop.
Such a mechanism is required, because waiting inside the task for the request to finish creates deadlocks.
This is a problem that arises because a receive operation will not finish, before its corresponding send is created. Because send & receive may operate on separate buffers first, they are not dependent, but with non-preemtive tasks it can create deadlocks, like described in the attached slides.
May be a bit technical, but shows that asynchronous communication is not trivial. Even if we can run mpi calls inside a SYCL-host-kernel, we would need some like the above described mechanism to handle the asynchronous aspect. And SYCL is specialized for OpenCL kernels, whereas RedGrapes uses the same mechanism for MPI as for CUDA or whatever asynchronous operation is needed.

async1
async2

@ax3l
Copy link
Member Author

ax3l commented Dec 16, 2019

Thanks for your thoughts! I guess I conveyed the essence of that today.
MInor correction: SYCL is the single-source programming model SYCL and not identical to OpenCL. ;)

@ax3l
Copy link
Member Author

ax3l commented Jun 6, 2020

Found another one for literature research: taskflow https://github.com/taskflow/taskflow (arxiv)

@michaelsippel
Copy link
Member

Thanks for reporting! This one was already in the list, but I didn't notice they renamed it.

redGrapes Comparison Table (working branch)

@ax3l
Copy link
Member Author

ax3l commented Jun 7, 2020

Oh right, thanks! Yes, just the latest release carried the rename.

@ax3l
Copy link
Member Author

ax3l commented Jul 13, 2020

not sure if relevant for the comparison table, but just came across: https://github.com/sci-visus/BabelFlow

@michaelsippel
Copy link
Member

not sure if relevant for the comparison table, but just came across: https://github.com/sci-visus/BabelFlow

Hm, they do something with tasks, but I don't get what they are doing. Where is the DSL as claimed in the readme ?
Regarding the status of the documentation, a comparison with this project is currently not very useful I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants