Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate all partitions of a set from partitions of subset #728

Open
krono-i2 opened this issue Jun 7, 2023 · 2 comments
Open

Generate all partitions of a set from partitions of subset #728

krono-i2 opened this issue Jun 7, 2023 · 2 comments

Comments

@krono-i2
Copy link

krono-i2 commented Jun 7, 2023

Hello guys!
I'm trying to parallelize the generation of all of the partitions of a set S over MPI (using mpi4py).
I know the number of possible partiotions of a set containing N elements is the Bell number evaluated in N.
In my case N=19, so Bell(19)=5832742205057 (huge!).
The first way I tried was to parallelize on the number of subset composing each partition, so that I can start 19 threads each running
set_partitions(S,k=threadID)
with threadID spanning from 1 to 19.
Such a way each thread generates S(19,threadID) partitions, where S(N,k) is the Stirling number.
The problem is that in this way resources are not well allocated, because the number of partitions are not equally distributed along threads, as you can see below.

[1,
 262143,
 193448101,
 11259666950,
 147589284710,
 693081601779,
 1492924634839,
 1709751003480,
 1144614626805,
 477297033785,
 129413217791,
 23466951300,
 2892439160,
 243577530,
 13916778,
 527136,
 12597,
 171,
 1]

As you can see the maximum Stirling number in this list (1709751003480) is about 1/3 of Bell(19), so the improvement would not be useful in practice.
There is a way to better parallize set_partitions in order to obtain better resource balancing?
I've been thinking about to start from the partitions of a subset of the entire set in order to evaluate a subset of the partitions parallely on each thread.
Have you any idea?
Thanks for help.

Ivan

[EDIT]
It seems the problem could be approached using a rooted tree discovering algoritm. How set_partitions could be modified in order to use that approach?

@pochmann
Copy link
Contributor

pochmann commented Jun 7, 2023

Isn't that CPU-bound? Why are you using threads?

@krono-i2
Copy link
Author

krono-i2 commented Jun 7, 2023

I'm using threads beacause I can allocate them on different machines using the MPI protocol.
But it's a conceptual probelm; it does not matter if they are threads, CPUs or nodes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants