Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add _multiprocessing stub #2094

Closed
wants to merge 1 commit into from
Closed

Conversation

hoodmane
Copy link
Member

Different systems feature detect multiprocessing in different ways. There are three different behaviors:

  1. Packages that just import at top level and make no attempt to feature detect
  2. Packages that feature detect by looking for an import error
  3. Packages that feature detect by looking for an import error or a runtime error

The type 3 systems that detect runtime errors are the best behaved (e.g., joblib). Most packages seem to be type 1 (fearless top level imports). We can't simultaneously support both 1 and 2. The only type 2 system is the core Python multiprocessing tests. My thought was it is better to manually xfail the core multiprocessing tests than to patch all type 1 packages.

It might be a good idea to try to convince some of the type 1 packages to do something a bit smarter, or to get rid of the top level import.

@rth
Copy link
Member

rth commented Jan 10, 2022

Is the issue with importing multiprocessing, it's submodules or _multiprocessing? Because importing multiprocessing alone should work fine I think?

@hoodmane
Copy link
Member Author

hoodmane commented Jan 10, 2022

Good point. This is for multiprocessing.connection and multiprocessing.synchronize.
multiprocessing.connection has a top level import _multiprocessing.
multiprocessing.synchronize has a top level from _multiprocessing import SemLock, sem_unlink.

Joblib has from _multiprocessing import SemLock but it is inside of a except block that catches ImportError. Joblib then tests whether it can create a SemLock. If the SemLock constructor raises FileExistsError, AttributeError, or OSError it catches it and switches to serial mode.

@rth
Copy link
Member

rth commented Jan 10, 2022

If the specific issue is for joblib, I would rather be +1 to for now patch joblib, and then try to contribute Pyodide compatible code upstream. I know joblib devs and it's likely they would accept it. There are fairly few parallelism specific libraries such as joblib where this would happen.

I'm a bit wary of patching threading or multiprocessing stdlib code with custom implementation since as #796 showed it can lead to even harder to debug issues down the road. Also I'm not sure that having 4 extra failing stdlib tests (even if technically they were in part skipped before) is worth this potential fix for other packages, until we get at least some bug report that this is indeed an issue people encounter.

@hoodmane
Copy link
Member Author

Also I'm not sure that having 4 extra failing stdlib tests (even if technically they were in part skipped before) is worth this potential fix for other packages

They were entirely skipped before, they have guards at the top of the file that skip the test if _multiprocessing.SemLock isn't available.

@hoodmane
Copy link
Member Author

But I think you are correct that this is the wrong approach. The problem is in joblib, we should patch joblib and submit a PR.

@hoodmane
Copy link
Member Author

Ah well the problem is really with externals.loky. Joblib has in __init__.py:

from .externals.loky import wrap_non_picklable_objects

and loky fails because it tries to import multiprocessing.connection.

@rth
Copy link
Member

rth commented Jan 10, 2022

That's OK, we can patch externals.loky. Loky is done by some of the same people, they just decided to vendor it, so a fix would take more time to propagate..

@hoodmane
Copy link
Member Author

I am going to open a PR on joblib adding a test. For 0.19, maybe we should just revert to the old version of joblib.

@rth
Copy link
Member

rth commented Jan 10, 2022

Yeah, that would be the easiest.

@hoodmane
Copy link
Member Author

Probably we should try to do these package updates some time other than right before the release. If we had a bot or something to periodically whine about out of date packages...

@hoodmane
Copy link
Member Author

Opened a PR on joblib: joblib/joblib#1246

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants