Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel Processing does not work on Windows #8296

Closed
Blendify opened this issue Oct 5, 2020 · 7 comments
Closed

Parallel Processing does not work on Windows #8296

Blendify opened this issue Oct 5, 2020 · 7 comments
Labels

Comments

@Blendify
Copy link
Contributor

Blendify commented Oct 5, 2020

Describe the bug
Using the -j auto command does not work on windows but it works on unix

To Reproduce
On a test project run:

sphinx-build -j auto -b html input output

Run that command under Windows and Unix.

Expected behavior
Parallel processing should work across all operating systems.

Not that you can tell that it works or not by the "Waiting for works" logging message (

logger.info(bold(__('waiting for workers...')))
) not appearing on windows, not to mention the also slower build times.

Your project
Any project works

Environment info

  • OS: Windows 10
  • Python version: 3.8.0
  • Sphinx version: 3.2.1
  • Sphinx extensions: none
  • Extra tools: none
@Blendify
Copy link
Contributor Author

Blendify commented Oct 6, 2020 via email

@Blendify
Copy link
Contributor Author

Blendify commented Oct 6, 2020

Looked into it a bit more and it seems multiprocessing is a hairy topic across platforms.
Sphinx should instead look into the python threading module : https://docs.python.org/3/library/threading.html
This makes it a lot easier to execute processes across threads and share memory between them.

@xmo-odoo
Copy link
Contributor

Looked into it a bit more and it seems multiprocessing is a hairy topic across platforms.
Sphinx should instead look into the python threading module : docs.python.org/3/library/threading.html
This makes it a lot easier to execute processes across threads and share memory between them.

Multithreading causes much trickier synchronisation issues, and while Sphinx would do some IO during reading and writing I expect most of its work is actual processing (parsing, transformations, etc...) as it pegs cores during compilation.

The GIL means it would get essentially no speedup from multithreading, as only one thread can execute Python code at a time.

@Blendify
Copy link
Contributor Author

Yes looking more into this I agree, there will be too much overhead trying to split use multiple threads instead we should stick with what is currently done using multiprocessing but instead of using os.fork() we should use https://docs.python.org/3/library/multiprocessing.html

I haven't had a deep look but I don't think it would be that much work to use that library instead.

@Blendify
Copy link
Contributor Author

I will do some tests to see if I could get this working if I have the free time.

@xmo-odoo
Copy link
Contributor

FWIW you may want to use #6881 for further discussion.

It doesn't provide a ready-made solution (or even the embryo of one), but it was specifically created by Komiya-san to track the eventual reimplementation of parallel building.

Also sphinx does already use multiprocessing, it doesn't perform raw forks. However the way it uses multiprocessing currently implies forking, so you have your work cut out for you: the as Komiya-san notes in #9092 the entire feature needs to be removed and reimplemented, as the spawning model is a completely different beast than the forking one.

@Blendify
Copy link
Contributor Author

Using #6881 instead

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 10, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants