Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

contrib.concurrent: Use python default max_workers #1543

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

aripollak
Copy link

@aripollak aripollak commented Dec 28, 2023

I noticed that contrib.concurrent defines its own default for max_workers, which is the same as CPython's default for ThreadPoolExecutor in CPython 3.8+, but differs from the default for ProcessPoolExecutor:

If max_workers is None or not given, it will default to the number of processors on the machine...If max_workers is None, then the default chosen will be at most 61, even if more processors are available.

This was surprising to me. If the running code is CPU-bound, which is usually why you'd use process_map instead of thread_map, there usually isn't a noticeable speed improvement to spawn more processes than the number of available CPUs. In my case, cpu_count + 4 ended up using more RAM than necessary for maximum speed improvement. I'm using a package that loads an index into memory at 800 MB per process, so that's an unnecessary extra 3.2 GB used by default.

Obsoletes #1530.

@aripollak aripollak changed the title Use python default max_workers for concurrent.futures executors contrib.concurrent: Use python default max_workers Dec 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant