contrib.concurrent: Use python default max_workers #1543

aripollak · 2023-12-28T20:38:16Z

I noticed that contrib.concurrent defines its own default for max_workers, which is the same as CPython's default for ThreadPoolExecutor in CPython 3.8+, but differs from the default for ProcessPoolExecutor:

If max_workers is None or not given, it will default to the number of processors on the machine...If max_workers is None, then the default chosen will be at most 61, even if more processors are available.

This was surprising to me. If the running code is CPU-bound, which is usually why you'd use process_map instead of thread_map, there usually isn't a noticeable speed improvement to spawn more processes than the number of available CPUs. In my case, cpu_count + 4 ended up using more RAM than necessary for maximum speed improvement. I'm using a package that loads an index into memory at 800 MB per process, so that's an unnecessary extra 3.2 GB used by default.

Obsoletes #1530.

aripollak added 2 commits December 28, 2023 15:21

Use python default max_workers for concurrent.futures executors

8946047

Remove default: None in comment

9c6c469

aripollak requested a review from casperdcl as a code owner December 28, 2023 20:38

aripollak changed the title ~~Use python default max_workers for concurrent.futures executors~~ contrib.concurrent: Use python default max_workers Dec 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

contrib.concurrent: Use python default max_workers #1543

contrib.concurrent: Use python default max_workers #1543

aripollak commented Dec 28, 2023 •

edited

contrib.concurrent: Use python default max_workers #1543

Are you sure you want to change the base?

contrib.concurrent: Use python default max_workers #1543

Conversation

aripollak commented Dec 28, 2023 • edited

aripollak commented Dec 28, 2023 •

edited