Replies: 10 comments
-
Hey @hadim 👋, We also offer priority support for our sponsors. |
Beta Was this translation helpful? Give feedback.
-
if you are up to push it and it goes per our celery design we are OK with it |
Beta Was this translation helpful? Give feedback.
-
I am happy to help with it but I likely lack expertise in the area so any guidance might be helpful. Also, it's worth bringing @linar-jether to the convo since he proposed a workaround recently at #4551 (comment) consisting of patching def _patch_joblib_loky_backend():
import joblib._parallel_backends
from joblib._parallel_backends import mp, cpu_count
def effective_n_jobs(self, n_jobs):
"""Determine the number of jobs which are going to run in parallel"""
if n_jobs == 0:
raise ValueError('n_jobs == 0 in Parallel has no meaning')
elif mp is None or n_jobs is None:
# multiprocessing is not available or disabled, fallback
# to sequential mode
return 1
elif n_jobs < 0:
n_jobs = max(cpu_count() + 1 + n_jobs, 1)
return n_jobs
# Monkey-patch to allow daemonic thread to spawn processes
joblib._parallel_backends.LokyBackend.effective_n_jobs = effective_n_jobs
_patch_joblib_loky_backend() But it raises the following error:
Note also that without this patch, the Celery worker works just fine, it's just that
|
Beta Was this translation helpful? Give feedback.
-
If you believe that issue is more Edit: ticket opened on the |
Beta Was this translation helpful? Give feedback.
-
@hadim I believe the other issue you've encountered, is this one: #1709 current_process()._config['daemon'] = False |
Beta Was this translation helpful? Give feedback.
-
An actual solution to this problem would be to provide a Loky process pool since I'm assuming this will always work. |
Beta Was this translation helpful? Give feedback.
-
@thedrow what would be required to implement a new worker pool class? |
Beta Was this translation helpful? Give feedback.
-
Well, we'll need to integrate it with Celery's current event loop. |
Beta Was this translation helpful? Give feedback.
-
I gave it a quick try but so far it seems to be working nicely. Thanks. |
Beta Was this translation helpful? Give feedback.
-
To be clear in my context, |
Beta Was this translation helpful? Give feedback.
-
Checklist
for similar or identical feature requests.
for existing proposed implementations of this feature.
to find out if the if the same feature was already implemented in the
master branch.
in this issue (If there are none, check this box anyway).
Related Issues and Possible Duplicates
Related Issues
Possible Duplicates
Brief Summary
Currently, the only way to use
joblib
andloky
(in some extentmultiprocessing
too) is to use-P threads
instead of-P processes
.Since
-P threads
is usingThreadPoolExecutor
from the stdlib under the hood, the task in the same worker is not really running in parallel (but async). This is because of the Python GIL.The problem becomes even more important if the workload is split between the main code and the subprocesses (executed by
joblib
).Only the subprocesses are executed in parallel but not the main code. It adds a clear bottleneck that is not ideal.
Architectural Considerations
I am not comfortable enough with Celery to propose an implementation. Feel free to throw your ideas below in this ticket.
Potential workaround
A potential workaround is to have plenty of Celery worker replicas that execute only one task (concurrency of 1). So the task will have all the CPU time available for it and be free to use
joblib
at its convenience.But having all the Celery workers with a concurrency of 1 also imply some overhead in the underlying infrastructure that is not really wanted ideally.
Beta Was this translation helpful? Give feedback.
All reactions