New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX parse pre-dispatch with AST instead of calling eval #1327
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1327 +/- ##
==========================================
- Coverage 93.91% 93.83% -0.08%
==========================================
Files 50 52 +2
Lines 7275 7301 +26
==========================================
+ Hits 6832 6851 +19
- Misses 443 450 +7
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the follow-up PR @adrinjalali! Here is some feedback:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add a new test for a pre_dispatch
expression which includes the n_jobs
placeholder in it?
Other than that and the linting issue, LGTM!
Doesn't this already cover it? @parametrize('n_tasks, n_jobs, pre_dispatch, batch_size',
[(2, 2, 'all', 'auto'),
(2, 2, 'n_jobs', 'auto'),
(10, 2, 'n_jobs', 'auto'),
(517, 2, 'n_jobs', 'auto'),
(10, 2, 'n_jobs', 'auto'),
(10, 4, 'n_jobs', 'auto'),
(200, 12, 'n_jobs', 'auto'),
(25, 12, '2 * n_jobs', 1),
(250, 12, 'all', 1),
(250, 12, '2 * n_jobs', 7),
(200, 12, '2 * n_jobs', 'auto')])
def test_dispatch_race_condition(n_tasks, n_jobs, pre_dispatch, batch_size):
# Check that using (async-)dispatch does not yield a race condition on the
# iterable generator that is not thread-safe natively.
# This is a non-regression test for the "Pool seems closed" class of error
params = {'n_jobs': n_jobs, 'pre_dispatch': pre_dispatch,
'batch_size': batch_size}
expected = [square(i) for i in range(n_tasks)]
results = Parallel(**params)(delayed(square)(i) for i in range(n_tasks))
assert results == expected |
Indeed, my |
Thanks for the fix @adrinjalali ! |
I'd like to vote for reverting this, as it does not address a real issue, as noted in #1128 (comment) |
Release 1.2.0 Fix a security issue where eval(pre_dispatch) could potentially run arbitrary code. Now only basic numerics are supported. joblib/joblib#1327 Make sure that joblib works even when multiprocessing is not available, for instance with Pyodide joblib/joblib#1256 Avoid unnecessary warnings when workers and main process delete the temporary memmap folder contents concurrently. joblib/joblib#1263 Fix memory alignment bug for pickles containing numpy arrays. This is especially important when loading the pickle with mmap_mode != None as the resulting numpy.memmap object would not be able to correct the misalignment without performing a memory copy. This bug would cause invalid computation and segmentation faults with native code that would directly access the underlying data buffer of a numpy array, for instance C/C++/Cython code compiled with older GCC versions or some old OpenBLAS written in platform specific assembly. joblib/joblib#1254 Vendor cloudpickle 2.2.0 which adds support for PyPy 3.8+. Vendor loky 3.3.0 which fixes several bugs including: robustly forcibly terminating worker processes in case of a crash (joblib/joblib#1269); avoiding leaking worker processes in case of nested loky parallel calls; reliability spawn the correct number of reusable workers.
I realized #1321 doesn't actually fix the issue, since
eval("exec('import os')", {}, {})
would still work.This PR uses AST and limits the operations to the very basic ones.
This should be an actual fix for #1128
cc @ogrisel