Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quickfix: mmap_mode to None should set max_nbytes to None #1325

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGES.rst
Expand Up @@ -4,6 +4,10 @@ Latest changes
Development version
-------------------

- Fix argument 'max_nbytes' being used if explicitely set to something
else than None even though 'mmap_mode' was set to None.
https://github.com/joblib/joblib/pull/1325

- Make sure that joblib works even when multiprocessing is not available,
for instance with Pyodide
https://github.com/joblib/joblib/pull/1256
Expand Down
10 changes: 7 additions & 3 deletions joblib/parallel.py
Expand Up @@ -495,9 +495,9 @@ class Parallel(Logger):
parallel_backend.
verbose: int, optional
The verbosity level: if non zero, progress messages are
printed. Above 50, the output is sent to stdout.
printed. If above 50, the output is sent to stdout.
The frequency of the messages increases with the verbosity level.
If it more than 10, all iterations are reported.
If above 10, all iterations are reported.
timeout: float, optional
Timeout limit for each task to complete. If any task takes longer
a TimeOutError will be raised. Only applied when n_jobs != 1
Expand Down Expand Up @@ -540,7 +540,8 @@ class Parallel(Logger):
Threshold on the size of arrays passed to the workers that
triggers automated memory mapping in temp_folder. Can be an int
in Bytes, or a human-readable string, e.g., '1M' for 1 megabyte.
Use None to disable memmapping of large arrays.
Use None to disable automatic memmapping of large arrays (small
arrays can ignore this argument).
Only active when backend="loky" or "multiprocessing".
mmap_mode: {None, 'r+', 'r', 'w+', 'c'}, default: 'r'
Memmapping mode for numpy arrays passed to workers. None will
Expand Down Expand Up @@ -688,6 +689,9 @@ def __init__(self, n_jobs=None, backend=None, verbose=0, timeout=None,
self._id = uuid4().hex
self._reducer_callback = None

if mmap_mode is None:
max_nbytes = None

if isinstance(max_nbytes, str):
max_nbytes = memstr_to_bytes(max_nbytes)

Expand Down
29 changes: 29 additions & 0 deletions joblib/test/test_memmapping.py
Expand Up @@ -888,6 +888,35 @@ def test_memmapping_pool_for_large_arrays_disabled(factory, tmpdir):
del p


@with_numpy
@with_multiprocessing
@parametrize("factory", [MemmappingPool, TestExecutor.get_memmapping_executor],
ids=["multiprocessing", "loky"])
def test_memmapping_pool_disabled_by_mmap_mode_to_None(factory, tmpdir):
"""Check that memmapping is in effect disabled by setting
mmap_mode to None while setting max_nbytes to an explicit
value."""
# Set mmap_mode to None to disable memmapping feature
p = factory(3, mmap_mode=None, max_nbytes=40, temp_folder=tmpdir.strpath)
try:

# Check that the tempfolder is empty
assert os.listdir(tmpdir.strpath) == []

# Try with a file largish than the memmap threshold of 40 bytes
large = np.ones(100, dtype=np.float64)
assert large.nbytes == 800
p.map(check_array, [(large, i, 1.0) for i in range(large.shape[0])])

# Check that the tempfolder is still empty
assert os.listdir(tmpdir.strpath) == []

finally:
# Cleanup open file descriptors
p.terminate()
del p


@with_numpy
@with_multiprocessing
@with_dev_shm
Expand Down