Validation step fails when using shared memory with `multiprocessing.managers.BaseManager` #28899

ElenaKhaustova · 2024-04-26T11:28:48Z

Describe the bug

Relates to #28781

We use multiprocessing managers to work with shared memory for pipeline parallelisation. After this validation step was added we are experiencing ValueError: cannot set WRITEABLE flag to True of this array error when objects are retrieved from shared memory and passed to scikit-learn functions, for example fit, including this validation step.

The only solution that works for us so far is making a deep copy of objects before passing them to those methods which is not the desired solution.

Steps/Code to Reproduce

Some findings:

The result depends on n_samples. When n_samles is relatively small ~100 the error is not happening. So can be related to ColumnTransformer throws error with n_jobs > 1 input dataframes and joblib auto-memmapping (regression in 1.4.1.post1) #28781 (comment)
Replacing pd.Series with pd.DataFrame solves the issue but we don't have an idea why

from concurrent.futures import ProcessPoolExecutor
from multiprocessing.managers import BaseManager
import traceback

import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression


class MemoryDataset:
    def __init__(self):
        self._ds = None

    def save(self, ds):
        self._ds = ds

    def load(self):
        return self._ds


def train_model(dataset: MemoryDataset) -> LinearRegression:
    regressor = LinearRegression()
    X_train, y_train = dataset.load()
    try:
        regressor.fit(X_train, y_train)
    except Exception as _:
        print(traceback.format_exc())
    return regressor


class MyManager(BaseManager):
    pass


MyManager.register("MemoryDataset", MemoryDataset, exposed=("save", "load"))


def main():
    rng = np.random.default_rng()
    n_samples = 1000
    X_train = pd.DataFrame(rng.random((n_samples, 4)), columns=list('ABCD'))
    y_train = pd.Series(rng.random(n_samples))
    # Replacing pd.Series with pd.DataFrame solves the issue
    # y_train = pd.DataFrame(rng.random((n_samples, 1)), columns=list('E'))

    futures = set()

    manager = MyManager()
    manager.start()
    dataset = manager.MemoryDataset()
    dataset.save((X_train, y_train))

    with ProcessPoolExecutor(max_workers=1) as pool:
        futures.add(pool.submit(train_model, dataset))

Expected Results

No error is thrown.

Actual Results

Traceback (most recent call last):
  File "/pr-scikit-learn/main.py", line 48, in train_model
    regressor.fit(X_train, y_train)
  File "/lib/python3.11/site-packages/sklearn/base.py", line 1473, in wrapper
    return fit_method(estimator, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/lib/python3.11/site-packages/sklearn/linear_model/_base.py", line 609, in fit
    X, y = self._validate_data(
           ^^^^^^^^^^^^^^^^^^^^
  File "/lib/python3.11/site-packages/sklearn/base.py", line 650, in _validate_data
    X, y = check_X_y(X, y, **check_params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/lib/python3.11/site-packages/sklearn/utils/validation.py", line 1282, in check_X_y
    y = _check_y(y, multi_output=multi_output, y_numeric=y_numeric, estimator=estimator)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/lib/python3.11/site-packages/sklearn/utils/validation.py", line 1292, in _check_y
    y = check_array(
        ^^^^^^^^^^^^
  File "/lib/python3.11/site-packages/sklearn/utils/validation.py", line 1100, in check_array
    array.flags.writeable = True
    ^^^^^^^^^^^^^^^^^^^^^
ValueError: cannot set WRITEABLE flag to True of this array

Versions

System:
    python: 3.11.9 (main, Apr 19 2024, 11:44:45) [Clang 14.0.6 ]
executable: /opt/miniconda3/envs/paraller-runner-scikit-learn-env/bin/python
   machine: macOS-10.16-x86_64-i386-64bit

Python dependencies:
      sklearn: 1.5.dev0
          pip: 23.3.1
   setuptools: 68.2.2
        numpy: 1.26.4
        scipy: 1.13.0
       Cython: None
       pandas: 2.2.2
   matplotlib: None
       joblib: 1.4.0
threadpoolctl: 3.4.0

Built with OpenMP: False

threadpoolctl info:
       user_api: blas
   internal_api: openblas
    num_threads: 10
         prefix: libopenblas
       filepath: /opt/miniconda3/envs/paraller-runner-scikit-learn-env/lib/python3.11/site-packages/numpy/.dylibs/libopenblas64_.0.dylib
        version: 0.3.23.dev
threading_layer: pthreads
   architecture: Nehalem

       user_api: blas
   internal_api: openblas
    num_threads: 10
         prefix: libopenblas
       filepath: /opt/miniconda3/envs/paraller-runner-scikit-learn-env/lib/python3.11/site-packages/scipy/.dylibs/libopenblas.0.dylib
        version: 0.3.26.dev
threading_layer: pthreads
   architecture: Nehalem

The text was updated successfully, but these errors were encountered:

jeremiedbb · 2024-04-26T13:11:03Z

Thanks for the report @ElenaKhaustova.

We should not try to force making the array writeable, because it's just not always possible. There's ongoing discussion in #28824 which should clarify when we need a writeable array and then make it properly.

ElenaKhaustova added Bug Needs Triage Issue requires triage labels Apr 26, 2024

jeremiedbb removed the Needs Triage Issue requires triage label Apr 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Validation step fails when using shared memory with `multiprocessing.managers.BaseManager` #28899

Validation step fails when using shared memory with `multiprocessing.managers.BaseManager` #28899

ElenaKhaustova commented Apr 26, 2024 •

edited by jeremiedbb

jeremiedbb commented Apr 26, 2024

Validation step fails when using shared memory with multiprocessing.managers.BaseManager #28899

Validation step fails when using shared memory with multiprocessing.managers.BaseManager #28899

Comments

ElenaKhaustova commented Apr 26, 2024 • edited by jeremiedbb

Describe the bug

Steps/Code to Reproduce

Expected Results

Actual Results

Versions

jeremiedbb commented Apr 26, 2024

Validation step fails when using shared memory with `multiprocessing.managers.BaseManager` #28899

Validation step fails when using shared memory with `multiprocessing.managers.BaseManager` #28899

ElenaKhaustova commented Apr 26, 2024 •

edited by jeremiedbb