Skip to content
This repository has been archived by the owner on Oct 14, 2018. It is now read-only.

dask/dask-searchcv

Repository files navigation

dask-searchcv

Dask-SearchCV is now included in Dask-ML.

Further development to Dask-SearchCV is occuring in the Dask-ML repository. Please post issues and make pull requests there.

Travis Status Documentation Status Conda Badge PyPI Badge

Tools for performing hyperparameter search with Scikit-Learn and Dask.

Highlights

  • Drop-in replacement for Scikit-Learn's GridSearchCV and RandomizedSearchCV.
  • Hyperparameter optimization can be done in parallel using threads, processes, or distributed across a cluster.
  • Works well with Dask collections. Dask arrays, dataframes, and delayed can be passed to fit.
  • Candidate estimators with identical parameters and inputs will only be fit once. For composite-estimators such as Pipeline this can be significantly more efficient as it can avoid expensive repeated computations.

For more information, check out the documentation.

Install

Dask-searchcv is available via conda or pip:

# Install with conda
$ conda install dask-searchcv -c conda-forge

# Install with pip
$ pip install dask-searchcv

Example

from sklearn.datasets import load_digits
from sklearn.svm import SVC
import dask_searchcv as dcv
import numpy as np

digits = load_digits()

param_space = {'C': np.logspace(-4, 4, 9),
               'gamma': np.logspace(-4, 4, 9),
               'class_weight': [None, 'balanced']}

model = SVC(kernel='rbf')
search = dcv.GridSearchCV(model, param_space, cv=3)

search.fit(digits.data, digits.target)