Skip to content

Latest commit

 

History

History
706 lines (559 loc) · 31.7 KB

v1.2.rst

File metadata and controls

706 lines (559 loc) · 31.7 KB

sklearn

Version 1.2.0

In Development

Changed models

The following estimators and functions, when fit with the same data and parameters, may produce different models from the previous version. This often occurs due to changes in the modelling logic (bug fixes or enhancements), or in random sampling procedures.

  • The default eigen_tol for cluster.SpectralClustering, manifold.SpectralEmbedding, cluster.spectral_clustering, and manifold.spectral_embedding is now None when using the 'amg' or 'lobpcg' solvers. This change improves numerical stability of the solver, but may result in a different model.
  • linear_model.GammaRegressor, linear_model.PoissonRegressor and linear_model.TweedieRegressor can reach higher precision with the lbfgs solver, in particular when tol is set to a tiny value. Moreover, verbose is now properly propagated to L-BFGS-B. 23619 by Christian Lorentzen <lorentzenchr>.
  • The default value for eps metrics.logloss has changed from 1e-15 to "auto". "auto" sets eps to np.finfo(y_pred.dtype).eps. 24354 by Safiuddin Khaja <Safikh> and gsiisg <gsiisg>.
  • Make sign of components_ deterministic in decomposition.SparsePCA. 23935 by Guillaume Lemaitre <glemaitre>.
  • The components_ signs in decomposition.FastICA might differ. It is now consistent and deterministic with all SVD solvers. 22527 by Meekail Zain <micky774> and Thomas Fan.
  • The condition for early stopping has now been changed in linear_model._sgd_fast._plain_sgd which is used by linear_model.SGDRegressor and linear_model.SGDClassifier. The old condition did not disambiguate between training and validation set and had an effect of overscaling the error tolerance. This has been fixed in 23798 by Harsh Agrawal <Harsh14901>.
  • For model_selection.GridSearchCV and model_selection.RandomizedSearchCV ranks corresponding to nan scores will all be set to the maximum possible rank. 24543 by Guillaume Lemaitre <glemaitre>.
  • The default value of tol was changed from 1e-3 to 1e-4 for linear_model.ridge_regression, linear_model.Ridge and linear_model.`RidgeClassifier. 24465 by Christian Lorentzen <lorentzenchr>.

Changes impacting all modules

  • The set_output API has been adopted by all transformers. Meta-estimators that contain transformers such as pipeline.Pipeline or compose.ColumnTransformer also define a set_output. For details, see SLEP018. 23734 and 24699 by Thomas Fan.
  • Low-level routines for reductions on pairwise distances for dense float32 datasets have been refactored. The following functions and estimators now benefit from improved performances in terms of hardware scalability and speed-ups:

    • sklearn.metrics.pairwise_distances_argmin
    • sklearn.metrics.pairwise_distances_argmin_min
    • sklearn.cluster.AffinityPropagation
    • sklearn.cluster.Birch
    • sklearn.cluster.MeanShift
    • sklearn.cluster.OPTICS
    • sklearn.cluster.SpectralClustering
    • sklearn.feature_selection.mutual_info_regression
    • sklearn.neighbors.KNeighborsClassifier
    • sklearn.neighbors.KNeighborsRegressor
    • sklearn.neighbors.RadiusNeighborsClassifier
    • sklearn.neighbors.RadiusNeighborsRegressor
    • sklearn.neighbors.LocalOutlierFactor
    • sklearn.neighbors.NearestNeighbors
    • sklearn.manifold.Isomap
    • sklearn.manifold.LocallyLinearEmbedding
    • sklearn.manifold.TSNE
    • sklearn.manifold.trustworthiness
    • sklearn.semi_supervised.LabelPropagation
    • sklearn.semi_supervised.LabelSpreading

    For instance sklearn.neighbors.NearestNeighbors.kneighbors and sklearn.neighbors.NearestNeighbors.radius_neighbors can respectively be up to ×20 and ×5 faster than previously on a laptop.

    Moreover, implementations of those two algorithms are now suitable for machine with many cores, making them usable for datasets consisting of millions of samples.

    23865 by Julien Jerphanion <jjerphan>.

  • Finiteness checks (detection of NaN and infinite values) in all estimators are now significantly more efficient for float32 data by leveraging NumPy's SIMD optimized primitives. 23446 by Meekail Zain <micky774>
  • Finiteness checks (detection of NaN and infinite values) in all estimators are now faster by utilizing a more efficient stop-on-first second-pass algorithm. 23197 by Meekail Zain <micky774>
  • Support for combinations of dense and sparse datasets pairs for all distance metrics and for float32 and float64 datasets has been added or has seen its performance improved for the following estimators:

    • sklearn.metrics.pairwise_distances_argmin
    • sklearn.metrics.pairwise_distances_argmin_min
    • sklearn.cluster.AffinityPropagation
    • sklearn.cluster.Birch
    • sklearn.cluster.SpectralClustering
    • sklearn.neighbors.KNeighborsClassifier
    • sklearn.neighbors.KNeighborsRegressor
    • sklearn.neighbors.RadiusNeighborsClassifier
    • sklearn.neighbors.RadiusNeighborsRegressor
    • sklearn.neighbors.LocalOutlierFactor
    • sklearn.neighbors.NearestNeighbors
    • sklearn.manifold.Isomap
    • sklearn.manifold.TSNE
    • sklearn.manifold.trustworthiness

    23604 and 23585 by Julien Jerphanion <jjerphan>, Olivier Grisel <ogrisel>, and Thomas Fan, 24556 by Vincent Maladière <Vincent-Maladiere>.

  • Systematically check the sha256 digest of dataset tarballs used in code examples in the documentation. 24617 by Olivier Grisel <ogrisel> and Thomas Fan. Thanks to Sim4n6 for the report.

Changelog

sklearn.base

  • Introduces base.ClassNamePrefixFeaturesOutMixin and base.ClassNamePrefixFeaturesOutMixin mixins that defines get_feature_names_out for common transformer uses cases. 24688 by Thomas Fan.

sklearn.calibration

  • Rename base_estimator to estimator in calibration.CalibratedClassifierCV to improve readability and consistency. The parameter base_estimator is deprecated and will be removed in 1.4. 22054 by Kevin Roice <kevroi>.

sklearn.cluster

  • cluster.KMeans with algorithm="lloyd" is now faster and uses less memory. 24264 by Vincent Maladiere <Vincent-Maladiere>.
  • The predict and fit_predict methods of cluster.OPTICS now accept sparse data type for input data. 14736 by Hunt Zhan <huntzhan>, 20802 by Brandon Pokorny <Clickedbigfoot>, and 22965 by Meekail Zain <micky774>.
  • cluster.Birch now preserves dtype for numpy.float32 inputs. 22968 by Meekail Zain <micky774>.
  • cluster.KMeans and cluster.MiniBatchKMeans now accept a new 'auto' option for n_init which changes the number of random initializations to one when using init='k-means++' for efficiency. This begins deprecation for the default values of n_init in the two classes and both will have their defaults changed to n_init='auto' in 1.4. 23038 by Meekail Zain <micky774>.
  • cluster.SpectralClustering and cluster.spectral_clustering now propogates the eigen_tol parameter to all choices of eigen_solver. Includes a new option eigen_tol="auto" and begins deprecation to change the default from eigen_tol=0 to eigen_tol="auto" in version 1.3. 23210 by Meekail Zain <micky774>.
  • cluster.KMeans now supports readonly attributes when predicting. 24258 by Thomas Fan
  • The affinity attribute is now deprecated for cluster.AgglomerativeClustering and will be renamed to metric in v1.4. 23470 by Meekail Zain <micky774>.

sklearn.datasets

  • Introduce the new parameter parser in datasets.fetch_openml. parser="pandas" allows to use the very CPU and memory efficient pandas.read_csv parser to load dense ARFF formatted dataset files. It is possible to pass parser="liac-arff" to use the old LIAC parser. When parser="auto", dense datasets are loaded with "pandas" and sparse datasets are loaded with "liac-arff". Currently, parser="liac-arff" by default and will change to parser="auto" in version 1.4 21938 by Guillaume Lemaitre <glemaitre>.
  • datasets.dump_svmlight_file is now accelerated with a Cython implementation, providing 2-4x speedups. 23127 by Meekail Zain <micky774>
  • Path-like objects, such as those created with pathlib are now allowed as paths in datasets.load_svmlight_file and datasets.load_svmlight_files. 19075 by Carlos Ramos Carreño <vnmabus>.
  • Make sure that datasets.fetch_lfw_people and datasets.fetch_lfw_pairs internally crops images based on the slice_ parameter. 24951 by Guillaume Lemaitre <glemaitre>.

sklearn.decomposition

  • decomposition.FastICA.fit has been optimised w.r.t its memory footprint and runtime. 22268 by MohamedBsh <Bsh>.
  • decomposition.SparsePCA and decomposition.MiniBatchSparsePCA now implements an inverse_transform function. 23905 by Guillaume Lemaitre <glemaitre>.
  • decomposition.FastICA now allows the user to select how whitening is performed through the new whiten_solver parameter, which supports svd and eigh. whiten_solver defaults to svd although eigh may be faster and more memory efficient in cases where num_features > num_samples. 11860 by Pierre Ablin <pierreablin>, 22527 by Meekail Zain <micky774> and Thomas Fan.
  • decomposition.LatentDirichletAllocation now preserves dtype for numpy.float32 input. 24528 by Takeshi Oura <takoika> and Jérémie du Boisberranger <jeremiedbb>.
  • Make sign of components_ deterministic in decomposition.SparsePCA. 23935 by Guillaume Lemaitre <glemaitre>.
  • The n_iter parameter of decomposition.MiniBatchSparsePCA is deprecated and replaced by the parameters max_iter, tol, and max_no_improvement to be consistent with decomposition.MiniBatchDictionaryLearning. n_iter will be removed in version 1.3. 23726 by Guillaume Lemaitre <glemaitre>.
  • The n_features_ attribute of decomposition.PCA is deprecated in favor of n_features_in_ and will be removed in 1.4. 24421 by Kshitij Mathur <Kshitij68>.

sklearn.discriminant_analysis

  • discriminant_analysis.LinearDiscriminantAnalysis now supports the Array API for solver="svd". Array API support is considered experimental and might evolve without being subjected to our usual rolling deprecation cycle policy. See array_api for more details. 22554 by Thomas Fan.
  • Validate parameters only in fit and not in __init__ for discriminant_analysis.QuadraticDiscriminantAnalysis. 24218 by Stefanie Molin <stefmolin>.

sklearn.ensemble

  • ensemble.HistGradientBoostingClassifier and ensemble.HistGradientBoostingRegressor now support interaction constraints via the argument interaction_cst of their constructors. 21020 by Christian Lorentzen <lorentzenchr>. Using interaction constraints also makes fitting faster. 24856 by Christian Lorentzen <lorentzenchr>.
  • Interaction constraints for ~sklearn.ensemble.HistGradientBoostingClassifier and ~sklearn.ensemble.HistGradientBoostingRegressor can now be specified as strings for two common cases: "no_interactions" and "pairwise" interactions. 24849 by Tim Head <betatim>.
  • Adds class_weight to ensemble.HistGradientBoostingClassifier. 22014 by Thomas Fan.
  • Improve runtime performance of ensemble.IsolationForest by avoiding data copies. 23252 by Zhehao Liu <MaxwellLZH>.
  • ensemble.StackingClassifier now accepts any kind of base estimator. 24538 by Guillem G Subies <GuillemGSubies>.
  • Make it possible to pass the categorical_features parameter of ensemble.HistGradientBoostingClassifier and ensemble.HistGradientBoostingRegressor as feature names. 24889 by Olivier Grisel <ogrisel>.
  • ensemble.StackingClassifier now supports multilabel-indicator target 24146 by Nicolas Peretti <nicoperetti>, Nestor Navarro <nestornav>, Nati Tomattis <natitomattis>, and Vincent Maladiere <Vincent-Maladiere>.
  • ensemble.HistGradientBoostingClassifier and ensemble.HistGradientBoostingClassifier now accept their monotonic_cst parameter to be passed as a dictionary in addition to the previously supported array-like format. Such dictionary have feature names as keys and one of -1, 0, 1 as value to specify monotonicity constraints for each feature. 24855 by Olivier Grisel <ogrisel>.
  • Fixed the issue where ensemble.AdaBoostClassifier outputs NaN in feature importance when fitted with very small sample weight. 20415 by Zhehao Liu <MaxwellLZH>.
  • ensemble.HistGradientBoostingClassifier and ensemble.HistGradientBoostingRegressor no longer error when predicting on categories encoded as negative values and instead consider them a member of the "missing category". 24283 by Thomas Fan.
  • ensemble.HistGradientBoostingClassifier and ensemble.HistGradientBoostingRegressor, with verbose>=1, print detailed timing information on computing histograms and finding best splits. The time spent in the root node was previously missing and is now included in the printed information. 24894 by Christian Lorentzen <lorentzenchr>.
  • Rename the constructor parameter base_estimator to estimator in the following classes: ensemble.BaggingClassifier, ensemble.BaggingRegressor, ensemble.AdaBoostClassifier, ensemble.AdaBoostRegressor. base_estimator is deprecated in 1.2 and will be removed in 1.4. 23819 by Adrian Trujillo <trujillo9616> and Edoardo Abati <EdAbati>.
  • Rename the fitted attribute base_estimator_ to estimator_ in the following classes: ensemble.BaggingClassifier, ensemble.BaggingRegressor, ensemble.AdaBoostClassifier, ensemble.AdaBoostRegressor, ensemble.RandomForestClassifier, ensemble.RandomForestRegressor, ensemble.ExtraTreesClassifier, ensemble.ExtraTreesRegressor, ensemble.RandomTreesEmbedding, ensemble.IsolationForest. base_estimator_ is deprecated in 1.2 and will be removed in 1.4. 23819 by Adrian Trujillo <trujillo9616> and Edoardo Abati <EdAbati>.

sklearn.feature_selection

  • Fix a bug in feature_selection.mutual_info_regression and feature_selection.mutual_info_classif, where the continuous features in X should be scaled to a unit variance independently if the target y is continuous or discrete. 24747 by Guillaume Lemaitre <glemaitre>

sklearn.gaussian_process

  • Fix gaussian_process.kernels.Matern gradient computation with nu=0.5 for PyPy (and possibly other non CPython interpreters). 24245 by Loïc Estève <lesteve>.
  • The fit method of gaussian_process.GaussianProcessRegressor will not modify the input X in case a custom kernel is used, with a diag method that returns part of the input X. 24405 by Omar Salman <OmarManzoor>.

sklearn.impute

  • Added keep_empty_features parameter to impute.SimpleImputer, impute.KNNImputer and impute.IterativeImputer, preventing removal of features containing only missing values when transforming. 16695 by Vitor Santa Rosa <vitorsrg>.

sklearn.kernel_approximation

  • kernel_approximation.RBFSampler now preserves dtype for numpy.float32 inputs. 24317 by Tim Head <betatim>.
  • kernel_approximation.SkewedChi2Sampler now preserves dtype for numpy.float32 inputs. 24350 by Rahil Parikh <rprkh>.
  • kernel_approximation.RBFSampler now accepts 'scale' option for parameter gamma. 24755 by Gleb Levitski <GLevV>

sklearn.linear_model

  • linear_model.LogisticRegression, linear_model.LogisticRegressionCV, linear_model.GammaRegressor, linear_model.PoissonRegressor and linear_model.TweedieRegressor got a new solver solver="newton-cholesky". This is a 2nd order (Newton) optimisation routine that uses a Cholesky decomposition of the hessian matrix. When n_samples >> n_features, the "newton-cholesky" solver has been observed to converge both faster and to a higher precision solution than the "lbfgs" solver on problems with one-hot encoded categorical variables with some rare categorical levels. 24637 and 24767 by Christian Lorentzen <lorentzenchr>.
  • linear_model.GammaRegressor, linear_model.PoissonRegressor and linear_model.TweedieRegressor can reach higher precision with the lbfgs solver, in particular when tol is set to a tiny value. Moreover, verbose is now properly propagated to L-BFGS-B. 23619 by Christian Lorentzen <lorentzenchr>.
  • linear_model.SGDClassifier and linear_model.SGDRegressor will raise an error when all the validation samples have zero sample weight. 23275 by Zhehao Liu <MaxwellLZH>.
  • linear_model.SGDOneClassSVM no longer performs parameter validation in the constructor. All validation is now handled in fit() and partial_fit(). 24433 by Yogendrasingh <iofall>, Arisa Y. <arisayosh> and Tim Head <betatim>.
  • Fix average loss calculation when early stopping is enabled in linear_model.SGDRegressor and linear_model.SGDClassifier. Also updated the condition for early stopping accordingly. 23798 by Harsh Agrawal <Harsh14901>.
  • The default value for the solver parameter in linear_model.QuantileRegressor will change from "interior-point" to "highs" in version 1.4. 23637 by Guillaume Lemaitre <glemaitre>.
  • String option "none" is deprecated for penalty argument in linear_model.LogisticRegression, and will be removed in version 1.4. Use None instead. 23877 by Zhehao Liu <MaxwellLZH>.
  • The default value of tol was changed from 1e-3 to 1e-4 for linear_model.ridge_regression, linear_model.Ridge and linear_model.RidgeClassifier. 24465 by Christian Lorentzen <lorentzenchr>.

sklearn.manifold

  • Adds option to use the normalized stress in manifold.MDS. This is enabled by setting the new normalize parameter to True. 10168 by Łukasz Borchmann <Borchmann>, 12285 by Matthias Miltenberger <mattmilten>, 13042 by Matthieu Parizy <matthieu-pa>, 18094 by Roth E Conrad <rotheconrad> and 22562 by Meekail Zain <micky774>.
  • Adds eigen_tol parameter to manifold.SpectralEmbedding. Both manifold.spectral_embedding and manifold.SpectralEmbedding now propogate eigen_tol to all choices of eigen_solver. Includes a new option eigen_tol="auto" and begins deprecation to change the default from eigen_tol=0 to eigen_tol="auto" in version 1.3. 23210 by Meekail Zain <micky774>.
  • manifold.Isomap now preserves dtype for np.float32 inputs. 24714 by Rahil Parikh <rprkh>.
  • Added an "auto" option to the normalized_stress argument in manifold.MDS and manifold.smacof. Note that normalized_stress is only valid for non-metric MDS, therefore the "auto" option enables normalized_stress when metric=False and disables it when metric=True. "auto" will become the default value foor normalized_stress in version 1.4. 23834 by Meekail Zain <micky774>

sklearn.metrics

  • metrics.ConfusionMatrixDisplay.from_estimator, metrics.ConfusionMatrixDisplay.from_predictions, and metrics.ConfusionMatrixDisplay.plot accepts a text_kw parameter which is passed to matplotlib's text function. 24051 by Thomas Fan.
  • metrics.class_likelihood_ratios is added to compute the positive and negative likelihood ratios derived from the confusion matrix of a binary classification problem. 22518 by Arturo Amor <ArturoAmorQ>.
  • Add metrics.PredictionErrorDisplay to plot residuals vs predicted and actual vs predicted to qualitatively assess the behavior of a regressor. The display can be created with the class methods metrics.PredictionErrorDisplay.from_estimator and metrics.PredictionErrorDisplay.from_predictions. 18020 by Guillaume Lemaitre <glemaitre>.
  • metrics.roc_auc_score now supports micro-averaging (average="micro") for the One-vs-Rest multiclass case (multi_class="ovr"). 24338 by Arturo Amor <ArturoAmorQ>.
  • Adds an "auto" option to eps in metrics.logloss. This option will automatically set the eps value depending on the data type of y_pred. In addition, the default value of eps is changed from 1e-15 to the new "auto" option. 24354 by Safiuddin Khaja <Safikh> and gsiisg <gsiisg>.
  • Allows csr_matrix as input for parameter: y_true of

    the metrics.label_ranking_average_precision_score metric. 23442 by Sean Atukorala <ShehanAT>

  • metrics.ndcg_score will now trigger a warning when the y_true value contains a negative value. Users may still use negative values, but the result may not be between 0 and 1. Starting in v1.4, passing in negative values for y_true will raise an error. 22710 by Conroy Trinh <trinhcon> and 23461 by Meekail Zain <micky774>.
  • metrics.log_loss with eps=0 now returns a correct value of 0 or np.inf instead of nan for predictions at the boundaries (0 or 1). It also accepts integer input. 24365 by Christian Lorentzen <lorentzenchr>.
  • The parameter sum_over_features of metrics.pairwise.manhattan_distances is deprecated and will be removed in 1.4. 24630 by Rushil Desai <rusdes>.

sklearn.model_selection

  • Added the class model_selection.LearningCurveDisplay that allows to make easy plotting of learning curves obtained by the function model_selection.learning_curve. 24084 by Guillaume Lemaitre <glemaitre>.
  • For all SearchCV classes and scipy >= 1.10, rank corresponding to a nan score is correctly set to the maximum possible rank, rather than np.iinfo(np.int32).min. 24141 by Loïc Estève <lesteve>.
  • In both model_selection.HalvingGridSearchCV and model_selection.HalvingRandomSearchCV parameter combinations with a NaN score now share the lowest rank. 24539 by Tim Head <betatim>.
  • For model_selection.GridSearchCV and model_selection.RandomizedSearchCV ranks corresponding to nan scores will all be set to the maximum possible rank. 24543 by Guillaume Lemaitre <glemaitre>.

sklearn.multioutput

  • Added boolean verbose flag to classes: multioutput.ClassifierChain and multioutput.RegressorChain. 23977 by Eric Fiegel <efiegel>, Chiara Marmo <cmarmo>, Lucy Liu <lucyleeow>, and Guillaume Lemaitre <glemaitre>.

sklearn.naive_bayes

  • Add methods predict_joint_log_proba to all naive Bayes classifiers. 23683 by Andrey Melnik <avm19>.
  • A new parameter force_alpha was added to naive_bayes.BernoulliNB, naive_bayes.ComplementNB, naive_bayes.CategoricalNB, and naive_bayes.MultinomialNB, allowing user to set parameter alpha to a very small number, greater or equal 0, which was earlier automatically changed to 1e-10 instead. 16747 by arka204, 18805 by hongshaoyang, 22269 by Meekail Zain <micky774>.

sklearn.neighbors

  • Adds new function neighbors.sort_graph_by_row_values to sort a CSR sparse graph such that each row is stored with increasing values. This is useful to improve efficiency when using precomputed sparse distance matrices in a variety of estimators and avoid an EfficiencyWarning. 23139 by Tom Dupre la Tour.
  • neighbors.NearestCentroid is faster and requires less memory as it better leverages CPUs' caches to compute predictions. 24645 by Olivier Grisel <ogrisel>.
  • neighbors.KernelDensity bandwidth parameter now accepts definition using Scott's and Silverman's estimation methods. 10468 by Ruben <icfly2> and 22993 by Jovan Stojanovic <jovan-stojanovic>.
  • neighbors.NeighborsBase now accepts Minkowski semi-metric (i.e. when 0 < p < 1 for metric="minkowski") for algorithm="auto" or algorithm="brute". 24750 by Rudresh Veerkhare <RudreshVeerkhare>
  • neighbors.NearestCentroid now raises an informative error message at fit-time instead of failing with a low-level error message at predict-time. 23874 by Juan Gomez <2357juan>.
  • Set n_jobs=None by default (instead of 1) for neighbors.KNeighborsTransformer and neighbors.RadiusNeighborsTransformer. 24075 by Valentin Laurent <Valentin-Laurent>.
  • neighbors.LocalOutlierFactor now preserves dtype for numpy.float32 inputs. 22665 by Julien Jerphanion <jjerphan>.

sklearn.pipeline

  • pipeline.FeatureUnion.get_feature_names_out can now be used when one of the transformers in the pipeline.FeatureUnion is "passthrough". 24058 by Diederik Perdok <diederikwp>
  • The pipeline.FeatureUnion class now has a named_transformers attribute for accessing transformers by name. 20331 by Christopher Flynn <crflynn>.

sklearn.preprocessing

  • preprocessing.FunctionTransformer will always try to set n_features_in_ and feature_names_in_ regardless of the validate parameter. 23993 by Thomas Fan.
  • preprocessing.LabelEncoder correctly encodes NaNs in transform. 22629 by Thomas Fan.
  • The sparse parameter of preprocessing.OneHotEncoder is now deprecated and will be removed in version 1.4. Use sparse_output instead. 24412 by Rushil Desai <rusdes>.

sklearn.svm

  • The class_weight_ attribute is now deprecated for svm.NuSVR, svm.SVR, svm.OneClassSVM. 22898 by Meekail Zain <micky774>.

sklearn.tree

  • tree.plot_tree, tree.export_graphviz now uses a lower case x[i] to represent feature i. 23480 by Thomas Fan.

sklearn.utils

  • A new module exposes development tools to discover estimators (i.e. utils.discovery.all_estimators), displays (i.e. utils.discovery.all_displays) and functions (i.e. utils.discovery.all_functions) in scikit-learn. 21469 by Guillaume Lemaitre <glemaitre>.
  • utils.extmath.randomized_svd now accepts an argument, lapack_svd_driver, to specify the lapack driver used in the internal deterministic SVD used by the randomized SVD algorithm. 20617 by Srinath Kailasa <skailasa>
  • utils.validation.column_or_1d now accepts a dtype parameter to specific y's dtype. 22629 by Thomas Fan.
  • utils.multiclass.type_of_target now properly handles sparse matrices. 14862 by Léonard Binet <leonardbinet>.
  • HTML representation no longer errors when an estimator class is a value in get_params. 24512 by Thomas Fan.
  • utils.estimator_checks.check_estimator now takes into account the requires_positive_X tag correctly. 24667 by Thomas Fan.
  • The extra keyword parameters of utils.extmath.density are deprecated and will be removed in 1.4. 24523 by Mia Bajic <clytaemnestra>.

Code and Documentation Contributors

Thanks to everyone who has contributed to the maintenance and improvement of the project since version 1.1, including:

TODO: update at the time of the release.