RFC New parameters for penalties in LogisticRegression #28711

lorentzenchr · 2024-03-27T14:58:31Z

Currently, LogisticRegression uses C as inverse penalization strength, penalty to select the type of penalty and l1_ratio to control the ration between l1 and l2 penalties.
I propose the following:

Add alpha (as in Ridge, ElasticNet, PoissonRegressor ...) instead of C.
Fail if both are given at the same time.
Deprecate C.
Deprecate penalty which is redundant. alpha and l1_ratio are enough.

The text was updated successfully, but these errors were encountered:

jeremiedbb · 2024-03-27T15:43:42Z

ping @scikit-learn/core-devs for more visibility.

ogrisel · 2024-03-28T08:41:42Z

If we are to do such a big renaming, wouldn't it make sense to use a more explicit name such as l2_regularization or l2_reg?

Also note: the alpha in lasso / GLMs and the one in ridge do not have the same meaning. One is scaled by the sum of sample weights (or n_samples) while the other is not. If we are to use a more explicit name such as l2_reg(ularization) we could use that opportunity to also make that more uniform across all linear models.

We had a discussion in the past about whether this is intentional or not because it could lead to make efficient parametrization for parameter tuning but I think we could decouple the choice of the default parameter value (which can stay estimator specific) from the choice of the parametrization which we could choose to make uniform across all linear models to simplify the message in the narrative doc, the reference doc and the various examples.

We could also decide to move progressively and first rename C by l2_reg and defer the decision to do a renaming/uniformization for alpha to a later time (if ever).

jeremiedbb · 2024-03-28T09:46:27Z

But C controls both l1 and l2 regularization (as alpha does for ElasticNet), their relative strength being controlled by l1_ratio. I'd find it confusing to rename it l2_reg and then enable l1 regularization by setting l2_reg=<some value> and l1_ratio=1.

ogrisel · 2024-03-28T11:21:10Z

Indeed l2_reg(ularization) would be a bad name: reg(ularization)_strength or just regularization might be a better name.

lorentzenchr · 2024-04-01T16:58:02Z

Indeed l2_reg(ularization) would be a bad name: reg(ularization)_strength or just regularization might be a better name.

I would prefer a consistent name among all linear models and alpha seems to be used most often as penalty strength.

As of version 1.4:

estimator	type	penalty strength parameter	additional penalty parameter
`LogisticRegression`	classifier	`C`	`penalty`, `l1_ratio`
`PassiveAggressiveClassifier`	classifier	`C`
`RidgeClassifier`	classifier	`alpha`
`SGDClassifier`	classifier	`alpha`	`penalty`, `l1_ratio`
`Ridge`	regressor	`alpha`
`SGDRegressor`	regressor	`alpha`	`penalty`, `l1_ratio`
`ElasticNet`	regressor	`alpha`	`l1_ratio`
`Lasso`	regressor	`alpha`
`LassoLars`	regressor	`alpha`
`OrthogonalMatchingPursuit`	regressor	`n_nonzero_coefs`
`HuberRegressor`	regressor	`alpha`
`QuantileRegressor`	regressor	`alpha`
`PoissonRegressor`	regressor	`alpha`
`TweedieRegressor`	regressor	`alpha`
`GammaRegressor`	regressor	`alpha`
`PassiveAggressiveRegressor`	regressor	`C`

github-actions bot added the Needs Triage Issue requires triage label Mar 27, 2024

lorentzenchr changed the title ~~New parameters for penalties in LogisticRegression~~ RFC New parameters for penalties in LogisticRegression Mar 27, 2024

lorentzenchr added RFC module:linear_model labels Mar 27, 2024

lorentzenchr mentioned this issue Mar 27, 2024

DOC adapt logistic regression objective in user guide #28706

Merged

jeremiedbb added API and removed Needs Triage Issue requires triage labels Mar 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC New parameters for penalties in LogisticRegression #28711

RFC New parameters for penalties in LogisticRegression #28711

lorentzenchr commented Mar 27, 2024

jeremiedbb commented Mar 27, 2024

ogrisel commented Mar 28, 2024

jeremiedbb commented Mar 28, 2024

ogrisel commented Mar 28, 2024 •

edited

lorentzenchr commented Apr 1, 2024

RFC New parameters for penalties in LogisticRegression #28711

RFC New parameters for penalties in LogisticRegression #28711

Comments

lorentzenchr commented Mar 27, 2024

jeremiedbb commented Mar 27, 2024

ogrisel commented Mar 28, 2024

jeremiedbb commented Mar 28, 2024

ogrisel commented Mar 28, 2024 • edited

lorentzenchr commented Apr 1, 2024

ogrisel commented Mar 28, 2024 •

edited