Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC New parameters for penalties in LogisticRegression #28711

Open
lorentzenchr opened this issue Mar 27, 2024 · 5 comments
Open

RFC New parameters for penalties in LogisticRegression #28711

lorentzenchr opened this issue Mar 27, 2024 · 5 comments

Comments

@lorentzenchr
Copy link
Member

Based on the comment #28706 (comment):

Currently, LogisticRegression uses C as inverse penalization strength, penalty to select the type of penalty and l1_ratio to control the ration between l1 and l2 penalties.
I propose the following:

  1. Add alpha (as in Ridge, ElasticNet, PoissonRegressor ...) instead of C.
    Fail if both are given at the same time.
  2. Deprecate C.
  3. Deprecate penalty which is redundant. alpha and l1_ratio are enough.
@github-actions github-actions bot added the Needs Triage Issue requires triage label Mar 27, 2024
@lorentzenchr lorentzenchr changed the title New parameters for penalties in LogisticRegression RFC New parameters for penalties in LogisticRegression Mar 27, 2024
@jeremiedbb
Copy link
Member

ping @scikit-learn/core-devs for more visibility.

@jeremiedbb jeremiedbb added API and removed Needs Triage Issue requires triage labels Mar 27, 2024
@ogrisel
Copy link
Member

ogrisel commented Mar 28, 2024

If we are to do such a big renaming, wouldn't it make sense to use a more explicit name such as l2_regularization or l2_reg?

Also note: the alpha in lasso / GLMs and the one in ridge do not have the same meaning. One is scaled by the sum of sample weights (or n_samples) while the other is not. If we are to use a more explicit name such as l2_reg(ularization) we could use that opportunity to also make that more uniform across all linear models.

We had a discussion in the past about whether this is intentional or not because it could lead to make efficient parametrization for parameter tuning but I think we could decouple the choice of the default parameter value (which can stay estimator specific) from the choice of the parametrization which we could choose to make uniform across all linear models to simplify the message in the narrative doc, the reference doc and the various examples.

We could also decide to move progressively and first rename C by l2_reg and defer the decision to do a renaming/uniformization for alpha to a later time (if ever).

@jeremiedbb
Copy link
Member

But C controls both l1 and l2 regularization (as alpha does for ElasticNet), their relative strength being controlled by l1_ratio. I'd find it confusing to rename it l2_reg and then enable l1 regularization by setting l2_reg=<some value> and l1_ratio=1.

@ogrisel
Copy link
Member

ogrisel commented Mar 28, 2024

Indeed l2_reg(ularization) would be a bad name: reg(ularization)_strength or just regularization might be a better name.

@lorentzenchr
Copy link
Member Author

Indeed l2_reg(ularization) would be a bad name: reg(ularization)_strength or just regularization might be a better name.

I would prefer a consistent name among all linear models and alpha seems to be used most often as penalty strength.

As of version 1.4:

estimator type penalty strength parameter additional penalty parameter
LogisticRegression classifier C penalty, l1_ratio
PassiveAggressiveClassifier classifier C
RidgeClassifier classifier alpha
SGDClassifier classifier alpha penalty, l1_ratio
Ridge regressor alpha
SGDRegressor regressor alpha penalty, l1_ratio
ElasticNet regressor alpha l1_ratio
Lasso regressor alpha
LassoLars regressor alpha
OrthogonalMatchingPursuit regressor n_nonzero_coefs
HuberRegressor regressor alpha
QuantileRegressor regressor alpha
PoissonRegressor regressor alpha
TweedieRegressor regressor alpha
GammaRegressor regressor alpha
PassiveAggressiveRegressor regressor C

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants