Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make it possible to specify interaction_cst and monotonic_cst with feature names. #24852

Closed
ogrisel opened this issue Nov 7, 2022 · 5 comments
Assignees

Comments

@ogrisel
Copy link
Member

ogrisel commented Nov 7, 2022

Describe the workflow you want to enable

Instead of passing an array of monotonicity constraints (-1 for a decrease constraint, +1 for an increase constraint or 0 for no constraint) specified by feature positions in the training set, it would be more convenient to pass a dict to pass constraints spec only for the required feature names (when those are available as str values in the dataset columns). For instance

from sklearn.datasets import load_diabetes
from sklearn.ensemble import HistGradientBoostingRegressor

X, y = load_diabetes(return_X_y=True, as_frame=True)

reg = HistGradientBoostingRegressor(
    monotonic_cst={"bmi": +1, "s3": -1}
)
reg.fit(X, y)

Not that here X has column names because it is a pandas dataframe.

See #24845 for a similar feature for interaction_cst by passing a list of tuple of str names instead.

Describe your proposed solution

This requires updating the fit method, docstring and examples to accept a dict of constraints with feature names as keys.

If feature_names_in_ is not defined in fit, then a value error with a helpful error message must be raised.

Describe alternatives you've considered, if relevant

No response

Additional context

Once #13649 is finalized and merged, a similar treatment should be adapted to it for the sake of consistency.

@ogrisel ogrisel self-assigned this Nov 7, 2022
@ogrisel
Copy link
Member Author

ogrisel commented Nov 7, 2022

I will give it a quick shot.

@ogrisel
Copy link
Member Author

ogrisel commented Nov 10, 2022

I am working on a similar UX improvement for categorical_features.

@lorentzenchr
Copy link
Member

If feature_names_in_ is defined in fit, then a value error with a helpful error message must be raised.

Did you mean „if … is NOT defined, …“?

@ogrisel
Copy link
Member Author

ogrisel commented Nov 15, 2022

monotonic_cst was implemented in:

and something similar for categorical_features in:

but we still need to do the same for interaction_cst, maybe after #24849 has been reviewed and merged.

@lorentzenchr
Copy link
Member

I guess this can be closed with #24855, #24889 and #24849 merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants