Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XGBoost support #4204

Open
3 of 5 tasks
mikaelmv opened this issue Mar 26, 2021 · 3 comments
Open
3 of 5 tasks

XGBoost support #4204

mikaelmv opened this issue Mar 26, 2021 · 3 comments
Labels
area/models MLmodel format, model serialization/deserialization, flavors area/tracking Tracking service, tracking client APIs, autologging bug Something isn't working

Comments

@mikaelmv
Copy link

Thank you for submitting an issue. Please refer to our issue policy for additional information about bug reports. For help with debugging your code, please refer to Stack Overflow.

Please fill in this bug report template to ensure a timely and thorough response.

Willingness to contribute

The MLflow Community encourages bug fix contributions. Would you or another member of your organization be willing to contribute a fix for this bug to the MLflow code base?

  • Yes. I can contribute a fix for this bug independently.
  • Yes. I would be willing to contribute a fix for this bug with guidance from the MLflow community.
  • No. I cannot contribute a bug fix at this time.

System information

  • Have I written custom code (as opposed to using a stock example script provided in MLflow): Custom
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10
  • MLflow installed from (source or binary): source
  • MLflow version (run mlflow --version): 1.14.1
  • Python version: 3.7
  • npm version, if running the dev UI:
  • Exact command to reproduce:

Describe the problem

Hi,

I am using a Scikit-Learn wrapper for XGBoost (XGBRegressor).

Now when logging the model with mlflow (mlflow.xgboost.log_model), it crashes as anticipated by the official documentation:

https://www.mlflow.org/docs/latest/python_api/mlflow.xgboost.html

I noticed that if I use mlflow.sklearn.log_model instead, the artifact gets saved as expected.

  1. What are the implications of using mlflow.sklearn.log_model with an XGBoost model? Not clear to me.
  2. What alternatives do I have to log my XGBRegressor model correctly with mlflow?

Obviously if I instead use a RandomForest, then mlflow.sklearn.log_model works perfectly as expected, so again my issue is to log the XGBRegressor model.

Code to reproduce issue

Provide a reproducible test case that is the bare minimum necessary to generate the problem.

mlflow.xgboost.log_model(pipeline, "model") (where pipeline combines transformers and the XGBRegressor)

Other info / logs

151
152     # Save an XGBoost model

--> 153 xgb_model.save_model(model_data_path)
154
155 conda_env_subpath = "conda.yaml"

AttributeError: 'Pipeline' object has no attribute 'save_model'

What component(s), interfaces, languages, and integrations does this bug affect?

Components

  • area/artifacts: Artifact stores and artifact logging
  • area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
@mikaelmv mikaelmv added the bug Something isn't working label Mar 26, 2021
@github-actions github-actions bot added area/artifacts Artifact stores and artifact logging area/model-registry Model registry, model registry APIs, and the fluent client calls for model registry labels Mar 26, 2021
@dmatrix dmatrix added area/tracking Tracking service, tracking client APIs, autologging area/models MLmodel format, model serialization/deserialization, flavors and removed area/artifacts Artifact stores and artifact logging area/model-registry Model registry, model registry APIs, and the fluent client calls for model registry labels Apr 1, 2021
@dmatrix
Copy link
Contributor

dmatrix commented Apr 1, 2021

@mikaelmv Thanks for filing this? Can you give me a small example, to reproduce this, that I can run and see? I know that for mlflow autologging we don't support scikit-learn API

@dmatrix dmatrix added the needs author feedback Issue is waiting for the author to respond label Apr 1, 2021
@chedikouki
Copy link

is there any solution
mlflow.xgboost.save_model(pipeline,'Model')
AttributeError: 'Pipeline' object has no attribute 'save_model'

@stale stale bot removed the needs author feedback Issue is waiting for the author to respond label Apr 12, 2022
@harupy
Copy link
Member

harupy commented Jun 3, 2022

#4954 fixed this issue.

@harupy harupy closed this as completed Jun 3, 2022
@harupy harupy reopened this Jun 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/models MLmodel format, model serialization/deserialization, flavors area/tracking Tracking service, tracking client APIs, autologging bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants