Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate Api examples #5186

Merged
merged 12 commits into from Jan 14, 2022
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
27 changes: 27 additions & 0 deletions examples/evaluation/README.md
@@ -0,0 +1,27 @@
### MLflow evaluation Examples

The examples in this directory illustrate how you can use the `mlflow.evaluate` API to evaluate a PyFunc model on the
specified dataset using builtin default evaluator, and log resulting metrics & artifacts to MLflow Tracking.

- Example `evaluate_on_binary_classifier.py` evaluates an xgboost `XGBClassifier` model on dataset loaded by
`shap.datasets.adult`.
- Example `evaluate_on_multiclass_classifier.py` evaluates a scikit-learn `LogisticRegression` model on dataset
generated by `sklearn.datasets.make_classification`.
- Example `evaluate_on_regressor.py` evaluate as scikit-learn `LinearRegression` model on dataset loaded by
`sklearn.datasets.fetch_california_housing`

#### Prerequisites

```
pip install scikit-learn xgboost shap matplotlib
WeichenXu123 marked this conversation as resolved.
Show resolved Hide resolved
```

#### How to run the examples

Run in this directory with Python.

```
python evaluate_on_binary_classifier.py
python evaluate_on_multiclass_classifier.py
python evaluate_on_regressor.py
```
31 changes: 31 additions & 0 deletions examples/evaluation/evaluate_on_binary_classifier.py
@@ -0,0 +1,31 @@
import xgboost
import shap
import mlflow
from sklearn.model_selection import train_test_split

# train XGBoost model
X, y = shap.datasets.adult()

num_examples = len(X)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

model = xgboost.XGBClassifier().fit(X_train, y_train)

eval_data = X_test
eval_data["label"] = y_test

with mlflow.start_run() as run:
mlflow.sklearn.log_model(model, "model")
model_uri = mlflow.get_artifact_uri("model")
result = mlflow.evaluate(
model_uri,
eval_data,
targets="label",
model_type="classifier",
dataset_name="adult",
evaluators=["default"],
)

print(f"metrics:\n{result.metrics}")
print(f"artifacts:\n{result.artifacts}")
Copy link
Member

@harupy harupy Jan 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line's output looks like:

{'roc_curve_plot': <mlflow.models.evaluation.artifacts.ImageEvaluationArtifact object at 0x7f5d98244950>,
 'precision_recall_curve_plot': <mlflow.models.evaluation.artifacts.ImageEvaluationArtifact object at 0x7f5d982890d0>,
 'lift_curve_plot': <mlflow.models.evaluation.artifacts.ImageEvaluationArtifact object at 0x7f5d985bc250>,
 'confusion_matrix': <mlflow.models.evaluation.artifacts.ImageEvaluationArtifact object at 0x7f5d9826bc10>,
 'shap_beeswarm_plot': <mlflow.models.evaluation.artifacts.ImageEvaluationArtifact object at 0x7f5d98123fd0>,
 'shap_summary_plot': <mlflow.models.evaluation.artifacts.ImageEvaluationArtifact object at 0x7f5d2c18fd50>,
 'shap_feature_importance_plot': <mlflow.models.evaluation.artifacts.ImageEvaluationArtifact object at 0x7f5d96eb1990>}

We might want to implement __str__ for EvaluationArtifact for better string representation (e.g. <EvaluationArtifact file_name.png>).

cc @dbczumar

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do this in follow-up updates.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I add a __repr__ for evaluation artifact , format is like ImageEvaluationArtifact(uri='...')

27 changes: 27 additions & 0 deletions examples/evaluation/evaluate_on_multiclass_classifier.py
@@ -0,0 +1,27 @@
import mlflow
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split

mlflow.sklearn.autolog()

X, y = make_classification(n_samples=10000, n_classes=10, n_informative=5, random_state=1)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

with mlflow.start_run() as run:
model = LogisticRegression(solver="liblinear").fit(X_train, y_train)
model_uri = mlflow.get_artifact_uri("model")
result = mlflow.evaluate(
model_uri,
X_test,
targets=y_test,
model_type="classifier",
dataset_name="multiclass-classification-dataset",
evaluators="default",
evaluator_config={"log_model_explainability": True, "explainability_nsamples": 1000},
)

print(f"run_id={run.info.run_id}")
print(f"metrics:\n{result.metrics}")
print(f"artifacts:\n{result.artifacts}")
30 changes: 30 additions & 0 deletions examples/evaluation/evaluate_on_regressor.py
@@ -0,0 +1,30 @@
import mlflow
from sklearn.datasets import fetch_california_housing
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split

mlflow.sklearn.autolog()

california_housing_data = fetch_california_housing()

X_train, X_test, y_train, y_test = train_test_split(
california_housing_data.data, california_housing_data.target, test_size=0.33, random_state=42
)

with mlflow.start_run() as run:
model = LinearRegression().fit(X_train, y_train)
model_uri = mlflow.get_artifact_uri("model")

result = mlflow.evaluate(
model_uri,
X_test,
targets=y_test,
model_type="regressor",
dataset_name="california_housing",
evaluators="default",
feature_names=california_housing_data.feature_names,
evaluator_config={"explainability_nsamples": 1000},
)

print(f"metrics:\n{result.metrics}")
print(f"artifacts:\n{result.artifacts}")
WeichenXu123 marked this conversation as resolved.
Show resolved Hide resolved
2 changes: 2 additions & 0 deletions mlflow/models/evaluation/lift_curve.py
Expand Up @@ -6,6 +6,7 @@
def _cumulative_gain_curve(y_true, y_score, pos_label=None):
"""
This method is copied from scikit-plot package.
WeichenXu123 marked this conversation as resolved.
Show resolved Hide resolved
See https://github.com/reiinakano/scikit-plot/blob/2dd3e6a76df77edcbd724c4db25575f70abb57cb/scikitplot/helpers.py#L157

This function generates the points necessary to plot the Cumulative Gain

Expand Down Expand Up @@ -77,6 +78,7 @@ def plot_lift_curve(
):
"""
This method is copied from scikit-plot package.
See https://github.com/reiinakano/scikit-plot/blob/2dd3e6a76df77edcbd724c4db25575f70abb57cb/scikitplot/metrics.py#L1133

Generates the Lift Curve from labels and scores/probabilities

Expand Down