Suggestion: Remove prediction from plot_confusion_matrix and just pass predicted labels #15880

jhennrich · 2019-12-13T09:44:44Z

The signature of plot_confusion_matrix is currently:

sklearn.metrics.plot_confusion_matrix(estimator, X, y_true, labels=None, sample_weight=None, normalize=None, display_labels=None, include_values=True, xticks_rotation='horizontal', values_format=None, cmap='viridis', ax=None)

The function takes an estimator and raw data and can not be used with already predicted labels. This has some downsides:

If a confusion matrix should be plotted but the predictions should also be used elsewhere (e.g. calculating accuracy_score) the estimation has to be performed several times. That takes longer and can result in different values if the estimator is randomized.
If no estimator is available (e.g. predictions loaded from a file) the plot can not be used at all.

Suggestion: allow passing predicted labels y_pred to plot_confusion_matrix that will be used instead of estimator and X. In my opinion the cleanest solution would be to remove the prediction step from the function and use a signature similar to that of accuracy_score, e.g. (y_true, y_pred, labels=None, sample_weight=None, ...). However in order to maintain backwards compatibility, y_pred can be added as an optional keyword argument.

TODO:

Introduce the class methods for the currently existing plots:
- ConfusionMatrixDisplay ENH/DEP add class method and deprecate plot function for confusion matrix #18543
- PrecisionRecallDisplay API add from_estimator and from_preditions to PrecisionRecallDisplay #20552
- RocCurveDisplay API add from_estimator and from_predictions to RocCurveDisplay #20569
- DetCurveDisplay API deprecate plot_det_curve in favor of display class methods #19278
- PartialDependenceDisplay. For this one, we don't want to introduce the from_predictions classmethod because it would not make sense, we only want from_estimator.
For all Display listed above, deprecate their corresponding plot_... function. We don't need to deprecate plot_det_curve because it hasn't been released yet, we can just remove it.
for new PRs like ENH Add CalibrationDisplay plotting class #17443 and FEA add PredictionErrorDisplay #18020 we can implement the class methods right away instead of introducing a plot function.

The text was updated successfully, but these errors were encountered:

NicolasHug · 2019-12-13T11:39:39Z

We should definitely stay backward compatible, but adding a y_pred keyword arg sounds reasonable to me. We should raise an error if y_pred is passed but X or estimator are also passed.

Would you want to submit a PR @jhennrich ?

jhennrich · 2019-12-13T14:34:49Z

I submitted a PR, but I think there is currently a problem with the CI so it has not passed yet.

qinhanmin2014 · 2019-12-13T14:48:18Z

I agree that we should support plot_XXX(y_true, y_pred) to avoid calculating the prediction for multiple times.
We also have similar issues in plot_roc_curve and plot_precision_recall_curve.
Adding y_pred seems acceptable, but honestly I don't think it's a good solution.
For those functions which accept **kwargs (e.g., plot_precision_recall_curve), seems that it's impossible to keep backward compatible?

NicolasHug · 2019-12-13T14:51:10Z

Why is it impossible to keep the backward compatibility? It seems to me that the proposal in #15883 is OK

qinhanmin2014 · 2019-12-13T14:52:48Z

Why is it impossible to keep the backward compatibility? It seems to me that the proposal in #15883 is OK

because we do not support **kwargs in plot_confusion_matrix. @NicolasHug

NicolasHug · 2019-12-13T14:54:43Z

Why is kwargs a problem?

qinhanmin2014 · 2019-12-13T14:55:14Z

Hmm, so there's another annoying thing, we support **kwargs in plot_roc_curve and plot_precision_recall_curve (and plot_partial_dependence), but we do not support it in plot_confusion_matrix

qinhanmin2014 · 2019-12-13T15:00:23Z

Why is kwargs a problem?

if we add the new parameter before **kwargs, we can keep backward compatibility, right?

jhennrich · 2019-12-13T15:08:29Z

The changes in my PR are backwards compatible and **kwargs can still be added. But I agree with @qinhanmin2014, a much much cleaner solution would be to throw out estimator and X and use positional arguments (y_true, y_pred, ...) that are consistent with most of the other sklearn stuff.

NicolasHug · 2019-12-13T15:10:33Z

if we add the new parameter before **kwargs, we can keep backward compatibility, right?

yes

a much much cleaner solution....

Unfortunately that would require a deprecation cycle (unless we make it very fast in the bugfix release but I doubt it...)

@thomasjpfan , any reason to pass the estimator as input instead of the predictions?

qinhanmin2014 · 2019-12-13T15:10:52Z

Thanks, let's add y_pred first, **kwags is another issue.

qinhanmin2014 · 2019-12-13T15:19:24Z

Unfortunately that would require a deprecation cycle (unless we make it very fast in the bugfix release but I doubt it...)

This seems impossible, sigh

@thomasjpfan , any reason to pass the estimator as input instead of the predictions?

I agree that we need to reconsider our API design. also try to ping @amueller

thomasjpfan · 2019-12-13T15:44:02Z

If a user wants to provide their own plotting part and provide their own confusion matrix:

from sklearn.metrics import ConfusionMatrixDisplay
confusion_matrix = confusion_matrix(...)
display_labels = [...]

disp = ConfusionMatrixDisplay(confusion_matrix=confusion_matrix,
                              display_labels=display_labels)
disp.plot(...)

This can similar be done for the other metric plotting functions.

The plot_confusion_matrix is kind of designed like the scorers which are able to handle the output of estimators nicely. In other words, it is a convenience wrapper for interacting with ConfusionMatrixDisplay and the estimator.

thomasjpfan · 2019-12-13T15:52:47Z

By accepting the estimator first, there is a uniform interface for the plotting functions. For example, the plot_partial_dependence does all the computation needed for creating the partial dependence plots and passes it to PartialDependenceDisplay. A user can still create the PartialDependenceDisplay themselves, but in that case it would be more invovled.

thomasjpfan · 2019-12-13T15:55:19Z

Although, I am open to having a "fast path", allowing for y_pred to be passed into the metrics related plotting functions, which will be passed directly to confusion_matrix and let it deal with validation.

NicolasHug · 2019-12-13T15:55:34Z

The computation of the predictions needed to build a PDPs are quite complex. Also, these predictions are typically unusable in e.g. a scorer or a metric. They're only useful for plotting the PDP. So it makes sense in this case to only accept the estimator in plot_partial_dependence.

OTOH for confusion matrix, the predictions are really just est.predict(X).

I don't think we want a uniform interface here. These are 2 very different input use-cases

EDIT: In addition, the tree-based PDPs don't even need predictions at all

thomasjpfan · 2019-12-13T16:31:40Z

There are other things we will run into without the estimator. For example if plot_precision_recall_curve were to accept y_pred, it will need pos_label because it can not be inferred anymore. In this case, I would prefer to use PrecisionRecallDisplay directly and have the user calculate the parameters needed to reconstruct the plot.

This comes down to what kind of question we are answering with this API. The current interface revolves around evaluating an estimator, thus using the estimator as an argument. It is motivated by answering "how does this trained model behave with this input data?"

If we accept y_pred, y_true, now the question becomes "how does this metric behave with this data?" This data may or may not be generated by a model.

NicolasHug · 2019-12-13T19:02:23Z

It's true that in this specific case, @jhennrich you could directly be using the ConfusionMatrixDisplay.

One drawback is that you need to specify display_labels since it has no default.

@thomasjpfan do you think we could in general provide sensible defaults for the Display objects, thus still making the direct use of the Display objects practical?

thomasjpfan · 2019-12-13T19:32:28Z

For some parameters, like display_labels, there is a reasonable default. The other Display object parameters can have reasonable defaults as well. Some parameters must be provided tho. For example, confusion_matrix must be provided for ConfusionMatrixDisplay or precision and recall for PrecisionRecallDisplay.

jnothman · 2019-12-16T12:51:03Z

One classic pattern for this kind of thing is defining:

ConfusionMatrixDisplay.from_estimator(...)
ConfusionMatrixDisplay.from_predictions(...)

but this is not very idiomatic to scikit-learn.

qinhanmin2014 · 2019-12-16T13:02:22Z

I start to get confused. The goal of current API is to avoid calculating for multiple times if users want to plot for multiple times, but if we accept y_true and y_pred, users still don't need to calculate for multiple times? (I know that things are different in PDP)

thomasjpfan · 2019-12-16T13:08:31Z

@jnothman That API is pretty nice looking!

@qinhanmin2014 Passing an estimator, X, y or y_true, y_pred works in satisfying the "do not calculate multiple times" API. In both cases, the confusion matrix is computed and stored into the Display object.

The difference between them is where the calculation of confusion matrix starts. One can think of pass y_pred as the "precomputed" value of the estimator.

qinhanmin2014 · 2019-12-16T13:11:52Z

So I think y_true, y_pred is better than estimator, X, y (not in PDP of course), because sometimes (often?) users not only want to plot the predictions, they also want to analysis the predictions. With current API, they'll need to calculate the predictions for multiple times.

thomasjpfan · 2019-12-16T13:35:59Z

For metrics, I can see the preference toward using y_true, y_pred over estimator, X, y. Imagine if the plotting for metrics support only y_true, y_pred

est = # fit estimator

plot_partial_dependence(est, X, ...)

# if plot_confusion_matrix accepts `y_true, y_pred`
y_pred = est.predict(X)
plot_confusion_matrix(y_true, y_pred, ...)

# if plot_roc_curve supports `y_true, y_score`
y_score = est.predict_proba(X)[: , 1]
plot_roc_curve(y_true, y_score, ...)
plot_precision_recall_curve(y_true, y_score, ...)

Currently the API looks like:

est = # fit estimator
plot_partial_dependence(est, X, ...)
plot_confusion_matrix(est, X, y, ...)
plot_roc_curve(est, X, y, ...)

# this will call `predict_proba` again
plot_precision_recall_curve(est, X, y, ...)

I would prefer to have an API that supports both options (somehow).

qinhanmin2014 · 2019-12-16T13:43:21Z

For metrics, I can see the preference toward using y_true, y_pred over estimator, X, y. Imagine if the plotting for metrics support only y_true, y_pred

Yes, this is what I mean.

I would prefer to have an API that supports both options (somehow).

I think this is a practical solution. An annoying thing is that we can only add y_pred at the end (i.e., plot_confusion_matrix(estimator, X, y_true, ..., y_pred))

NicolasHug · 2020-09-10T11:47:30Z

The blocker label is for release blockers (things that absolutely need to be fixed before a release), not for PR blockers

lucyleeow · 2020-09-10T11:49:15Z

Ahh good to know.

pzelasko · 2020-09-10T15:38:55Z

@pzelasko @jhennrich how do you feel about having two classmethods or two functions? Or would you prefer a single function, which is a bit messy in python.

And if you prefer two functions or two classmethods, do you see any benefit despite discoverability? Discoverability might be enough of a reason to do classmethods though, I don't see a strong argument for having two functions.

I like the two-classmethods approach the most, especially the from_xxx pattern - sth like @thomasjpfan proposed:

CalibrationDisplay.from_estimator(...)
CalibrationDisplay.from_predictions(...)

NicolasHug · 2020-10-03T16:22:20Z

Looks like there's no strong opposition to using 2 class methods, so let's do that. We'll need to:

Introduce the class methods for the currently existing plots:
- ConfusionMatrixDisplay
- PrecisionRecallDisplay
- RocCurveDisplay
- DetCurveDisplay
- PartialDependenceDisplay. For this one, we don't want to introduce the from_predictions classmethod because it would not make sense, we only want from_estimator.
For all Display listed above, deprecate their corresponding plot_... function. We don't need to deprecate plot_det_curve because it hasn't been released yet, we can just remove it.
for new PRs like ENH Add CalibrationDisplay plotting class #17443 and FEA add PredictionErrorDisplay #18020 we can implement the class methods right away instead of introducing a plot function.

This is a bit of work but I think we can get this done before 0.24 so that #17443 and #18020 can move forward already.

Any objection @thomasjpfan @jnothman @amueller @glemaitre ?

@jhennrich @pzelasko , would you be interested in submitting a PR to introduce the class methods in one of the Display objects?

lucyleeow · 2020-10-03T16:24:52Z

Thanks for making the decision @NicolasHug! I'll get onto #17443 (after waiting for objections)

thomasjpfan · 2020-10-05T16:26:02Z

I have no objections.

glemaitre · 2020-10-06T08:01:29Z

No objection as well.

glemaitre · 2020-10-06T08:05:30Z

I will take care of the other classes then and advance my stalled PR.
@lucyleeow in case I did not do all of those and you are searching for some PRs, ping me :)

pzelasko · 2020-10-06T13:22:35Z

I'd love to contribute but I'm engaged in too many projects at this time. Thanks for listening to the suggestions!

amueller · 2020-10-07T19:59:12Z

Sounds good :)

adrinjalali · 2021-08-22T09:31:16Z

Seems like there's been good progress here. Removing the milestone, but pinging y'all in case you think something needs to get in before the release.

glemaitre · 2021-08-22T10:49:51Z

All the PRs regarding the plotting are either reviewed or written on my side (you can have a look at this column of my board https://github.com/scikit-learn/scikit-learn/projects/17#column-6455639) Sent from my phone - sorry to be brief and potential misspell.

thomasjpfan · 2022-04-15T21:37:50Z

I am closing this issue because all the tasks in the original issue has been merged.

jhennrich mentioned this issue Dec 13, 2019

[MRG] Allow plot_confusion_matrix to be called on predicted labels #15883

Closed

qinhanmin2014 added this to the 0.22.1 milestone Dec 13, 2019

qinhanmin2014 added Blocker and removed Blocker labels Dec 13, 2019

qinhanmin2014 added the Blocker label Dec 13, 2019

glemaitre added Blocker and removed Blocker labels Sep 10, 2020

cmarmo modified the milestones: 0.24, 1.0 Sep 17, 2020

lucyleeow mentioned this issue Sep 27, 2020

ENH Add CalibrationDisplay plotting class #17443

Merged

glemaitre self-assigned this Oct 6, 2020

glemaitre mentioned this issue Oct 6, 2020

ENH/DEP add class method and deprecate plot function for confusion matrix #18543

Merged

glemaitre mentioned this issue Jan 26, 2021

API deprecate plot_det_curve in favor of display class methods #19278

Merged

This was referenced May 13, 2021

Plot precision-recall curve given probas_pred #20082

Closed

Add an optional y_pred argument to metrics.plot_confusion_matrix #20107

Closed

glemaitre mentioned this issue Jul 19, 2021

API add from_estimator and from_preditions to PrecisionRecallDisplay #20552

Merged

4 tasks

glemaitre added this to TO DO in Guillaume's pet Jul 19, 2021

glemaitre mentioned this issue Jul 19, 2021

API add from_estimator and from_predictions to RocCurveDisplay #20569

Merged

8 tasks

glemaitre removed this from TO DO in Guillaume's pet Aug 11, 2021

adrinjalali removed this from the 1.0 milestone Aug 22, 2021

thomasjpfan closed this as completed Apr 15, 2022

Meeting Issues automation moved this from To do to Done Apr 15, 2022

Suggestion: Remove prediction from plot_confusion_matrix and just pass predicted labels #15880

Suggestion: Remove prediction from plot_confusion_matrix and just pass predicted labels #15880

Comments

jhennrich commented Dec 13, 2019 • edited by thomasjpfan

NicolasHug commented Dec 13, 2019

jhennrich commented Dec 13, 2019

qinhanmin2014 commented Dec 13, 2019

NicolasHug commented Dec 13, 2019

qinhanmin2014 commented Dec 13, 2019

NicolasHug commented Dec 13, 2019

qinhanmin2014 commented Dec 13, 2019

qinhanmin2014 commented Dec 13, 2019

jhennrich commented Dec 13, 2019 • edited

NicolasHug commented Dec 13, 2019

qinhanmin2014 commented Dec 13, 2019

qinhanmin2014 commented Dec 13, 2019

thomasjpfan commented Dec 13, 2019 • edited

thomasjpfan commented Dec 13, 2019 • edited

thomasjpfan commented Dec 13, 2019 • edited

NicolasHug commented Dec 13, 2019 • edited

thomasjpfan commented Dec 13, 2019 • edited

NicolasHug commented Dec 13, 2019

thomasjpfan commented Dec 13, 2019

jnothman commented Dec 16, 2019

qinhanmin2014 commented Dec 16, 2019 • edited

thomasjpfan commented Dec 16, 2019

qinhanmin2014 commented Dec 16, 2019

thomasjpfan commented Dec 16, 2019

qinhanmin2014 commented Dec 16, 2019

NicolasHug commented Sep 10, 2020

lucyleeow commented Sep 10, 2020

pzelasko commented Sep 10, 2020

NicolasHug commented Oct 3, 2020

lucyleeow commented Oct 3, 2020 • edited

thomasjpfan commented Oct 5, 2020

glemaitre commented Oct 6, 2020

glemaitre commented Oct 6, 2020

pzelasko commented Oct 6, 2020

amueller commented Oct 7, 2020

adrinjalali commented Aug 22, 2021

glemaitre commented Aug 22, 2021 via email

thomasjpfan commented Apr 15, 2022

jhennrich commented Dec 13, 2019 •

edited by thomasjpfan

jhennrich commented Dec 13, 2019 •

edited

thomasjpfan commented Dec 13, 2019 •

edited

thomasjpfan commented Dec 13, 2019 •

edited

thomasjpfan commented Dec 13, 2019 •

edited

NicolasHug commented Dec 13, 2019 •

edited

thomasjpfan commented Dec 13, 2019 •

edited

qinhanmin2014 commented Dec 16, 2019 •

edited

lucyleeow commented Oct 3, 2020 •

edited