Replies: 1 comment
-
For the case that the metric is undefined then we should have the @fabseb60 Feel free to take over the PR, it will be improve greatly the current stage for this corner-case. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Dear all,
A number of colleagues and I would like to use the balanced accuracy score (https://scikit-learn.org/stable/modules/model_evaluation.html#balanced-accuracy-score) in our single-label multiclass classification work. The definition of this score that you give, though, is problematic, since it does not account for the case in which one of the classes is empty, which may indeed happen in many cases of applicative interest. E.g., what happens, in the equation at the top of 3.3.2.4, when both TP and FN are zero? (That is, when the data contains no positive examples?) Some may be tempted to say that in this case we should consider TP/TP+FN to be 1, while others might be tempted to say that in this case we should ignore the TP/TP+FN factor in the addition; note that these two alternatives return different results.
Of course, this problem is relevant both in the binary case and in the single-label multiclass case (in which, as you say, balanced accuracy “is the macro-average of recall scores per class” – in this case, if one of the classes is empty, recall for that class is equal to 0/0). A 2015 paper of mine [1] discusses a solution; in that paper I present a version of balanced accuracy (that I call K) that is equivalent to balanced accuracy in unproblematic cases and is also defined for the above problematic case. K is defined for the binary case (Equation 5 in that paper), for the single-label multiclass case (Equation 10), and for the ordinal case, i.e., the single-label multiclass case in which the classes are ordered, as in e.g., Disastrous, Poor, OK-ish, Good, Excellent (Equations 10 and 11). You might consider replacing balanced accuracy with K.
Best FS
[1] Fabrizio Sebastiani: An Axiomatically Derived Measure for the Evaluation of Classification Algorithms. ICTIR 2015: 11-20. http://nmis.isti.cnr.it/sebastiani/Publications/ICTIR2015.pdf
Beta Was this translation helpful? Give feedback.
All reactions