Calibration: added expected calibration error (ECE) #28831
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this implement/fix? Explain your changes.
This PR implements the Expected Calibration Error (ECE) as a function in the
sklearn.calibration
module. ECE is a widely used metric used to quantify the calibration performance of probabilistic machine learning models, particularly in binary classification tasks. It measures the discrepancy between predicted probabilities and empirical probabilities across different confidence levels.It is used both in "classical" machine learning and deep learning, for example:
"Obtaining Well Calibrated Probabilities Using Bayesian Binning", see PDF at https://ojs.aaai.org/index.php/AAAI/article/view/9602
The widely cited "On Calibration of Modern Neural Networks" , see PDF at https://arxiv.org/pdf/1706.04599.pdf. The ECE is defined in equations 2 and 3.
Computing the ECE follows the same initial steps as the calibration curve. So I place the common code into a new function called
calibration_stats
which is then called by bothcalibration_curve
andexpected_calibration_error
.Any other comments?
I have not added unit tests yet. Sending the PR to get some feedback before proceeding.