Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAINT Clean up deprecations for 1.5: in NearestCentroid #28813

Conversation

jeremiedbb
Copy link
Member

Removed the deprecated metrics from NearestCentroid (everything but euclidean and manhattan).

I also reworked the docstring a bit because it was wrong:

  • for euclidean, the mean doesn't minimize the sum of L2 distances (this would be the geometric median) but the sum of squared L2 distances.
  • for manhattan, the median is ambiguous. We take the feature-wise median which minimizes the sum of L1 distances, not to be confused with the geometric median which minimizes the sum of L2 distances.

@jeremiedbb jeremiedbb added No Changelog Needed Quick Review For PRs that are quick to review labels Apr 11, 2024
Copy link

github-actions bot commented Apr 11, 2024

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: f68fe40. Link to the linter CI: here

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some feedback. Otherwise, LGTM.

If `metric="euclidean"`, the centroid for the samples corresponding to each
class is the arithmetic mean, which minimizes the sum of squared L1 distances.
If `metric="manhattan"`, the centroid is the feature-wise median, which
minimizes the sum of L1 distances.

.. deprecated:: 1.3
Support for metrics other than `euclidean` and `manhattan` and for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section should be updated as a versionchanged (as done for the metric='precomputed' case in 0.19) or just removed (but then the previous versionchanged should also be removed for consistency).

@@ -111,12 +103,7 @@ class NearestCentroid(ClassifierMixin, BaseEstimator):
_valid_metrics = set(_VALID_METRICS) - {"mahalanobis", "seuclidean", "wminkowski"}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This attribute can probably also be deleted, no?

@jeremiedbb jeremiedbb added this to the 1.5 milestone Apr 25, 2024
Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpicks. Otherwise LGTM.

sklearn/neighbors/_nearest_centroid.py Outdated Show resolved Hide resolved
sklearn/neighbors/_nearest_centroid.py Outdated Show resolved Hide resolved
jeremiedbb and others added 2 commits April 26, 2024 17:51
Co-authored-by: Guillaume Lemaitre <guillaume@probabl.ai>
Co-authored-by: Guillaume Lemaitre <guillaume@probabl.ai>
@adrinjalali adrinjalali merged commit 4c89b3b into scikit-learn:main May 2, 2024
30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants