Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEPR: remove inplace arg in Categorical methods #49321

Merged
merged 8 commits into from
Oct 29, 2022
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
45 changes: 2 additions & 43 deletions doc/source/user_guide/categorical.rst
Expand Up @@ -353,11 +353,6 @@ Renaming categories is done by using the

In contrast to R's ``factor``, categorical data can have categories of other types than string.

.. note::

Be aware that assigning new categories is an inplace operation, while most other operations
under ``Series.cat`` per default return a new ``Series`` of dtype ``category``.

Categories must be unique or a ``ValueError`` is raised:

.. ipython:: python
Expand Down Expand Up @@ -952,7 +947,6 @@ categorical (categories and ordering). So if you read back the CSV file you have
relevant columns back to ``category`` and assign the right categories and categories ordering.

.. ipython:: python
:okwarning:

import io

Expand All @@ -969,8 +963,8 @@ relevant columns back to ``category`` and assign the right categories and catego
df2["cats"]
# Redo the category
df2["cats"] = df2["cats"].astype("category")
df2["cats"].cat.set_categories(
["very bad", "bad", "medium", "good", "very good"], inplace=True
df2["cats"] = df2["cats"].cat.set_categories(
["very bad", "bad", "medium", "good", "very good"]
)
df2.dtypes
df2["cats"]
Expand Down Expand Up @@ -1153,38 +1147,3 @@ Setting the index will create a ``CategoricalIndex``:
df.index
# This now sorts by the categories order
df.sort_index()

Side effects
~~~~~~~~~~~~

Constructing a ``Series`` from a ``Categorical`` will not copy the input
``Categorical``. This means that changes to the ``Series`` will in most cases
change the original ``Categorical``:

.. ipython:: python
:okwarning:

cat = pd.Categorical([1, 2, 3, 10], categories=[1, 2, 3, 4, 10])
s = pd.Series(cat, name="cat")
cat
s.iloc[0:2] = 10
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

won't this still affect the original Categorical?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, you're right. Added back.

cat
df = pd.DataFrame(s)
df["cat"].cat.categories = [1, 2, 3, 4, 5]
cat

Use ``copy=True`` to prevent such a behaviour or simply don't reuse ``Categoricals``:

.. ipython:: python

cat = pd.Categorical([1, 2, 3, 10], categories=[1, 2, 3, 4, 10])
s = pd.Series(cat, name="cat", copy=True)
cat
s.iloc[0:2] = 10
cat

.. note::

This also happens in some cases when you supply a NumPy array instead of a ``Categorical``:
using an int array (e.g. ``np.array([1,2,3,4])``) will exhibit the same behavior, while using
a string array (e.g. ``np.array(["a","b","c","a"])``) will not.
3 changes: 1 addition & 2 deletions doc/source/whatsnew/v0.15.0.rst
Expand Up @@ -70,7 +70,6 @@ For full docs, see the :ref:`categorical introduction <categorical>` and the
:ref:`API documentation <api.arrays.categorical>`.

.. ipython:: python
:okwarning:

df = pd.DataFrame({"id": [1, 2, 3, 4, 5, 6],
"raw_grade": ['a', 'b', 'b', 'a', 'a', 'e']})
Expand All @@ -79,7 +78,7 @@ For full docs, see the :ref:`categorical introduction <categorical>` and the
df["grade"]

# Rename the categories
df["grade"].cat.categories = ["very good", "good", "very bad"]
df["grade"] = df["grade"].cat.rename_categories(["very good", "good", "very bad"])

# Reorder the categories and simultaneously add the missing categories
df["grade"] = df["grade"].cat.set_categories(["very bad", "bad",
Expand Down
4 changes: 2 additions & 2 deletions doc/source/whatsnew/v0.19.0.rst
Expand Up @@ -271,12 +271,12 @@ Individual columns can be parsed as a ``Categorical`` using a dict specification
such as :func:`to_datetime`.

.. ipython:: python
:okwarning:

df = pd.read_csv(StringIO(data), dtype="category")
df.dtypes
df["col3"]
df["col3"].cat.categories = pd.to_numeric(df["col3"].cat.categories)
new_categories = pd.to_numeric(df["col3"].cat.categories)
df["col3"] = df["col3"].cat.rename_categories(new_categories)
df["col3"]

.. _whatsnew_0190.enhancements.union_categoricals:
Expand Down