Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Overflow warning when using Series.diff() on Series with Periods #58447

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

Aloqeely
Copy link
Contributor

- Bug in :meth:`DataFrame.to_dict` raises unnecessary ``UserWarning`` when columns are not unique and ``orient='tight'``. (:issue:`58281`)
- Bug in :meth:`DataFrame.to_excel` when writing empty :class:`DataFrame` with :class:`MultiIndex` on both axes (:issue:`57696`)
- Bug in :meth:`DataFrame.to_string` that raised ``StopIteration`` with nested DataFrames. (:issue:`16098`)
- Bug in :meth:`read_csv` raising ``TypeError`` when ``index_col`` is specified and ``na_values`` is a dict containing the key ``None``. (:issue:`57547`)

Period
^^^^^^
-
- Bug in :meth:`Series.diff` showing ``RuntimeWarning`` of overflowing when the Series has :class:`Period` (:issue:`58320`)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I mention this happens on Windows only?

@@ -104,11 +102,7 @@ def check_reduce(self, ser: pd.Series, op_name: str, skipna: bool):

@pytest.mark.parametrize("periods", [1, -2])
def test_diff(self, data, periods):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this whole test function can be removed if its just calling super


new_data = add_overflowsafe(
self.asi8, np.asarray(-other_i8, dtype="i8")
).astype(object)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a great solution if this always needs to be cast to object

Copy link
Contributor Author

@Aloqeely Aloqeely Apr 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not? The comment says "we return an object-dtype ndarray" so it makes sense to cast it to object.
I thought this is better than creating a new array which will be object dtype (old behavior)

I could add an optional argument dtype to add_overflowsafe, would that be better?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this seems suboptimal, and that comment seems pretty far detached from where the object conversion happens right? I agree this is a bit suspect; the old code seems to use primitive types so overall this is adding a good deal of overhead.

Do you know where the cast to object is actually occurring?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new_data = np.array([self.freq.base * x for x in new_i8_data])

Here, self.freq.base is what causes new_data to be object dtype

@mroeschke mroeschke added Period Period data type Warnings Warnings that appear or should be added to pandas labels Apr 29, 2024

new_data = add_overflowsafe(
self.asi8, np.asarray(-other_i8, dtype="i8")
).astype(object)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this seems suboptimal, and that comment seems pretty far detached from where the object conversion happens right? I agree this is a bit suspect; the old code seems to use primitive types so overall this is adding a good deal of overhead.

Do you know where the cast to object is actually occurring?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Period Period data type Warnings Warnings that appear or should be added to pandas
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Using Series.diff() on a Series with Periods on Windows shows an overflow error
3 participants