Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Index.repeat is failing for DatetimeIndex with a frequency #15720

Closed
galipremsagar opened this issue May 10, 2024 · 0 comments · Fixed by #15722
Closed

[BUG] Index.repeat is failing for DatetimeIndex with a frequency #15720

galipremsagar opened this issue May 10, 2024 · 0 comments · Fixed by #15722
Assignees
Labels
bug Something isn't working cudf.pandas Issues specific to cudf.pandas

Comments

@galipremsagar
Copy link
Contributor

Describe the bug
Index.repeat is throwing an error when it is performed on DatetimeIndex with a set frequency.

Steps/Code to reproduce bug

In [6]: idx = cudf.date_range("2021-01-01", periods=3, freq="D")

In [7]: idx
Out[7]: DatetimeIndex(['2021-01-01', '2021-01-02', '2021-01-03'], dtype='datetime64[ns]', freq='D')

In [9]: idx._freq
Out[9]: <DateOffset: days=1>

In [10]: idx.repeat(2)
Out[10]: ---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
File /nvme/0/pgali/envs/cudfdev/lib/python3.11/site-packages/pandas/core/arrays/datetimelike.py:2124, in TimelikeOps._validate_frequency(cls, index, freq, **kwargs)
   2123     if not np.array_equal(index.asi8, on_freq.asi8):
-> 2124         raise ValueError
   2125 except ValueError as err:

ValueError: 

The above exception was the direct cause of the following exception:

ValueError                                Traceback (most recent call last)
File /nvme/0/pgali/envs/cudfdev/lib/python3.11/site-packages/IPython/core/formatters.py:711, in PlainTextFormatter.__call__(self, obj)
    704 stream = StringIO()
    705 printer = pretty.RepresentationPrinter(stream, self.verbose,
    706     self.max_width, self.newline,
    707     max_seq_length=self.max_seq_length,
    708     singleton_pprinters=self.singleton_printers,
    709     type_pprinters=self.type_printers,
    710     deferred_pprinters=self.deferred_printers)
--> 711 printer.pretty(obj)
    712 printer.flush()
    713 return stream.getvalue()

File /nvme/0/pgali/envs/cudfdev/lib/python3.11/site-packages/IPython/lib/pretty.py:411, in RepresentationPrinter.pretty(self, obj)
    408                         return meth(obj, self, cycle)
    409                 if cls is not object \
    410                         and callable(cls.__dict__.get('__repr__')):
--> 411                     return _repr_pprint(obj, self, cycle)
    413     return _default_pprint(obj, self, cycle)
    414 finally:

File /nvme/0/pgali/envs/cudfdev/lib/python3.11/site-packages/IPython/lib/pretty.py:779, in _repr_pprint(obj, p, cycle)
    777 """A pprint that just redirects to the normal repr function."""
    778 # Find newlines and replace them with p.break_()
--> 779 output = repr(obj)
    780 lines = output.splitlines()
    781 with p.group():

File /nvme/0/pgali/envs/cudfdev/lib/python3.11/site-packages/nvtx/nvtx.py:116, in annotate.__call__.<locals>.inner(*args, **kwargs)
    113 @wraps(func)
    114 def inner(*args, **kwargs):
    115     libnvtx_push_range(self.attributes, self.domain.handle)
--> 116     result = func(*args, **kwargs)
    117     libnvtx_pop_range(self.domain.handle)
    118     return result

File /nvme/0/pgali/envs/cudfdev/lib/python3.11/site-packages/cudf/core/index.py:1393, in Index.__repr__(self)
   1391         output = output.replace("'", "")
   1392 else:
-> 1393     output = repr(preprocess.to_pandas())
   1395 # Fix and correct the class name of the output
   1396 # string by finding first occurrence of "(" in the output
   1397 index_class_split_index = output.find("(")

File /nvme/0/pgali/envs/cudfdev/lib/python3.11/site-packages/nvtx/nvtx.py:116, in annotate.__call__.<locals>.inner(*args, **kwargs)
    113 @wraps(func)
    114 def inner(*args, **kwargs):
    115     libnvtx_push_range(self.attributes, self.domain.handle)
--> 116     result = func(*args, **kwargs)
    117     libnvtx_pop_range(self.domain.handle)
    118     return result

File /nvme/0/pgali/envs/cudfdev/lib/python3.11/site-packages/cudf/core/index.py:2158, in DatetimeIndex.to_pandas(self, nullable, arrow_type)
   2152 else:
   2153     freq = (
   2154         self._freq._maybe_as_fast_pandas_offset()
   2155         if self._freq is not None
   2156         else None
   2157     )
-> 2158     return pd.DatetimeIndex(result, name=self.name, freq=freq)

File /nvme/0/pgali/envs/cudfdev/lib/python3.11/site-packages/pandas/core/indexes/datetimes.py:370, in DatetimeIndex.__new__(cls, data, freq, tz, normalize, closed, ambiguous, dayfirst, yearfirst, dtype, copy, name)
    367         data = data.copy()
    368     return cls._simple_new(data, name=name)
--> 370 dtarr = DatetimeArray._from_sequence_not_strict(
    371     data,
    372     dtype=dtype,
    373     copy=copy,
    374     tz=tz,
    375     freq=freq,
    376     dayfirst=dayfirst,
    377     yearfirst=yearfirst,
    378     ambiguous=ambiguous,
    379 )
    380 refs = None
    381 if not copy and isinstance(data, (Index, ABCSeries)):

File /nvme/0/pgali/envs/cudfdev/lib/python3.11/site-packages/pandas/core/arrays/datetimes.py:394, in DatetimeArray._from_sequence_not_strict(cls, data, dtype, copy, tz, freq, dayfirst, yearfirst, ambiguous)
    391     result = result.as_unit(unit)
    393 validate_kwds = {"ambiguous": ambiguous}
--> 394 result._maybe_pin_freq(freq, validate_kwds)
    395 return result

File /nvme/0/pgali/envs/cudfdev/lib/python3.11/site-packages/pandas/core/arrays/datetimelike.py:2088, in TimelikeOps._maybe_pin_freq(self, freq, validate_kwds)
   2084 elif self._freq is None:
   2085     # We cannot inherit a freq from the data, so we need to validate
   2086     #  the user-passed freq
   2087     freq = to_offset(freq)
-> 2088     type(self)._validate_frequency(self, freq, **validate_kwds)
   2089     self._freq = freq
   2090 else:
   2091     # Otherwise we just need to check that the user-passed freq
   2092     #  doesn't conflict with the one we already have.

File /nvme/0/pgali/envs/cudfdev/lib/python3.11/site-packages/pandas/core/arrays/datetimelike.py:2135, in TimelikeOps._validate_frequency(cls, index, freq, **kwargs)
   2129     raise err
   2130 # GH#11587 the main way this is reached is if the `np.array_equal`
   2131 #  check above is False.  This can also be reached if index[0]
   2132 #  is `NaT`, in which case the call to `cls._generate_range` will
   2133 #  raise a ValueError, which we re-raise with a more targeted
   2134 #  message.
-> 2135 raise ValueError(
   2136     f"Inferred frequency {inferred} from passed values "
   2137     f"does not conform to passed frequency {freq.freqstr}"
   2138 ) from err

ValueError: Inferred frequency None from passed values does not conform to passed frequency D

In [11]: idx.to_pandas().repeat(2)
Out[11]: 
DatetimeIndex(['2021-01-01', '2021-01-01', '2021-01-02', '2021-01-02',
               '2021-01-03', '2021-01-03'],
              dtype='datetime64[ns]', freq=None)

Expected behavior
A clear and concise description of what you expected to happen.

@galipremsagar galipremsagar added bug Something isn't working cudf.pandas Issues specific to cudf.pandas labels May 10, 2024
@galipremsagar galipremsagar self-assigned this May 10, 2024
rapids-bot bot pushed a commit that referenced this issue May 14, 2024
Fixes: #15720 

This PR fixes `Index.repeat` where the `freq` of `DatetimeIndex` needs to be reset.

Authors:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - Matthew Roeschke (https://github.com/mroeschke)

URL: #15722
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cudf.pandas Issues specific to cudf.pandas
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

1 participant