Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors out when calling .strftime('%+') #16271

Open
2 tasks done
marc-at-brightnight opened this issue May 16, 2024 · 2 comments
Open
2 tasks done

Errors out when calling .strftime('%+') #16271

marc-at-brightnight opened this issue May 16, 2024 · 2 comments
Labels
A-exceptions Area: exception handling A-timeseries Area: date/time functionality P-low Priority: low python Related to Python Polars

Comments

@marc-at-brightnight
Copy link

marc-at-brightnight commented May 16, 2024

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

import datetime

data = {'date': [datetime.date(2021, 4, 1),
  datetime.date(2021, 4, 2),
  datetime.date(2021, 4, 3)]}

df = pl.DataFrame(data)

df.with_columns(date_str=pl.col('date').dt.strftime('%+')) # fails

dt_df = df.cast({pl.Date: pl.Datetime})

dt_df.with_columns(date_str=pl.col('date').dt.strftime('%+')) # also fails, though different error

Log output

`df.with_columns(date_str=pl.col('date').dt.strftime('%+'))`

thread '<unnamed>' panicked at crates/polars-core/src/chunked_array/temporal/date.rs:46:50:
called `Result::unwrap()` on an `Err` value: Error
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Traceback (most recent call last):
  File "/Users/mnhmbp/PycharmProjects/app/venv/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3577, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-4-ed1cdc278abd>", line 14, in <module>
    df.with_columns(date_str=pl.col('date').dt.strftime('%+')) # fails
  File "/Users/mnhmbp/PycharmProjects/app/venv/lib/python3.10/site-packages/polars/dataframe/frame.py", line 8310, in with_columns
    return self.lazy().with_columns(*exprs, **named_exprs).collect(_eager=True)
  File "/Users/mnhmbp/PycharmProjects/app/venv/lib/python3.10/site-packages/polars/lazyframe/frame.py", line 1816, in collect
    return wrap_df(ldf.collect(callback))
pyo3_runtime.PanicException: called `Result::unwrap()` on an `Err` value: Error

`dt_df.with_columns(date_str=pl.col('date').dt.strftime('%+'))`

Traceback (most recent call last):
  File "/Users/mnhmbp/PycharmProjects/app/venv/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3577, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-6-d81b3df6bd78>", line 1, in <module>
    dt_df.with_columns(date_str=pl.col('date').dt.strftime('%+')) # also fails, though different error
  File "/Users/mnhmbp/PycharmProjects/app/venv/lib/python3.10/site-packages/polars/dataframe/frame.py", line 8310, in with_columns
    return self.lazy().with_columns(*exprs, **named_exprs).collect(_eager=True)
  File "/Users/mnhmbp/PycharmProjects/app/venv/lib/python3.10/site-packages/polars/lazyframe/frame.py", line 1816, in collect
    return wrap_df(ldf.collect(callback))
polars.exceptions.ComputeError: cannot format NaiveDateTime with format '%+'

Issue description

My use case is simply converting a date column to isoformat, similar to called datetime.datetime.utcnow().isoformat(). As such, it would be nice to have a convenience function .isoformat(), that works with both pl.Date and pl.Datetime.

According to the docs, using strftime('%+') seems like the most straightforward way to get this done but I could be missing something.

Expected behavior

df.with_columns(date_str=pl.Series([d.isoformat() for d in df['date'].to_list()]))

┌────────────┬────────────┐
│ date       ┆ date_str   │
│ ---        ┆ ---        │
│ date       ┆ str        │
╞════════════╪════════════╡
│ 2021-04-01 ┆ 2021-04-01 │
│ 2021-04-02 ┆ 2021-04-02 │
│ 2021-04-03 ┆ 2021-04-03 │
└────────────┴────────────┘

dt_df.with_columns(date_str=pl.Series([d.isoformat() for d in dt_df['date'].to_list()]))

┌─────────────────────┬─────────────────────┐
│ date                ┆ date_str            │
│ ---                 ┆ ---                 │
│ datetime[μs]        ┆ str                 │
╞═════════════════════╪═════════════════════╡
│ 2021-04-01 00:00:00 ┆ 2021-04-01T00:00:00 │
│ 2021-04-02 00:00:00 ┆ 2021-04-02T00:00:00 │
│ 2021-04-03 00:00:00 ┆ 2021-04-03T00:00:00 │
└─────────────────────┴─────────────────────┘

Installed versions

--------Version info---------
Polars:               0.20.26
Index type:           UInt32
Platform:             macOS-14.2-arm64-arm-64bit
Python:               3.10.13 (main, Aug 24 2023, 12:59:26) [Clang 15.0.0 (clang-1500.0.40.1)]
----Optional dependencies----
adbc_driver_manager:  <not installed>
cloudpickle:          3.0.0
connectorx:           <not installed>
deltalake:            <not installed>
fastexcel:            <not installed>
fsspec:               2024.2.0
gevent:               <not installed>
hvplot:               <not installed>
matplotlib:           3.8.3
nest_asyncio:         1.6.0
numpy:                1.23.0
openpyxl:             3.1.2
pandas:               1.5.3
pyarrow:              15.0.1
pydantic:             2.6.3
pyiceberg:            <not installed>
pyxlsb:               <not installed>
sqlalchemy:           1.4.52
torch:                <not installed>
xlsx2csv:             <not installed>
xlsxwriter:           <not installed>
@marc-at-brightnight marc-at-brightnight added bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars labels May 16, 2024
@MarcoGorelli MarcoGorelli added the A-exceptions Area: exception handling label May 16, 2024
@MarcoGorelli
Copy link
Collaborator

hey - looks like you need to set a time zone to be able to use '%+'

In [36]: df['date'].cast(pl.Datetime('us', 'Europe/London')).dt.strftime('%+')
Out[36]:
shape: (3,)
Series: 'date' [str]
[
        "2021-04-01T01:00:00+01:00"
        "2021-04-02T01:00:00+01:00"
        "2021-04-03T01:00:00+01:00"
]

The error message for the Date one should be improved though

@MarcoGorelli MarcoGorelli added P-low Priority: low A-timeseries Area: date/time functionality and removed bug Something isn't working needs triage Awaiting prioritization by a maintainer labels May 16, 2024
@marc-at-brightnight
Copy link
Author

Thanks! Yes, using the below suits my use case. having a convenience function like df['date'].dt.isoformat() for the below might be something to think about.
df['date'].cast(pl.Datetime('us', 'UTC')).dt.strftime('%+')

Appreciate the help!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-exceptions Area: exception handling A-timeseries Area: date/time functionality P-low Priority: low python Related to Python Polars
Projects
Status: Ready
Development

No branches or pull requests

2 participants