Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Paniced at to_timestamp_micros function when the timestamp is too large. #3832

Closed
Tracked by #3148
ZuoTiJia opened this issue Oct 14, 2022 · 10 comments
Closed
Tracked by #3148
Labels
bug Something isn't working

Comments

@ZuoTiJia
Copy link
Contributor

Describe the bug
Paniced at to_timestamp_micros function.

To Reproduce

DataFusion CLI v13.0.0
❯ SELECT to_timestamp_micros(9065525203050843594);
thread 'main' panicked at 'invalid or out-of-range datetime', 
.cargo/registry/src/github.com-1ecc6299db9ec823/chrono-0.4.22/src/naive/datetime/mod.rs:131:18

Additional context
DataFusion 13.0

@ZuoTiJia ZuoTiJia added the bug Something isn't working label Oct 14, 2022
@retikulum
Copy link
Contributor

retikulum commented Oct 14, 2022

Hi. I think this problem is about chrono's timestamp_nano function because it uses i64. Datafusion convert microsecond to nanosecond and passed it to timestamp_nano function. 9065525203050843594 microsecond is 9065525203050843594000 and it doesn't fit into i64 and overflowed. Moreover, it is commented that it is panicked with big numbers at chrono 0.4.22

It seems interesting case but I am not sure about the solution. Even if new version chrono is used, your input is too big to store in i64 so it will overflow eventually. You can also see an example in playground with newer version.

@waitingkuo
Copy link
Contributor

https://github.com/chronotope/chrono/blob/main/src/naive/datetime/mod.rs#L117

from_timestamp will be deprecated since 0.4.23.
i think we could change it to from_timestamp_opt and notify the user it's out-of-range

@waitingkuo
Copy link
Contributor

i submitted a pr here apache/arrow-rs#2892

@retikulum
Copy link
Contributor

retikulum commented Oct 18, 2022

Hi. I am asking this because I want to understand the codebase better. I build datafusion-cli with debug, run SELECT to_timestamp_micros(9065525203050843594); and get the error. However part of the call stack is shown as follows:

  19:     0x7ff66b198d9e - chrono::naive::datetime::NaiveDateTime::from_timestamp
                               at .cargo\registry\src\github.com-1ecc6299db9ec823\chrono-0.4.22\src\naive\datetime\mod.rs:131
  20:     0x7ff66b1d14db - arrow_array::temporal_conversions::timestamp_us_to_datetime
                               at \.cargo\registry\src\github.com-1ecc6299db9ec823\arrow-array-24.0.0\src\temporal_conversions.rs:123
  21:     0x7ff66b1d3bb5 - arrow_array::temporal_conversions::as_datetime<arrow_array::types::TimestampMicrosecondType>
                               at i\.cargo\registry\src\github.com-1ecc6299db9ec823\arrow-array-24.0.0\src\temporal_conversions.rs:181
  22:     0x7ff66a9be6ea - arrow_array::array::primitive_array::PrimitiveArray<arrow_array::types::TimestampMicrosecondType>::value_as_datetime<arrow_array::types::TimestampMicrosecondType>
                               at \.cargo\registry\src\github.com-1ecc6299db9ec823\arrow-array-24.0.0\src\array\primitive_array.rs:496

So arrow is using from_timestamp function in temporal_conversions which means there is a need for a change in this part of the code in arrow.

I think it is not releated to this issue but datafusion uses from_timestamp function in following line:https://github.com/apache/arrow-datafusion/blob/f93642fd7782b0d530a5bff0fecec020b34e9bd1/datafusion/physical-expr/src/expressions/datetime.rs#L304

So we also need to fix it. If I am wrong, could you also suggest me tips about how to debug/analyze these bugs?

@waitingkuo
Copy link
Contributor

I think it is not releated to this issue but datafusion uses from_timestamp function in following line:

https://github.com/apache/arrow-datafusion/blob/f93642fd7782b0d530a5bff0fecec020b34e9bd1/datafusion/physical-expr/src/expressions/datetime.rs#L304

So we also need to fix it. If I am wrong, could you also suggest me tips about how to debug/analyze these bugs?

@retikulum yes, i think we need to fix it as well.

@alamb @avantgardnerio
do you think we should eventually move these datetime arithmetic functions into arrow-rs?

@avantgardnerio
Copy link
Contributor

When chrono:0.4.23 is out we can kill all these:

        ScalarValue::IntervalDayTime(Some(i)) => add_day_time(prior, *i, sign),
        ScalarValue::IntervalYearMonth(Some(i)) => shift_months(prior, *i * sign),
        ScalarValue::IntervalMonthDayNano(Some(i)) => add_m_d_nano(prior, *i, sign),

@avantgardnerio
Copy link
Contributor

Presently we can't remove them yet because there is no Timestamp + Months, only Date + months in 0.4.22.

@alamb
Copy link
Contributor

alamb commented Oct 20, 2022

do you think we should eventually move these datetime arithmetic functions into arrow-rs?

Yes, I do think eventually the date/time arithmetic should be moved into arrow-rs kernels

@Jefffrey
Copy link
Contributor

Issue seems to be resolved now, doesn't panic but returns error:

DataFusion CLI v15.0.0
❯ SELECT to_timestamp_micros(9065525203050843594);
+-----------------------------------------------+
| totimestampmicros(Int64(9065525203050843594)) |
+-----------------------------------------------+
| ERROR CONVERTING DATE                         |
+-----------------------------------------------+
1 row in set. Query took 0.002 seconds.
❯

@alamb
Copy link
Contributor

alamb commented Dec 10, 2022

Thank you for checking @Jefffrey

@alamb alamb closed this as completed Dec 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants