Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Future of leap second handling #954

Open
Ekleog opened this issue Feb 1, 2023 · 12 comments
Open

Future of leap second handling #954

Ekleog opened this issue Feb 1, 2023 · 12 comments

Comments

@Ekleog
Copy link
Contributor

Ekleog commented Feb 1, 2023

I'm writing this issue to try and summarize all the possibilities for leap seconds handling in chrono, hopefully this will help making a decision :) (I can't find a checkbox to allow edits by maintainers, so feel free to ping me if you want anything added here!)

Option 1: change nothing

The current behavior has the following problems:

  • The current API allows "invalid" leap seconds (outside the end of a minute), which causes issues with json round-tripping and understanding the Debug representation: discussed at Json round-tripping changes leap seconds #944
  • Due to this API, client code needs to be ready for leap seconds at any time (eg. inside a minute), and not just when leap seconds actually happen
  • The current API makes for imprecise Duration accounting when there was a leap second: discussed at UTC seems to be treated as canonical; hazardous in the presence of leap seconds #21
  • Other theoretical issues, like the fact that there could be leap seconds that "accelerate" a "second", instead of making it "longer"; but that probably won't happen before 2135 so it's not an emergency to handle.

Option 2: count leap seconds as seconds=60, nanoseconds<1B

Pros:

  • The API would stop allowing "invalid" leap seconds, and thus no longer have issues with json round-tripping

Cons:

  • JSON stays serializing :60 seconds, which is likely to crash other consumers
  • Duration stays imprecise
  • It's still very easy to create a DateTime that will never exist, even though adding a timezone to NaiveDateTime tries to prevent that (creating a random DateTime would be roughly 1/61 chance of making a DateTime that will never exist, which sure is already better than the current 1/2)

Option 3: assume the host clock is UTC-SLS, remove leap second support from chrono

Pros:

  • The code becomes much simpler, as it no longer needs to have to even think of leap seconds
  • No more worries about serialization

Cons:

  • Duration handling stays wrong in the presence of leap seconds. The worst offset is probably ~42 seconds by 2135. It's at most one second a day, and UTC-SLS smoothes the transition, so it's maybe not a big deal.
  • If the host clock is not actually doing UTC-SLS but plain UTC, then we will end up with a "time freeze" of one second at the end of the leap second.
  • How should chrono deserialize leap seconds that were serialized with :60?

As a further idea, to mitigate the "what if the host clock is actually UTC" issue, we could reserve the last microsecond of each minute to store milliseconds in case of a leap second. This would make time flow like:

Real time            -> Return value of Utc::now()
00:00:59.999 999 000 -> 00:00:59.999 999 000
00:00:59.999 999 500 -> 00:00:59.999 999 000
00:00:60.123 456 789 -> 00:00:59.999 999 123

Note that this would have to happen during the 59th second of each minute, as it's hard to predict whether the host clock is actually running UTC-SLS or not, and whether there is going to be a leap second if not.

However, chrono would then keep in normal times the property that it returns time precise to the microsecond (which is the case on windows), and would just lose one microsecond of precision once a minute to avoid times too far apart from becoming exactly-equal values in the case of a leap second

Option 4: like option 3, but also introduce a leap second table (potentially fetched from the system) to compute durations

Pros:

  • Like option 3, plus
  • Has exact durations. Basically, the behavior would be correct as per the UTC definition

Cons:

  • Like option 3, plus
  • The code becomes even more complex than options 5 or 6

TBH, IMO this is definitely not a good solution, we should pick another option. I'm having it here for completeness.

Option 5: introduce a TimePoint type, counted in TAI, that can convert from/to DateTime (probably in addition to option 2)

Pros:

  • Duration accounting becomes trivial between TimePoint's
  • Serialization to UTC would still possibly have leap seconds, but it would become easy to serialize in TAI, which should definitely be preferred anyway as it's actually possible to reason about future dates and intervals

Cons:

  • Needs a way to actually know the offset between UTC and TAI to implement conversions. This has been planned for a while, but seems not to be implemented yet.
  • Convertion from/to DateTime is not great, in that DateTime can represent times that are not actually valid TimePoints, like invalid leap seconds.

Overall, I think the con is not actually a downside. It hits basically the same problem as regular timezone rule changes, that happen almost as regularly as leap second insertions: since 2012 (start of the iana archives), there has been 3 leap seconds, and 86 tzdata changes, so at least 83 timezone changes that were recorded.

I think if chrono can stomach being probably at least one hour off half the year at 83 places, then it can probably stomach being 3 seconds off. And if it cannot, then tzdata should be provided by the OS and not hardcoded in chrono, which would fix both issues.

Option 6: like option 5, but make struct DateTime<Tz>(TimePoint, Tz)

Pros:
- Like option 5, plus

  • Literally all DateTimes would become valid times that actually exist as per the current knowledge of timezones (though that may still change in the future, like currently it's possible to encode a time in a timezone but maybe by then that timezone will have changed and introduce summer time that would make this time not exist)
  • Code becomes much simpler (and probably faster too), as there is no longer a need to deal with converting from/to TimePoint and dealing with leap seconds there, it can simply be handled once at the time of taking Utc::now(), and once at the time of formatting

Cons:

  • Needs a way to actually know the offset between UTC and TAI to figure out what TimePoint Utc::now() is (see option 5)
  • It becomes impossible to represent a DateTime with a leap second in the future that chrono does not know about, even if the user were to know about it.

For full disclosure, I'm personally partial towards option 6, based on the pros and cons listed above.

What do you think?

For people who may want to express their opinion about which ideas they think are good/bad without adding new information to the debate, I'm going to add one message per option, feel free to up/downvote (keeping all options in the top-post so that maintainers can hopefully edit it)

@Ekleog
Copy link
Contributor Author

Ekleog commented Feb 1, 2023

Option 1: change nothing

@Ekleog
Copy link
Contributor Author

Ekleog commented Feb 1, 2023

Option 2: count leap seconds as seconds=60, nanoseconds<1B

@Ekleog
Copy link
Contributor Author

Ekleog commented Feb 1, 2023

Option 3: assume the host clock is UTC-SLS, remove leap second support from chrono

@Ekleog
Copy link
Contributor Author

Ekleog commented Feb 1, 2023

Option 4: like option 3, but also introduce a leap second table (potentially fetched from the system) to compute durations

@Ekleog
Copy link
Contributor Author

Ekleog commented Feb 1, 2023

Option 5: introduce a TimePoint type, counted in TAI, that can convert from/to DateTime (probably in addition to option 2)

@Ekleog
Copy link
Contributor Author

Ekleog commented Feb 1, 2023

Option 6: like option 5, but make struct DateTime(TimePoint, Tz)

@djc
Copy link
Contributor

djc commented Feb 1, 2023

I don't think I will have time to digest this in the near future, sorry.

@Ekleog
Copy link
Contributor Author

Ekleog commented Feb 1, 2023

No problem! I tried to make the top-post as easy to digest as possible and requiring no context so it's possible to read it without knowledge of the past discussions and understand the tradeoffs, so it shouldn't be harder to read later with less context.

I just hope you get a chance to look into it long enough before releasing chrono 0.5 that I can get a chance to try and implement the required things to get them in, if 0.6 is another 5+ years down the road :)

@pitdicker
Copy link
Collaborator

pitdicker commented Sep 11, 2023

@Ekleog Thank you so much for digging into this issue! And you put your money where your mouth is with your kine crate.

As a first note: while I personally think leap seconds are cool (I love exotic little complexities like this), most library code really doesn't want to think about leap seconds. If code adds chrono::Duration::hours(5), it rarely expects to be one second off when there happened to be a leap second.

My opinion on your options:

  1. change nothing
    I am all for improving a bit 😄
  2. count leap seconds as seconds=60, nanoseconds<1B
    You probably mean to change the API interface, and not the internal representation?
    I may agree that it is nice for methods such as Naivetime::from_hms_opt to accept 60 as a valid value for seconds.
  3. assume the host clock is UTC-SLS, remove leap second support from chrono
    'UTC with Smoothed Leap Seconds'
    Chrono is much more than an interface to the system clock. We have no choice but to do something with leap seconds.
  4. like option 3, but also introduce a leap second table (potentially fetched from the system) to compute durations
    ...
  5. introduce a TimePoint type, counted in TAI, that can convert from/to DateTime (probably in addition to option 2)
    With this point you seem to be thinking about a well-defined time scale, somewhat like hifitime provides?
    It somewhat matches with the idea that is currently part of the documentation of NaiveTime:

    If you cannot tolerate this behavior, you must use a separate TimeZone for the International Atomic Time (TAI). TAI is like UTC but has no leap seconds, and thus slightly differs from UTC. Chrono does not yet provide such implementation, but it is planned.

  6. like option 5, but make struct DateTime(TimePoint, Tz)
    The strengts of chrono lie, in my opinion, in its current representation that is optimal for working with dates.
    I don't think changing the backing storage of DateTime matches well with the priorities of chrono as a crate.

What I would propose is:

  • It would be cool if chrono could improve to the point where it can do computations that respect leap seconds. I.e., make DateTime<Tz>::checked_add_signed and DateTime<Tz>::checked_sub_signed more accurate. Or provide two similar methods, so users can choose whether they care.
  • It would be cool if we would ship our own leap second table, but were also able to update it with information from the OS. We now have a TZif parser that can be used on Unix, and apparently Windows also stores some info in the registry.
  • A TAI timezone seems like a good addition.
  • If there is an OS that can return clock values in TAI, that may be interesting to switch to.
  • If someone crates a DateTime with a non-existing leap second, the default reaction of chrono should be (again, in my opinion): 'ok, you probably now better, I'll work with that'. Although we could offer some validation method.
  • We should change some methods to be more critical of invalid leap seconds (Json round-tripping changes leap seconds #944 (comment))

@pitdicker
Copy link
Collaborator

Adding a Tai timezone as suggested in the documentation, so that it is possible to convert between DateTime<Utc> and DateTime<Tai>, seems like a good first step. Then arithmetic can work with the regular methods in chrono.

@demurgos
Copy link
Contributor

demurgos commented Sep 13, 2023

If code adds chrono::Duration::hours(5), it rarely expects to be one second off when there happened to be a leap second.

This feels like the core of issue to me: what is the meaning of Duration::hours(5), or more plainly: what's the unit?

I'm sure you're all very aware of this, but there is not a single definition of what is a "second" and this is the root of this problem:

  • SI second: 9,192,631,770 Caesium 133 transitions; always the same duration regardless of where and when it is measured
  • historical second (or "non-leap seconds" in chrono's documentation): 1/86400th of a day; and a day is defined based on the movement of the Earth, this duration fluctuates over time (a historical second measured 100 years ago or 100 years in the future will not have the same duration)

The duration of a mean solar day is 86,400 historical seconds by definition, but 86,400.0013 SI seconds by measurement. Calendars try to remain in sync with solar days and the discrepancy between both units (SI and historical) is what ultimately leads to leap seconds. (and the differences between UTC/TAI/UT1/UT2, ...) UTC uses SI seconds but tries to approximate historical seconds by introducing discontinuities through leap seconds.

What I find disappointing is that instead of containing this complexity to the conversion between a timestamp measured in SI seconds and a calendar representation, we instead ended up with timestamps measured in historical seconds.

The current situation feels like an unhappy middle-ground: duration computations with timestamps are broken and you still have to go through timezones for display anyway. But retro-compatibility trumps all :/

(Sorry for the long preface, but I really feel that framing the issue as a unit problem is a helpful perspective)

This brings me back to the meaning of adding chrono::Duration::hours(5). Is it "add 18,000 historical seconds" or is it "add 18,000 SI seconds"? Depending on the decision, there's always something surprising. There was a leap second on December 31st 2016, so if I add 5 hours to 2016-12-31 22:00:00 I either get: historical seconds (surprising) and 03:00:00 or SI seconds and 02:59:59 (surprising). There's no way around breaking expectations, but picking the latter option avoids breaking duration math.


In general, I feel that this situation is similar to the difficulties around Unicode. Rust requires users to be somewhat aware of the related complexity. Time and calendars are also seemingly simple but get pretty complex, and I hope that chrono helps with providing clearer semantics.


I would strongly prefer if options 5 or 6 were supported by Chrono; and a Tai timezone may be a good start.

@pitdicker
Copy link
Collaborator

1/86400th of a day; and a day is defined based on the movement of the Earth, this duration fluctuates over time (a historical second measured 100 years ago or 100 years in the future will not have the same duration)

In general, I feel that this situation is similar to the difficulties around Unicode.

Well written, the complexities of reality that do not perfectly fit a clean mathematical model 😄.

This brings me back to the meaning of adding chrono::Duration::hours(5). Is it "add 18,000 historical seconds" or is it "add 18,000 SI seconds"?

@demurgos #1282 is about creating a new CalendarDuration that can encode seperate components. months can take into account months have a different number of days, and days can take into account that days are not always 24 hours (because of DST and other timezone transitions). What do you think of the plan for seconds to encode the question of how to count a duration in the presence of leap seconds in that duration type? #1282 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants