Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename NaiveDate and potentially remove Date<Tz> #820

Open
esheppa opened this issue Sep 13, 2022 · 24 comments
Open

Rename NaiveDate and potentially remove Date<Tz> #820

esheppa opened this issue Sep 13, 2022 · 24 comments
Milestone

Comments

@esheppa
Copy link
Collaborator

esheppa commented Sep 13, 2022

First some past commentary:

Chrono RFC's: https://github.com/chronotope/chrono-rfcs/blob/master/notes/api-issues.md
chronotope/chrono-rfcs#1
chronotope/chrono-rfcs#3

#204

...which will allow us to deprecate Date and rename NaiveDate to Date, over a couple releases.

I think #204 has the right approach here, and this could be combined or kept separate from other renaming (eg potential renaming of NaiveTime):

  • Remove Date<Tz>
  • Rename NaiveDate to Date

We now have functions like and_local_timezone which offer an alternative way of constructing literals in a given timezone, however I think that making literals in a given timezone like Utc.ymd(2018, 1, 1).and_hms(1, 2, 3) is probably not the common case anyway, as they would often either be parsed, be the current time, or be some function of one of the previous.

We could also use LocalDate instead of Date as mentioned in chronotope/chrono-rfcs#3 (comment) which is what java.time / NodaTime use - however my preference is just Date.

@esheppa esheppa added this to the 0.5 milestone Sep 13, 2022
@djc
Copy link
Contributor

djc commented Sep 13, 2022

I am very on board with ditching Date<Tz> and renaming NaiveDate to Date.

@mqudsi

This comment was marked as outdated.

@djc
Copy link
Contributor

djc commented Oct 6, 2022

I like just Date much better than either LocalDate and DateOnly. I think there are probably substantial cultural differences in naming between Rust and the Java/.NET ecosystems even beyond the type system.

@mqudsi
Copy link
Contributor

mqudsi commented Oct 6, 2022

Fair enough.

@esheppa
Copy link
Collaborator Author

esheppa commented Oct 7, 2022

My thinking around the naming is as follows

  • Some types are prefixed with Naive to both signify that they are missing information, but also gently encourage use of non-Naive types, however we don't want to discourage use of a date datatype where that would be most suitable from a correctness perspective.
  • However, I often find myself using NaiveDate, and in the cases where I'm using it, DateTime<Tz> wouldn't necessarily make sense, but I don't think that DateOnly or LocalDate would necessarily be ideal either, as it seems like especially DateOnly is named that way only due to being added much later than the original DateTime type.
  • In these cases where I'm using NaiveDate, the dates aren't Local in a sense of a Local vs zoned datetime, and the ability to specify just a date, rather than a datetime with a sentinel value for the time part is very valuable for API simplicity and correctness
  • For NaiveTime the same arguments apply as for NaiveDate, namely that a Tz parameter can't make sense here.
  • NaiveDateTime has the desired effect of nudging me to use DateTime<Tz> where possible, however LocalDateTime could also be a great option here.

All this being said, I'm keen to explore any use cases where specified dates are Local in the same sense as a local timestamp, and whether those justify a fully separate type, or else some compromise in the renaming of NaiveDate

@dtolnay
Copy link
Contributor

dtolnay commented Nov 13, 2022

Hello, I am here because the 0.4.23 release notes request feedback on the deprecations in that release.

I prefer the current implementation of NaiveDate and Date<Tz> in chrono 0.4 over the API changes suggested in this issue. I don't find any justification for the Date<Tz> deprecation here or in #851 that is compelling to me, and I think it would be good for it to be reversed.

For what I think is a typical correct usage of Date<Tz>, see https://docs.rs/db-dump/0.4.0/db_dump/version_downloads/struct.Row.html. The database dumps published nightly by crates.io (https://crates.io/data-access) contain a table of download counts per day for each version of each crate in the registry. The bucketing of the download counts is aligned to UTC's midnight so the correct representation for the date for each row is Date<Utc>. If the bucketing were aligned to a different time zone, that difference would be reflected in the type. DateTime<Tz> is not suitable here because we are talking about downloads for a whole day, not downloads at a particular instant. NaiveDate (under whatever name) is not suitable because it is not naive, it refers to a specific date in a specific time zone; meaningfully interpreting the information requires knowledge of the time zone.

@esheppa
Copy link
Collaborator Author

esheppa commented Nov 13, 2022

Thanks very much for the feedback @dtolnay. This is helpful as we don't have a way of surveying the ecosystem, so putting in depredations for our future plans is really the only option we have to see how the APIs are being used. Myself and @djc are keen to deprecate and eventually remove Date<Tz> but given a compelling reason to keep the existing AP we could un-deprecate it.

From my perspective, one potentially compelling reason for deprecation is that the actual current offset of a Date<Tz> is not well defined, in the case where the Tz has a daylight saving or other transition on that date. This means that the offset method, and the naive_utc methods are potentially problematic. (however we could just remove these methods as one way of satisfying this).

The other reason I am partial to is that the naming pushes users towards avoiding types beginning with Naive but admittedly that could be solved by renaming the Naive* types to Local* instead.

The usage in db-dump looks like a good use of Date<Utc> but it would be interesting to explore whether this same usage would work for timezones with daylight savings.

However it looks like the intended usage in db-dump, which is a reporting kind of task, is closely related to some use cases discussed in #722 relating to periods of months, quarters and years, but could also reasonably be extended to periods of days, hours, minutes, etc, potentially in a naive and non-naive flavor. Currently the discussion had landed around that kind of functionality being in a separate crate that builds on chronos functionality, as the reporting style functionality has the potential to be somewhat opinionated and potentially not as generally useful as the rest of the crate (however this usage is particularly important to my usual use of chrono) so one potential option is providing a zoned Day period in another crate, following a similar API as the other period lengths, including methods for conversion to other timeseries resolutions and methods for the start and end time in chrono types and other resolutions.

If a crate were to be published with that kind of functionality, would that be satisfactory for your use case to replace the current use of Date<Tz>?

Alternatively, would a very minimal Date<Tz> be a good option (potentially storing the TimeZone itself rather than the offset, which precludes Local but allows Utc, FixedOffset and chrono_tz::Tz), with just a few methods, eg for going to and from NaiveDate as well as extracting the Date<Tz> from a DateTime<Tz>?

@dtolnay
Copy link
Contributor

dtolnay commented Nov 13, 2022

Storing the Tz instead of a single offset in Date<Tz> sounds good to me. That will allow accurately obtaining the start and end of the date as a DateTime<Tz> even in the presence of daylight saving time zones. In my case it doesn't make a difference because I am using a time zone without daylight saving, but I can see how this approach is more general than what 0.4 can support.

Reducing the public API of Date<Tz> is also okay with me as long as it still has a Datelike impl.

@Ten0
Copy link

Ten0 commented Nov 13, 2022

that could be solved by renaming the Naive* types to Local* instead

I just want to add my voice to the

I like just Date much better than either LocalDate and DateOnly. I think there are probably substantial cultural differences in naming between Rust and the Java/.NET ecosystems even beyond the type system.

above.
NaiveDate is an absolutely accurate and clear naming, while Local seems to suggest that this has something to do with the local time of the machine, and Only seems semantically meaningless.

@djc
Copy link
Contributor

djc commented Nov 14, 2022

@dtolnay thanks for the feedback!

I've been thinking about your feedback quite a bit but ultimately I'm not sure I find it compelling. Your use case seems to be about storing what amounts to an interval of timestamps that are aligned to date boundaries. This works reasonably well with UTC but would pretty much fall down in the case of most other timezones that have something like daylight savings time, where you would have 2 25-hour days and 2 23-hour days per year. Because it only works well with UTC and other fixed-offset timezones, it seems to me that this could equally well be represented with the NaiveDate type (potentially renamed to Date) or maybe a custom wrapper struct TzDate { date: NaiveDate, tz: Utc } if you want to be explicit in the type system about which time zone is being used.

Another reason this probably only makes sense for Utc is that Utc is a ZST, whereas other TimeZone types will have considerable state attached to them. As such, keeping a whole range of Date<Tz> around doesn't generalize well to other zones.

As @esheppa explains, I think the Date type in chrono should be seen more as a point in the calendar and less as a range of timestamps, and confusing these two roles makes the API more complex.

@djc djc pinned this issue Nov 14, 2022
@dtolnay
Copy link
Contributor

dtolnay commented Nov 14, 2022

This works reasonably well with UTC but would pretty much fall down in the case of most other timezones

I don't agree with the "pretty much falls down" characterization, since it works equally well with UTC and daylight-saving-having time zones as long as it's not a single offset that is stored, but the tz itself, like @esheppa mentions at the bottom of #820 (comment).

Ignoring that option, I still don't agree that something supporting only fixed-offset time zones means it should use NaiveDate. This seems like a case where a trait bound restricting to fixed-offset tz would be a good fit, or just documentation, with or without runtime checks, much like how chrono already documents that certain month numbers and day numbers are not valid.

Utc is a ZST, whereas other TimeZone types will have considerable state attached to them. As such, keeping a whole range of Date<Tz> around doesn't generalize well to other zones

That is not how I would handle that case in the db-dump crate. If crates.io's download count dates were a different time zone, I would define a zero-sized time zone and impl chrono::offset::TimeZone for it with a zero-sized offset type, to represent crates.io database time. I think it's especially important in this case that the crate wouldn't just be using NaiveDate.

I think the Date type in chrono should be seen more as a point in the calendar and less as a range of timestamps,

I don't think anyone was confusing these until your change. It's clear to me that "point in the calendar" is the point of NaiveDate, while "range of timestamps from midnight to adjacent midnight" is Date<Tz>.

@djc
Copy link
Contributor

djc commented Nov 15, 2022

So I guess maybe we should rename Date<Tz> to Day<Tz> in 0.5 to make this more obvious.

We've definitely seen people use this in surprising ways in the issues.

@esheppa
Copy link
Collaborator Author

esheppa commented Nov 15, 2022

What if we add Day<Tz> rather than renaming the existing type - this would allow us to put it in 0.4.24 as well as updating the deprecation notice on Date<Tz> to point users to the new type, making migration easier? The semantics are somewhat different so it might make more sense that way. It could have methods like:

impl<Tz: TimeZone + Copy> Day<Tz> {
    fn start_time(&self) -> DateTime<Tz>;
    fn date(&self) -> NaiveDate;
    fn zone(&self) -> Tz;
}

impl<Tz: TimeZone + Copy> From<DateTime<Tz>> for Day<Tz> {
...
}

Reducing the public API of Date is also okay with me as long as it still has a Datelike impl.

@dtolnay would it be satisfactory if Datelike methods could be accessed via .date()?

Subsequently it could be kept in chrono or moved to a separate crate if it seemed like it belonged better with the potential work described in #722 .

(The start_time method here is somewhat involved because not all days necessarily start at 00:00, however it would probably lean on logic from #716 and #830 and/or some heuristics on the reasonable possible offsets by trying each 15 minutes after 00:00 in order)

@djc
Copy link
Contributor

djc commented Nov 16, 2022

Yes, I think that's the right direction. Maybe Tz: Copy is unnecessarily restrictive, or it's doing double duty as some kind of semantic bound ("must have a fixed offset") that becomes very implicit. And I would maybe rename start_time() to just start() -- not much potential for confusion here especially given the signature? Should we add end() as well?

@esheppa
Copy link
Collaborator Author

esheppa commented Nov 16, 2022

Maybe Tz: Copy is unnecessarily restrictive,

While it seems so from the code above, my intention here is to allow FixedOffset, Utc and chrono_tz::Tz while disallowing Local, and Copy happens to be implemented by the first three and not the latter

Should we add end() as well?

This is problematic due to the possibility of leap seconds.

@esheppa
Copy link
Collaborator Author

esheppa commented Nov 19, 2022

I've pushed up a branch with a basic implementation of this: #883. If the API is generally suitable then I'll go ahead and flesh out the documentation and add more tests

@djc
Copy link
Contributor

djc commented Nov 20, 2022

Maybe Tz: Copy is unnecessarily restrictive,

While it seems so from the code above, my intention here is to allow FixedOffset, Utc and chrono_tz::Tz while disallowing Local, and Copy happens to be implemented by the first three and not the latter

That's what I meant by "it's doing double duty as some kind of semantic bound". We should either document how/why Copy is used here or maybe introduce another trait (which could have Copy has a supertrait) with more precise/semantic naming.

Should we add end() as well?

This is problematic due to the possibility of leap seconds.

Okay, makes sense.

@mqudsi
Copy link
Contributor

mqudsi commented Nov 21, 2022

I swear I'm not trying to bikeshed and I like the idea of being able to backport this change to 0.4, but the parallel between Day<Tz> and NaiveDate is jarring/discontinuous. IMHO, it makes more sense to say Date has a Tz unless you don't know or can't specify what that timezone is then use a NaiveDate instead (and you can basically think of it as pub type NaiveDate = Date<!>).

More pedantically, a date is a combination of day, month, and year while day is part of a date but is not associated with a date. You have "day of the week," "day of the month," and "day of the year." But since you can go from the proposed Day<Tz> to month or year, you specifically have a date and not a day.

Less pedantically and more intuitively, careful users are going to wonder what the semantic difference between "day" and "date" is, when in fact there is none and it's just a choice of names. If anything, the names could be reversed since Date<Tz> has clear "formal, strongly-typed date" connotations, whereas Day is just a day. I would hate for the crate to be stuck long-term with a subpar naming option just so we can ship this functionality in 0.4 (where Date would conflict) instead of delaying it 0.5.


On a completely different topic,

Should we add end() as well?

This is problematic due to the possibility of leap seconds.

It doesn't have to be if end_time() is exclusive (I don't know if that's more intuitive than inclusive end.)

@esheppa
Copy link
Collaborator Author

esheppa commented Nov 23, 2022

@mqudsi - thanks for the feedback. Interestingly, it should be possible to put the impl of Day<Tz> behind the API of Date<Tz> without breaking semver compatibility with the 0.4.x series. My main reasoning for not doing this was:

  • The semantics are quite different, with the Day storing the timezone itself vs Date storing the offset.
  • We would have to return something from the Offset method (this could be the offset from the DateTime returned from the start() function)
  • The docs for Date<Tz> mention:
    • You almost certainly want to be using a NaiveDate instead of this type.

I would hate for the crate to be stuck long-term with a subpar naming option just so we can ship this functionality in 0.4

Definitely agreed on this, and I think the choice of whether the type (if merged) is ultimately called Date or Day is an important one. The idea was to differentiate between an instant and a period, however NaiveDate can happily fulfill both roles (eg it could/should have a start_time() method too). I'm keen to free up the name Date for NaiveDate in 0.5 which means we would need an alternative name for the Date<Tz>/Day<Tz> anywhere, so I think the options are:

  • NaiveDate -> Date, Date<Tz> -> Day<Tz>
  • NaiveDate -> <other_prefix>Date, Date<Tz> unchanged
  • NaiveDate -> Date, Date<Tz> -> <other_prefix>Date<Tz> (ZonedDate<Tz>?)
  • NaiveDate -> deprecated, Date<Tz> -> Date<Tz = ()> (details below)

(and you can basically think of it as pub type NaiveDate = Date<!>).

This is an interesting idea, but it has implications for methods like start() and offset() - it would be nice to avoid implicitly choosing Utc, but perhaps we make those methods only available when the Tz: TimeZone and not in other cases? Then we could have pub struct Date<Tz = ()> { ... } allowing Date to be used in both cases? (Interestingly this would allow the NaiveDate -> Date transition in the 0.4.x series!

It doesn't have to be if end_time() is exclusive (I don't know if that's more intuitive than inclusive end.)

Yes this makes sense, now implemented:

pub fn exclusive_end(&self) -> Option<DateTime<Tz>> {

@Ten0
Copy link

Ten0 commented Nov 23, 2022

My vote goes to:
NaiveDate -> Date + NaiveDate type alias
Date<Tz> -> Day<Tz>

@esheppa
Copy link
Collaborator Author

esheppa commented Nov 24, 2022

@mqudsi - Thanks for the suggestion with the alternative Tz type, I've created an example at: #894.

It potentially remains to be seen whether the NotSpecified type should implement Tz but it may be the only option without specialization/mutually-exclusive impls.

This could be extended to DateTime<Tz> as well

My vote goes to:
NaiveDate -> Date + NaiveDate type alias
Date -> Day

Thanks for these thoughts @Ten0. What do you think about #894?

@Ten0
Copy link

Ten0 commented Nov 24, 2022

What do you think about #894?

Unless I'm mistaken, some operations are naturally going to end up being faillible in Tz contexts and not in naive contexts (or the opposite).

Implementing it this way seems dangerous with regards to flexibility (I suspect one would quickly hit "can't be specialized" errors), readability, as well as overall correctness.

For instance, correctness:

impl Offset for NotSpecified {
    fn fix(&self) -> FixedOffset {
        FixedOffset::ZERO
    }
}

with offset being defined as systematically a way to remap to UTC, would now be incorrect for that type, so that would have to be made unavailable in the API somehow.

flexibility:
NaiveDate::and_time is infallible, but Day::and_time isn't, so Day<NotSpecified> should be disallowed, or its functions' signatures should be such that they are infallible in one case and not in the other... That seems like a footgun.

readability:
All the prevention/allowance/specialization on these structs would likely hinder readability of the documentation.
I doubt that would allow reducing the amount of code within Chrono, and even if that was the case it probably wouldn't be worth it given how many more people are going to read chrono's doc than write its code.

Overall I strongly believe that keeping on considering NaiveDateTime and DateTime<Tz> as two fundamentally different objects is going to be the best approach here.


To expand a little on why I agree with making the Date/Day<Tz> distinction the initially proposed way: unless I'm mistaken it is simply the most semantically correct naming we have so far:

In my experience, mapping to the most semantically correct naming is always going to be what makes it most natural for users to understand the underlying concepts.

@djc
Copy link
Contributor

djc commented Nov 24, 2022

Yes, I think we should stick with NaiveDate -> Date and Date<Tz> -> Day<Tz>. We can add good documentation to address potential confusion issues to clarify the intended semantic distinction.

@esheppa
Copy link
Collaborator Author

esheppa commented Nov 26, 2022

We can add good documentation to address potential confusion issues to clarify the intended semantic distinction.

NaiveDate::and_time is infallible, but Day::and_time isn't

Thanks @Ten0 this is an excellent point. I'm keen to put words to this effect in the documentation itself to make a clear argument for having both types

The dictionary definitions are good, perhaps we can include them as well, maybe from wiktionary (the license looks like it is compatible if we don't alter the definitions)

@djc djc unpinned this issue Mar 3, 2023
@pitdicker pitdicker mentioned this issue Sep 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants