Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Offer tz-rs implementation #493

Closed
Sytten opened this issue Jul 30, 2022 · 11 comments
Closed

Feature: Offer tz-rs implementation #493

Sytten opened this issue Jul 30, 2022 · 11 comments
Labels
A-local-offset Area: local offset C-enhancement Category: an enhancement with existing code E-hard Significant experience needed.

Comments

@Sytten
Copy link

Sytten commented Jul 30, 2022

Hi!
The current implementation for local_offset_at for unix platforms uses the libc localtime_r which is problematic due to race conditions with setenv and currently can only be used for single threaded applications. Chrono moved to using tz-rs and it would be nice to offer a feature for time that also uses this crate. Happy to participate to an implementation if that is of any interest.
Thanks :)

@jhpratt
Copy link
Member

jhpratt commented Jul 30, 2022

Chrono is also explicitly ignoring the ability for the TZ environment variable to change for performance reasons. There are also a number of other things that can silently change, potentially causing issues.

If this were implemented, it would require parsing the full tzdb on every invocation to ensure correctness. This would be far too much overhead for what I hope is obvious reasons.

I think it's worth keeping in mind that what time is doing is not actually unsound. We're relying on the guarantee that getenv and similar syscalls are thread-safe. It's setenv that is not, and it's std that violates this guarantee. Ultimately there is work happening upstream in the standard library that will render the check for the number of running threads pointless. More specifically, a new concept of "deprecated safe" is being worked on. If/when this hits stable, that would be sufficient reason for me to lift the restriction.

@Sytten
Copy link
Author

Sytten commented Jul 30, 2022

Thanks for the detailed response, very insightful. Maybe then the feature could be to provide a lock so it can work on multiple threads like some implementations are doing if the user is aware that they must not use the call outside the lib.

I am really just trying to find a solution that is as safe as possible but works on multi-thread too...

@jhpratt
Copy link
Member

jhpratt commented Jul 30, 2022

The standard library uses a lock. The problem is that not everything uses a lock, and when they do, it's not the same lock. This is an extremely difficult problem to solve. Don't get me wrong — I appreciate the desire to make things work — but it's something that's been under discussion for a long time. It goes back to a libc function that has existed since the 1980s. Someone (I'd have to pull up who) is actually trying to change setenv to make it thread-safe. Needless to say that's not so much an uphill battle, but rather climbing the side of El Capitan.

@Sytten
Copy link
Author

Sytten commented Jul 30, 2022

Make sense, I guess I will disable the thread count check then. We just had a problem in our latest release on macos aarch64 as it looks like the lib cant count the threads on that platform. So it is definitely an issue for our users. And we dont use setenv so that should be fine.

@jhpratt
Copy link
Member

jhpratt commented Jul 30, 2022

Yeah, as long as you can guarantee with absolute certainty that nowhere in your dep tree is mutating the environment, then using RUSTFLAGS="--cfg unsound_local_offset" is sound. If it's done in even one place, then using the cfg is UB.

@Sytten
Copy link
Author

Sytten commented Jul 30, 2022

Considering we were using chrono before that I imagine did call localtime_r without checks and it worked well, it should be fine right?
I think a lot of people are in a similar situation of moving from chrono to time and hitting cases that chrono handled (probably badly) before.

@jhpratt
Copy link
Member

jhpratt commented Jul 30, 2022

Not necessarily. Even when put in a loop, it takes a number of iterations before the reproduction actually triggers the error.

Ultimately this is a decision that was made by me. I believe avoiding a segmentation fault is far more important than getting the system's UTC offset. The bug wasn't discovered via an audit, but by a user running into it in a real-world situation.

@jhpratt jhpratt added E-hard Significant experience needed. C-enhancement Category: an enhancement with existing code A-local-offset Area: local offset labels Aug 1, 2022
@Sytten
Copy link
Author

Sytten commented Aug 8, 2022

For reference, I replaced the usage of the local_offset_at with a mix of tz-db, tz-rs and iana-time-zone. It works great so far and it's not that complicated. I added a bit of caching similar to what chrono does:

use std::cell::RefCell;

use time::{OffsetDateTime, UtcOffset};

thread_local! {
  static TZ_NAME: RefCell<Option<Option<String>>> = Default::default();
}

pub fn current_local_offset() -> Option<UtcOffset> {
    let now = OffsetDateTime::now_utc();
    local_offset_at(now)
}

pub fn local_offset_at(datetime: OffsetDateTime) -> Option<UtcOffset> {
    match get_timezone() {
        Some(tz_name) => match tzdb::tz_by_name(&tz_name) {
            Some(tz) => match tz.find_local_time_type(datetime.unix_timestamp()) {
                Ok(tm) => match UtcOffset::from_whole_seconds(tm.ut_offset()) {
                    Ok(offset) => Some(offset),
                    Err(e) => {
                        log::warn!(
                            "Failed to convert seconds ({}) to utc offset: {}",
                            tm.ut_offset(),
                            e
                        );
                        None
                    }
                },
                Err(e) => {
                    log::warn!("Failed to convert unix time to local time: {}", e);
                    None
                }
            },
            None => {
                log::warn!("Unknown timezone: {}", tz_name);
                None
            }
        },
        None => None,
    }
}

fn get_timezone() -> Option<String> {
    // TODO: We should make this unblock / async since it can read from disk
    // TODO: We should check the TZ environment variable too
    TZ_NAME.with(|cached| {
        cached
            .borrow_mut()
            .get_or_insert_with(|| match iana_time_zone::get_timezone() {
                Ok(tz_name) => Some(tz_name),
                Err(e) => {
                    log::error!("Failed to get timezone: {}", e);
                    None
                }
            })
            .clone()
    })
}

smallstepman pushed a commit to dfinity/sdk that referenced this issue Aug 9, 2022
this required a small rework, because it's not possible to obtain
completely sound local time offset value in multithreaded environment
- time-rs/time#427
- time-rs/time#500
- time-rs/time#493
@jhpratt
Copy link
Member

jhpratt commented Aug 10, 2022

I do not foresee myself using tz-rs to work around the limitation of local_offset_at. Without caching, the performance would be terrible. With caching, it is not guaranteed to be accurate (the time zone can change). Even with measures taken to accommodate the potential for the time zone changing, it's not guaranteed to be correct. This isn't to mention the significant effort it would take to do this.

When I do get around to #193, I likely won't be using tz-rs. This is because I have my own vision for what the API should look like, I would want it to work on all platforms, and because I need to control the MSRV.

@jhpratt jhpratt closed this as not planned Won't fix, can't repro, duplicate, stale Aug 10, 2022
@Sytten
Copy link
Author

Sytten commented Aug 11, 2022

In case you didnt know, Tzdb crate is based on tz-rs. It is just a statically linked version of the timezone data. As per caching it is an acceptable tradeoff IMO. Did you see the implementation I did? I dont think it would be super complex to generalize to the library, let me know if you need help or want to bounce ideas.

@jhpratt
Copy link
Member

jhpratt commented Aug 11, 2022

When I say tzdb, I mean the actual tzdb, as in the raw data. I have no intent of relying on a tz-rs (directly or indirectly) for this for the reasons I stated.

As per caching it is an acceptable tradeoff IMO.

Not if you need accurate data. It is more than reasonable that a user expects a value returned to be accurate, even if the value has changed since the last call. Caching data violates this expectation, and the measures necessary to invalidate the cache for every potential way the value could change is far more effort than it's worth.

You're free to use the implementation you provided, and I encourage you to do so if it works for your use case. But there are the drawbacks that I've stated, and I don't see the situation changing to the extent that I would be comfortable adding tz-rs as a dependency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-local-offset Area: local offset C-enhancement Category: an enhancement with existing code E-hard Significant experience needed.
Projects
None yet
Development

No branches or pull requests

2 participants