Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use thread safe libtz instead of libc to get local timezone offset on unix. #543

Closed
wants to merge 1 commit into from

Conversation

caldwell
Copy link

Fixes #293 without needing to disable the feature on multithreaded programs.

I wrote the Rust libtz/libtz-sys crates to address this problem. They wrap IANA's libtz (repo, homepage), which is maintained in the same repo with their timezone and leapsecond datasets.

The libtz-sys crate has 2 interfaces, one that wraps localtime_r and another that wraps localtime_rz. Both should be thread safe:

  • localtime_r is built such that its call to getenv() is routed to a small Rust function that wraps std::env::var_os() which locks (against other Rust env manipulation).
  • localtime_rz is a NetBSD interface that doesn't call getenv() at all (you have to "alloc" a timezone storage struct, which takes the timezone name as a parameter).

The libtz crate provides a slightly more idiomatic interface to the libtz-sys crate. Currently it only uses libtz-sys's localtime_rz interface because that seems a little cleaner (though it does use std::env::var_os() to read the TZ env var, since that's the standard way to override the timezone for a process).

I set up a repo with the multithreaded POC in the original bug report.

 git clone https://github.com/caldwell/bad-localtime
 cd bad-localtime
 cargo run --bin time

Its Cargo.toml is currently pointing to this pull request branch to show that it doesn't crash.

Things I'm worried about:

  • Testing. I've only really run this on a Debian x86_64 box and an aarch64 Mac. The libtz code looks like it supports a lot of OSes, but I can't really vouch for it. I don't know how to test on a wide variety of systems easily. Recommendations are welcome.

  • Speed. The localtime_tz call that's currently being used requires calls to tzalloc() and tzfree() which allocate and free the timezone data (it also reads and parses the binary tzdata file on alloc). Because local_offset_at() is a static function it has to alloc and free on every call.

    I am worried this might be slow. It looks like there is some benchmarking code in time. I've not done benchmarking in Rust before—is it easy to test this one section of the code?

    There are 2 potential fixes for this if it is a problem:

    1. Surface the localtime_t interface in the Rust libtz crate and use that instead.
      The localtime_t call stores the timezone data in a static buffer (protected by a pthread mutex). This only requires it to parse the timezone binary file once.

    2. Change this PR to use something like local_static or thread_local! to keep the Timezone around to avoid the need to alloc and free it each time.

I would love to get localtime support into multithreaded apps (pretty much anything async can't use it at the moment!).

Thoughts?

@codecov
Copy link

codecov bot commented Jan 26, 2023

Codecov Report

Merging #543 (3b93cc3) into main (c45264c) will increase coverage by 0.3%.
The diff coverage is 100.0%.

@@           Coverage Diff           @@
##            main    #543     +/-   ##
=======================================
+ Coverage   95.3%   95.5%   +0.3%     
=======================================
  Files         78      78             
  Lines       8563    8533     -30     
=======================================
- Hits        8157    8151      -6     
+ Misses       406     382     -24     
Impacted Files Coverage Δ
time/src/sys/local_offset_at/unix.rs 100.0% <100.0%> (+66.7%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@jclulow
Copy link
Contributor

jclulow commented Jan 26, 2023

I wrote the Rust libtz/libtz-sys crates to address this problem. They wrap IANA's libtz (repo, homepage), which is maintained in the same repo with their timezone and leapsecond datasets.

The libtz-sys crate has 2 interfaces, one that wraps localtime_r and another that wraps localtime_rz. Both should be thread safe:

The libtz crate provides a slightly more idiomatic interface to the libtz-sys crate. Currently it only uses libtz-sys's localtime_rz interface because that seems a little cleaner (though it does use std::env::var_os() to read the TZ env var, since that's the standard way to override the timezone for a process).

Presumably on systems where this is a problem, that won't help with getenv() or setenv() or tzset() use by code that is not Rust code; e.g., threads managed by libraries, or if you are building a shared library with Rust?

In crates that adopt libtz like you're suggesting here, on operating systems that do not have a defective setenv()/tzset() (i.e., where all of those functions are actually thread-safe), is it easy or possible to just use the system routines instead of the extra copy of the IANA library and the extra timezone determination logic?

@caldwell
Copy link
Author

Presumably on systems where this is a problem, that won't help with getenv() or setenv() or tzset() use by code that is not Rust code; e.g., threads managed by libraries, or if you are building a shared library with Rust?

I don't think other threads doing C tzset() or getenv() will affect anything, since time nor libtz ever setenv() or std::env::var_set().

Definitely another thread using an unsafe C setenv() would break things. However one might argue that another thread doing setenv() is a bug in that code, since it is not specced to be a thread safe call. Staying in the Rust realm, std::env::var_set() is thread safe and will not interact (since this library uses std::env::var_os()).

It would also possible to remove the lookup of the TZ env var completely. I don't really like that as it's the only way for a user on a POSIX system to change the timezone away from the system default for a specific process and it would violate user expectations if it weren't there. But I don't know how often it is used it practice (when it doesn't exist, /etc/localtime is used). I know I never use it. Perhaps it could be disabled via a feature flag?

In crates that adopt libtz like you're suggesting here, on operating systems that do not have a defective setenv()/tzset() (i.e., where all of those functions are actually thread-safe), is it easy or possible to just use the system routines instead of the extra copy of the IANA library and the extra timezone determination logic?

In theory that should be possible. I'm not sure how to detect it though. Are you thinking some sort of explicit list of good OSes? I guess it's really the C library version, not the OS per se. Do you know of any implementations that are thread safe?

@jclulow
Copy link
Contributor

jclulow commented Jan 27, 2023

It would also possible to remove the lookup of the TZ env var completely. I don't really like that as it's the only way for a user on a POSIX system to change the time zone away from the system default for a specific process and it would violate user expectations if it weren't there. But I don't know how often it is used it practice (when it doesn't exist, /etc/localtime is used). I know I never use it. Perhaps it could be disabled via a feature flag?

In crates that adopt libtz like you're suggesting here, on operating systems that do not have a defective setenv()/tzset() (i.e., where all of those functions are actually thread-safe), is it easy or possible to just use the system routines instead of the extra copy of the IANA library and the extra timezone determination logic?

In theory that should be possible. I'm not sure how to detect it though. Are you thinking some sort of explicit list of good OSes? I guess it's really the C library version, not the OS per se. Do you know of any implementations that are thread safe?

FWIW, I suspect the Linux ecosystem is somewhat unusual in housing a menagerie of libc implementations. I expect most other UNIX environments (e.g., illumos, the BSDs, commercial/proprietary systems) provide libc along with the kernel as part of one large consolidation developed together.

The illumos implementations of tzset(3C), localtime_r(3C), setenv(3C), and getenv(3C), all have an MT-Level of Safe or MT-Safe; MT-Level and other attributes are described in attributes(7). We inherited them a decade ago when we forked from Solaris, so I would expect that Solaris is also thread safe for all of these routines.

In addition to thread safety, our time zone configuration is a bit different from a Linux system, though most of this is hidden from a consuming process behind the standard entrypoints like localtime_r(), etc.

A process determines the local time zone in roughly this way:

  • if TZ is set in the environment, we'll use it uncritically to select the zone
  • if TZ is not set, we'll read the TIMEZONE(5) file: a text file which is allowed to contain a TZ= line specifying the time zone
  • otherwise, we'll default to GMT0

Iff a process does not have TZ set in its environment, we cache the value read from /etc/TIMEZONE the first time we try to establish a local time, and the zone information from the zoneinfo file. This cache can be purged in all processes on the system by a privileged user with the tzreload(8) command. One can, thus, reconfigure the local system time zone while software is running. If a process does have TZ set in its environment then tzreload won't change the time zone for that process, but it will still invalidate the cache of the nominated zone file, which might have been patched in a package update to reflect an updated set of time zone information without needing to restart all the software on the system.

Historically systems like Java and Go and so on have created challenges for operators by ignoring the local time zone database, or at least some of the system facilities for configuring and refreshing it. It would be really great if we could avoid having that happen with Rust, where there are truly superlative facilities for using system library routines through FFI, etc!

@caldwell
Copy link
Author

caldwell commented Jan 27, 2023

In addition to thread safety, our time zone configuration is a bit different from a Linux system…

Interesting. Though it seems once it has the timezone name it still parses it the same way, either as a POSIX inline definition or as a tzdata file compiled by zic.

Historically systems like Java and Go and so on have created challenges for operators by ignoring the local time zone database, or at least some of the system facilities for configuring and refreshing it. It would be really great if we could avoid having that happen with Rust, where there are truly superlative facilities for using system library routines through FFI, etc!

I agree. FWIW libtz currently doesn't override the database, it uses system database files. But it does replace the system library, so I decided to poke around existing implementations that I can find:

  • the Illumos code is quite nice. I liked the way they checked /var/run/tzsync with a memmap and automatically tzset() when it changed.
  • glibc doesn't appear to do anything above and beyond what libtz does (it just caches the first thing it reads forever as long as the name doesn't change)—long running processes won't pick up tz database changes like Illumos.
  • freebsd doesn't appear to lock getenv/setenv. They have the same libtz code in their contrib/ and their libc uses that for localtime, however they have modified it to reload the tz file if it was last loaded over 61 seconds ago.
  • netbsd has locks on getenv and setenv. It has a copy of the IANA code, it doesn't reload the db ever.
  • openbsd doesn't lock getenv/setenv. Their localtime is a heavily modified version of some ancestor of the IANA code. It does not periodically reload the database.
  • macos locks getenv/setenv. Their localtime looks closest to the OpenBSD code (despite being in a FreeBSD directory)—it has been modified to listen for updates on notifyd with a fallback that checks the filesystem with lstat() (on every invocation of localtime().

It does appear from that list that really the only OSes that need to be worked-around are linux/glibc, freebsd, and openbsd. Of those only freebsd would have an inferior tz implementation (in the sense that the IANA libtz code doesn't reload itself ever).

But I think I agree that this should probably be more of a surgical workaround than what it currently is. I'll see if I can come up with something tomorrow.

@jhpratt
Copy link
Member

jhpratt commented Jan 27, 2023

While this approach in general is tenable, there are a number of problems here. I won't list the ones that I noticed within ~5 minutes of looking at this, as to be honest I think it would be somewhat rude (feel free to reach out privately). By far the most important, however, is that the repository you linked does not compile on my machine. I get a linker error, which is a dealbreaker right off the bat. My setup isn't anything crazy, either — Fedora on a ThinkPad. For this reason, I am closing this pull request.

With regard to it being effectively unusable in async code, that is simply due to the safety requirements. The next release of time will permit the end user to explicitly opt out of soundness checks at runtime, though doing is unsafe, of course. My ideal solution is to get #[deprecated_safe] going again on the language-level, as work on it stopped last year when the author unexpectedly deleted their GitHub account.

@jhpratt jhpratt closed this Jan 27, 2023
@jhpratt
Copy link
Member

jhpratt commented Jan 27, 2023

And as a side note, if there are operating systems that have thread-safe environment mutation, please let me know with references. I should be able to opt specific OS's out of the multi-threaded check quite easily.

@caldwell
Copy link
Author

And as a side note, if there are operating systems that have thread-safe environment mutation, please let me know with references. I should be able to opt specific OS's out of the multi-threaded check quite easily.

It's mostly here in this message, albeit more verbosely, so I'll summarize here (and add illumos links):

  • illumos locks setenv. It is built such that getenv() can be lockless and still multithread safe.
  • netbsd has locks on getenv and setenv.
  • macos has locks on getenv and setenv.

@jhpratt
Copy link
Member

jhpratt commented Jan 28, 2023

Thanks! I'll look that over and add opt-outs as appropriate.

@jhpratt
Copy link
Member

jhpratt commented Feb 19, 2023

illumos, NetBSD, and MacOS now bypass the check on main.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The call to localtime_r may be unsound
3 participants