Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove iana-time-zone dependency for clock feature. #1018

Closed
wants to merge 1 commit into from

Conversation

dshmatkou
Copy link

@dshmatkou dshmatkou commented Apr 17, 2023

Background

I would like to bump chrono's version in Android project. Unfortunately, iana-time-zone dependency of the clock feature doesn't allow me do it. In case of Android iana-time-zone uses android_system_properties crate to look up system properties (this is not a stable API).

Solution

Make iana-time-zone dependency default for the crate, but remove it from the clock feature requirements.

Alternative

Use libc's localtime functionality. This functionality was removed from the crate in pull #499 due to potential SEGFAULT.

@djc
Copy link
Contributor

djc commented Apr 17, 2023

Unfortunately, iana-time-zone dependency of the clock feature doesn't allow me do it.

Sorry, can you explain this a bit more?

@dshmatkou
Copy link
Author

Sure, I updated the background section.
TL;TR: It uses unstable API.

@djc
Copy link
Contributor

djc commented Apr 17, 2023

What do you mean by unstable API?

@mgeisler
Copy link

Hi Dirkjan, long time no see 😄

What do you mean by unstable API?

I work with Android these days: the problem is that Android is not going to vendor iana-time-zone to our external/rust/crates/ directory because the crate uses an unstable (internal) Android API.

This in turn means that we cannot bump the version of chrono in Android as long as it has a hard dependency on iana-time-zone. So it would be helpful for us if the dependency could be removed or made optional like @dshmatkou did here.

@djc
Copy link
Contributor

djc commented Apr 18, 2023

Hey Martin, good to see you're doing Rust work inside Google these days!

So, now that I have you here -- is there some stable API we could use to get the current timezone on Android?

(For context, this was implemented in #756 in order to deal with the complaints from #755. Does the Android libc mktime() return a UTC timestamp or a local timestamp?)

@dshmatkou
Copy link
Author

It seems that local timestamp should be the first option and UTC is a fallback.

@dshmatkou dshmatkou marked this pull request as ready for review April 19, 2023 13:12
@mgeisler
Copy link

Hey Martin, good to see you're doing Rust work inside Google these days!

Yeah, it's been a lot of fun!

So, now that I have you here -- is there some stable API we could use to get the current timezone on Android?

I'll have to ask around to find out what the correct way is to do this.

Does the Android libc mktime() return a UTC timestamp or a local timestamp?

If you're talking about the time_t returned, then that's always the number of seconds since 1970 in UTC. But you're probably actually asking how mktime interprets the fields of the tm struct given to it?

From what I can read in man mktime (on Linux), the function always assumes that the tm struct is using the local time. People should use gmtime instead if they have a tm struct with UTC field values.

@mgeisler
Copy link

mgeisler commented May 5, 2023

Hey all,

I talked with people about what the recommended way is to handle timezones on Android. The main takeaways are

  • Parsing files under /system/usr/share/zoneinfo is not supported: their format changes over time and there are no stability promises for these files (plus, .
    • Further to this, the /system partition does not receive updates via the Mainline project (OS component updates shipped via the Play store). This means that the timezone information becomes state: as an example, I learn that Mexico stopped observing daylight saving time recently. The update to the timezone data does not make it to the /system partition, so it's fundamentally wrong to read data from there.
  • Reading the system property is not an officially supported API and we would prefer not to have platform code do this.
  • Calling into localtime and similar libc functions is the preferred way for native code. Ultimately, it depends on what you use the timezone name for. If you use it to lookup timezone data in your own static timezone database, then you're back to the problem of relying on files under /system (i.e., your application will sometimes to wrong).

Does this help a bit? @esheppa, I saw that you were heavily involved in the #756 and so you will probably be interested in this too.

@dshmatkou
Copy link
Author

Hi @djc!
Could you please have another look on this pull request?

@djc
Copy link
Contributor

djc commented May 11, 2023

@mgeisler is the Android libc susceptible to issues like RUSTSEC-2020-0159? If so, you're asking me to move chrono back to an approach where some ways of using chrono can segfault? I think the timeline here is that:

  • Prior to 0.4.20 (as in, basically forever), we supported getting the local timezone on Android
  • In 0.4.20 we broke that because we started parsing tzinfo files in Rust in order to fix RUSTSEC-2020-0159 and we didn't have that setup correctly for Android
  • In 0.4.21 we fixed it by relying on, basically, unsupported APIs and filesystem locations so we went back to supporting getting the local timezone

If Android libc is not susceptible to RUSTSEC-2020-0159, we can revert to using libc calls to get the local timezone offset on Android only. If it is, however, I'm inclined to take on responsibility for the use of unstable platform API/surface and consider any (hopefully temporary) breakage from platform changes the fault of the Android platform for not providing a robust way to get the information we need.

For the proposed change, it seems to me that people using chrono with default-features = false, features = ["clock"] (which might admittedly be a bit silly) on Android will suddenly not get proper timezone support, instead falling back to UTC. As such, I would consider this change technically semver-incompatible. I honestly don't have a good sense of how much chrono is getting used in Android environments, but maybe we could try this out?

@esheppa @pitdicker any thoughts?

@pitdicker
Copy link
Collaborator

I read the discussion at the time, but never really understood how the problem was fixed. Or rather: Is it really fixed or do we just pretend it is?

As I understand it, calling get_env is only 100% safe to call in a single-threaded environment, or if there is guaranteed no non-Rust code. Both are something a library can not guarantee.

I'll paste some relevant comments.

@pitdicker
Copy link
Collaborator

rust-lang/rust#27970 (comment)

For the moment at least, I do think the right solution is to take the environment lock around calls to getaddrinfo. This doesn't solve the general problem, since non-Rust code won't use the Rust environment lock, but it'll at least mean purely Rust programs won't have any issues here, which is an improvement over the current situation.

rust-lang/rust#27970 (comment)

I'm not arguing that the general state of environment handling in threaded programs is anything less than awful.

It is worth pointing out that glibc, at least, seems to have its own internal locking, such that concurrent calls to setenv and getenv are not inherently racy. And since Rust calls those functions internally rather than reimplementing environment handling itself, the combination of Rust and glibc shouldn't in general have races. (There are exceptions, notably around fork/exec, and those need further work.)

rust-lang/rust#27970 (comment)

Furthermore, there is currently no way to safely call C libraries that read environment variables. See #27970 (comment) for cases where this is a problem.

I think this comment understates the problem. openssl calls getenv during certificate validation, because it calls gmtime_r*. If the community is going to take this family of problems seriously, I think the chrono/time advisories are just the start.

  • well, it calls getenv if linked against glibc, and gmtime_r is MT-Safe env so any implementation may but I don't think musl does.

rust-lang/rust#27970 (comment)

It's worth noting that there's significant (and ongoing) discussion about this https://internals.rust-lang.org/t/synchronized-ffi-access-to-posix-environment-variable-functions/15475. I think we're going to take this seriously, but probably the solution is probably not to report a CVE against nearly every rust library that does FFI.

rust-lang/rust#27970 (comment)

Patching POSIX/libc has been proposed in the IRLO thread linked above. I don't think it's a good solution because this takes the ability to fix things out of the hands of the Rust programmers, and in many cases leaves us at the behest of whatever distro we happen to be using (there is at least one comment to this effect in the thread). And in addition, this does nothing to help with programs or libraries that access the environment through environ/the third argument to main (this is mentioned multiple times in the IRLO thread).

Moreover, I don't think it's at all fair to put this on libc maintainers. Rust messed up here. Or, if not the standard library, loads of library authors messed up. And even if there were libc patches, I struggle to imagine prioritizing this as a backported security fix. IMO it's a pretty clear case of someone didn't follow the very old and well-documented interface (and spooky though this bug is, it would be tricky to exploit). I can't imagine that if this had been brought up around 1.0 that all of std::env would look like what it does today. To that end: #92365

It would be nice if POSIX or libc authors decided to make getenv and setenv both thread-safe, because it would mitigate the impact of this kind of mistake, as it has been made outside of Rust too: https://sourceware.org/bugzilla/show_bug.cgi?id=15607#c3

rust-lang/rust#27970 (comment)

richfelker would you be interested in changing this in musl?

No. At has been explained above and in the other tracker item of this form I took part in, this is purely a bug in the Rust specification, and needs to be fixed there. setenv (the POSIX function) is not thread-safe, and for every good reason should not be required to be. "Not thread-safe" means you are essentially not allowed to use it in a multi-threaded program at all. (There may be some subtle exceptions to that, but it's a good enough approximation.)

Why is it important that setenv not be thread-safe? Because the interface for accessing the environment, getenv, provides the caller a pointer to storage whose lifetime could end when setenv is called, but the caller of getenv has no way to "release" its reference to this storage. The other sanctioned interface, accessing environ[] directly, would have data races under concurrent setenv, which is just as bad. So if setenv were thread-safe, the operations of reading the environment would necessarily be non-thread-safe. Present issues of POSIX do not require that getenv be thread-safe, but basically everything depends on it being thread-safe, and all implementations I'm aware of make it so (which requires no work, since it's accessing read-only data). And at least for implementation-internal access to the environment, it's a requirement that it be thread-safe. And future issues of POSIX will require it to be.

For what it's worth, getaddrinfo is not one of the functions which POSIX permits to inspect the environment, and the musl implementation does not, so this particular manifestation of the Rust issue should not be one you encounter on musl.

rust-lang/rust#27970 (comment)

The problem is that most people don't know about this rule and so their code doesn't follow it, even in otherwise high-quality codebases. As a test, I hacked up glibc to print a message to syslog whenever setenv or putenv is called while multiple threads running; I installed this in a Debian VM and started up some random desktop sessions. Here's what I got:

https://gist.github.com/comex/9fa0899293d1c6e748faa81b875b25ac

It's inevitable that Rust code will end up in the same address space as code that does this, whether it's a Rust binary using C libraries or vice versa. But this leads to the extreme conclusion that portable Rust code cannot safely access the environment at all, not even just to read it. It can't even safely copy the environment at initialization time, not if built into a library that might be dlopened by C code while threads are already running.

Perhaps all that code should be fixed. (After all, some of these instances probably represent dormant race conditions in the C code itself, even without Rust coming into the picture.) But the above link represents only a small sample of Linux desktop applications, and Linux desktop applications represent a tiny fraction of the environments people run Rust code in. I suspect that unchecked setenv use can be found throughout the entire ecosystem of multithreaded C programs (and Python programs, and Ruby programs, et cetera). Fixing it all is impossible.

rust-lang/rust#27970 (comment)

Indeed, because each standard library (NSPR, Rust libstd, Ruby's, Python's) first attempts to fix the problem by adding their own lock, which works for things in the process that use that standard library, but fail for things in the process that use a different standard library.

rust-lang/rust#27970 (comment)

It's worse than that: Rust libraries used by Firefox must avoid calling env::get_var because that could race with a setenv from NSPR! Likewise, Firefox must not link with any C library that calls getenv, since that call would also fail to be protected by the lock.

The NSPR lock is just as bad as the Rust lock when it comes to compositionality with general-purpose C code. Both of them only work if they are used by 100% of the code that reads or mutates the environment. How does Firefox avoid the getaddrinfo problem that started this thread? The getenv inside getaddrinfo or localtime_r is just as unsafe in Firefox as it is in a Rust FFI situation.

rust-lang/rust#27970 (comment)

Besides the three issues I noted above, there is a fourth issue: getenv is also not thread-safe:

richfelker wrote:

And future issues of POSIX will require it to be.

This is the change in that doc:

At line 33878 change:

The getenv() function is inherently not thread-safe because it
returns a value pointing to static data.

to:

Some earlier versions of this standard did not require getenv()
to be thread-safe because it was allowed to return a value
pointing to an internal buffer. However, this behaviour allowed
by the C Standard is no longer allowed by POSIX.1. POSIX.1
requires the environment data to be available through environ[],
so there is no reason why getenv() can't return a pointer to the
actual data instead of a copy. Therefore getenv() is now required
to be thread-safe (except when another thread modifies the
environment).

We could either keep using getenv on the assumption that it is thread-safe even though current and older POSIX says it isn't thread-safe, or we could just change std::env to use environ directly with pure Rust code and avoid getenv() completely; i.e. RiiR. I prefer to RiiR, though I think it should be a lower priority.

@pitdicker
Copy link
Collaborator

I work with Android these days: the problem is that Android is not going to vendor iana-time-zone to our external/rust/crates/ directory because the crate uses an unstable (internal) Android API.

This in turn means that we cannot bump the version of chrono in Android as long as it has a hard dependency on iana-time-zone. So it would be helpful for us if the dependency could be removed or made optional like @dshmatkou did here.

I can understand how a crate that depends on internal APIs can be problematic.

Solution

Make iana-time-zone dependency default for the crate, but remove it from the clock feature requirements.

In effect on Android Local would become equivalent to Utc without the clock feature? That seems like a surprising bug to run into.

  • Further to this, the /system partition does not receive updates via the Mainline project (OS component updates shipped via the Play store). This means that the timezone information becomes state: as an example, I learn that Mexico stopped observing daylight saving time recently. The update to the timezone data does not make it to the /system partition, so it's fundamentally wrong to read data from there.

This is very interesting information! And libc localtime would get its data from somewhere else?

If Android libc is not susceptible to RUSTSEC-2020-0159, we can revert to using libc calls to get the local timezone offset on Android only.

If I understand it right, there would be no way to fully fix this on the libc side. Does the Rust standard library use the same environment lock as libc (haven't checked yet)?


Sorry for the rambling, I am just trying to understand the surrounding issues.

For the reason behind this PR: Would putting iana-time-zone behind a default feature like android_unstable_apis be possible and acceptable?

@pitdicker
Copy link
Collaborator

In effect on Android Local would become equivalent to Utc without the clock feature? That seems like a surprising bug to run into.

Sorry for the wrong comment. Local is only available with the clock feature.

@djc
Copy link
Contributor

djc commented May 12, 2023

I've decided I'm going to close this. I don't think it's feasible to make whether Local is actually local vs falling back to UTC a separate feature flag at this point, after Local has historically returned actual local time for a long time. I'm open to continuing to discuss other options in an issue, though.

Alternative (partial) solutions:

  • Document that Local uses unstable and potentially obsolete data on Android, so that it should be avoided
  • On 0.5, don't expose Local at all on Android platforms

I read the discussion at the time, but never really understood how the problem was fixed. Or rather: Is it really fixed or do we just pretend it is?

I just checked and there are still two calls to env::var("TZ") in src/offset/local/unix.rs, so I suppose it might not be entirely fixed? I'm not sure if this is as bad as it was before.

@djc djc closed this May 12, 2023
@esheppa
Copy link
Collaborator

esheppa commented May 13, 2023

@mgeisler - are you using Local at all? The easiest solution may be able to just turn the clock feature off. Also do you envisage the android_system_properties method ever becoming stable? How risky do you think usage of android_system_properties crate is for applications on android?

@djc - I like your first dot point - we could even point users toward chrono_tz - which I think provides a good solution, as the tz files are then updated any time the application is updated to the latest version of chrono_tz.

This can work well if the following cases seem reasonable:

  • Android project - uses chrono without the clock feature to avoid unstable internal APIs (only works if it doesn't need Local)
  • Android applications - use chrono, chrono-tz. First uses iana-timezone-name to get the name of the timezone, and then parses the name with chrono-tz. The risk here is the use of unstable APIs meaning the application could break, so we could document that users handle failure of iana-timezone-name in their application code

@mgeisler
Copy link

This is very interesting information! And libc localtime would get its data from somewhere else?

Yes, it's my understanding that the C library would "somehow" know how to do this correctly... it knows the local timezone and can give you up to date information about it.

@mgeisler - are you using Local at all?

I am not sure what we use or not use in AOSP, sorry. All I know is that iana-time-zone was not imported because we don't have android_system_properties imported. We have at least two libraries for reading system properties, one in Rust (rustutils) and one which generates bindings for C++ and Java (sysprop).

In Android, we would like to standardize the library used for system properties and avoid pulling in duplicate functionality for this (duplicate dependencies end up increasing the size of the Android binaries).

@djc
Copy link
Contributor

djc commented May 30, 2023

Is the rustutils crate available on crates.io? If there is an interface that other crates can use then maybe iana-time-zone would be open to changing the dependency they use.

@mgeisler
Copy link

Is the rustutils crate available on crates.io? If there is an interface that other crates can use then maybe iana-time-zone would be open to changing the dependency they use.

It's not yet available there and open sourcing it was exactly what I was hoping @dshmatkou could help us with. Dmitry works for Android as a 20% contributor and I'm trying to help him make small tactical fixes to the Rust ecosystem 😄

@qwandor
Copy link

qwandor commented Jun 13, 2023

Following up a bit more on this, I've had a chat to @nfuller from the Android time team. It seems there are three supported APIs for accessing timezone data on Android: the standard Java ones, ICU4J (also Java), and the libc functions (e.g. localtime_r) from Bionic. Reading the tzdata file directly is not a supported API, and the file format may change in ways that break non-platform code that relies on it.

There are also some parts of the Android platform that use chrono::Local.

However, the good news is that it looks like #499 mostly doesn't apply to Android. localtime_r does end up calling getenv("TZ"), (via tzset_unlocked), but this environment variable is not set on Android (unless the application sets it itself), and I believe that in this case getenv will safely return null. The thread safety issue of getenv only happens when it returns a non-null pointer which the caller may try to read after another thread has invalidated it. When getenv returns null, tzset_unlocked then reads the appropriate system property and uses that instead, which is what we want.

With this in mind I suggest that we return chrono to using localtime_r to get the local timezone on Android, which should give the correct result and is threadsafe except in the case that the application tries explicitly sets TZ, which shouldn't normally happen.

@djc
Copy link
Contributor

djc commented Jun 13, 2023

With this in mind I suggest that we return chrono to using localtime_r to get the local timezone on Android, which should give the correct result and is threadsafe except in the case that the application tries explicitly sets TZ, which shouldn't normally happen.

That sounds okay to me. Can you submit a PR to this effect?

@qwandor
Copy link

qwandor commented Jun 13, 2023

Thanks, will do.

@enh-google
Copy link

[Android's libc maintainer here...]

The thread safety issue of getenv only happens when it returns a non-null pointer which the caller may try to read after another thread has invalidated it.

that's not necessarily true. if someone's calling setenv() at the same time (let alone messing with environ!), getenv() is inherently unsafe, no?

i think if you want to make a "this is fine" argument, it should be along the lines of "messing with environment variables in a multi-threaded program is inherently unsafe already, and calling localtime_r() is just exposing an existing bug rather than being problematic in and of itself".

fwiw, i think it's time i just bit the bullet and added the tzalloc()/localtime_rz()/tzfree() functions from NetBSD (https://man.netbsd.org/NetBSD-6.1.4/localtime_rz.3) to Android. i've started a thread with maintainers of other libcs to see if they have other/similar ideas (because obviously something that's in Android's bionic, Apple's libc, and glibc is far more useful that something that's only in one of them). but even if there's no interest from them, it's probably time to just go with NetBSD's existing practice.

i don't know how easy it is in rust to say "call this API if it exists", nor whether rust supports NetBSD at all, but this is actually an alternative implementation you can try today if so. (and having a signed-up user would be further motivation for me :-) )

@djc
Copy link
Contributor

djc commented Jun 16, 2023

@enh-google thanks for getting involved here. Getting a safer API (from NetBSD) out to Android sounds like a good way forward -- would be awesome if you can get other libcs on board. In terms of adopting it in chrono, how do newer libc implementations roll out to the Android installed base?

@qwandor
Copy link

qwandor commented Jun 16, 2023

The thread safety issue of getenv only happens when it returns a non-null pointer which the caller may try to read after another thread has invalidated it.

that's not necessarily true. if someone's calling setenv() at the same time (let alone messing with environ!), getenv() is inherently unsafe, no?

Is that true even if the environment variable you're trying to read isn't set beforehand, and isn't affected by the setenv call? I might have missed some detail in tracing through the calls, but it looked to me like getenv would safely return null in that case.

i think if you want to make a "this is fine" argument, it should be along the lines of "messing with environment variables in a multi-threaded program is inherently unsafe already, and calling localtime_r() is just exposing an existing bug rather than being problematic in and of itself".

The trouble with that approach is that it's not the decision the Rust standard library has made. It exposes safe wrappers around getenv and setenv (e.g. var and set_var). These ensure thread safety on Unix-like platforms by guarding the underlying getenv/setenv calls with a lock, which is fine as long as nothing else in the process tries to access the environment without that lock. Unfortunately given that chrono is not part of the standard library, there's no way for it to take the lock, and give that this is just a library we shouldn't really assume that some other part of the program isn't using the standard library in ways that are supposed to be safe.

@enh-google
Copy link

@enh-google thanks for getting involved here. Getting a safer API (from NetBSD) out to Android sounds like a good way forward -- would be awesome if you can get other libcs on board. In terms of adopting it in chrono, how do newer libc implementations roll out to the Android installed base?

as they upgrade to newer OS releases. you can find out what release you're on with the NDK's android_get_device_api_level(), though i'm assuming rust has some equivalent of dlsym() or weak symbols + attribute availability so that you can just check whether the function exists at runtime anyway?

Is that true even if the environment variable you're trying to read isn't set beforehand, and isn't affected by the setenv call? I might have missed some detail in tracing through the calls, but it looked to me like getenv would safely return null in that case.

yes (though obviously "someone called putenv() and then messed with the pointer" is going to be the most reliable way to break things).

These ensure thread safety on Unix-like platforms by guarding the underlying getenv/setenv calls with a lock...

which is another way of saying "they're not actually safe". that would only be safe in a world where everything in your process was written in rust. which -- although it can be the case for a daemon that's part of the OS on Android -- can never be the case for an Android app, because app code is always dlopen()ed into an existing zygote process. so calling the "safe" rust functions isn't safe in reality either.

@djc
Copy link
Contributor

djc commented Jun 19, 2023

as they upgrade to newer OS releases. you can find out what release you're on with the NDK's android_get_device_api_level(), though i'm assuming rust has some equivalent of dlsym() or weak symbols + attribute availability so that you can just check whether the function exists at runtime anyway?

So having looked into the NetBSD API a bit more, it would allow us to allocate a timezone_t type given the timezone name. However, what would be the recommended way to access the current timezone name on modern Android? IIRC this issue got started because there's no good API to access the current timezone name, either.

@qwandor
Copy link

qwandor commented Jun 19, 2023

So having looked into the NetBSD API a bit more, it would allow us to allocate a timezone_t type given the timezone name. However, what would be the recommended way to access the current timezone name on modern Android? IIRC this issue got started because there's no good API to access the current timezone name, either.

The NetBSD man page for tzalloc says "A NULL pointer may be passed to tzalloc() instead of a timezone name, to refer to the current system timezone.". This seems like something Bionic could implement to read the appropriate system property.

@djc
Copy link
Contributor

djc commented Jun 19, 2023

Fair enough, though it would probably also be useful to have a way to get to the current timezone's name.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants