Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewite caching in local::unix module #1457

Draft
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

pitdicker
Copy link
Collaborator

Existing code

At a high level, the unix module tries to find the current time zone, load the data for it and cache the result.

  1. To detect changes the caching code looks if the TZ environment variable has changed or if the modification time of the /etc/localtime symlink has changed.
  2. To update the time zone info, the first step is in tz_info::timezone::TimeZone::local. If the TZ variable is set it can load TZ data from the following sources:
  • a POSIX TZ string
  • a named time zone, loaded from the ZONE_INFO_DIRECTORIES
  • an absolute path
  • the /etc/localtime symlink
  • an empty TZ variable is an error
  1. If the TZ variable is not set it uses the /etc/localtime symlink.
  2. unix::current_zone then specifies unix::fallback as a third fallback.
    This gets the time zone name using the iana_time_zone crate, and tries to load it from TZDB_LOCATION (which is currently not the same as ZONE_INFO_DIRECTORIES).
  3. A final fallback is to default to UTC.

Problems with this setup are that the logic lives in different places, that the caching code does not detect changes except the most limited form, and that the logic to look up the zoneinfo directory is implemented in two different ways.

New code

Everything related to loading the time zone data is implemented as methods on a CachedTzInfo type.
Caching and updating happens with the methods:

  • CachedTzInfo::tz_info()
    • refresh_cache()
      • needs_update()
        • tz_env_var_changed()
        • symlink_changed()
        • tz_name_changed()
      • read_tz_info()
        • read_from_tz_env()
        • read_from_symlink()
        • read_with_tz_name()
          • tzdb_dir()

I kept methods such as read_with_tz_name() and tz_name_changed() close together so it is easier to see they work similar.

The CachedTzInfo type remembers the last source that was succesfull, and stores all information needed to check the cache is up to date: TZ variable, time zone name, path, and zoneinfo directory.

New functionality is that we cache the directory with the time zone database, and respect the TZDIR environment variable (fixes #1265).

All potential error sources are returned with a Result</* */, ()> type.


This is a rewrite, I can't pretend it to be a refactor.

Copy link

codecov bot commented Feb 24, 2024

Codecov Report

Attention: Patch coverage is 44.63277% with 98 lines in your changes are missing coverage. Please review.

Project coverage is 91.41%. Comparing base (e05ba8b) to head (b485ac2).

Files Patch % Lines
src/offset/local/unix.rs 44.88% 97 Missing ⚠️
src/offset/local/tz_info/timezone.rs 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1457      +/-   ##
==========================================
- Coverage   91.82%   91.41%   -0.42%     
==========================================
  Files          40       40              
  Lines       18345    18382      +37     
==========================================
- Hits        16846    16804      -42     
- Misses       1499     1578      +79     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@djc
Copy link
Contributor

djc commented Feb 24, 2024

This is a rewrite, I can't pretend it to be a refactor.

That makes it very tricky to review. Pretty sure there are ways to break it down into smaller pieces.

@pitdicker
Copy link
Collaborator Author

That makes it very tricky to review.

Yes, I realize that 😞.

Maybe I can split it in:

  • Drop the existing caching logic, and move all time zone loading logic into CachedTzInfo.
  • Add new caching code.

And maybe that can be divided in more than two commits, hard to say.

Does that sound better reviewable?

@pitdicker
Copy link
Collaborator Author

I'll see what I can do to split it up more.

@pitdicker
Copy link
Collaborator Author

Setting to draft for now. This is only tangentially related to the work of converting the API to Results.
I'll concentrate on the time_delta and parsing modules first.

@pitdicker pitdicker marked this pull request as draft February 25, 2024 19:19
@pitdicker pitdicker force-pushed the unix_tz_search branch 2 times, most recently from d5a7efb to 68553d5 Compare March 19, 2024 11:50
@pitdicker pitdicker force-pushed the unix_tz_search branch 4 times, most recently from 6508743 to e6b5574 Compare March 20, 2024 07:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Respect TZDIR environment variable
2 participants