Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improved parsing of timezone abbreviations #1065

Open
rtphokie opened this issue Nov 7, 2021 · 0 comments · May be fixed by #1103
Open

improved parsing of timezone abbreviations #1065

rtphokie opened this issue Nov 7, 2021 · 0 comments · May be fixed by #1103

Comments

@rtphokie
Copy link

rtphokie commented Nov 7, 2021

Issue Description

The documentation suggests agains using abbreviated timezone strings (e.g. EST, UTC, etc.):

timezone names from tz database provided via dateutil package, note that abbreviations such as MST, PDT, BRST are unlikely to parse due to ambiguity. Use the full IANA zone name instead (Asia/Shanghai, Europe/London, America/Chicago etc)

Arrow can successfully parse each of the standard and daylight timezone abbreviations in the United States (except Alaska) and eastern & western Europe (including GMT and UTC): ['CET', 'EDT', 'EET', 'EST', 'GMT', 'HST', 'MET', 'MST', 'UTC', 'WET']

Arrow can and should do better. It exists to improve on datetime and dateutil, why improve on parsing as well?

Timezone abbreviations are not as ambiguous as the documentation claims. It is well defined in the IANA timezone database. Specifically in to2050.tzs (see the IANA timezone database downloadable from the reference link below) which could easily be used to create a mapping from abbreviations to offset.

Additional timezone abbreviations that could be supported (including some historical ones such as war and peace time used in the US during 1942) are:

ACDT, ACST, ADDT, ADT, AEDT, AEST, AHDT, AHST, AKDT, AKST, AMT, APT, AST, AWDT, AWST, AWT, BDST, BDT, BMT, BST, CAST, CAT, CDDT, CDT, CEMT, CEST, CMT, CPT, CST, CWT, ChST, DMT, EAT, EDDT, EEST, EMT, EPT, EWT, FFMT, FMT, GDT, GST, HDT, HKST, HKT, HKWT, HMT, HPT, HWT, IDDT, IDT, IMT, IST, JDT, JMT, JST, KDT, KMT, KST, LMT, LST, MDDT, MDST, MDT, MEST, MMT, MPT, MSD, MSK, MWT, NDDT, NDT, NPT, NST, NWT, NZDT, NZMT, NZST, PDDT, PDT, PKST, PKT, PLMT, PMMT, PMT, PPMT, PPT, PST, PWT, QMT, RMT, SAST, SDMT, SET, SJMT, SMT, SST, TBMT, TMT, WAST, WAT, WEMT, WEST, WIB, WIT, WITA, WMT, YDDT, YDT, YPT, YST, YWT

Reference: https://www.iana.org/time-zones

System Info

  • 🖥 OS name and version: MacOS 11.6.1
  • 🐍 Python version: Python 3.9.1
  • 🏹 Arrow version: 1.2.1
@rtphokie rtphokie added the bug label Nov 7, 2021
@jamesboi951 jamesboi951 linked a pull request Apr 18, 2022 that will close this issue
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants