Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TIMEZONE setting unable to handle two-digit UTC offset #1192

Open
cwfoo opened this issue Oct 16, 2023 · 2 comments
Open

TIMEZONE setting unable to handle two-digit UTC offset #1192

cwfoo opened this issue Oct 16, 2023 · 2 comments

Comments

@cwfoo
Copy link

cwfoo commented Oct 16, 2023

These UTC offsets behave as expected:

import dateparser
dateparser.parse('tomorrow', settings={'TIMEZONE': 'UTC+8'})
dateparser.parse('tomorrow', settings={'TIMEZONE': 'UTC+08'})
dateparser.parse('tomorrow', settings={'TIMEZONE': 'UTC+0800'})
dateparser.parse('tomorrow', settings={'TIMEZONE': 'UTC+8:00'})
dateparser.parse('tomorrow', settings={'TIMEZONE': 'UTC+08:00'})
dateparser.parse('tomorrow', settings={'TIMEZONE': '+0800'})
dateparser.parse('tomorrow', settings={'TIMEZONE': '+08:00'})

However, if the UTC offset timezone omits both "UTC" and the minutes offset, there will be an error. Example:

import dateparser
dateparser.parse('tomorrow', settings={'TIMEZONE': '+08'})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3/dist-packages/dateparser/conf.py", line 92, in wrapper
    return f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/dateparser/__init__.py", line 61, in parse
    data = parser.get_date_data(date_string, date_formats)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/dateparser/date.py", line 451, in get_date_data
    parsed_date = _DateLocaleParser.parse(
                  ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/dateparser/date.py", line 200, in parse
    return instance._parse()
           ^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/dateparser/date.py", line 204, in _parse
    date_data = self._parsers[parser_name]()
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/dateparser/date.py", line 224, in _try_freshness_parser
    return freshness_date_parser.get_date_data(self._get_translated_date(), self._settings)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/dateparser/freshness_date_parser.py", line 156, in get_date_data
    date, period = self.parse(date_string, settings)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/dateparser/freshness_date_parser.py", line 91, in parse
    now = apply_timezone(utc_dt, settings.TIMEZONE)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/dateparser/utils/__init__.py", line 119, in apply_timezone
    new_datetime = apply_tzdatabase_timezone(date_time, tz_string)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/dateparser/utils/__init__.py", line 94, in apply_tzdatabase_timezone
    usr_timezone = timezone(pytz_string)
                   ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/pytz/__init__.py", line 201, in timezone
    raise UnknownTimeZoneError(zone)
pytz.exceptions.UnknownTimeZoneError: '+08'

Dateparser should support two-digit UTC offsets because Python standard libraries sometimes return such offsets. For example:

$ TZ=:Asia/Singapore python3
Python 3.11.2 (main, Mar 13 2023, 12:18:29) [GCC 12.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import datetime
>>> datetime.datetime.now().astimezone().tzname()
'+08'

Please fix dateparser so that the TIMEZONE setting is able to handle two-digit UTC offsets such as '+08'.

@anarcat
Copy link
Contributor

anarcat commented Oct 16, 2023

I suggested @cwfoo open this bug report, but after investigating this issue a little further, it does seem like an odd timezone parameter...

I have a patch for undertime in here that tries to workaround that issue:

https://gitlab.com/anarcat/undertime/-/merge_requests/22

I'm not sure what the right way to go here. The best would be for dateparser to accept actual tzinfo objects instead of having to pass them as a string in the environment.

@Gallaecio
Copy link
Member

but after investigating this issue a little further, it does seem like an odd timezone parameter...

Indeed.

The best would be for dateparser to accept actual tzinfo objects instead of having to pass them as a string in the environment.

Sounds like a valid enhancement.

Maybe you could edit the title and description of the issue to be about this enhancement. Or close this issue and open a new one about the enhancement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants