Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/datetime conversions #174

Closed
wants to merge 4 commits into from

Conversation

Oracen
Copy link
Contributor

@Oracen Oracen commented Oct 30, 2022

What was changed

The date parsing system in temporalio.converter:decode_search_attributes is switched from datetime.datetime.fromisoformat to dateutils.parser.isoparse. This provides support for ISO8601 compliant datetime strings with timezones attached

Why?

Python's datetime library is not timezone aware by default, with issues on StackOverflow going back as far as 2008. Prior to Python 3.11, datetime.datetime.fromisoformat was not designed to handle timezone-aware datetime strings, but rather the timezone-unaware outputs of datetime.isoformat, which was not ISO compliant! (Hardest problem in SE is naming after all.)

The new Scheduling features in Temporal handle timezone-aware schedules by default. This caused issues when communicating with the Python SDK, as it could not decode the timezone strings (typically appended with 'Z' for UTC time). dateutils.parser.isoparse is known to have some problems with greedy evaluation of date strings but this typically occurs in the presence of user input. It should not be a problem for machine-formatted strings.

While the release of 3.11 means this won't be an issue in future, much of the Python and cloud ecosystem relies on older versions of Python. This change means most current versions of Python can leverage the scheduling functionality.

Checklist

  1. Closes Search attribute Datetime need to support full ISO 8601 (and maybe more) #144

  2. How was this tested:

  • development of replication and fix (see linked issue for replication code)
  • wrote unit tests to verify old/new functionality
  • run against my local development project
  1. Any docs updates needed?
    No

@CLAassistant
Copy link

CLAassistant commented Oct 30, 2022

CLA assistant check
All committers have signed the CLA.

@@ -28,6 +28,7 @@
import google.protobuf.json_format
import google.protobuf.message
import google.protobuf.symbol_database
from dateutil import parser
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a new dependency just for this feature. Is there any way we can avoid it? Is there any way with the standard library to do this, and/or is the parser easy to vendor/copy into here?

(also, a new dependency would need to be added to project.toml which you can do via poetry add dateutil and then re-alphabetizing)

Copy link
Member

@cretz cretz Oct 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it may be unavoidable to add this dependency unfortunately. Can't easily vendor stdlib impl as it's in C (ref python/cpython#92177). But can we put a if if sys.version_info >= (3, 11): here to use the stdlib version? (also don't forget to add that dependency and to run poe format).

Also, this will need to land after (and merge main after) #172 is merged which will add Python 3.11 support.

Copy link
Contributor Author

@Oracen Oracen Oct 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not an issue on my side, I must have been working on a tainted env because I thought dateutil was in standard. Will rebuild and double-check.

For those wondering about why this is an issue:
https://stackoverflow.com/questions/127803/how-do-i-parse-an-iso-8601-formatted-date
https://stackoverflow.com/questions/969285/how-do-i-translate-an-iso-8601-datetime-string-into-a-python-datetime-object

On doing it within the standard library, we could hack around with the following:

# Example lookup for Z
map_datetime = {
    "Z": "+00:00"
}

# ...
# v = "2020-01-01T00:00:00Z"

if v[-1].isalpha() and len(v) == 20:
    v = v[:-1] + map_datetime[v[-1].upper()]

# Code as normal

There's a slight catch here that "J" = local time, so we'd have to decide if we treat that string as timezone unaware or pull the timezone from the local machine.

The only reason I avoided this was to rely on a battle-tested date parser that could potentially catch a wider range of formats coming in from other languages. If you're confident we'll only see letters coming in (outside of the +/- hours format) then I'm happy to switch over to an explicit remap of the last digits for the sake of avoiding the external dependency

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think dateutil is probably an acceptable solution since it doesn't seem to carry dependencies on its own or move much.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Roger that, will add in the version check and wait for the 3.11 updates you guys have planned

Copy link
Member

@cretz cretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a couple of suggestions and need to merge main, otherwise all good. If the poetry.lock conflict is too annoying to fix, feel free to close this PR and open a new one on a fresh branch off main. Alternatively, you can wait a bit until I get around to this issue where I'll basically just copy what you did here. Thanks!

EDIT: Now that I think about it, we can just use this dependency on < 3.11. See https://python-poetry.org/docs/dependency-specification/#python-restricted-dependencies

@@ -1,20 +1,20 @@
[tool.poetry]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did so much of this file change? Not that it seems wrong, just doesn't seem like the issue to do this in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Auto-sorting...I noticed you asked to re-alphabetise, I didn't realise the autoformatter did so much! Will revert

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I meant just the dependency, heh

@@ -164,6 +165,37 @@ def test_encode_search_attribute_values():
temporalio.converter.encode_search_attribute_values(["foo", 123])


def test_decode_search_attributes():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to also add an integration test that sets a datetime attribute in a workflow? Or maybe alter test_workflow_search_attributes so that it will break without your fix?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leave it with me

@Oracen Oracen mentioned this pull request Nov 1, 2022
1 task
@Oracen
Copy link
Contributor Author

Oracen commented Nov 1, 2022

Closing PR to bypass merge conflicts, new issue on #179

@Oracen Oracen closed this Nov 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Search attribute Datetime need to support full ISO 8601 (and maybe more)
3 participants