Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing implementations of the Django interface for time metadata due to missing timezone information #1233

Open
TimoGuenther opened this issue Apr 13, 2023 · 5 comments · May be fixed by #1240

Comments

@TimoGuenther
Copy link
Contributor

The various backends feature two sets of methods for accessing time metadata:

  • get_(accessed|created|modified)_time: Official Django interface. The returned datetime must be timezone aware if Django setting USE_TZ is True, else naive in the local timezone. This means the remote timezone must be known in order to be able to either (if USE_TZ=True) attach timezone information to the datetime or else (if USE_TZ=False) first shift the datetime from remote timezone to local timezone (and then remove the timezone information). Since not all backends know the remote timezone, this is not always implemented. Instead, there is also the following.
  • (accessed|created|modified)_time: django-storages methods. The returned datetime is always naive. However, it is unclear in which timezone that datetime is. Those backends that know the remote timezone first shift this datetime to the local timezone (and then remove the timezone information), whereas the others leave it without timezone information in the unknown remote timezone.

Availability (✅ = implemented, ❌ = not implemented):

backend accessed_time created_time modified_time get_accessed_time get_created_time get_modified_time
Apache Libcloud
Azure
Dropbox
FTP
Google Cloud
S3 Boto3
SFTP

Options:

  1. Fully implement the official Django interface. I suppose this is not possible due to missing timezone information for some backends.
  2. Partially implement the official Django interface. For example, only raise NotImplementedError if USE_TZ is True but timezone information is missing. However, I doubt that conforms to the interface considering the Django documentation explicitly states that the naive datetime must be in the local timezone, which means the remote timezone must be known to be able to shift the naive datetime to the local timezone.
  3. Warn about this peculiarity by explaining why the method is not implemented. For example:
    def get_modified_time(name):
        raise NotImplementedError(
            "SFTP does not provide timezone information. "
            "Use modified_time for a naive datetime instead."
    )
    This would not only inform users what is going on but document for developers that this is intended behavior.
  4. Allow providing the (somehow known) remote timezone information (say, through the constructor) and only raise NotImplementedError if timezone information is missing.
@jschneier
Copy link
Owner

The second methods were originally in Django but were deprecated and removed in favor of the first a few versions ago.

The correct thing to do is option 1 and remove the others since they are dead but clearly the demand doesn’t exist because no one has opened a PR.

Would very happily review and accept a PR that implemented 1 and removed the dead methods.

@TimoGuenther
Copy link
Contributor Author

That explains the discrepancy. get_*_time was introduced in Django version 1.10 (released 2016-08-01) for the explicit purpose of timezone handling. The superseded *_time was then removed in Django version 2.0 (released 2017-12-02). The deprecation notice details:

The old, non-timezone-aware methods accessed_time(), created_time(), and modified_time() are deprecated in favor of the new get_*_time() methods.

Third-party storage backends should implement the new methods and mark the old ones as deprecated. Until then, the new get_*_time() methods on the base Storage class convert datetimes from the old methods as required and emit a deprecation warning as they do so.

Third-party storage backends may retain the old methods as long as they wish to support earlier versions of Django.

As you say, we should definitely implement the new get_*_time methods where possible. However, what about storage backends that do not have access to timezone information (e.g., SFTP)? The old *_time methods make sense in such cases. Implementing the new get_*_time methods on such storage backends requires option 4.

Also, how about keeping the old *_time methods around for backends that do have access to timezone information? On the one hand, that would support ancient versions of Django. On the other, people might accidentally end up using the old *_time methods if both are around.

@jschneier
Copy link
Owner

jschneier commented Apr 14, 2023

Working with legacy versions isn't a concern of mine; we should remove the removed methods. Libraries that depend on them are broken.

@TimoGuenther
Copy link
Contributor Author

Okay. Still, what to do about storage backends that do not have access to timezone information? How could the new (necessarily timezone-aware) get_*_time methods be implemented for them?

@TimoGuenther
Copy link
Contributor Author

I opened #1240 to fix this issue. I chose to keep the old interface around because some storage backends, i.e., FTP and SFTP, cannot be timezone-aware without outside help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants