Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate legacy API in favor of traversable API (files) #80

Closed
8 tasks done
jaraco opened this issue Oct 21, 2020 · 17 comments · Fixed by #282
Closed
8 tasks done

Deprecate legacy API in favor of traversable API (files) #80

jaraco opened this issue Oct 21, 2020 · 17 comments · Fixed by #282

Comments

@jaraco
Copy link
Member

jaraco commented Oct 21, 2020

The files() API completely supersedes the earlier APIs through a simpler and more intuitive interface. In order to achieve "one and preferably only one" interface for the same behavior, it's desirable to remove the legacy API.

In GitLab by @jaraco on Feb 22, 2020, 09:33

Following from !76 and this comment, I'd like to explore what the implementation would look like with only the traversable API. In that comment, I wrote:

I've started stripping out [the legacy] functionality in the feature/drop-legacy-api branch. Progress was good, but I got stumped on Traversable.open() for zip files, which doesn't support text-decoding. This leads me to believe that's a feature zipp should support. This also suggests the Traversable API should include .open().

Plan

@jaraco
Copy link
Member Author

jaraco commented Oct 21, 2020

In GitLab by @jaraco on Feb 29, 2020, 20:37

In #81, we're adding .open() to the traversable API.

@jaraco
Copy link
Member Author

jaraco commented Oct 21, 2020

In GitLab by @jaraco on Mar 1, 2020, 11:15

mentioned in commit a8a471c

@FFY00
Copy link
Member

FFY00 commented Oct 31, 2020

What is required for this? I assume writing some quick snippets on how to migrate from the old API to the new one, and while doing that make sure all features from the old API have an equivalent in the new one.

This could also potentially be done at the same time as #108, what do you think?

@jaraco
Copy link
Member Author

jaraco commented Oct 31, 2020

Thanks for asking.

I assume writing some quick snippets on how to migrate from the old API to the new one, and while doing that make sure all features from the old API have an equivalent in the new one.

Good plan. Note that most of the test coverage is against the legacy API. I was thinking something along these lines:

(moved to description)

Some of these steps are optional if superseded by subsequent steps (such as freezing legacy implementation).

I've already updated the documentation for porting from pkg_resources to use the files/traversable API, so that may also provide some hints as to how to migrate from the importlib_resources legacy API.

I notice that the docs are not synced at all with CPython, so there will be a separate effort (but coordinated) of deprecating the API in importlib.resources. That process will have a longer timeline, as it will need to be deprecated in one Python version (no earlier than 3.10) and removed in one or two versions later (probably one, as users have a fallback in importlib_resources).

This could also potentially be done at the same time as #108,

It may prove easier if #108 is done first, though I wouldn't have one block on the other. I've got some thoughts I'll add on that ticket.

@FFY00
Copy link
Member

FFY00 commented May 21, 2021

I have opened #221 which reimplements the legacy API using files(). Overall it seems to be able to replace most of it, falling short in just one feature, importlib.resources.path. Traversables are designed as purely abstract, with no ties to the file system, which makes it impossible to replace path. I have opened bpo-44200, which proposes a fix for that.

@jaraco
Copy link
Member Author

jaraco commented May 23, 2021

makes it impossible to replace path

What about as_file(files(...).joinpath(...))? as_file takes a Traversable file and creates a path for it on the file system. It's meant to be the replacement for path.

@FFY00
Copy link
Member

FFY00 commented May 23, 2021

as_file will always create a temporary file, path documents that it will only do that if the resource isn't on the file system. as_path replaces it in essence, but not as path is documented. I probably shouldn't have used "impossible" to describe it, but as_file does go against what path documents, and has an undesirable behavior IMO.

I think there are valid use-cases for wanting this path behavior. One example would be big resources, let's say I have a 10gb dataset, I don't want to copy that to a temporary file.

Something to consider: On most Linux distributions, /tmp is a tmpfs, which lives on the RAM. If I create a 10gb file there, I will be putting 10gb on the RAM. You can see how this could go wrong... even for smaller files...

@jaraco
Copy link
Member Author

jaraco commented May 23, 2021

as_file uses singledispatch to avoid creating temporary objects when the traversable is a pathlib.Path. So in the most common case, temporary files should not be created.

@FFY00
Copy link
Member

FFY00 commented May 23, 2021

Right, but that is simply mitigating the issue on the most common case, not actually fixing the issue in the interface.

One example where that will not work is CompatibilityFiles, if we choose to go with that. Well, I guess we could add a dispatch for that too, but then we have to also handle the case of the class coming from the stdlib or this module, it's definitely doable but messy IMO.

I would really prefer to fix the interface, but it's your call.

@jaraco
Copy link
Member Author

jaraco commented May 24, 2021

I'd like to see what it means to fix the interface, but for the purposes of this effort, I suspect as_file is sufficient to satisfy the needs of path. Am I mistaken?

@FFY00
Copy link
Member

FFY00 commented May 24, 2021

I would say yes, but it won't necessarily stay true to its docstring. What you get is pretty much the same, the documented details on how you get it will not stay true in some cases.

@FFY00
Copy link
Member

FFY00 commented Oct 20, 2021

Per the fall down in python/steering-council#68, I propose that we mark the legacy API as PendingDeprecationWarning until Python 3.9 is EOL, which should be October 2025, and deprecate it with the mandatory period of 2 Python releases after that. This could be revisited if there are any changes to the importlib loaders that make this more difficult, but since we would have to provide the files() compatibility anyway, I don't see expect that being a problem.
This decision would probably be temporary as I expect the SC to make a call on this, but we can move to PendingDeprecationWarning right now to alleviate tensions.

@jaraco
Copy link
Member Author

jaraco commented Oct 21, 2021

Create/adapt test suite such that files API is covered entirely by tests not using legacy API.

I see someone checked this option, but I don't think this work is done. What I mean to do is remove all the tests that call the legacy functions, but port the tests to exercise the same behavior using files-native usage. This work can also serve to demonstrate what a migration looks like and possibly reveal some edge cases with migration issues.

Note: this would leave the legacy functions untested. I believe that's okay, even preferable, as long as that implementation remains frozen.

Per the fall down in python/steering-council#68, I propose that we mark the legacy API as PendingDeprecationWarning until Python 3.9 is EOL.

I'd like to review that issue more closely. My initial instinct is that a PendingDeprecationWarning only adds unnecessary delay to the deprecation (leaving users to delay even considering porting to the preferred API for several years). Because this project actively maintains a backport, it should be able to move faster than other stdlib modules because end-users have full access to the files API even on Python 2.7.

I'll take the time to read the argument closely and review the policy again to ensure the current approach is in compliance and not causing undue harm.

@FFY00
Copy link
Member

FFY00 commented Oct 21, 2021

I see someone checked this option, but I don't think this work is done. What I mean to do is remove all the tests that call the legacy functions, but port the tests to exercise the same behavior using files-native usage. This work can also serve to demonstrate what a migration looks like and possibly reveal some edge cases with migration issues.

Note: this would leave the legacy functions untested. I believe that's okay, even preferable, as long as that implementation remains frozen.

Oh, my bad! I checked it by mistake.

@jaraco
Copy link
Member Author

jaraco commented Oct 21, 2021

I reviewed the policy. I do see that the policy does require some discussion, which I believe was met by this issue, but I believe deserves a more visible forum for the concerns raised.

I've created this post to capture in detail my thinking on the matter and to get comments from the community.

@jaraco
Copy link
Member Author

jaraco commented Jul 7, 2023

It's time! Let's remove the legacy implementation.

@jaraco
Copy link
Member Author

jaraco commented Jul 7, 2023

There once was a branch (feature/drop-legacy-api), but it's now stale, so I've pushed it to refs/archive/drop-legacy-api-naive and deleted it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants