Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-43012: remove pathlib._Accessor #25701

Merged

Conversation

barneygale
Copy link
Contributor

@barneygale barneygale commented Apr 28, 2021

Per Pitrou:

The original intent for the “accessor” thing was to have a variant that did all accesses under a filesystem tree in a race condition-free way using openat and friends. It turned out to be much too hairy to actually implement, so was entirely abandoned, but the accessor abstraction was left there.

https://discuss.python.org/t/make-pathlib-extensible/3428/2

Accessors are:

  • Lacking any internal purpose - '_NormalAccessor' is the only implementation
  • Lacking any firm conceptual difference to Path objects themselves (inc. subclasses)
  • Non-public, i.e. underscore prefixed - '_Accessor' and '_NormalAccessor'
  • Unofficially used to implement customized Path objects, but once once bpo-24132 is addressed there will be a supported route for that.

This patch preserves all existing behaviour.

https://bugs.python.org/issue43012

Automerge-Triggered-By: GH:encukou

@barneygale
Copy link
Contributor Author

Note that a lot of my previous pathlib bugfixes were partly in preparation for making this diff look OK! I was hoping to highlight that Path and _Accessor are conceptually the same at present, and therefore shouldn't be split across two classes. I hope this diff makes that clear! Browsing by commit might be useful also.

I'm pretty sure the test_glob_permissions test wasn't working properly. I had to add a key= argument to sorted() to make things play nice, which makes me think its patching of os.scandir() wasn't visible to pathlib previously.

@kfollstad
Copy link
Contributor

kfollstad commented Apr 29, 2021

Bravo! I didn't realize yesterday with #25699 that it was temp code. I had seen your commits and misunderstood. I thought somehow your intent was to add a class in between Path and PurePath which leveraged moving all cwd, resolve, open, touch, into the accessor.

I had done some performance testing that move of cwd, resolve, open, touch into the accessor and noted that it resulted in a significant slow down doing so.

cwd()
python3.9 -m timeit -r 5 -n 100000 -s 'from pathlib import Path' 'Path.cwd()'
100000 loops, best of 5: 4.03 usec per loop
python3.10-with-accessor -m timeit -r 5 -n 100000 -s 'from pathlib import Path' 'Path.cwd()'
100000 loops, best of 5: 206 usec per loop

resolve()
python3.9 -m timeit -r 5 -n 50000 -s 'from pathlib import PosixPath' -s 'p = PosixPath("/usr/bin/python")' 'p.resolve()'
50000 loops, best of 5: 15.4 usec per loop
python3.10-with-accessor -m timeit -r 5 -n 50000 -s 'from pathlib import PosixPath' -s 'p = PosixPath("/usr/bin/python")' 'p.resolve()'
50000 loops, best of 5: 429 usec per loop

open()
python3.9 -m timeit -r 5 -n 100000 -s 'from pathlib import PosixPath' -s 'p = PosixPath("/var/tmp/tmpfile.txt")' 'with open(p, "w"): pass'
100000 loops, best of 5: 14 usec per loop
python3.10-with-accessor -m timeit -r 5 -n 100000 -s 'from pathlib import PosixPath' -s 'p = PosixPath("/var/tmp/tmpfile.txt")' 'with open(p, "w"): pass'
100000 loops, best of 5: 163 usec per loop

touch()
python3.9 -m timeit -r 5 -n 100000 -s 'from pathlib import PosixPath' -s 'p = PosixPath("/var/tmp/tmpfile.txt")' 'p.touch()'
100000 loops, best of 5: 1.89 usec per loop
python3.10-with-accessor -m timeit -r 5 -n 100000 -s 'from pathlib import PosixPath' -s 'p = PosixPath("/var/tmp/tmpfile.txt")' 'p.touch()'
100000 loops, best of 5: 26.1 usec per loop

I have to imagine that moving everything out will do the exact opposite and make all of these slow down go away and speed up all of the other functions that had used the accessor. That said, I'm just a guy out there who happens to be using pathlib, but I think this is brilliant.

Myself, I had been wondering about changing the name of accessor to common (or something similar) having also seen Pitrou's comment to you, since it seemed like a vestigial abstraction. But I like taking it out entirely even better. Remove the cruft and make it faster - this seems like a win!

@barneygale
Copy link
Contributor Author

Yep! And you'll note that the os functions in the resulting code are called from only one Path method. That's to facilitate us adding a new AbstractPath class that sits between PurePath and Path. Certain methods like stat() would be abstract, whereas derived methods like exists() would not.

@barneygale
Copy link
Contributor Author

I'm also planning to perform an optimization pass, as I think there's a few _from_parts() tricks missed here and there that might have slowed things down slightly. Overall I'm expecting the changes to be performance-neutral.

@barneygale
Copy link
Contributor Author

barneygale commented May 13, 2021

@pitrou I think you wrote this code originally - do you have the time/inclination to review? Thanks so much!

@serhiy-storchaka As you improved the performance of globbing in bpo-26032, do you have any feedback on the Path.scandir() addition? Would you prefer Path._scandir() for example? Thank you.

@serhiy-storchaka
Copy link
Member

As for Path.scandir(), if you do not want to extend an API, do not add a public method. And if you want to add a new public method, it is a different issue which should be discussed separately (in particularly it should be shown an advantage over calling os.scandir()).

@barneygale barneygale force-pushed the bpo-43012-remove-pathlib-accessor branch from 0124153 to 599e253 Compare May 13, 2021 22:07
@barneygale
Copy link
Contributor Author

As for Path.scandir(), if you do not want to extend an API, do not add a public method. And if you want to add a new public method, it is a different issue which should be discussed separately (in particularly it should be shown an advantage over calling os.scandir()).

Thanks very much. I've renamed it to _scandir() as it can stay private for now.

Lib/pathlib.py Outdated Show resolved Hide resolved
Lib/pathlib.py Outdated Show resolved Hide resolved
@barneygale
Copy link
Contributor Author

I'll wait for #26153 to land before rebasing.

@barneygale barneygale force-pushed the bpo-43012-remove-pathlib-accessor branch from b0dcbd2 to 7a0882b Compare June 12, 2021 15:27
@barneygale
Copy link
Contributor Author

I've rebased this PR. I believe it's ready for core review!

@github-actions
Copy link

This PR is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale Stale PR or inactive for long period of time. label Jul 13, 2021
@barneygale
Copy link
Contributor Author

I'm still keen to land this if possible. Thanks!

@barneygale barneygale force-pushed the bpo-43012-remove-pathlib-accessor branch from b55c5e7 to 013ddc6 Compare January 5, 2022 02:25
@merwok
Copy link
Member

merwok commented Jan 5, 2022

Please avoid force push for future PRs, they make reviewing less pleasant: https://devguide.python.org/pullrequest/

@merwok
Copy link
Member

merwok commented Jan 5, 2022

Can you add a news entry? Can be just one line or two about removing an obsolete layer and preparing the terrain for upcoming enhancements.

@barneygale
Copy link
Contributor Author

Can you add a news entry? Can be just one line or two about removing an obsolete layer and preparing the terrain for upcoming enhancements.

I've added a news entry now. I used your wording as it seemed spot on to me!

These aren't necessary, and may slow things down.
Copy link
Member

@encukou encukou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me this looks ready to merge. Is anyone else reviewing it?

@miss-islington miss-islington merged commit 08f8301 into python:main Feb 2, 2022
@barneygale
Copy link
Contributor Author

Thank you for the reviews, everyone. If you can spare the time to continue reviewing this PR series, the next one is at #31085. Thanks so much!

@barneygale barneygale mannequin mentioned this pull request May 28, 2022
@projetmbc projetmbc mannequin mentioned this pull request Jun 21, 2022
jstvz added a commit to SFDO-Tooling/CumulusCI that referenced this pull request Dec 1, 2022
Python 3.11 removed pathlib._Accessor in python/cpython#25701. This
broke our patch when it replaced a call to `os.getcwd()` with
`Path.cwd()` in `Path.absolute()`. As a classmethod `Path.cwd()`
receives the class as its implicit first argument meaning we can have 0
or 1 arguments.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale Stale PR or inactive for long period of time.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants