Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encoding updates #703

Merged
merged 9 commits into from Nov 23, 2020
Merged

Encoding updates #703

merged 9 commits into from Nov 23, 2020

Conversation

samschott
Copy link
Contributor

This PR changes how paths are encoded and decoded on Unixes as follows:

  1. Drops support for Python 2.7 to unify handling of paths (this can of course be changed if Python 2.7 support is important).
  2. Use Python's builtin os.fsencode and os.fsdecode functions available from 3.2.

Those changes have significant advantages when creating the watches with a str path. Previously, events from paths which contain bytes that cannot be decoded in the file system encoding (sys.getfilesystemencoding()) would raise an unhandled UnicodeEncodingError in the inotify emitter thread. Now, the bytes are converted to Python's "surrogate escapes" and remain in the string. All os functions accept strings with such "escaped bytes" and will properly convert them back with os.fsencode.

@BoboTiG
Copy link
Collaborator

BoboTiG commented Nov 22, 2020

Great! I am completely for dropping Python 2.7. Even 3.4 and 3.5 if needed, by it seems not for now :)

Could you add lines in the changelog + your GH nickname? Also if you mind updating the README and/or just remove Python 2.7 everywhere (I could do it before the release though).

I will merge tomorrow, thanks for the patch 💪

@samschott
Copy link
Contributor Author

samschott commented Nov 22, 2020

Will do.

The Python 2.7 tests are predictably failing but also some Python 3.10 and Windows tests. From a brief glance, it doesn't look like anything introduced in this PR.

Before you merge, I'll also add an emitter test with invalid bytes in the filename. Just in case people are tempted to use bytes.decode instead of os.fsdecode.

@samschott
Copy link
Contributor Author

samschott commented Nov 23, 2020

Ok, this should be ready now. The remaining test failures are:

  1. Failure of test_eventlet_monkey_patching on Python 3.10: TypeError when importing eventlet.
  2. Failure of test_event_dispatcher on some macOS runs: A thread leak. Where is this coming from?
  3. Failure to install Python 3.7 on Windows.

I am mostly concerned about the second one. Could this have anything to do with the changes from this PR?

@BoboTiG
Copy link
Collaborator

BoboTiG commented Nov 23, 2020

1 and 3 are OK.
I relaunched the job for 2, it happens from time to time ;)

@BoboTiG
Copy link
Collaborator

BoboTiG commented Nov 23, 2020

Passed, I wil merge after the nitpick in the test.

@BoboTiG BoboTiG merged commit 432c31f into gorakhargosh:master Nov 23, 2020
@BoboTiG
Copy link
Collaborator

BoboTiG commented Nov 23, 2020

Thanks a lot @samschott :)

@samschott
Copy link
Contributor Author

Thanks for merging -- and for the very quick response times!

@samschott samschott deleted the encoding-updates branch November 23, 2020 22:08
CCP-Aporia added a commit to CCP-Aporia/watchdog that referenced this pull request Jan 26, 2021
Unfortunately gorakhargosh#703 cannot easily be backported, so we won't have this
functionality in the Python 2.7 / 3.5 branch unless someone reports
this as an issue.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants