Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify encoding in open #296

Merged
merged 8 commits into from Oct 11, 2022
Merged

Specify encoding in open #296

merged 8 commits into from Oct 11, 2022

Conversation

Avasam
Copy link
Contributor

@Avasam Avasam commented Oct 11, 2022

See https://peps.python.org/pep-0597/
This also happened to me on Windows in python/typeshed#8879 because of mathematical unicode characters as part of the comments:

(.venv) PS C:\Users\Avasam\Documents\Git\typeshed> flake8 .                    
Traceback (most recent call last):
  File "C:\Program Files\Python39\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Program Files\Python39\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\Avasam\Documents\Git\typeshed\.venv\Scripts\flake8.exe\__main__.py", line 7, in <module>
  File "C:\Users\Avasam\Documents\Git\typeshed\.venv\lib\site-packages\flake8\main\cli.py", line 22, in main
    app.run(argv)
  File "C:\Users\Avasam\Documents\Git\typeshed\.venv\lib\site-packages\flake8\main\application.py", line 363, in run
    self._run(argv)
  File "C:\Users\Avasam\Documents\Git\typeshed\.venv\lib\site-packages\flake8\main\application.py", line 351, in _run
    self.run_checks()
  File "C:\Users\Avasam\Documents\Git\typeshed\.venv\lib\site-packages\flake8\main\application.py", line 264, in run_checks
    self.file_checker_manager.run()
  File "C:\Users\Avasam\Documents\Git\typeshed\.venv\lib\site-packages\flake8\checker.py", line 323, in run
    self.run_serial()
  File "C:\Users\Avasam\Documents\Git\typeshed\.venv\lib\site-packages\flake8\checker.py", line 307, in run_serial
    checker.run_checks()
  File "C:\Users\Avasam\Documents\Git\typeshed\.venv\lib\site-packages\flake8\checker.py", line 589, in run_checks
    self.run_ast_checks()
  File "C:\Users\Avasam\Documents\Git\typeshed\.venv\lib\site-packages\flake8\checker.py", line 494, in run_ast_checks
    for (line_number, offset, text, _) in runner:
  File "C:\Users\Avasam\Documents\Git\typeshed\.venv\lib\site-packages\pyi.py", line 1811, in run
    yield from _check_for_type_comments(path)
  File "C:\Users\Avasam\Documents\Git\typeshed\.venv\lib\site-packages\pyi.py", line 1777, in _check_for_type_comments
    stublines = path.read_text().splitlines()
  File "C:\Program Files\Python39\lib\pathlib.py", line 1267, in read_text     
    return f.read()
  File "C:\Program Files\Python39\lib\encodings\cp1252.py", line 23, in decode 
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 176166: 
character maps to <undefined>

There are linters to catch this, namely pylint (unspecified-encoding), flake8-encodings and flake8-file-encoding. I have tried both flake8 plugins mentioned and they gave me the same results, other than slightly different error messages:
image

Copy link
Collaborator

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with these changes, but they won't fix the error you're seeing locally when you run flake8 on python/typeshed#8879! To fix that error, you need to add encoding="utf-8" to the call to read_text here:

flake8-pyi/pyi.py

Line 1777 in c55a904

stublines = path.read_text().splitlines()

Wanna do that in this PR as well?

Copy link
Collaborator

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Want to add a changelog entry? This counts as a bugfix. (Feel free to give yourself credit in the entry.)

@Avasam
Copy link
Contributor Author

Avasam commented Oct 11, 2022

Thanks! Want to add a changelog entry? This counts as a bugfix. (Feel free to give yourself credit in the entry.)

Are there guidelines on how to do so? CONTRIBUTING.md mentions doing a release, which I don't think is quite what I want.

@AlexWaygood
Copy link
Collaborator

AlexWaygood commented Oct 11, 2022

Are there guidelines on how to do so?

No, we don't have guidelines for that yet -- I can write something up :) But basically the process is "add a short sentence to the top of the changelog describing the user-visible impact of the change".

For now, this is is a pretty good example of a bugfix changelog entry: https://github.com/PyCQA/flake8-pyi/pull/193/files#diff-06572a96a58dc510037d5efa622f9bec8519bc1beab13c9f251e97e657a9d4edR3-R6

And this is a pretty good example of a changelog entry from a non-maintainer contributor: https://github.com/PyCQA/flake8-pyi/pull/242/files#diff-06572a96a58dc510037d5efa622f9bec8519bc1beab13c9f251e97e657a9d4edR20-R21

Copy link
Collaborator

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gonna pretend I know what a non-CP1252 character is

@Avasam
Copy link
Contributor Author

Avasam commented Oct 11, 2022

Gonna pretend I know what a non-CP1252 character is

CP1252 character set, aka Windows-1252 character set. Is the default windows one.

Python lists it as cp1252
image

@AlexWaygood AlexWaygood merged commit d8e0357 into PyCQA:master Oct 11, 2022
@AlexWaygood
Copy link
Collaborator

Thanks! :D

@Avasam Avasam deleted the UnicodeDecodeError branch October 11, 2022 22:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants