Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: use lowercase letters in hex literals #1692

Closed
besfahbod opened this issue Sep 10, 2020 · 13 comments
Closed

Suggestion: use lowercase letters in hex literals #1692

besfahbod opened this issue Sep 10, 2020 · 13 comments
Labels
T: style What do we want Blackened code to look like?

Comments

@besfahbod
Copy link

besfahbod commented Sep 10, 2020

Reading the code, when I saw the commend saying All letters used in the representation are normalized to lowercase, I thought it also refers to hex letters. However, a few lines later, I was disappointed to see Change hex literals to upper case..

black/src/black/__init__.py

Lines 5158 to 5168 in 6284953

All letters used in the representation are normalized to lowercase (except
in Python 2 long literals).
"""
text = leaf.value.lower()
if text.startswith(("0o", "0b")):
# Leave octal and binary literals alone.
pass
elif text.startswith("0x"):
# Change hex literals to upper case.
before, after = text[:2], text[2:]
text = f"{before}{after.upper()}"

There's one practical issue with use of upper-case hex letters, though, and that's similarity between letter B and digit 8.

I have seen bugs caused by this similarity in security-sensitive code, such as UTF-8/-16 decoders. Personally, I always used uppercase letters in hex numerals prior to that incident, because of the aesthetics of it; but, since then, have been switching to lowercase anywhere possible.

Maybe Black would consider switching to lowercase hex letters, as well?

@JelleZijlstra
Copy link
Collaborator

I initially implemented it that way but it was changed in #530 for a reason I no longer recall. @zsol do you remember why we made that change?

@hugovk
Copy link
Contributor

hugovk commented Sep 10, 2020

See also recent #1514, asking for lowercase hexadecimal due to the similarity of A and 4.

@hugovk
Copy link
Contributor

hugovk commented Sep 10, 2020

I initially implemented it that way but it was changed in #530 for a reason I no longer recall. @zsol do you remember why we made that change?

#467 was a month before #530, where lowercase hex was initially decided upon. No reasoning given, but perhaps it'll help jog the memory.

@ichard26 ichard26 added the T: style What do we want Blackened code to look like? label Sep 11, 2020
@zsol
Copy link
Collaborator

zsol commented Sep 18, 2020

I really can't recall anymore, and couldn't find a mention of this in our discussions either, but I'm sure there was discussion.

@fametrano
Copy link

fametrano commented Sep 21, 2020

ABB4AB8A
vs
abb4ab8a

isn't this enough to settle the issue?
:-)

@C0DK
Copy link
Contributor

C0DK commented Oct 5, 2020

I'll gladly fix this if everyone agrees on it? :)

@fametrano
Copy link

I've heard no contrarian opinions so far

@JelleZijlstra
Copy link
Collaborator

I agree that lowercase is better in a vacuum. I'm not as sure if this change is worth the pain of changing Black's existing style. Every time we change Black's current style, that means people who use Black to format an existing codebase need to reformat their code, which pollutes their commit histories and might break open branches. So to decide whether to make a formatting change, we need to see if the gains from the formatting change outweigh the pains of changing existing formatting. We don't have a good framework for making that tradeoff, but it's worth thinking about.

@fametrano
Copy link

fametrano commented Oct 14, 2020

Sure, let's think about this.
Anyway, IMO one single commit to catch up with this change is not dramatic for projects adopting black

@C0DK
Copy link
Contributor

C0DK commented Oct 21, 2020

There we go - created a PR. if you do appreciate it, please label it with hacktoberfest-accepted so i can get a sticker on my cyberbike ( 😛 )

I would also argue, as a user, that the change is minor, and as we are using black to not really think about formatting, it will simply happen in the background i guess. Also if you want to be explicit in your formatting rules, then you should probably freeze the version of your formatter. This seems like a minor change, and not everyone has hex numbers in their source code.

Your call :)

Just happy that I could, hopefully, help.

EDIT: I am not sure what primer is supposed to do, but the CI step fails. If you can guide me towards fixing it, then that'd be nice.

ambv added a commit that referenced this issue Apr 25, 2021
@ambv
Copy link
Collaborator

ambv commented Apr 25, 2021

I was the original advocate for upper-case numerals and based on my whining we decided to revert this change. The responsibility for the disruption is mine alone, I'm sorry to have wasted your time.

Line of reasoning:

  • numeral formatting introduced in 2018 has been used with close to zero feedback that it's wrong;
  • there's hundreds of millions of lines of code (in a very literal sense) formatted with Black now, many of which depend on the current formatting;
  • apart from aesthetic arguments, the readability argument is somewhat dubious, if your programming font lets people mistake 4 for A, 8 for B, or 0 for C, you should be changing it as there's a much wider area of confusion than hex numerals;
  • the "security issues" argument in the docstring is not based on any evidence AFAICT.

@cooperlees
Copy link
Collaborator

Maybe we can revisit this before we go "stable" one last time and if (and only if) we get a lot of the community behind this (maybe can do a survey or something) we could do a dedicated release for this case change.

cooperlees added a commit that referenced this issue Apr 26, 2021
- It was reverted to not cause so much diff churn on millions of lines of code
cooperlees added a commit that referenced this issue Apr 26, 2021
- It was reverted to not cause so much diff churn on millions of lines of code
- Fix primer config for projects that should now pass
@C0DK
Copy link
Contributor

C0DK commented Apr 27, 2021

Man, I hoped I would be a dedicated contributor. but that's all fine :) I do see your point @ambv and agree.

noxan pushed a commit to noxan/black that referenced this issue Jun 6, 2021
* Made hex lower case

* Refactored numeric formatting section

* Redid some refactoring and removed bloat

* Removed additions from test_requirements.txt

* Primer now expects expected changes

* Undid some refactoring

* added to changelog

* Update src/black/__init__.py

Co-authored-by: Zsolt Dollenstein <zsol.zsol@gmail.com>

Co-authored-by: Zsolt Dollenstein <zsol.zsol@gmail.com>
Co-authored-by: Cooper Lees <me@cooperlees.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T: style What do we want Blackened code to look like?
Projects
None yet
Development

No branches or pull requests

9 participants