New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed #34140 -- Format python code blocks in documentation files #16261
Conversation
I'm inspecting further the result (by looking at the pages locally). I find it confusing to have And they are in the coding style section (!!). I think it's better to manually change these before moving further. Which implies more changes in this PR. Not sure what the view on this is, i.e. do you agree? |
Thanks for all this work you've already done. I think maybe we can ask for a review from @adamchainz who wrote the blacken docs package and also from @benjaoming who have already experimented it in readthedocs, ad I said in the issue. It would help also to have a review from people involved in the black PR for the code. |
@pauloxnet Makes sense, yeah I need more input on this as described above. As an "end result" for this PR, ideally we do not just update the existing snippets. But also set the system up so that Thanks again @pauloxnet !! |
I was looking at the Workflow for linters and wondering why it's entirely skipping the If there can be a way to invoke A good follow-up issue would be to discuss how the separate configurations for |
|
||
|
||
class Example: | ||
# ... | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pass | |
... |
Blacken docs accept the three points notation (without #
) as replacement for pass
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right @pauloxnet , it's ..
that it does not accept for code-block:: python
🤦♂️ ... thanks!!
Also, snippets having lines starting with >>>
can be "blackened" using code-block:: pycon
. But right now a lot of these are just unmarked. I'm not sure there is a way to detect these automatically. For example this:
https://github.com/django/django/blob/main/docs/ref/models/expressions.txt#L635-L643
Do you know of any way to detect these automatically? @pauloxnet @benjaoming @adamchainz
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now I can think only to something like a search rules that can detect ::\n\n >>>
and replace with :\n\n.. code-block:: pycon\n\n >>>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @pauloxnet that's helpful, I used that pattern. Doesn't pick up some edge cases, but picks up a lot of cases.
However I now have some new Cannot parse [..]
errors:
/docs/ref/forms/fields.txt:262: code block parse error Cannot parse: 1:16: print(f.as_ul()))
/docs/ref/models/expressions.txt:816: code block parse error Cannot parse: 2:0: EOF in multi-line statement
/docs/ref/models/expressions.txt:816: code block parse error Cannot parse: 3:0: EOF in multi-line statement
/docs/ref/models/expressions.txt:816: code block parse error Cannot parse: 1:0: ),
/docs/ref/models/expressions.txt:816: code block parse error Cannot parse: 1:0: )
/docs/ref/models/expressions.txt:838: code block parse error Cannot parse: 2:0: EOF in multi-line statement
/docs/ref/models/expressions.txt:838: code block parse error Cannot parse: 1:41: 'partition_by': [F('studio'), F('genre')],
/docs/ref/models/expressions.txt:838: code block parse error Cannot parse: 1:28: 'order_by': 'released__year',
/docs/ref/models/expressions.txt:838: code block parse error Cannot parse: 1:0: }
/docs/ref/models/expressions.txt:838: code block parse error Cannot parse: 2:0: EOF in multi-line statement
/docs/ref/models/expressions.txt:838: code block parse error Cannot parse: 3:0: EOF in multi-line statement
/docs/ref/models/expressions.txt:838: code block parse error Cannot parse: 1:26: expression=Avg('rating'), **window,
/docs/ref/models/expressions.txt:838: code block parse error Cannot parse: 1:0: ),
/docs/ref/models/expressions.txt:838: code block parse error Cannot parse: 3:0: EOF in multi-line statement
/docs/ref/models/expressions.txt:838: code block parse error Cannot parse: 1:26: expression=Max('rating'), **window,
/docs/ref/models/expressions.txt:838: code block parse error Cannot parse: 1:0: ),
/docs/ref/models/expressions.txt:838: code block parse error Cannot parse: 3:0: EOF in multi-line statement
/docs/ref/models/expressions.txt:838: code block parse error Cannot parse: 1:26: expression=Min('rating'), **window,
/docs/ref/models/expressions.txt:838: code block parse error Cannot parse: 1:0: ),
/docs/ref/models/expressions.txt:838: code block parse error Cannot parse: 1:0: )
/docs/ref/models/expressions.txt:877: code block parse error Cannot parse: 2:0: EOF in multi-line statement
/docs/ref/models/expressions.txt:877: code block parse error Cannot parse: 3:0: EOF in multi-line statement
/docs/ref/models/expressions.txt:877: code block parse error Cannot parse: 1:0: ),
/docs/ref/models/expressions.txt:877: code block parse error Cannot parse: 1:0: ).filter(
/docs/ref/models/expressions.txt:877: code block parse error Cannot parse: 1:0: )
/docs/ref/models/expressions.txt:961: code block parse error Cannot parse: 2:0: EOF in multi-line statement
/docs/ref/models/expressions.txt:961: code block parse error Cannot parse: 3:0: EOF in multi-line statement
/docs/ref/models/expressions.txt:961: code block parse error Cannot parse: 1:0: ),
/docs/ref/models/expressions.txt:961: code block parse error Cannot parse: 1:0: )
/docs/ref/models/expressions.txt:981: code block parse error Cannot parse: 2:0: EOF in multi-line statement
/docs/ref/models/expressions.txt:981: code block parse error Cannot parse: 3:0: EOF in multi-line statement
/docs/ref/models/expressions.txt:981: code block parse error Cannot parse: 1:0: ),
/docs/ref/models/expressions.txt:981: code block parse error Cannot parse: 1:0: )
/docs/ref/models/expressions.txt: Rewriting...
/docs/ref/contrib/gis/layermapping.txt:69: code block parse error Cannot parse: 2:0: EOF in multi-line statement
/docs/ref/contrib/gis/db-api.txt:160: code block parse error Cannot parse: 1:28: qs = Zipcode.objects.filter(<field>__<lookup_type>=<parameter>)
/docs/ref/contrib/gis/db-api.txt:196: code block parse error Cannot parse: 1:30: qs = Elevation.objects.filter(<field>__<lookup_type>=<parameter>)
/docs/ref/contrib/gis/db-api.txt:196: code block parse error Cannot parse: 1:30: qs = Elevation.objects.filter(<field>__<band_index>__<lookup_type>=<parameter>)
/docs/ref/contrib/gis/db-api.txt:196: code block parse error Cannot parse: 1:30: qs = Elevation.objects.filter(<field>__<lookup_type>=(<raster_input, <band_index>)
/docs/topics/forms/media.txt:285: code block parse error Cannot parse: 2:0: <line number missing in source>
/docs/topics/db/examples/one_to_one.txt:75: code block parse error Cannot parse: 2:0: <line number missing in source>
/docs/topics/db/examples/one_to_one.txt:75: code block parse error Cannot parse: 1:0: except ObjectDoesNotExist:
Best to devote a fresh mind to this. As I'm too tired today. Will keep you in the loop.
Thanks again @pauloxnet !!
docs/howto/logging.txt
Outdated
'file': { | ||
'class': 'logging.FileHandler', | ||
'filename': 'general.log', | ||
# [...] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I'm aware we want to keep the ...
and …
unchanged in docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I try to change this, since it's code-block:: python
it gives these errors:
../docs/howto/logging.txt:116: code block parse error Cannot parse: 3:4: "handlers": {
../docs/howto/logging.txt:155: code block parse error Cannot parse: 3:4: "loggers": {
../docs/howto/logging.txt:194: code block parse error Cannot parse: 3:4: "formatters": {
../docs/howto/logging.txt:220: code block parse error Cannot parse: 3:4: "handlers": {
../docs/howto/logging.txt:255: code block parse error Cannot parse: 3:4: "loggers": {
../docs/howto/logging.txt:269: code block parse error Cannot parse: 3:4: "loggers": {
I.e. it fails to interpret the lines containing ...
. Unlike code-block:: pycon
, the PythonConsoleLexer, which does. But that's not meant for Python code. Not sure what direction to take on this. Will change
# [...]
to
# ...
and see if that's acceptable. But it doesn't achieve:
keep the
...
and…
unchanged in docs
docs/intro/tutorial05.txt
Outdated
return Question.objects.filter(pub_date__lte=timezone.now()).order_by("-pub_date")[ | ||
:5 | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not nice 😞 Maybe:
return Question.objects.filter(pub_date__lte=timezone.now()).order_by("-pub_date")[ | |
:5 | |
] | |
now = timezone.now() | |
return Question.objects.filter(pub_date__lte=now).order_by("-pub_date")[:5] |
docs/intro/tutorial07.txt
Outdated
@@ -171,7 +175,7 @@ tabular way of displaying inline related objects. To use it, change the | |||
:caption: ``polls/admin.py`` | |||
|
|||
class ChoiceInline(admin.TabularInline): | |||
#... | |||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...
should remain.
docs/ref/models/expressions.txt
Outdated
50 | ||
>>> company.chairs_needed | ||
## >>> company.chairs_needed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's ..., unexpected.
fa997ca
to
a5f9781
Compare
@pauloxnet @felixxm @benjaoming There are unmarked code blocks or snippets that I found no way of adding/replacing using a reliable "search/replace" pattern. So I'm just going through the docs manually. With the manual process, so far, I only went through While I didn't manage to come up with a fully automated way to integrate |
a5f9781
to
76fc58a
Compare
@jvzammit thank you for the tremendous work you are doing on this issue. |
Ouch. How are they "unmarked"? Is it a result of Django documentation quirks or is it a missing feature of blacken-docs? |
You're welcome, thank you for the help!
I plan to split it up along these lines:
Will ignore I think reviewing will be quite an effort as well. But the changes are meant to be "changes that change nothing". So both the changes themselves, and the reviews, shouldn't involve much thinking. |
It's not a missing feature of blacken-docs IMHO. What I mean is this for example: https://github.com/django/django/blob/main/docs/ref/models/conditional-expressions.txt#L16 I.e. I need to add a |
@jvzammit In Django's Sphinx project, it seems that Python is the default language for syntax highlighting, so it could be argued that blacken-docs should be able to know about the default and lint blocks that are unmarked? |
@benjaoming In fact it appears that it's not picking these up: https://docutils.sourceforge.io/docs/ref/rst/restructuredtext.html#literal-blocks But I don't think it's a bug. In fact blacken-docs' docs itself states that in RST it only formats I think it's still a good effort to mark the unmarked blocks with |
I think it would be great that all Python code-blocks are explicitly marked as Python code, 👍 The immediate argument would be that it makes linting easier. Secondarily, it makes it more intuitive to include literal texts that should not be highlighted. |
Fully agree @benjaoming 👍 |
efb459d
to
7338749
Compare
@pauloxnet @benjaoming @felixxm Ready for another round? Since this became very "not small", do you see a way you can "split" the review work? |
@pauloxnet Unfortunately there isn't a way to mark individual files as reviewed AFAIK. That would come really handy in this case.
Yeah can start with a small group. Keep in mind the changes are quite "dumb". I.e. replacement of |
I reviewed and it seems everything ok to me. Maybe we can wait for a review form the others you tagged here and also from @carltongibson (if he can). |
modifying the system library path: | ||
|
||
.. code-block:: python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way of skipping these changes?
(Old-as-time, an rst codeblock marked with ::
is Python.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opened adamchainz/blacken-docs#195 to see what @adamchainz says.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@carltongibson I understand that it's @jvzammit who has taken the time to manually change all the ::
into code-block
statements in order to get them linted :)
We also agreed that it's explicit rather than implicit.. and we like that according to our Zen :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Grrr. Right I missed that.
...so it could be argued that blacken-docs should be able to know about the default and lint blocks that are unmarked?
That's more on the money to my mind. We're basically saying to change an ancient RST convention otherwise. I've adjusted the blacken-docs issue to see what Adam thinks.
Thanks for this @jvzammit — beyond #16261 (comment) I more or less like it.
|
This is a big question. Docs aren't code. Oftentimes the example is meant to point in the right direction rather than be runnable. One suspects the parser isn't going to allow us to blackenish (greyen?) in these cases. Note: I didn't look at all the files yet to see how often this comes up. |
I can help with reviewing this. If we can assign intervals of files (A-Z) and use a separate spreadsheet to mark when they have been reviewed, I think that could be a way forwards? We can also just declare which files that have been reviewed on the GitHub review comment. (but maybe reviewing it shouldn't happen before we've reached a conclusion on @carltongibson's inputs and any other general inputs)
Is it to accommodate that future documentation changes can easily be backported? And/or are you thinking about the benefits of also blackening old documentation code examples? |
It's more about this. We'd cope, but it's no fun rewriting a patch... However
I don't think the size is too bad. Let's resolve this Then it's picking out odd-cases: like the Serious question: If we just did |
@carltongibson Thanks for the great feedback! I opened this mainly because I think "we have to". But in my mind it's still an experimental PR. I.e. to answer the question "how would things look?" Still, if Django is going the I agree with you on some changes bringing no value. For example I don't like all the newlines added in code snippets. On the other hand the change to require Just my two cents. |
I agree with the principle of this sentence. But experience leads me to say that the code in the documentation is an authoritative source for anyone who reads it, and everyone expects it to be runnable and state-of-the-art Django code. This is especially true for newbies [1], who, however, are the readers of the documentation we care most about. [1] I held a Django Girls workshop last week |
The ability to elide detail in order to emphasise the point at hand is important. I don't agree that all examples should be runnable. But independent of whether it's true or not, it's a big shift in the scope of the ticket to go from "reformat the code examples to match Black style" to "ensure all code examples are runnable" where that latter involves rewriting existing examples, and in cases making them less clear.
This is what prompts this question. Can we do something Black-like, that doesn't commit us to the bigger change? That bigger change may have value, but my initial reaction to it is that it's false, patently so. I take the point about beginners. There are, though, often deeper explanatory goals that need clarity above all else. Take the logging examples. Having those be schematic is a massive aid. Having a And so on... Maybe it's a plus changing in this way. (I don't see it having spent this afternoon on it, but maybe) But it's not just a reformatting. Let me have a look at how many cases we're talking about. If it's only a few then a re-phrase is not a problem likely. |
'blog', | ||
..., | ||
... | ||
"blog", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there supposed to be a comma after the first ...
?
@carltongibson I haven't had time to go over all these many changed lines. But if we find just a few changes that touch upon other aspects of the Black format, I think that it have a potential high value from accepting the full package? I was scrolling down from the top and found other cases than quotation marks that seem valuable. Indentation errors: Also annoying to have to fix Line-wrapping stuff that's just line-wrapping: Line-wrapping stuff that improves readability: Unclear code snippets now have a real contextualization, in part because they aren't allowed to do otherwise: Maybe if you look through the changes from the bottom, you can find other aspects of why blacken-docs gives an over-all improvement. An additional point: We do this once, we keep using blacken-docs. If we do "grey-ish", we might end up doing this several times. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An additional point: We do this once, we keep using blacken-docs. If we do "grey-ish", we might end up doing this several times.
Absolutely. Whatever we use needs to be scripted. (That's the issue with skips...)
- Generally it's OK. Black is good, and I agree with the motivation here, which is why I accepted the ticket initially.
- There are issues still. I quickly found a few just scrolling... (see inline comments) we need to look closely.
- I'd really like to get a resolution for the forced
::
rewrite — it feels like half the diff is just for that.
f'hello {user}' | ||
f'hello {user.name}' | ||
f'hello {self.user.name}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why aren't these "
? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's related to this block being marked as pycon
? That doesn't seem right as this block doesn't look like it's in interactive style?
strings should not be used for any string that may require translation, | ||
cluding error and logging messages. In general ``format()`` is more | ||
rbose, so the other formatting methods are preferred. | ||
|
||
n't waste time doing unrelated refactoring of existing code to adjust the | ||
rmatting method. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is malformed.
# invoke deprecated behavior | ||
... # invoke deprecated behavior |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔
ImportError: cannot import name 'make_toast' from 'django.shortcuts' | ||
importError: cannot import name 'make_toast' from 'django.shortcuts' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Malformed. Why not "
? 🤔
.. parsed-literal:: | ||
.. parsed-literal: | ||
|
||
.. code-block:: pycon |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this correct? What's with the parsed-literal
directive? 🤔
If we do this, can some docs be added on how contributors should run this locally? (I couldn't see anything when looking at this diff, but maybe I just missed it). I'd also be interested to know more about the experience for contributors? Getting Sphinx to build in the first place can be hard enough, then there's getting links correct (e.g. ':meth:Django.forms.module.class.method'), and spelling, and now you want me to format the code blocks a certain style as well. If this isn't automated somehow, it's starting to feel a bit much (personally). |
It's actually quite the opposite. We want that, as for the Django code, also the formatting of the documentation examples are automatically formatted by blacken-docs, just to remove this task from the developer. |
OK, I've sketched out adamchainz/blacken-docs#196 to handle I think to reduce the scope here it would be good to pull out any preliminary changes and get those merged if we can. For example, applying the Then were there other manual fixes needed? I can't quite see at a glance. Those can be in a separate commit to running blacken-docs, so we can easily see the output. Thanks |
@carltongibson You inspired me: adamchainz/blacken-docs#198 Motivation: adamchainz/blacken-docs#197 I.e. It would be really useful to separate manual changes to existing docs files, from the automated ones. As you suggested. |
Hey @jvzammit — did you manage to make any progress here? The Then (thinking about CI changes, which aren't here yet? 🤔) can we proceed using a branch of blacken docs straight from GitHub (e.g. pip have VCS support..., maybe you intend pre-commit — that does too no?), or do we need to push something to PyPI? We're looking at two commits for both suggestions so that should be easy to keep updated so we can get this moving, rather than being blocked waiting for resolution upstream. |
Hey @carltongibson thanks for pinging! Hope you had a great Christmas! I have this on my mind, but everyone needs a Christmas break. So I didn't ping anyone about it. Was planning to that in the new year. I am waiting for these to be merged:
Why? I need a version of blacken-docs that contains both of these changes. I.e. the Have both merged would allow this process to be replicable on everyone's local and also on CI. After this initial PR is merged. As this PR would include the "manual changes" as first commit. And the changes done by blacken-docs itself as the second commit. (I'll probably force-push and erase all previous commits. Or better, reopen the PR separately so we don't lose context in this one.)
I do not understand your suggestion -- should I create a fork that merges blacken-docs PR 196 & PR 198 above? |
@jvzammit Happy Christmas too! 🎄 My question was just whether we can use a branch of blacken-docs to progress here rather than waiting. I know I often encourage folks to do the same waiting on a change in any package I help on. Worst case is we have an extra commit later changing the Django docs style to not use the Sphinx default, but that change would always be a separate commit anyway. (Aside: The small edits to the examples needed probably are OK — the benefit of having black in play justifies the change, likely — let's see that. Then the only area of disagreement is this change to requiring the explicit language vs the Sphinx convention. If we can isolate that, then we can compare it to the suggested change to blacken-docs and come to a consensus on the way forward. Getting the different parts separated helps to minimise and clarify what we actually have left to come to agreement on. I hope that makes sense.) No urgency! I was just passing by 🥳 |
@carltongibson @jvzammit I also think I will review this PR after the Epiphany. Happy New Year. |
So the latest |
Closing in favour of #16496 |
Ticket #34140
What have I done?
Right now
blacken-docs
receives files as parameters. For example:In order to run it against all
txt
files under docs I quickly scripted this for local use:In order to have it run without errors, and change all applicable files, I had to manually change some files. Why?
black
needs syntactially-correct Python in order to work. For example..
will not work; so I replaced that with# ..
.In addition some existing code snippets mix a style of both
code-block:: python
andcode-block:: pycon
. I changed these topython
by changing lines containing>>>
to contain# >>>
for example. I am not sure this is the best approach.All these kinds of changes are in the first commit: f073649
Once all files were changed to have no errors, I ran the script above. The file changes after command ran are in the second commit: 1b1915c
My work so far raised more questions than answers:
Formatting marked python code blocks
In some cases, I feel the previous "syntactically-incorrect" Python code was more helpful. Example, I changed:
to:
Is the user reading the docs now better off? Meh.
Formatting unmarked python code blocks
blacken-docs
does not pick any code snippets that are not marked with.. code-block:: [..]
.So code blocks as the below are not changed:
This is found here: https://github.com/django/django/blob/main/docs/ref/contrib/postgres/search.txt#L27-L28
Should all these be changed manually to be within a code block?
I think we should try to avoid having different styles (e.g. single-quotes and double-quotes) in the docs as much as possible.
So I would appreciate guidance on this.
Integrating blacken-docks
For this "blacken-docs" process to be integrated in future "docs" contributions it must be included:
However, either I'm missing something, or this is not going to be straightforward. My approach so far, i.e. the custom script above, feels kinda "hacky".
So I also need guidance on this please.