Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jupyter notebook support #2345

Closed
MarcoGorelli opened this issue Jun 22, 2021 · 24 comments · Fixed by #2357
Closed

Jupyter notebook support #2345

MarcoGorelli opened this issue Jun 22, 2021 · 24 comments · Fixed by #2357
Labels
C: dependencies C: jupyter Jupyter Notebooks, any color you like S: accepted The changes in this design / enhancement issue have been accepted and can be implemented T: enhancement New feature or request T: style What do we want Blackened code to look like?

Comments

@MarcoGorelli
Copy link
Contributor

MarcoGorelli commented Jun 22, 2021

#1814 was closed, with the message:

If someone would like to push this PR to completion, I would be happy to reopen.

I'd be quite interested in seeing black support Jupyter Notebooks natively. Indeed, I wrote nbQA for the purpose of running this and other tools on notebooks, but if notebooks could be handled by the tools themselves, that'd be even better.

A few questions / discussion points before I begin:

  • what would you like to do about cells containing valid IPython code on which ast.parse would throw a syntax error (e.g. any cell containing IPython magics) - should they be handled (as is done in nbQA, via IPython's TransformerManager().transform_cell()) or ignored? Note that the former would require IPython as a dependency
  • trailing semicolons are often used in Jupyter notebooks to suppress extra output before plots, e.g.
    fig, ax = plt.subplots()
    ax.plot(x, y); 
    However, black would remove them. In nbQA, I preserve these trailing semicolons by adding them back after black has removed them. Would it be OK to do that here too?
  • automagics. Writing pwd is fine in a Jupyter Notebook cell, but it's not valid Python. Automagics can only be detected with a running IPython kernel. In nbQA I deal with this by skipping over cells whose content can't be parsed neither ast.parse nor TransformerManager().transform_cell()
@felix-hilden felix-hilden added C: dependencies T: style What do we want Blackened code to look like? T: enhancement New feature or request labels Jun 22, 2021
@JelleZijlstra
Copy link
Collaborator

Thanks for offering to work on this! Here are my takes:

what would you like to do about cells containing valid IPython code on which ast.parse would throw a syntax error

It would be better if we could handle those cells. Perhaps we can put support behind an optional extra (like we do for uvloop), so users who don't need it don't have to deal with the extra dependency.

trailing semicolons are often used in Jupyter notebooks to suppress extra output before plots

We could add a "Jupyter mode" that leaves trailing semicolons alone, similar in spirit to the existing pyi mode.

automagics

What you write sounds like the best solution. It's unfortunate though, since it means the failure will pass silently if there is a bug in the Black parser that makes us unable to parse some valid code.

@MarcoGorelli
Copy link
Contributor Author

MarcoGorelli commented Jun 22, 2021

It's unfortunate though, since it means the failure will pass silently if there is a bug in the Black parser that makes us unable to parse some valid code.

Yes, exactly - in nbQA I have the --nbqa-dont-skip-bad-cells flag so that people not using automagics can still be warned if they have invalid syntax for this, which could be put here too?


Thanks for your feedback anyway, I'll get to work next week!

EDIT

I might be possible to detect automagics properly, but it'd require running ipython rather than python - I'll look into this anyway

@MarcoGorelli MarcoGorelli changed the title "If someone would like to push this [Jupyter support] PR to completion, I would be happy to reopen. Jupyter notebook support Jun 22, 2021
@MarcoGorelli
Copy link
Contributor Author

MarcoGorelli commented Jun 22, 2021

One final thing to bring up is cells with multi-line magics, e.g.

%timeit f(1, \
          2, \
          3)

TransformerManager would tranform it as follows:

"get_ipython().run_line_magic('timeit', 'f(1,           2,           3)')\n"

, thus losing the \ at the end of each line and putting it all on the same line.

Because this can't be reconstructed exactly, in nbQA I skip processing cells with multi-line magics. Would that be OK here too?

EDIT

Alternatively, the above could simply be transformed to

%timeit f(1,           2,           3)

@MarcoGorelli
Copy link
Contributor Author

Hi @JelleZijlstra

Having looked further into this, it is possible to detect automagics, but they require a running ipython kernel, i.e. if black is invoked as ipython -m black notebook.ipynb rather than just black notebook.ipynb.

My outstanding question, then, is around multiline magics. Any thoughts on what black should do to a cell containing:

%timeit f(1, \
          2, \
          3)

?

Thanks

@JelleZijlstra
Copy link
Collaborator

In that particular case, ideally Black should turn it into %timeit f(1, 2, 3), but I'm guessing it's not safe for magics in general to treat the "argument" as Python code and format it.

Otherwise, we should either just leave multi-line magics alone (as you suggested above) or come up with some safe way to put the backslashes back.

@MarcoGorelli
Copy link
Contributor Author

it's not safe for magics in general to treat the "argument" as Python code and format it.

yeah, depending on the magic, the argument might not be Python code. e.g.

%matplotlib inline

Cool, thanks for your reply, working on a PR

@mlucool
Copy link

mlucool commented Jun 30, 2021

CC @ryantam626 who maintains https://github.com/ryantam626/jupyterlab_code_formatter (which handles many of these issues already - but only in jupyterlab).

@MarcoGorelli
Copy link
Contributor Author

Thanks @mlucool

I've had a look at the source code, but it looks like magics are detected with regular expressions, which feels brittle.
Indeed, I tried out

if True:
    %time 2+2

, ran the jupyter code formatter, and got

Jupyterlab Code Formatter Error
Cannot parse: 2:4: %time 2+2

I've reported this bug to jupyterlab_code_formatter - but anyway, I think the wider point remains that regular expressions are too brittle for this

@MarcoGorelli
Copy link
Contributor Author

MarcoGorelli commented Jul 4, 2021

it is possible to detect automagics, but they require a running ipython kernel, i.e. if black is invoked as ipython -m black notebook.ipynb rather than just black notebook.ipynb.

Unfortunately, there doesn't seem to be a non-hacky way of capturing the return code: https://stackoverflow.com/a/35168666/4451315

So for now I'll leave automagics out - I've made a first draft here #2357 , if anyone following along has any test cases I hadn't thought of to suggest, I'd be extremely happy to see them

@ichard26 ichard26 added the S: accepted The changes in this design / enhancement issue have been accepted and can be implemented label Jul 4, 2021
@MarcoGorelli
Copy link
Contributor Author

MarcoGorelli commented Jul 5, 2021

Hi @JelleZijlstra - question on the design: if you have the jupyter dependencies installed, then if you run

black src

then should black run on all Python, .pyi, and .ipynb files by default?
Or should it only run on ipynb files if you run

black src --ipynb

?

At the moment, the PR I opened does the former, but I think the latter might be more desirable, as then an error message can be thrown if the additional dependencies aren't found


UPDATE

In the end I've gone with not running on notebooks by default. To run on notebooks, users should pass --include \.ipynb$. Any thoughts on this? IMO it's probably better for notebooks to be opt-in

@JelleZijlstra
Copy link
Collaborator

I feel like we should format ipynb files by default once we support them, so the default behavior of black . is to format anything black can format. We should make this clear in the release announcement, though.

I'm curious if other maintainers have a different opinion.

@MarcoGorelli
Copy link
Contributor Author

Makes sense, thanks - I've reverted to that behaviour

@cooperlees
Copy link
Collaborator

I also prefer the default if the support is there. Less CLI args the better imo (+ black's original goals of not being configurable) ...

@MarcoGorelli
Copy link
Contributor Author

OK, done. I'm also raising a warning if black would find a Jupyter Notebook but the user doesn't have the extra dependency (IPython) installed

@jucor
Copy link

jucor commented Oct 9, 2021

@MarcoGorelli Naive question: is a cell's last line of code supposed to end with a newline? (hence with an extra empty line at the end of the cell)
black . doesn't add that newline, but VSCode "Format cell" does add that newline -- and I'm trying to figure out on which of these two I should file a bug report :)
Thanks for the info! And for the work supporting notebooks :)

@MarcoGorelli
Copy link
Contributor Author

Thanks @jucor , that's kind of you!

I wouldn't expect there to be a trailing newline in every single cell - if you agree, I'd suggest reporting this to vscode if they do something different.

I haven't had a look at what they do, but I'd have thought they could just use black.format_cell directly, perhaps that's worth bringing up as a suggestion if they're using something else

@jucor
Copy link

jucor commented Oct 9, 2021

Thanks for the fast and precise answer @MarcoGorelli ! I completely agree with you, the trailing newline in the cell surprised me :) It seems that VSCode dumps the content of the cell to a tempfile then applies black to that tempfile then reads it into the cell. Therefore black formats this cell like a single file, with (rightly for a file!) a newline at the end :/
Have you had good experiences filing bug reports on VSCode? I don't know the culture of that repo enough, and there seems to be a lot of traffic there -- hence a lot of "yeah, we'll put that on the backburner". Any idea how to make a bug report there more salient?

@MarcoGorelli
Copy link
Contributor Author

Hmm, so I presume they don't handle "silent mode" or magics?

No idea how to make a report more salient, sorry - perhaps opening a pull request?

@jucor
Copy link

jucor commented Oct 9, 2021

I've spent some time digging in the internals of vscode, vscode-python, and vscode-jupyter, see if I could make a PR. I can confirm that for all formatters it supports (autopep8, yapf, black), vscode always format notebook cells by simply dumping the content of the cell (not the JSON, mind you!) to a tempfile then calling the formatter on the command line on that file. Hence the trailing "\n".

Any request or PR to do otherwise for black (e.g. properly calling black.format_cell) is likely to be rejected as too cumbersome :/

However, it should be easy to modify so it calls black --ipynb instead of black when there's a cell. Do you think it is possible to enrich black --ipynb to detect when it is passed only the python content of a single cell, as opposed to the JSON of a ipynb file, and reformat that with your proper notebook-cell-rules, instead of currently rejecting the file?

@MarcoGorelli
Copy link
Contributor Author

Possibly, but I really think this should be fixed in vscode.

I'm not a black dev though, so don't take my opinion as final or anything

@jucor
Copy link

jucor commented Oct 9, 2021

Actually you're right. The exact same problem happens when vscode calls autopep8 and yapf.
So one way to handle it vscode-side is to keep the existing logic but remove the final \n after calling whichever formatter. Should be relatively feasible. I'll see what I can do on it and report here for completeness :)

@jucor
Copy link

jucor commented Oct 12, 2021

Hi @MarcoGorelli
VSCode-python team answered in microsoft/vscode-python#17681 (comment) that they welcome a PR.
What do you think is the best way from the below?

  • A. just fixing VScode-Python so it removes the trailing "\n". Pros: Easy, and works for all formatters. Cons: ignores all the other notebook-specific formatting you've made (silent mode and magic).
  • B. Updating black's --ipynb argument as discussed above (which would probably need you to do), and modifying VSCode to call it. Pros: Makes use of all the goodies you've coded. Cons: more work on black side, and, minorly but to be noted, only works for black.

@MarcoGorelli
Copy link
Contributor Author

I'd suggest

C. use black.format_cell directly

In my opinion, black need not add extra complexity just for the sake of a single IDE

Just my opinion though

@jucor
Copy link

jucor commented Oct 12, 2021 via email

@JelleZijlstra JelleZijlstra added the C: jupyter Jupyter Notebooks, any color you like label Oct 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: dependencies C: jupyter Jupyter Notebooks, any color you like S: accepted The changes in this design / enhancement issue have been accepted and can be implemented T: enhancement New feature or request T: style What do we want Blackened code to look like?
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants