use format parsing libraries instead of custom line based parsing #32

keewis · 2020-06-25T12:31:21Z

Right now blackdoc goes through the lines one by one, tries to extract code, format the code using black, then put everything back together. While this does work, extending to different file formats becomes difficult, especially when trying to extend to jupyter notebooks (json files).

There are a lot of libraries for parsing the currently supported formats (or the planned formats):

docutils for parsing restructured text (rst)
ipython sphinx extension / jupyter sphinx extension for reading ipython / jupyter styled code blocks
markdown-it-py for markdown
doctest for extracting doctest lines
json for parsing jupyter notebooks

this might come with the disadvantage of slower reading since e.g. docutils does more than just reading code blocks, but using those might make the tool a bit more robust and easier to extend to new formats. Another disadvantage is that right now we're reading the files only once but we might have to read at least docstrings more often (that is if we don't restrict python files to doctest -- should be possible, rst in docstrings is rare).

The text was updated successfully, but these errors were encountered:

keewis · 2020-11-04T00:54:13Z

thinking about this again, maybe we don't have to do that? There's lots of utilities that apply black to notebooks, so we don't have to add that. markdown support would be nice, but that should be somewhat similar to rst support. We're also reading files only once: every line is classified and the appropriate extraction / reformatting functions are called, in sequence.

keewis · 2021-08-08T15:38:41Z

since those libraries are fundamentally different from how the tool currently works (especially docutils), using those would require a rewrite, which I'm not prepared to do (and the main motivation, notebook support, can be done using other tools, e.g. nbQA).

Instead, markdown support will have to be implemented in a new blackdoc.formats module.

MarcoGorelli · 2021-08-08T16:29:25Z

the main motivation, notebook support, can be done using other tools, e.g. nbQA

Just FYI - hopefully, soon, this won't even require extra tools psf/black#2357

The black.format_cell function may even help here in blackdoc

keewis closed this as completed Aug 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use format parsing libraries instead of custom line based parsing #32

use format parsing libraries instead of custom line based parsing #32

keewis commented Jun 25, 2020 •

edited

keewis commented Nov 4, 2020

keewis commented Aug 8, 2021 •

edited

MarcoGorelli commented Aug 8, 2021 •

edited

use format parsing libraries instead of custom line based parsing #32

use format parsing libraries instead of custom line based parsing #32

Comments

keewis commented Jun 25, 2020 • edited

keewis commented Nov 4, 2020

keewis commented Aug 8, 2021 • edited

MarcoGorelli commented Aug 8, 2021 • edited

keewis commented Jun 25, 2020 •

edited

keewis commented Aug 8, 2021 •

edited

MarcoGorelli commented Aug 8, 2021 •

edited