-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Separate code examples from docs #1475
Conversation
Deploying with Cloudflare Pages
|
7410821
to
70e2489
Compare
2d7c360
to
5688bb5
Compare
5688bb5
to
226679e
Compare
226679e
to
0d90141
Compare
0d90141
to
0e52913
Compare
05f4237
to
42bed95
Compare
990cd38
to
9694cc4
Compare
9694cc4
to
d725046
Compare
d725046
to
d8ba8f0
Compare
🗒️ TODOs for future work. This was intended to be done now, but these tasks are more for when we start testing the examples. As the changes here are substantial, I would prefer to merge these changes sooner rather than later to avoid repeated rebases where editing intervention is needed. Feel free to ignore this in review. These tasks would be handled in future. Duplication/reuseWe can always switch to reusing specific code examples in future, if we feel it's an improvement.
Generating outputs
Further quality control
Miscellaneous
|
d8ba8f0
to
1d55c4d
Compare
1d55c4d
to
c87c1fc
Compare
An idea: Rather than keeping the code snippets separately to the docs, causing the friction that Iain describes, can we find a way to keep the snippets inline in the docs, and then extract them and test them as part of the build process? This might be a bit fiddly but I think it would address Iain's concerns. It's also something that we could add incrementally. In terms of implementation, I wonder whether we could support adding some optional metadata to each code snippet, to indicate how the snippet should be tested. We could then write a mkdocs plugin that implements the What do you think? |
A summary of the available packages for running or testing inline codemarkdown-exec
codechecker-mkdocs
entangled/mkdocs-plugin
mkdocs-code-validator
mktestdocs
pytest-examples(This issue is now closed, but I found this package after closing the issue, so adding it for completeness.)
pytest-codeblocks(This issue is now closed, but I found this package after closing the issue, so adding it for completeness.)
|
What isn't covered by these plugins?Let's, for now, assume that we only want to test working dataset definitions and ignore everything else that we could check — failing dataset definitions, full OpenSAFELY projects, sandbox sessions, errors.1 After reviewing again what these plugins do, I struggle to see how they handle the following (suggestions are welcome):
Some of these considerations may apply to other approaches that we devise ourselves to include code inline. Footnotes
|
I'll respond to some of the other comments below. |
Checking dataset definition fragments
What's meant by "valid" here? This import statement is valid according to Python's syntax rules and, if you have the
We can only check that line in the full ehrQL context by making it part of a minimally complete dataset definition. There are also other quoted parts of dataset definitions (examples on this page) that, as presented in isolation, would be only valid Python syntax, but not even valid Python code — there is prior context missing from the code. Despite that, the lines could still form part of a perfectly valid dataset definition. For these fragments, the options are then:
If we make each fragment complete, then doing so inline is going to entail repetition. The repetition will also make the documentation less readable as we would include the entire dataset definition every time we refer to it. To reuse the examples, we need something like the snippet approach in this PR that allows repeated inclusion and selective quoting. Or some other thing we devise where we can include the reused code inline in the source file. |
Formatting
I'm really not tied to using Ruff and Black. And there are other issues with Ruff and Black too:
On the other hand, one reason we might want to use these tools is that different authors may write dataset definitions in different ways. Because of that, we can end up with inconsistencies in how they are written across the documentation. |
Managing snippets and reuse
At the moment, possibly if we want to:
If we can devise a better way, then no.
No, they don't need to be separated. They could be included in the same directory as the relevant page directly. Perhaps even without any directory structure at all: though throwing everything into the same directory feels like it might create problems in organising the code for testing.
MkDocs will fail to build the site, if a snippet link is invalid. There is the possibility of accidentally referring to an incorrect file, granted. (That could also happen even if we devise some inline solution that doesn't involve repetition of code, and instead uses references to the code to include.)
My intent was to have one-to-one, although I think the actual PR as it stands does have a little reuse because one of the examples was identical. There are tradeoffs with both:
Anecdotally, having named tens of these, I didn't find it that difficult, even artificially constraining myself to two word names, which isn't really required and was more for establishing a convention. The only naming headscratchers were for the examples page, where there are a few similar examples. |
Inlining code somehow
In some ways, I think this is possibly preferable from the documentation author/editor perspective, that everything is in one file. That said, if you implement some means of quoting extracts from a snippet, then you're possibly still jumping around in a text editor. But within the same file, not a separate file. So you'd still likely end up editing the documentation by having two tabs/windows open, one to refer to the code and one where you're writing: the difference is just that both would be opening the same file. And from some of the other discussion above, I'd add the features that warrant consideration:
I have some ideas of how the above would work by using separate files, wherever they ended up being stored. If we went with an inline approach that we develop, we'd have to devise solutions for the features we actually want. Some of this feels a little like we would be writing our own version of the existing snippet plugin. On that note, I'll add that I just tried, and you can write a file where the snippet code to be used is in the same file. The problem is that the snippet itself, including any markers also get included in the output too. (These could be stripped out.) |
(thanks for the comments; I've had to think a lot more about this) |
A couple more thoughts here:
|
How do other projects manage testable code?MkDocs usersI'm looking at the list of known Material for MkDocs users. Surprisingly, a lot of even these technical organisations using MkDocs don't, at least obviously, test the code in their documentation. (This is from quickly looking around at any GitHub workflows and build configurations.) Ones I can find testing are listed below. FastAPIhttps://github.com/tiangolo/fastapi
…and that's all I can findI checked through all of the listed Material for MkDocs users, and only FastAPI was testing their examples. (That is, when the examples are structured much like we are doing in the ehrQL documentation. There were a few cases where documentation authors had used mkdocstrings to pull in code-related content, and these docstrings might well be tested.) SphinxThere's a doctest plugin as used in Fabric. This allow writing of inline code. This is still a little restrictive though:
Talks on documentation example testingAgain there are surprisingly few resources on this. I skimmed through these to see what they advocated. "Unit Test the Docs: Why You Should Test Your Code Examples", 2022https://www.youtube.com/watch?v=yCqteMY3L-g Goes for storing the code examples elsewhere and pulling those in This uses a snippet approach with a Node.js tool called Bluehawk, that looks like the snippet support in pymdown-extensions but has a few more features. "Unit Testing Your Docs", 2020https://www.youtube.com/watch?v=E9zod8-I-fs Also goes for separating the code examples from the documentation, |
I tried using the pymdown-extensions snippet extension as a basis for an additional Markdown extension that removes inline snippets. Rather than figure out how to set up an entirely new project that lets me install an extension, I forked the pymdown-extensions repository and modified that for now. This could be separated if it was useful. Why an extension?
When trying to modify the Markdown, I found that someone in a MkDocs issue has requested a hook that runs after the pymdown-extensions snippets are included. Such a hook currently doesn't exist. The suggested approach in that issue is to create an extension that runs after the snippet extension. This is controlled by a priority level, which is 32 in the standard snippets extension. There are probably cases that this doesn't catch, but does work in this simple case. This doesn't do any testing or anything like that, just allows for inclusion of "inline snippet", without. How it worksIt checks for the snippet start and end section lines. This means that every inline snippet needs a start and end marker. What might be better?I'm still not convinced that separate files for code inclusion is a worse solution than this inline handling of code. Otherwise, it might be preferable to make a feature request in the pymdown-extensions repository for this feature. Snippets are included when the snippet source and destination are the same file. The only problem is the repeated inclusion of the snippet and the snippet syntax. Code
"""
Snippet ---8<---.
pymdownx.remove_snippet
Inject snippets
MIT license.
Copyright (c) 2023 Steven Maude
Copyright (c) 2017 Isaac Muse <isaacmuse@gmail.com>
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
documentation files (the "Software"), to deal in the Software without restriction, including without limitation
the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software,
and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions
of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED
TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF
CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
DEALINGS IN THE SOFTWARE.
"""
from markdown import Extension
from markdown.preprocessors import Preprocessor
import re
class SnippetMissingError(Exception):
"""Snippet missing exception."""
class RemoveInlineSnippetPreprocessor(Preprocessor):
"""Remove inline snippets from Markdown content."""
RE_SNIPPET_SECTION = re.compile(
r"""(?xi)
^(?P<pre>.*?)
(?P<escape>;*)
(?P<inline_marker>-{1,}8<-{1,}[ \t]+)
(?P<section>\[[ \t]*(?P<type>start|end)[ \t]*:[ \t]*(?P<name>[a-z][-_0-9a-z]*)[ \t]*\])
(?P<post>.*?)$
"""
)
def __init__(self, config, md):
"""Initialize."""
self.encoding = config.get("encoding")
self.tab_length = md.tab_length
super(RemoveInlineSnippetPreprocessor, self).__init__()
def remove_inline_snippets(self, lines):
"""Remove inline snippets from the lines."""
open_sections = set()
new_lines = []
for l in lines:
# Found a snippet section marker
m = self.RE_SNIPPET_SECTION.match(l)
if m is not None:
section_name = m.group("name")
# We found the start
if m.group("type") == "start":
open_sections.add(section_name)
continue
# We found the end
elif m.group("type") == "end":
try:
open_sections.remove(section_name)
except KeyError:
pass
continue
# We are currently not in a snippet, so append the line
if len(open_sections) == 0:
new_lines.append(l)
# It's possible that someone starts a section without closing it.
# Fail loudly for now.
# (It's also possible that someone ends a section without starting it.
# This isn't explicitly handled.)
assert len(open_sections) == 0
return new_lines
def run(self, lines):
"""Remove snippets."""
return self.remove_inline_snippets(lines)
class RemoveInlineSnippetExtension(Extension):
"""Remove inline snippet extension."""
def __init__(self, *args, **kwargs):
"""Initialize."""
self.config = {
"encoding": ["utf-8", 'Encoding of snippets - Default: "utf-8"'],
}
super(RemoveInlineSnippetExtension, self).__init__(*args, **kwargs)
def extendMarkdown(self, md):
"""Register the extension."""
self.md = md
md.registerExtension(self)
config = self.getConfigs()
remove_snippet = RemoveInlineSnippetPreprocessor(config, md)
md.preprocessors.register(remove_snippet, "remove_snippet", 33)
def makeExtension(*args, **kwargs):
"""Return extension."""
return RemoveInlineSnippetExtension(*args, **kwargs)
markdown_extensions:
…
- pymdownx.removeinlinesnippets Example:
…
--8<-- "tutorial/writing-a-dataset-definition/index.md:func"
…
--8<-- [start:func]
def my_function(var):
pass
--8<-- [end:func] |
I wonder if this needs to be a mkdocs plugin at all. We could have a test which extracts all the snippets (together with their metadata) from all the That would just get run as part of our normal CI workflow. I think this may have been what @iaindillingham was suggesting yesterday, and I just hadn't fully understood at the time. |
Unless I'm missing something, you'd presumably still need some mechanism — MkDocs hook/plugin, Markdown extension or whatever else — somewhere to remove any embedded content from the Markdown that you don't want appearing in the rendered HTML. |
HTML comments?
|
Something like that mechanism would also allow us to check code examples drawn from docstrings that get included in the reference documentation. |
I haven't had an opportunity to do more than skim the comments, so sorry if this has been discussed. However, what about:
If the Python-in-Python were on the same line as the Python-in-Markdown, then an error would be easy to identify from the stack trace. (We may even want to catch and re-raise an error, modifying the filename from The above would give Python code in each Markdown file the same scope (a Markdown file is like a notebook). We could modify that slightly by adding the Indeed, rather than passing each Python file to the interpreter, why not pass it to At this stage, I don't think we need a MkDocs plugin. |
Sorry, just seen this @evansd. I agree. Hopefully my previous comment adds some necessary detail. |
Thanks Iain. Broadly that sounds like the right approach to me, but I don't think there's any need to extract the code into files. Building the code up as a string and passing to Using a custom |
Let's say that we do restrict ourselves to only wanting to test dataset definitions1. At risk of repeating myself2, I'll summarise the main features of our current documentation code collection that I think makes some of them trickier to handle. It might narrow down what specifically the requirements of a good solution might be. Quoting of dataset definitionsIn some examples, we use short quotations from a dataset definition when discussing a specific part. In a few of these cases — the import statements — the quotation would at least be valid standalone Python, but not a valid dataset definition. In other cases, the lines as quoted are not even valid standalone Python. It would be best for these lines would be taken from one complete dataset definition, which is tested. The way this kind of line selection is usually done is via some kind of extended markup feature like the pymdown-extensions snippet syntax. As is, the snippet function does not work correctly with inlined snippets3. Less good alternatives:
Incomplete dataset definitionsRelatedly, some dataset definitions in the examples are entirely incomplete. If we wanted to test these, we would need to make them complete and selectively quote the parts we want to include. The less good alternative: just ignore these. Failing dataset definitionsSome dataset definitions deliberately fail, and we include their tracebacks. It would be good to validate that these fail in the way that we've indicated in the text, with the correct error text. It's more than possible that we might change these errors in future. The less good alternative: just ignore these, and update the failure errors manually. Provision of supplementary filesSome of our included dataset definitions are not self-contained. They require supplementary files, such as codelists. If we don't provide this additional data somewhere, then the options look like:
Inclusion and validation of outputsDo we ever want to include the generated outputs from dataset definitions, such as output datasets or logs, in the documentation? Or validate those outputs don't change inadvertently by someone editing an example? Less good alternative: we have to generate, copy-paste, and keep them in sync manually. Making a decision@evansd @iaindillingham: Let's say that we go with the "mark up inline dataset definition code blocks as I can imagine how pulling out the examples into separate files does address all of the above:
How does the inline Footnotes
|
Quoting of dataset definitions Incomplete dataset definitions Failing dataset definitions Provision of supplementary files Inclusion and validation of outputs If I could emphasize one thing, it would be that the ehrQL docs will change, and will change dramatically. Let's aim for minimally better: Python code that doesn't error when passed to the interpreter. |
I'm working on the alternative approach proposed here in #1648. |
Closing in favour of #1648 which tests dataset definitions only, but without restructuring the examples. |
This code has been subject to considerable work to get it into this form. However, it did not seem useful to retain the various approaches and versions of the code before this state. A quick guide to this code: * It finds any Markdown files in `docs/`. * It uses the SuperFences extension, as we do in the MkDocs configuration, to extract Markdown code blocks labelled with `ehrql` syntax. These are assumed to be self-contained dataset definitions. * The code blocks that will be tested should appear as code blocks in the documentation, by default (provided the CSS isn't changed to modify the appearance of code blocks somehow, which shouldn't be the case, because why would you?). They are identified in the parametrized tests by their ordinal fence number in the source file. * It finds any Python modules indicated by a `.py` extension. Python modules are assumed to be self-contained dataset definitions. * The found dataset definitions are run to generate a dataset, and the output checked to see if it's a CSV. There is some monkeypatching necessary to make this work: * `codelist_from_csv()` relies on having CSV data available, and the checks on valid codelist codes are patched out. Without further work, we don't have any direct way of including data for inline dataset definitions in Markdown source, or specifying which mock CSV data to use without any established convention for examples to use. #1697 proposes ideas to remove this monkeypatching further. * The sandboxing code is monkeypatched out to use "unsafe" loading of dataset definitions. Without doing so, it is not possible to monkeypatch any other ehrQL code: the ehrQL is run in a subprocess otherwise. For more details and discussion, see the related PR for this code (#1648) and the previous PR (#1475) which this approach replaces.
This code has been subject to considerable work to get it into this form. However, it did not seem useful to retain the various approaches and versions of the code before this state. A quick guide to this code: * It finds any Markdown files in `docs/`. * It uses the SuperFences extension, as we do in the MkDocs configuration, to extract Markdown code blocks labelled with `ehrql` syntax. These are assumed to be self-contained dataset definitions. * The code blocks that will be tested should appear as code blocks in the documentation, by default (provided the CSS isn't changed to modify the appearance of code blocks somehow, which shouldn't be the case, because why would you?). They are identified in the parametrized tests by their ordinal fence number in the source file. * It finds any Python modules indicated by a `.py` extension. Python modules are assumed to be self-contained dataset definitions. * The found dataset definitions are run to generate a dataset, and the output checked to see if it's a CSV. There is some monkeypatching necessary to make this work: * `codelist_from_csv()` relies on having CSV data available, and the checks on valid codelist codes are patched out. Without further work, we don't have any direct way of including data for inline dataset definitions in Markdown source, or specifying which mock CSV data to use without any established convention for examples to use. #1697 proposes ideas to remove this monkeypatching further. * The sandboxing code is monkeypatched out to use "unsafe" loading of dataset definitions. Without doing so, it is not possible to monkeypatch any other ehrQL code: the ehrQL is run in a subprocess otherwise. For more details and discussion, see the related PR for this code (#1648) and the previous PR (#1475) which this approach replaces.
This code has been subject to considerable work to get it into this form. However, it did not seem useful to retain the various approaches and versions of the code before this state. A quick guide to this code: * It finds any Markdown files in `docs/`. * It uses the SuperFences extension, as we do in the MkDocs configuration, to extract Markdown code blocks labelled with `ehrql` syntax. These are assumed to be self-contained dataset definitions. * The code blocks that will be tested should appear as code blocks in the documentation, by default (provided the CSS isn't changed to modify the appearance of code blocks somehow, which shouldn't be the case, because why would you?). They are identified in the parametrized tests by their ordinal fence number in the source file. * It finds any Python modules indicated by a `.py` extension. Python modules are assumed to be self-contained dataset definitions. * The found dataset definitions are run to generate a dataset, and the output checked to see if it's a CSV. There is some monkeypatching necessary to make this work: * `codelist_from_csv()` relies on having CSV data available, and the checks on valid codelist codes are patched out. Without further work, we don't have any direct way of including data for inline dataset definitions in Markdown source, or specifying which mock CSV data to use without any established convention for examples to use. #1697 proposes ideas to remove this monkeypatching further. * The sandboxing code is monkeypatched out to use "unsafe" loading of dataset definitions. Without doing so, it is not possible to monkeypatch any other ehrQL code: the ehrQL is run in a subprocess otherwise. For more details and discussion, see the related PR for this code (#1648) and the previous PR (#1475) which this approach replaces.
This code has been subject to considerable work to get it into this form. However, it did not seem useful to retain the various approaches and versions of the code before this state. A quick guide to this code: * It finds any Markdown files in `docs/`. * It uses the SuperFences extension, as we do in the MkDocs configuration, to extract Markdown code blocks labelled with `ehrql` syntax. These are assumed to be self-contained dataset definitions. * The code blocks that will be tested should appear as code blocks in the documentation, by default (provided the CSS isn't changed to modify the appearance of code blocks somehow, which shouldn't be the case, because why would you?). They are identified in the parametrized tests by their ordinal fence number in the source file. * It finds any Python modules indicated by a `.py` extension. Python modules are assumed to be self-contained dataset definitions. * The found dataset definitions are run to generate a dataset, and the output checked to see if it's a CSV. There is some monkeypatching necessary to make this work: * `codelist_from_csv()` relies on having CSV data available, and the checks on valid codelist codes are patched out. Without further work, we don't have any direct way of including data for inline dataset definitions in Markdown source, or specifying which mock CSV data to use without any established convention for examples to use. #1697 proposes ideas to remove this monkeypatching further. * The sandboxing code is monkeypatched out to use "unsafe" loading of dataset definitions. Without doing so, it is not possible to monkeypatch any other ehrQL code: the ehrQL is run in a subprocess otherwise. For more details and discussion, see the related PR for this code (#1648) and the previous PR (#1475) which this approach replaces.
Fixes #1342.
Reasons for this PR
The goal of this PR is to make the code examples easier to test (#1325).
The proposal here is to move the existing code examples in the MkDocs documentation from being inline out to separate files.
While there are tools for checking inlined code source in documentation, these are probably not flexible enough for what we are working with.1
First, a lot of our examples are dataset definitions, rather than pure source code. Beyond just checking "does it work?", we may want to validate the output of dataset definitions to ensure they are unchanged. Or at least know be warned that the outputs have changed, and rebuild the outputs accordingly.
Second, we actually have more than just dataset definitions in our ehrQL documentation: a mixture of reproducible code-like includes. These are:
Each of these categories will need handling in different ways to verify they behave as expected.
Third, separating out the code means it is easier, should we wish, to run other source code tooling on examples such as Black or Ruff. Otherwise, workarounds or further dependencies are needed to handle the inlined code. For now, no additional tooling has been run, and the examples are essentially a direct copy-paste.2
What has been separated out
This separates out the following code examples that were inlined in documentation source:
project.yaml
This does not separate out:
Overall, almost every code-like piece of content has been moved. There are a few other odds and ends that have not yet been moved out.
Example code in ehrQL source docstrings has also been left untouched.
Summary of the changes
Many files have been added and changed, which makes the overall diff very large. However, the overall changes are relatively straightforward to summarise. The changes are of three kinds:
.gitignore
modifies a now-unused path to point to paths of the new code examples. This is to ignore metadata files generated when running dataset definitions.pyproject.toml
and.pre-commit-config.yaml
are modified to exclude these separated examples from the existing Python checks. We are not running the usual Black and Ruff tooling on these files, so they fail our automatic checks. We may in future wish to remove these exclusions.How the code examples are structured
The changes to
DEVELOPERS.md
documents how the examples are currently structured.This is subject to change, if we agree in this PR's review that it should change.
Questions for review
success
orfailure
make sense for interactive Python sessions? The appearance of a traceback in these does not prevent continuation.success
possibly does make sense if you consider it as "we expect that no exception should appear in this session".output/
orlog/
?Footnotes
Examples: markdown-exec and mktestdocs both work based on matching to the syntax label, for example,
python
. ↩One example at least was fixed because pre-commit spotted that the Python syntax was invalid due to a missing parenthesis. ↩