New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MD051: Enhance with optional ignore prefix or regex #547
Comments
What about linting after the Markdown files are fully generated by the build? Otherwise you may have broken "fig-" links and won't know it. |
That is true and certainly a risk. In my case they are built into HTML and and we do lint those. But apparently the HTML linter we use (HTMLHint) is not as good as markdown lint 😄 since we have several broken links that this check has only just surfaced. I could look at expanding that project to possibly have a similar rule to MD051 as an alternative to adding the exception here if you’d prefer not to complicate this code base with an exception option. |
How many links do there tend to be in a document? Would supporting a list of strings be enough because you could provide a project level configuration that listed "fig-1" to "fig-10"? Or are there hundreds of these in a document? |
There can be a lot. Here’s an example one if you’re curious: https://github.com/HTTPArchive/almanac.httparchive.org/blob/main/src/content/en/2021/css.md We also have translations and one thing that’s particularly apparent with this new link is the translators often leave the original links, but translate the headings (which then obviously changes the heading anchor and so breaks the link). MD053 has surfaced a lot of those which is why I’m keen to be able to use this check to prevent that in future. Those and typos in links (or edits after we create links, but then change the heading names) are main use case for me. Figure links are less of an issue for us - and a more complicated one to solve anyway since the figure link likely exists but could be wrong one if they all shift along by inserting a new figure. But we also tend to link those less anyway as usually talk about the figures just after then so don’t need a link. |
I may have been unclear. I was asking how many instances of figure links might be in a document. If it is very few, then it would be possible to provide a fixed size list of the first 10 and that would cover you. If there are very many, maybe a regular expression is more relevant. I don't see any matches for "fig-" in that document, so maybe I'm looking for the wrong thing? |
So there are anything for a few to many figure links. That particular document has 67 figures. You can see the final published version here. In this example none of the figures are referenced by links. Here’s one where there is a link: https://raw.githubusercontent.com/HTTPArchive/almanac.httparchive.org/main/src/content/en/2021/pwa.md (search for fig-4). The nature of publication is most figures are talked about directly after the figure so don’t need to reference the figure, but occasionally another figure in the text is referenced and linked. So I was thinking to allow me to configured a prefix allow list ( |
I'm worried that prefix is not general enough for other scenarios (if there are any?) and that regular expression is harder to work with for many folks. I also feel kind of like the approach you describe now is quite fragile and may have a bunch of broken figure links already. So I don't have an approach I like yet. |
FYI I managed to work around this using inline ignores on the affected lines since, luckily, we don't reference figures that much internally so this is feasible. I still think it would be useful to have some sort of more generic overrides for dynamically inserted content like this, where the markdown is, in effect, a source file, rather than the final output. I'm know I'm not the only one that does this (though not aware of any others that explicitly link to generated content). I take onboard your above point that it might be better to lint the final output HTML in these cases, though it's also nice to be able to flag items at source (so we get the correct line number), but without the noise of items markdownlint can't be expected to deal with. Maybe inline ignores are the best way of dealing with this but feels a little verbose for regular use, and listing all the possible figures in any MD051 config is almost as verbose. Maybe some kind of regex-lite like Anyway, I understand if you'd prefer not to handle this in markdownlink and so wish to close this issue. As I say I've managed to work around it with existing functionality and still benefit from MD051 to help identify a lot of real issues/typos thanks to this new rule. Thanks again for creating and supporting this tool! |
Great news! I'll leave this open as a possible enhancement and see if/what other scenarios come up. |
We hit similar issue. Our project which is held in bitbucket generates README for terraform with DocTor https://github.com/thlorenz/doctoc. It generates table of contents with links in format
and markdown-linter marks links with For now I have to disable the rule. |
MD051 is a very nice rule addition! However some of the markdown files in our project have dynamically inserted figures which start called
fig-1
,fig-2
. These anchors don't exist in the markdown, but are added as part of our build process.What do you think about adding an optional config with a prefix or regex of links to ignore?
Something like:
Or
Or
I'd be willing to have a go at a PR for this if this sounds reasonable and have any preference for any of the above or any preferred name/syntax.
The text was updated successfully, but these errors were encountered: