Removed hundreds of formatting warnings for nltk.org #2859
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hello!
Pull request overview
The warnings/errors
When generating the website (with
sphinx-build -E ./web/ ./build
), Sphinx reports well over 520 warnings and errors. Most of these involve some formatting issue in a method docstring. For those unaware, Sphinx by default expects the Sphinx docstring format, which is essentially reStructuredText. ReST can be very strict at times. There were many issues regarding unexpected indents and missing blank lines, which have all been resolved with this PR.The full list can be seen by clicking on this.
The issue
Some of these warnings didn't mean much - Sphinx could still understand everything, and the only effect on the website was some more or less whitespace somewhere. However, in some situations the warnings were indeed indicative of some big issues with the generated site. This is an issue that has existed since that documentation was written, and is simply because the in-code docstrings were not proper ReST.
Some examples
The differences between old and new range from small to substantial. Here's some examples:
Remaining warnings
This PR has reduced the number of warnings from ~516 to 11, as these warnings still exist:
These warnings are all related to how Sphinx finds multiple sources for some terms. These can cause issues, as
Expression
might become a clickable link to the wrong class calledExpression
on the website. This is something that ought to be handled at some point.Formatting issue with news.html
One tiny error I found while going through the website was that there's no space inbetween
released
and the date from roughly 2015 backwards on NLTK's news page: https://www.nltk.org/news.html#id7Also see this screenshot:
This was due to the separated colon here:
NLTK 3.2 released : March 2016
. I've now replaced this withNLTK 3.2 released: March 2016
in commit 1b57208. Feel free to compare the results by going to https://tomaarsen.com/nltk/news.html#id7. The result can also be seen in this screenshot.Note
The website formatting is still far from perfect. I've only fixed the cases that were so wrong that they threw warnings or errors. There's probably many, many more places where the formatting simply isn't like intended or desired.
That said, I believe this PR will make some more steps in making our documentation a useful resource once more.