Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PDF builds should use Sphinx provided LaTeX Makefile which uses Latexmk (and possibly xindy) #4454

Closed
jfbu opened this issue Jul 31, 2018 · 12 comments · Fixed by #5437
Closed
Labels
Needed: design decision A core team decision is required

Comments

@jfbu
Copy link

jfbu commented Jul 31, 2018

It seems PDF builds at RTD by default use something like this

pdflatex -interaction=nonstopmode /home/docs/checkouts/readthedocs.org/user_builds/sphinx/checkouts/master/doc/_build/latex/sphinx.tex
makeindex -s python.ist sphinx.idx
pdflatex -interaction=nonstopmode /home/docs/checkouts/readthedocs.org/user_builds/sphinx/checkouts

i.e. they don't simply proceed via make -C <buildir> all-pdf which would use the Sphinx provided Makefile (there are ways to pass options to pdflatex if needed). The latter uses Latexmk which does automatically the correct number of pdflatex runs.

This causes issue #2857, but may cause more issues starting with Sphinx 1.8.

Indeed at Sphinx 1.8, a new user configuration value latex_use_xindy is provided to support indexing of terms beyond English ascii letters. Only the Sphinx provided Makefile + Latexmk configuration file latexmkrc has the correct invocation of xindy which depends on the document language.

If RTD used by default the Sphinx provided LaTeX Makefile, already for Sphinx 1.7.x, and certainly for Sphinx >= 1.8 this would solve #2857 and support the new latex_use_xindy Sphinx option painlessly.

Notice that with latex_engine set by user to xelatex, xindy usage will be default. Indeed makeindex is broken with UTF-8 index files. With lualatex such broken files causes a PDF build abort, and with xelatex they appear to work, but by luck, and the index in PDF is ill-formed.

@agjohnson
Copy link
Contributor

We aren't going to execute any pdfs builds through the Makefile, we avoid the Makefile completely. Is there a solution that doesn't require using the Makefile?

@agjohnson agjohnson added Needed: design decision A core team decision is required and removed Improvement Minor improvement to code labels Aug 3, 2018
@jfbu
Copy link
Author

jfbu commented Aug 21, 2018

For the majority of projects I expect the sequence

pdflatex ...
makeindex ...
pdflatex ...
pdflatex ...

i.e. one additional pdflatex call is enough. But:

  • there is a reason why Sphinx decided to replace the old Makefile by usage of Latexmk which is a Perl script containing the logic needed to adjust automatically the number of pdflatex runs,

  • and there are also (increasingly many) reasons why Sphinx uses a Makefile in the latex build repertory.

Sphinx LaTeX builder converts templates to project-adjusted Makefile and Latexmk config file. They include:

  • whether to use pdflatex, xelatex, lualatex or platex + dvipdfmx (Japanese)
  • (since Sphinx 1.8) whether to use makeindex or xindy for the general index

Regarding the Latexmk config file (for non-Japanese) only the $pdflatex Perl variable counts (for PDF output) and it incorporates whether to run pdflatex, lualatex, or xelatex. For the latter it uses xdv not pdf format in intermediate invocations.

And for Japanese documents (which use platex+dvipdfmx) there is also extra step of running extractbb on all image files. This step is incorporated in Makefile. Besides the indexing engine is mendex not makeindex which is not compatible with Japanese.

In older times, Sphinx did pdflatex thrice (!), then makeindex, then pdflatex twice, so a total number of 5 (!) pdflatex runs. One can see this in old latex Makefile.

Thus there is quite some logic incorporated

  • in the Sphinx produced Makefile and Latexmk configuration file,
  • and in the Latexmk Perl script itself.

This logic means that when Sphinx user does himself PDF build, the workflow is substantially faster than formerly, because Latexmk will know to execute perhaps only one pdflatex run when latex sources have changed but PDF was already build earlier. (user does make latex in source repertory then make -C <latexbuilddir> all-pdf)

PDF builds may also execute makeindex on glossary .glo files produced on pdflatex first run. The logic is already in the latexmkrc Latexmk config file. I have not really tested it myself.

All this to say that trying to extract a fix number of pdflatex/makeindex calls isn't really feasible, although for vast majority of projects I expect simply pdflatex (once) + makeindex + pdflatex (twice) is adequate.

What is certain is that pdflatex + makeindex + pdflatex is inadequate, because the makeindex call creates new tex code which is executed on second pdflatex call and this tex code wants to add a new entry in table of contents, which always needs 2 passes in latex.

And as already indicated the makeindex way will become with Sphinx 1.8 replaced by xindy way for non-English projects. Thus any hard-coded project independent code is bound to raise problems.

@jfbu
Copy link
Author

jfbu commented Aug 21, 2018

And as already indicated the makeindex way will become with Sphinx 1.8 replaced by xindy way for non-English projects. Thus any hard-coded project independent code is bound to raise problems.

For pdflatex engine, Sphinx 1.8 still uses makeindex per default even for non-English (non-Japanese) documents. But for xelatex/lualatex it will use xindy because makeindex is incompatible with UTF-8.

@agjohnson
Copy link
Contributor

Thanks for the detailed information! We might be calling on you more to help because I don't think any of us are well experienced with latex, and the need to perform multiple passes on PDF build seems like a silly latex design from the outside.

A couple things:

  • We definitely won't use the Makefile for building PDFs. It's too error prone for a general use case like ours -- Sphinx makefiles are optional even.
  • 3 or 5 passes means 3 or 5 times the build resources, so we are extremely conservative making changes here. It's silly the latex engines require this, but alas.
  • The point to use xindy is helpful. There have been discussions of supporting xelatex instead of pdflatex, so perhaps a RTD config option to use xelatex/xindy makes a bit more sense now.
  • We could try to use latexmk as a wrapper directly, instead of pdflatex, this sounds promising. We could either require a special setting option in our config to force this, or we could check for a latexmk config in the repository (i prefer explicit configuration over magic though).

@agjohnson agjohnson added this to the New build features milestone Sep 19, 2018
@jfbu
Copy link
Author

jfbu commented Sep 20, 2018

At core level TeX page breaking algorithm never gets to see the document as a whole. It accumulates vertical material and it tests legitimate page breaking points when enough material has accumulated. It chooses the locally best one, ships out the built page, forgets about it and iterates.

For achieving internal references such as a table of contents, the LaTeX developers (thirty years ago, circa) thus developed a multi-pass system. Fact of life ... ;-). If table of contents is at start of document and is long it will offsets all page numbers of indexed terms. Luckily enough people usually print index at end of document. But if you have an index per chapter for example, you can see it may need even one more pass, because only after inclusion of index will page numbers stabilize. Hyperlinks came later and they add their level of complexity.

I am not sure to understand what you mean about Makefile being error prone. The Makefile is constructed by Sphinx as part of the process of building LaTeX output. It is part of LaTeX builder output in a way. It takes into account the latex engine specified by user.

Using xelatex to build PDFs works only is user project has specified the latex engine to be xelatex. If not, and if for example the language is German, then xelatex will fail producing correct PDF output, some letters used in German will be missing or words will not be correctly hyphenated, because xelatex does not work well with traditional TeX fonts, due to the fact that the LaTeX format for xelatex is not constructed the same way by the LaTeX team as for pdflatex.

Xelatex is designed for using system (OpenType/TrueType) fonts, and the font configuration is not an external parameter but something which must be hard-coded in the LaTeX document itself, so it is a user configuration. It is impossible to say I will compile all those LaTeX documents with xelatex: the documents have to know they will get compiled with xelatex.

@jfbu
Copy link
Author

jfbu commented Sep 20, 2018

If table of contents is at start of document and is long it will offsets all page numbers of indexed terms

That is not such a problem for 'manual' type project because the numbering is reset after the table of contents, but 'howto' documents will have that problem.

@stsewd
Copy link
Member

stsewd commented Sep 20, 2018

I am not sure to understand what you mean about Makefile being error prone.

As far I can remember, Makefile generation is optional, also for windows I think they generate a bat file, so maybe some users can be updating one file or another and not both. Users may don't have a makefile, etc.

@jfbu
Copy link
Author

jfbu commented Sep 20, 2018

A user may indeed have possibly broken custom Makefile related to using sphinx-build. But I am talking here about the other Makefile which is created by latex builder together with quite a few other files and end up in latex build repertory. This is entirely of Sphinx devising it is not a user Makefile. I am not thinking about local builds by RTD user, but only about the builds offered by RTD when hosting a user project documentation. If these builds are hosted on Windows platforms, yes then there is a problem, because Sphinx does not currently provide this for Windows. If the builds are hosted on Unixen, then we are talking here about a Makefile which is customized only via the user project option (like the choice of using xelatex or pdflatex). The Makefile is recreated by Sphinx on each clean "make latex". It has nothing to do with the "top" Makefile which dispatches to the various builders "html", "latex", "man", "epub", etc... Those Makefile are often heavily customized by Sphinx user projects, but this is not the case of the LaTeX Makefile.

but: via latex_additional_files, a user can overwrite the produced Makefile by any file Makefile of his choice/devising. You probably do not want to execute blindly via make such unknown Makefile, as this may cause a security risk? (I don't know how sandboxed runs are).

The idea here could be rather to force use the Sphinx generated Makefile. The issue is that it is a template which has variables depending on how user has configured LaTeX in the conf.py.

However this will use Latexmk, which itself depends on configuration files (latexmkrc). This configuration file can also be overriden by user via latex_additional_files in conf.py. AFAIK this can be used to replace pdflatex by anything. So again you probably want to only use a latexmkrc of your own.

Basically, you of course don't need this latex Makefile, as in case of pdflatex builds, it only issues latexmk -pdf -dvi- -ps- foo.tex, but inside the latexmkrc you have the configuration for either using makeindex or using xindy. The way xindy is called depends on the language of the project, and for Cyrillic it has special extras. At 1.8.0 currently the xindy rule was converted to Perl, so this all becomes a bit complicated to unravel. Basically for English project, it would look like this if written manually on a Unixen

$pdflatex = 'pdflatex %O %S';
$makeindex = '[ ! -s %S ] && : > %D || xindy -L english -C utf8  -M sphinx.xdy -M LICRlatin2utf8.xdy -I xelatex %O -o %D %S';

wow! (the initial bit is in case there is no .idx file, as xindy does not behave like makeindex in case of absent or empty .idx file). Now using xindy with English+pdflatex is not needed, but I only wanted to illustrate how that could go. You see here that additional xindy modules are needed which are created by Sphinx 1.8 not by earlier version which did not support xindy.

I may now start looking very scary.

To recap:

  • currently pdflatex, makeindex runs are hard-coded in a certain way,

  • you could use rather Latexmk, by

    1. executing latexmk -pdf -dvi- -ps- foo.tex

    2. having set-up a latexmkrc file like this

      $pdflatex = 'pdflatex %O %S';
      $makeindex = 'makeindex -s python.ist %O -o %D %S';
      

This will handle correctly Sphinx projects using pdflatex and not using xindy (and not having extensions managing glossaries). It will not handle correctly projects who want xelatex and/or xindy... (or Japanese documents, they use platex+ dvipdfmx). But it is not worse than current hardcoded pdflatex runs.

I hope I did not get too scary before that conclusion...

@skirpichev
Copy link
Contributor

As far I can remember, Makefile generation is optional

@stsewd, it's not. Using Makefile - the "official" way of post-processing latex output (to PDF or something else). There is no Sphinx option to turn this off.

Either you have to re-implement all this functionality, or PDF output in the rtfd will be mostly useless feature, as it is now.

Btw, using Makefile will also solve #1556.

@humitos
Copy link
Member

humitos commented Jan 28, 2019

@jfbu thanks for your detailed explanation!

I'd like to know if there is a way to not use the Makefile generated by Sphinx and support all the PDF builders (pdflatex, xindy, xelatex, etc) as if we were calling make with the Makefile generated by Sphinx.

Is it possible using latexmk and changing the options from latexmkrc?

@jfbu
Copy link
Author

jfbu commented Jan 28, 2019

Is it possible using latexmk and changing the options from latexmkrc?

@humitos Leaving aside the case of Japanese documents (platex + dvipdfmx and extractbb on image files) the answer is almost yes.

  • Basically you want to issue latexmk -pdf -dvi- -ps- <filename>.tex

  • the latexmkrc uses environment variable LATEXOPTS, but it is configured in Makefile to default to empty; just do the same. Or set it to -interaction=batchmode for example.

  • if the Sphinx project was configured to use Xindy, then the Makefile configures XINDYOPTS and the latter is needed for the latexmkrc. You could grep the Makefile for the XINDYOPTS lines (which are consecutive), and set it up yourself accordingly in the environment before calling latexmk -pdf -dvi- -ps-. This is crucial for correct functioning of Xindy.

Notice that Xindy is default for xelatex/lualatex. They are broken without it for indices.

Don't fiddle with the latexmkrc itself as configured by Sphinx LaTeX builder. And of course you must do that in the latex build repertory with all its auxiliary files as set-up by Sphinx. You can not do that with the tex file alone copied to some other repertory.

If using a Latexmk from January 2017 or newer (4.52b) you can add -xelatex option to the latexmk call, this will speed up xelatex builds with many graphics inclusions. (not tested recently, I hope still true).

You may also consider -silent option of Latexmk for minimal console output.

@humitos
Copy link
Member

humitos commented Mar 7, 2019

I tried a little more today to build the Chinese version of requests with latexmk inside our Docker image locally (outside all of the Read the Docs steps) and I wasn't able to make it work.

I was trying to do this, to see if we can start allowing some experimental testing with this under a Feature flag in production. Although, I don't want to enable anything like this unless we have a few projects tested and we know that this is working.

@jfbu do you have in mind any project that I can use for testing this that "just builds" without doing anything strange/weird or any kind of customized modification? It would be good to have a couple of them for these different cases that we were mentioning: German, Chinese, Japanese, and other non-ASCII languages.

humitos added a commit that referenced this issue Mar 12, 2019
New versions of Sphinx use `latexmk` to build the PDF files. This
command uses a file called `latexmkrc` (or `latexmkjarc` for Japanese)
which contains all the proper commands that needs to be ran depending
on different Sphinx configurations.

`latexmk` will take care by itself on the amount of phases that need
to be ran without us worrying about it.

Currently, this is not considering LATEXMKOPTS and XINDYOPTS
environment variables configured by Sphinx.

This feature is implemented under a Feature flag so we can test it
easily without breaking other working projects.

References:

- #1556
- #4454
- #5405
- toppers/tecs-docs#7
- https://github.com/sphinx-doc/sphinx/blob/master/sphinx/texinputs/Makefile_t
- https://www.sphinx-doc.org/en/master/usage/builders/index.html#sphinx.builders.latex.LaTeXBuilder
humitos added a commit that referenced this issue Mar 12, 2019
New versions of Sphinx use `latexmk` to build the PDF files. This
command uses a file called `latexmkrc` (or `latexmkjarc` for Japanese)
which contains all the proper commands that needs to be ran depending
on different Sphinx configurations.

`latexmk` will take care by itself on the amount of phases that need
to be ran without us worrying about it.

Currently, this is not considering LATEXMKOPTS and XINDYOPTS
environment variables configured by Sphinx.

This feature is implemented under a Feature flag so we can test it
easily without breaking other working projects.

References:

- #1556
- #4454
- #5405
- toppers/tecs-docs#7
- https://github.com/sphinx-doc/sphinx/blob/master/sphinx/texinputs/Makefile_t
- https://www.sphinx-doc.org/en/master/usage/builders/index.html#sphinx.builders.latex.LaTeXBuilder
humitos added a commit that referenced this issue Mar 13, 2019
New versions of Sphinx use `latexmk` to build the PDF files. This
command uses a file called `latexmkrc` (or `latexmkjarc` for Japanese)
which contains all the proper commands that needs to be ran depending
on different Sphinx configurations.

`latexmk` will take care by itself on the amount of phases that need
to be ran without us worrying about it.

Currently, this is not considering LATEXMKOPTS and XINDYOPTS
environment variables configured by Sphinx.

This feature is implemented under a Feature flag so we can test it
easily without breaking other working projects.

References:

- #1556
- #4454
- #5405
- toppers/tecs-docs#7
- https://github.com/sphinx-doc/sphinx/blob/master/sphinx/texinputs/Makefile_t
- https://www.sphinx-doc.org/en/master/usage/builders/index.html#sphinx.builders.latex.LaTeXBuilder
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needed: design decision A core team decision is required
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants