Build PDF files using latexmk #5437

humitos · 2019-03-12T17:33:57Z

This PR makes a Japanese project to build properly. It would be awesome to have more projects as examples to test here. Although, since this code is behind a feature flag we can merge without worry too much for now on making it work perfectly.

I'm referencing these issues/prs to be closed here since most of the requests are covered in this PR. It may be some things missings, like supporting xindy or using environment variables like XINDYOPTS.

Closes #1556
Closes #4454
Closes #5405

New versions of Sphinx use `latexmk` to build the PDF files. This command uses a file called `latexmkrc` (or `latexmkjarc` for Japanese) which contains all the proper commands that needs to be ran depending on different Sphinx configurations. `latexmk` will take care by itself on the amount of phases that need to be ran without us worrying about it. Currently, this is not considering LATEXMKOPTS and XINDYOPTS environment variables configured by Sphinx. This feature is implemented under a Feature flag so we can test it easily without breaking other working projects. References: - #1556 - #4454 - #5405 - toppers/tecs-docs#7 - https://github.com/sphinx-doc/sphinx/blob/master/sphinx/texinputs/Makefile_t - https://www.sphinx-doc.org/en/master/usage/builders/index.html#sphinx.builders.latex.LaTeXBuilder

stsewd · 2019-03-13T17:35:02Z

readthedocs/doc_builder/backends/sphinx.py

+        # ``latex_engine`` is ``platex``
+        pdfs = []
+        if self.project.language == 'ja':
+            pdfs = latex_path.glob('*.pdf')


Why is this only required by ja lang? Don't we already run this whole function under a feature flag?

Japanese language is the only one that requires this extra step. I don't know exactly why but most of the documentation that I read differentiate this language from the others. I suppose it's because it mix kanji (Chinese) with its own symbols.

I took this step from the Makefile generated by Sphinx when the language is Japanese.

The above github comment should be a code comment.

The Japanese language uses latex+dvipdfmx path. Until TeXLive 2015 release, extractbb step was needed. But it is not needed with newer dvipdfmx binary. Here is a quote of an exchange I had in 2017 on TeXLive mailing list:

Can someone confirm that since TL2014 there is no need
on Unix-like systems to prepare beforehand .xbb
files for graphics inclusion of .png, .jpeg, ...,
when doing latex+dvipdfmx or platex+dvipdfmx ?
(and that it was either needed before, or at least
shell-escape had to be enabled) ?

I remember xbb files were necessary for TL2014 or earlier.
Japanese people has been heavily using dvipdfmx, so they added
extractbb to their local texmf.cnf by hand

...

Thanks for the information ! This is for support
of some other software, now I know they can avoid
the explicit extractbb calls but under assumption
of a TL2015 or more recent install,

Thus Sphinx may soon drop the extractbb and make the Japanese set-up in Makefile less singular: indexe 2.0 release requires "LaTeX builder now depends on TeX Live 2015 or above." but we forgot to actually remove the extractbb dependency as we don't want either to break too easily user projects with older TeX installations.

I have prepared sphinx-doc/sphinx#6189 for Sphinx to remove the extra handling of image files for Japanese language by extractbb as it is superfluous for TL2015 or later based TeX distribution. Not sure it will be merged during 2.x series.

stsewd

This feature flag should be documented so users that need it can request it to us

stsewd · 2019-03-13T17:35:37Z

readthedocs/doc_builder/backends/sphinx.py

+        if self.project.language == 'ja':
+            pdfs = latex_path.glob('*.pdf')
+
+        for image in itertools.chain(images, pdfs):


image and pdfs are not related I guess, but not sure about this step

images may be provided in pdf format: in fact this is the preferred format for image inclusion

From sphinx doc here is the list of file extensions for images in order of preference for LaTeX:

supported_image_types = ['application/pdf', 'image/png', 'image/jpeg']

stsewd · 2019-03-13T17:37:19Z

readthedocs/doc_builder/backends/sphinx.py

+            '-r',
+            rcfile,
+
+            # FIXME: check for platex here as well


I'm confused about platex and latexmk

latexmk uses internally pdflatex or xelatex or lualatex or platex (defined by latex_engine

stsewd · 2019-03-13T17:40:12Z

readthedocs/doc_builder/backends/sphinx.py

+            cwd=latex_cwd,
+        )
+
+        self.pdf_file_name = f'{self.project.slug}.pdf'


Where did this come from? From the -jobname option?

humitos · 2019-03-13T17:43:33Z

This feature flag should be documented so users that need it can request it to us

Yes. I don't want to expose this yet since we don't have too much working examples. Although, I plan to write a guide on our docs explaining what are the required steps from the users to build PDF for these languages.

humitos · 2019-03-13T18:45:26Z

It would be awesome to have more projects as examples to test here

I tested some projects today and I was able to build most of them. Some of them did not build because other problems not related to this PR (missing png file, invalid rst table on the docs, or similar)

Automatically select `xelatex` for Chinese and `platex` for Japanese. These defaults can be overridden by the user. Force `latex_use_xindy=False` for now until we support it in our Docker image.

humitos · 2019-03-18T10:58:31Z

This feature flag should be documented so users that need it can request it to us

I added this since I feel more comfortable with these code after some testing and also because I added a default configuration in our conf.py.tmpl which does not need user intervention.

ericholscher

Seems reasonable. Is the goal to turn this on for all Chinese & Japanese languages by default after we test it behind the feature flag?

Does this require additional ops or docker changes, to ensure we have the proper binaries installed?

ericholscher · 2019-03-18T12:59:54Z

readthedocs/doc_builder/backends/sphinx.py

+        # ``latex_engine`` is ``platex``
+        pdfs = []
+        if self.project.language == 'ja':
+            pdfs = latex_path.glob('*.pdf')


The above github comment should be a code comment.

ericholscher · 2019-03-18T13:00:41Z

readthedocs/doc_builder/backends/sphinx.py

+            'cat',
+            rcfile,
+            cwd=latex_cwd,
+        )


Is this to output to the user, or some other reason?

Yes. This is only to show the content of the latexmkrc file to the user.

I want to show this because that file is built by Sphinx depending on some configurations. Also, this is the file that will say what command execute to build the PDF in the end. So, without this output will be very hard to debug a problem.

humitos · 2019-03-18T15:26:46Z

Seems reasonable. Is the goal to turn this on for all Chinese & Japanese languages by default after we test it behind the feature flag?

I'd say yes. Although, not only for Chinese & Japanese, but for all the languages. This is the default way of building PDF on Sphinx >=1.6 and seems to be more robust that the one that we are using. Also, it allows customization using Sphinx common configuration options.

Does this require additional ops or docker changes, to ensure we have the proper binaries installed?

No. It shouldn't.

jfbu

I have provided some comments for background LaTeX info... it looks great that you merged this, but as I said I can not help much on Chinese related matters.

jfbu · 2019-03-18T19:40:54Z

readthedocs/doc_builder/templates/doc_builder/conf.py.tmpl

+    project_language in ('zh_CN', 'zh_TW'),
+])
+
+japanase = any([


is the spelling japanase intended here?

Yes. It's a typo!

jfbu · 2019-03-18T19:49:40Z

readthedocs/doc_builder/templates/doc_builder/conf.py.tmpl

+        'preamble': '\\usepackage[UTF8]{ctex}\n',
+    }
+    latex_elements = latex_elements_user or latex_elements_rtd
+elif japanase:


cf my prior comment about spelling

jfbu · 2019-03-18T19:49:41Z

readthedocs/doc_builder/templates/doc_builder/conf.py.tmpl

+    latex_use_xindy = False
+
+    latex_elements_rtd = {
+        'preamble': '\\usepackage[UTF8]{ctex}\n',


sadly, I am not at all competent in Chinese. I don't know how Chinese users of LaTeX go about producing indices... Xindy does not seem to have any special support for CJK languages, but at least it is Unicode aware contrarily to makeindex. There is zhmakeindex but its usage has not been incorporated to Sphinx, I am not aware of PRs about this.

The Japanese language is handled separately by Sphinx because it uses a specific LaTeX binary, platex and a specific indexing binary mendex. But they are not Unicode aware, so mixing Japanese with European languages is not easy/feasible apart from English as it only needs ascii range. In future Sphinx will switch presumably to uplatex+upmendex for Japanese support (see e.g. sphinx-doc/sphinx#4187 (comment)).

jfbu · 2019-03-18T19:55:52Z

readthedocs/doc_builder/backends/sphinx.py

+        if self.project.language == 'ja':
+            pdfs = latex_path.glob('*.pdf')
+
+        for image in itertools.chain(images, pdfs):


images may be provided in pdf format: in fact this is the preferred format for image inclusion

From sphinx doc here is the list of file extensions for images in order of preference for LaTeX:

supported_image_types = ['application/pdf', 'image/png', 'image/jpeg']

jfbu · 2019-03-18T20:05:25Z

readthedocs/doc_builder/backends/sphinx.py

+        # ``latex_engine`` is ``platex``
+        pdfs = []
+        if self.project.language == 'ja':
+            pdfs = latex_path.glob('*.pdf')


The Japanese language uses latex+dvipdfmx path. Until TeXLive 2015 release, extractbb step was needed. But it is not needed with newer dvipdfmx binary. Here is a quote of an exchange I had in 2017 on TeXLive mailing list:

Can someone confirm that since TL2014 there is no need
on Unix-like systems to prepare beforehand .xbb
files for graphics inclusion of .png, .jpeg, ...,
when doing latex+dvipdfmx or platex+dvipdfmx ?
(and that it was either needed before, or at least
shell-escape had to be enabled) ?

I remember xbb files were necessary for TL2014 or earlier.
Japanese people has been heavily using dvipdfmx, so they added
extractbb to their local texmf.cnf by hand

...

Thanks for the information ! This is for support
of some other software, now I know they can avoid
the explicit extractbb calls but under assumption
of a TL2015 or more recent install,

Thus Sphinx may soon drop the extractbb and make the Japanese set-up in Makefile less singular: indexe 2.0 release requires "LaTeX builder now depends on TeX Live 2015 or above." but we forgot to actually remove the extractbb dependency as we don't want either to break too easily user projects with older TeX installations.

humitos · 2019-03-19T09:11:32Z

@jfbu thanks for your feedback on this!

I don't have too much experience with LaTeX and building other languages that are not Spanish or English. So, I decided to follow what current stable Sphinx version was doing instead of "propose something better/polished" to avoid entering new and different bugs to the ones that Sphinx already have.

Once Sphinx 2.x is released, we could refactor these steps and make them work better but for now I prefer to keep it stable.

From my tests locally I can say that I was able to build PDF in Chinese, Japanese and English without problem. So, this is way better to what we had.

Let's see how it goes in production with real projects and we can iterate from there.

ReadTheDocs switched from calling pdflatex directly to using latexmk. This change broke the previous workaround that only the last-built PDF file was made available on RTD. Use a more explicit approach instead to fix that. Changed upstream in readthedocs/readthedocs.org#5437

humitos force-pushed the humitos/build-pdf-latexmk branch from 02d2888 to 1c444e2 Compare March 12, 2019 17:35

humitos requested a review from a team March 12, 2019 17:35

stsewd reviewed Mar 13, 2019

View reviewed changes

Show latexrc file on build output

24ba762

humitos force-pushed the humitos/build-pdf-latexmk branch from 1c444e2 to 24ba762 Compare March 13, 2019 17:54

humitos mentioned this pull request Mar 13, 2019

Guide to build PDF for non-ASCII language #5453

Merged

Return the proper successful command

c2d68e5

humitos mentioned this pull request Mar 18, 2019

Feature flag to use other pdf latex binaries #5405

Closed

humitos added 2 commits March 18, 2019 11:53

Append default Sphinx configurations for Chinese and Japanaese

36ed646

Automatically select `xelatex` for Chinese and `platex` for Japanese. These defaults can be overridden by the user. Force `latex_use_xindy=False` for now until we support it in our Docker image.

Document USE_PDF_LATEXMK feature flag

98f0f55

humitos mentioned this pull request Mar 18, 2019

Support xindy and XINDYOPTS environment variable #5476

Open

humitos requested a review from a team March 18, 2019 11:13

ericholscher approved these changes Mar 18, 2019

View reviewed changes

Add comment explaining one of the build steps

d9ccf27

ericholscher merged commit 11a674f into master Mar 18, 2019

delete-merged-branch bot deleted the humitos/build-pdf-latexmk branch March 18, 2019 18:55

jfbu reviewed Mar 18, 2019

View reviewed changes

skirpichev mentioned this pull request Mar 18, 2019

Xelatex for PDF generation #1556

Closed

humitos mentioned this pull request Mar 19, 2019

Typo on conf.py.tmpl #5495

Merged

davidfischer mentioned this pull request Mar 29, 2019

Build failed without any reason readthedocs/sphinx_rtd_theme#740

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build PDF files using latexmk #5437

Build PDF files using latexmk #5437

humitos commented Mar 12, 2019

stsewd Mar 13, 2019

humitos Mar 13, 2019

ericholscher Mar 18, 2019

jfbu Mar 18, 2019

jfbu Mar 19, 2019

stsewd left a comment

stsewd Mar 13, 2019

jfbu Mar 18, 2019

stsewd Mar 13, 2019

humitos Mar 13, 2019

stsewd Mar 13, 2019

humitos Mar 13, 2019

humitos commented Mar 13, 2019

humitos commented Mar 13, 2019

humitos commented Mar 18, 2019

ericholscher left a comment

ericholscher Mar 18, 2019

ericholscher Mar 18, 2019

humitos Mar 18, 2019

humitos commented Mar 18, 2019

jfbu left a comment

jfbu Mar 18, 2019

humitos Mar 19, 2019

jfbu Mar 18, 2019

jfbu Mar 18, 2019

jfbu Mar 18, 2019

jfbu Mar 18, 2019

humitos commented Mar 19, 2019

Build PDF files using latexmk #5437

Build PDF files using latexmk #5437

Conversation

humitos commented Mar 12, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stsewd left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

humitos commented Mar 13, 2019

humitos commented Mar 13, 2019

humitos commented Mar 18, 2019

ericholscher left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

humitos commented Mar 18, 2019

jfbu left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

humitos commented Mar 19, 2019