Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LaTeX: even with smartquotes off, PDF output transforms straight quotes and consecutive hyphens into curly quotes and dashes #6890

Closed
jfbu opened this issue Dec 5, 2019 · 3 comments

Comments

@jfbu
Copy link
Contributor

jfbu commented Dec 5, 2019

With this input

'test' "test" test--test---test...

and smartquotes = False in conf.py, on doing make latexpdf, the PDF output will contain

’test’ "test" test–test—test...

i.e. the straight quote is converted to a right curly quote, the double quote is intact, the hyphens give en-dash and em-dash, the triple dots is ok, it does not give ellipsis.

Using xelatex the PDF is slightly different

’test’ ”test” test–test—test...

i.e. also the double quote is curly (right curly double quote). This is already reported #6886.

Screenshots

using pdflatex

Capture du 2019-12-05 14-54-56

using xelatex

Capture du 2019-12-05 14-54-03

using platex (Japanese)

Capture du 2019-12-05 15-10-37

Environment info

  • OS: [e.g. Unix/Linux/Mac/Win/other with version]
  • Python version: 3.7.5
  • Sphinx version: tested with 1.8.5 and 2.x, applies to all versions I believe (prior to 1.6.6 there was no smartquotes toggle)
  • Sphinx extensions: N/A
  • Extra tools: TL2019. Regarding xelatex perhaps issue did not show with old fontspec, it is not documented in user manual when applying "TeX ligatures" became default, and it may depend also on the LaTeX release date as some versions of Sphinx prior to 2.x left LaTeX to decide default fonts to use for xelatex.

Additional context
Add any other context about the problem here.

This problem has always been there I believe. For example with the sphinx.util.texescape as at Sphinx 1.5, the problem applied. It has evolved in backwards compatible way since. For standard text Sphinx applies some tex escaping to prevent TeX ligatures e.g. transform of >>, but it never did any specific TeX escaping of the straight quote ' or the hyphen - .

It is thus a long standing bug regarding the single straight quote '. For the consecutive hyphens, transform into dashes mimicked what Sphinx html builder did with smarty pants on (the default). So it could be considered a feature. But it became a bug when smart quotes also applied to LaTeX (I believe around 1.6.6). If smart quotes are off, then the PDF output should be like in original sources. Especially because ' becomes a right curly quote and never a left curly quote one, which is bad except for elision in some languages (such as French).

The source input is identically found in LaTeX file if smartquotes = False, the problem is caused by the conversion from LaTeX file to PDF via pdflatex/xelatex/etc....

Code-blocks and inline literals do not have this problem.

For the " the pdflatex PDF output looks ok, this is due to \usepackage[T1]{fontenc}. If this line is removed (which can make sense for purely English documents) via redefinition of latex_elements['fontenc'] then LaTeX uses so-called OT1 encoding and the ̀"` will also become (right) curly in PDF.

pdflatex case requires another approach: i.e. a better sphinx.util.texescape.tex_replacements

@jfbu jfbu added this to the 2.3.0 milestone Dec 5, 2019
@jfbu
Copy link
Contributor Author

jfbu commented Dec 5, 2019

Also the left tick is involved. Source such as

 \`test\` \`\`test\`\`

currently gives in LaTeX file via make latex

{}`test{}` {}`{}`test{}`{}`

which in pdflatex PDF output with smartquotes = False produces

‘test‘ ‘‘test‘‘

i.e. single left curly quotes. The escaping only serves to avoid a double left curly quote via a ligature. And this does not even work for lualatex engine which outputs in PDF:

‘test‘ “test“

(xelatex does inhibit the ligature)

@jfbu
Copy link
Contributor Author

jfbu commented Dec 5, 2019

And this does not even work for lualatex engine which outputs in PDF:

‘test‘ “test“

(xelatex does inhibit the ligature)

ah sorry, no Sphinx lualatex does not have the ligature due to #5790. I tested directly lualatex build on a tex file. Actually the lualatex fix at #5790 is extended to xelatex at #6888 so for them there is already a complete fix.

jfbu added a commit to jfbu/sphinx that referenced this issue Dec 5, 2019
Refs: sphinx-doc#6890

But I had to leave out the hyphen because replacing it would break URLs
containing it, e.g. the ones generated by the pep role.
jfbu added a commit to jfbu/sphinx that referenced this issue Dec 15, 2019
Refs: sphinx-doc#6890

The comma character is not TeX-escaped because it is frequent in general
text and escaping it would make the LaTeX output larger for only dealing
with the problem of the LaTeX-ligature of ,, into a single character.
And one there is problem with the commas in options to Verbatim from
PygmentsBridge.

The hyphen character is escaped (not in ids and URIs!) to
\sphinxhyphen{} for both Unicode and non-Unicode engines. This is needed
to work around hyperref transforming -- and --- from section titles into
EN DASH resp. EM DASH in PDF bookmarks.

latex3/hyperref#112

Note to expert LaTeX users: if Sphinx latex user with xelatex has

- turned off Smart Quotes for some reason,

- but does want TeX ligatures and thus overrode Sphinx
latex_elements['fontenc'] default (since sphinx-doc#6888) to this effect,

then this should be added to LaTeX preamble:

    \def\sphinxhyphen#1{-}% (\protected is now not needed)
    \let\sphinxhyphenforbookmarks\sphinxhyphen
@tk0miya
Copy link
Member

tk0miya commented Dec 15, 2019

#6891 is merged now. closing.

@tk0miya tk0miya closed this as completed Dec 15, 2019
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 30, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants