Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Destination point not set for named links in toc #2088

Closed
blenzi opened this issue Nov 28, 2022 · 8 comments
Closed

Destination point not set for named links in toc #2088

blenzi opened this issue Nov 28, 2022 · 8 comments
Labels

Comments

@blenzi
Copy link

blenzi commented Nov 28, 2022

Please provide all mandatory information!

Describe the bug (mandatory)

If a pdf outline / toc contains named links, the destination point ("to") is not set

To Reproduce (mandatory)

Compile with pdflatex or paste in your browser the command below to get the pdf (rename it to named_outline.pdf for the following):

latexonline.cc/compile?url=https://gist.githubusercontent.com/blenzi/5d4338d4b384fd0d1495a6341b534eb5/raw/0bb0da6deb3d74a1e67aabf8f7a61d5ab8e5b438/named_outline.tex

Open the pdf and get the toc:

>>> import fitz
>>> doc = fitz.open("named_outline.pdf")
>>> doc.get_toc(simple=False)[0][3]
{'kind': 4, 'xref': 4, 'name': 'page=1&zoom=nan,133.768,229.479', 'collapse': True, 'zoom': 0.0}

To fix it, it is probably enough (although I am not sure how safe it is) to get the last two items of "name" here:

pnt.x, pnt.y = map(float, ln.dest.named.split(",")[-2:])
nl["to"] = pnt

Your configuration (mandatory)

  • Operating system, potentially version and bitness: MacOsX
  • Python version, bitness: 3.11.0
  • PyMuPDF version, installation method (wheel or generated from source): 1.21.0

For example, the output of print(sys.version, "\n", sys.platform, "\n", fitz.__doc__) would be sufficient (for the first two bullets).

@JorjMcKie
Copy link
Collaborator

Thanks for report this.

The problem is an error parsing the uri field in class linkDest. This class is responsible for providing all link properties - including link kind.
The correct result would be LINK_GOTO in the end, because internal symbolic names are always resolved to page number / point destinations.

@JorjMcKie
Copy link
Collaborator

Have a fix. Problem will be corrected in next version.
Drop me a note if you would like to make a local patch.

@julian-smith-artifex-com
Copy link
Collaborator

Fixed in PyMuPDF-1.21.1.

@thomascerbelaud
Copy link

thomascerbelaud commented Dec 8, 2023

Hello,

I would like to reopen this issue, as I encoutered the same problem as encountered in the original post in version 1.23.7. This is resolved when downgrading to 1.21.1 though, so I wondered if it was possible to include a fix in a future release of PyMuPDF.

Describe the bug (mandatory)

If a pdf outline / toc contains named links, the destination point ("to") is not set

To Reproduce (mandatory)

Download the PDF file here, then run this code:

>>> import fitz
>>> doc = fitz.Document("2302.11382.pdf")  # name of the PDF file at download
>>> doc.get_toc(simple=False)[0][-1]
{'kind': 4, 'xref': 21, 'name': 'nameddest=section.1', 'zoom': 0.0}

Your configuration (mandatory)

  • Operating system, potentially version and bitness: WSL2 Ubuntu 22.04, under Windows 10
  • Python version, bitness: 3.9.18
  • PyMuPDF version, installation method (wheel or generated from source): 1.23.7, installed from pip

@JorjMcKie
Copy link
Collaborator

@thomascerbelaud - you should find this issue resolved if you use the import statement import fitz_new as fitz.
This will invoke the new PyMuPDF implementation which contains a full PDF name resolution. This is also used in TOC extraction.

@thomascerbelaud
Copy link

It's working indeed, great thanks! A last question though: is it planned to release a version of fitz that does not include fitz_new? Should we always import fitz_new instead of fitz from now on?

@JorjMcKie
Copy link
Collaborator

It's working indeed, great thanks! A last question though: is it planned to release a version of fitz that does not include fitz_new? Should we always import fitz_new instead of fitz from now on?

No, this is a temporary situation. We are planning to swap versions classic <> rebased ("fitz_new") very soon. In that next phase, the current/classic version will still be available as import fitz_old as fitz.
In the final phase (several weeks out), the classic version will then be dropped.

@julian-smith-artifex-com
Copy link
Collaborator

julian-smith-artifex-com commented Dec 8, 2023

Also see discussion #2680 "New 'rebased' implementation of PyMuPDF"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants