Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade PDFJS #203

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open

Conversation

carlinmack
Copy link
Contributor

@carlinmack carlinmack commented May 8, 2024

❤️ Thank you for your contribution!

Close #170
Close https://github.com/zenodo/ops/issues/438
Close https://github.com/zenodo/ops/issues/278

Description

Please describe briefly your pull request.

Upgrade PDF JS from 1.x to 3.x. This corrects the issue with displaying characters incorrectly and self printing PDFs

This is a large PR so here is how the files are structured:

  • the JavaScript now lives in /static/js/pdfjs/ and is a subset of the the files from https://www.npmjs.com/package/pdfjs-dist/v/3.11.174
  • the CSS lives in assets/semantic-ui/css/viewer.css
  • the HTML lives in templates/semantic-ui/invenio_previewer/pdfjs.html
  • images have been moved from:
    • /assets/semantic-ui/js/invenio_previewer/pdfjs/web/images
    • /static/js/pdfjs/web/images/
      to:
    • invenio-previewer/invenio_previewer/assets/semantic-ui/css/invenio_previewer/pdfjs/images
  • locale data has been moved from:
    • /assets/semantic-ui/js/invenio_previewer/pdfjs/web/locale
    • /static/js/pdfjs/web/locale/
      to:
    • /static/js/pdfjs/
  • cmaps/ and standard_fonts/ have been added to static/js/pdfjs

The changes I have made are:

  • the HTML needs to be changed by us to hide buttons, add translation strings etc
  • same with the CSS for retheming
  • I have changed the JavaScript minorly and added CHANGED comments where I have made changes (for some reason the PDF url is hardcoded by default)

The files to be reviewed are:

image

Before and after (demonstrating that the character rendering issues have been fixed)

image

To do:

  • They recommend re-theming, so we should make it blue like before
  • Disable self printing PDFs
    • seems to be disabled by default
  • Remove unwanted functionality like editing
  • Add fullscreen functionality
  • Translation strings need to be re-added
  • Remove JS and script calls from pdfjs.html

Shelve:

  • Would be a lot better if this used code from the npm module directly
    • I have tried but getting the same problems as before
  • Try upgrade to v4
    • I haven't tried but I think it's not worth delaying the PR over
  • Writing documentation or notes about how this can be upgraded
    • If someone can suggest where, I will write up the situation

Checklist

Ticks in all boxes and 🟢 on all GitHub actions status checks are required to merge:

Frontend

Reminder

By using GitHub, you have already agreed to the GitHub’s Terms of Service including that:

  1. You license your contribution under the same terms as the current repository’s license.
  2. You agree that you have the right to license your contribution under the current repository’s license.

@carlinmack carlinmack marked this pull request as ready for review May 10, 2024 14:15
@carlinmack carlinmack changed the title init: upgrade pdfjs to 3.11.174 Upgrade PDFJS May 10, 2024
@ntarocco
Copy link
Contributor

ntarocco commented May 16, 2024

Thanks a lot for the work, this is not an easy one :)

About your comments:

Would be a lot better if this used code from the npm module directly: I have tried but getting the same problems as before

I think we should really explore this option, to avoid adding all the built file in the repo. I guess you have already tried it, but what was the issue when doing the following?

  1. in the webpack.py, include all the JS/CSS that you need, example:
"pdfjs_worker_min_js": "./node_modules/pdfjs-dist/build/pdf.worker.min.mjs",
"pdfjs_min_js": "./node_modules/pdfjs-dist/build/pdf.min.mjs",
  1. include them when rendering:
def preview(file):
    ...
    return render_template(
        "invenio_previewer/pdfjs.html",
        file=file,
        js_bundles=current_previewer.js_bundles + ["pdfjs_min_js.js", "pdfjs_worker_min_js.js"],
        css_bundles=...,
    )

This should expose a pdfjsLib JS object globally, in window.
3. In your JS, use it:

const document = pdfjsLib.getDocument(pdfUrl);
...

This should avoid building PDF.js with webpack.

And about:

Try upgrade to v4: I haven't tried but I think it's not worth delaying the PR over

Why did you start with v3, and not directly with v4?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bump pdfjs-dist to >3
2 participants