Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rendering crashes if PDF contains images which cannot be decoded #18042

Closed
plantago opened this issue May 2, 2024 · 2 comments · Fixed by #18047
Closed

Rendering crashes if PDF contains images which cannot be decoded #18042

plantago opened this issue May 2, 2024 · 2 comments · Fixed by #18047

Comments

@plantago
Copy link

plantago commented May 2, 2024

Attach (recommended) or Link to PDF file here: I cannot provide the PDF file due to NDA.

Configuration:

If the PDF file contains an image which cannot be decoded, rendering fails trying to dereference a
null object.

Unfortunately, I cannot attach the PDF file due to NDA but I did some debugging and here are my
findings:

When PartialEvaluator#buildPaintImageXObject() fails to decode an image it sends null as
imgData to the main thread this._sendImgData(objId, /* imgData = */ null, cacheGlobally);.

The WorkerTransport object receives null in the imageData parameter in the "obj" handler and
calls this.objs.resolve(id, imageData); where imageData == null in the case "Image"
branch.

The PDFObjects#resolve() method replaces INITIAL_DATA in obj.data with null.

The PDFObjects#*[Symbol.iterator]() method filters out only data === INITIAL_DATA but not data == null.

The case "CopyLocalImage" branch in the "commonobj" handler iterates over pageProxy.objs:

for (const [, data] of pageProxy.objs) {
  if (data.ref !== imageRef) {               // `data` can be `null` here.
    continue;
  }
  ...
}

and because data can be null the code crashes trying to access the ref property with an error
message in the console:

Warning: getOperatorList - ignoring XObject: "UnknownErrorException: Cannot read properties of null (reading 'ref')".

A simple fix data.ref !== imageRef -> data?.ref !== imageRef makes the PDF renderable again but
probably it just fixes the symptom, not the cause of the issue.

@plantago plantago changed the title Rendering crashes if PDF contains shared images which cannot be decoded Rendering crashes if PDF contains images which cannot be decoded May 2, 2024
@Snuffleupagus
Copy link
Collaborator

Snuffleupagus commented May 3, 2024

Attach (recommended) or Link to PDF file here: I cannot provide the PDF file due to NDA.

That's unfortunately a problem, since we need to be able to debug the file and importantly add tests when making changes.

@plantago
Copy link
Author

plantago commented May 4, 2024

Can the data property be null? Yes, it can. Should the data be tested for null before accessing property ref? I think the answer is obvious.

E.g. the PDFObjects.clear() method checks data for the null value:

clear() {
  for (const objId in this.#objs) {
    const { data } = this.#objs[objId];
    data?.bitmap?.close(); // Release any `ImageBitmap` data.
  }
  this.#objs = Object.create(null);
}

If pdf.js were written in TypeScript and the PDFObjects.#objs were properly defined as a nullable type then this code wouldn't even compile:

for (const [, data] of pageProxy.objs) {
  if (data.ref !== imageRef) {             // TypeScript would show an error here: 'data' is possibly 'null'.ts(18047)
    continue;
  }
  ...
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants