Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make stream parser resilient to text in streams #1215

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

Trapfether
Copy link
Contributor

@Trapfether Trapfether commented Apr 16, 2022

What?

Updates the streamParser to ignore the keyword 'stream' if it is not proceeded by a special character. Fixes #1206

Why?

Some font files when embedded include their license information, one font where this occurred included the word 'bitstream' several times which confused the streamParser. The streamParser thought there were more open streams than there were.

How?

The word 'stream' should be proceeded by a space, newline, carriage return, backslash, lessThan, or greaterThan character. I also included tabs for good measure. If the word 'stream' is not proceeded by one of those characters, it is ignored.

There is a special case just after a 'stream' is consumed, we set a flag that allows an immediately following 'stream' to be counted.

Testing?

I added a test to the testing suite that exercised the fail condition with the current parser, then updated the parser to pass the new test.

New Dependencies?

NA

Screenshots

NA

Suggested Reading?

NA

Anything Else?

NA

Checklist

  • I read CONTRIBUTING.md.
  • I read MAINTAINERSHIP.md#pull-requests.
  • I added/updated unit tests for my changes.
  • I added/updated integration tests for my changes.
  • I ran the integration tests.
  • I tested my changes in Node, Deno, and the browser.
  • I viewed documents produced with my changes in Adobe Acrobat, Foxit Reader, Firefox, and Chrome.
  • I added/updated doc comments for any new/modified public APIs.
  • My changes work for both new and existing PDF files.
  • I ran the linter on my changes.

Some Embedded fonts include the license information, and one example font included the word 'bitstream' in the text, which confused the stream parser. This Commit updates the parse to require 'stream' to be proceeded by one of multiple special characters.
@ahaganDEV
Copy link

I have tested this change out in a forked version of the repo and can confirm that the change works as intended. Look forward to this being reviewed and then released so that my organisation can revert back to this repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PowerPoint PDF data loaded into PDF-Lib does not open in Adobe Acrobat Pro DC
4 participants