Improve performance of arrayBufferToBinaryString #3121

Pantura · 2021-03-22T13:12:10Z

Thanks for the awesome library which seems to be the only way we can output snapshot of a DOM node reliably with extra headers and custom scaling options.

However we stumbled on a problem with CSP rules regarding dataUrls when using image with image dataUrl. I transferred to use a blob url of the image but performance suffered very badly.

After some performance monitoring the worst culprit was found which is fixed in this PR.

This PR has two improvements over the old functionality

Skip expensive buffer overflows

After blob file size starts growing, so will the time spent in trying the direct conversion of the buffer. For a buffer with a size of 14M, my 2019 core i7 Mac takes 100ms before throwing an exception. For 44M that takes 400+ms. Skip trying altogether for massive buffers. The limit could be tuned but this is roughly where the processing times start increasing and overflows begin.
The image in question is about 3840x2160 px when being added to PDF.

Use TextDecoder when possible to speed up conversion

Along with the fix above using TextDecoder is the best option for binary string conversion. I tested direct string concatenation (worst), using predefined array size with final .join (depending on GC cycles, below and above original solution).

Code in this PR runs arrayBufferToBinaryString about 18 times faster (4700 vs 251ms and 3000 vs 153 for two of my test pages).
Old:

Updated:

The total time for adding the image is still suffering from decodePixels performance but I couldn't think of a solution to fix that.

* Skip expensive buffer overflows * Use TextDecoder when possible to speed up conversion

Pantura · 2021-03-22T13:37:44Z

I'm unable to run unit tests on my Mac (10.14) even before my changes.

HackbrettXXX · 2021-03-23T09:52:28Z

Thanks for this PR. Is String.fromCharCode or TextDecoder.decode faster for small buffers? If TextDecoder is faster I think we can use it also for small buffers if available.

Pantura · 2021-03-23T10:59:03Z

I was thinking about that as well but didn't have time check it yet. I can test that as well soon.

Pantura · 2021-03-23T12:44:32Z

One thing to note though, support for TextDecoder is kind of new so it isn't possible to rely solely on that. And the old functionality processed through 100x20px image in sub-milliseconds.

101arrowz

There are a few things I think should be worked out. I've worked with binary strings before, and they should be avoided at all costs, but since it's a bit hard to remove all the uses of binary strings in jsPDF, it's best to maximize performance as much as possible.

src/modules/addimage.js

101arrowz · 2021-03-24T08:23:00Z

src/modules/addimage.js

+      var decoder = new TextDecoder("ascii");
+      return decoder.decode(buffer);


Actually, for binary strings, this is not going to work. ASCII only covers 127 code points, i.e. 0x00 through 0x7F. We need 0x00 through 0xFF (the entire byte range). For example, this fails:

const inputBuffer = new Uint8Array([252, 129, 187, 147]); const binStr = new TextDecoder('ascii').decode(inputBuffer.buffer); console.log(binStr.split('').map(str => str.charCodeAt(0))); // [252, 129, 187, 8220]

As it turns out, TextDecoder doesn't work well for binary string encoding, and this case should probably be removed entirely.

I'm not sure how this passed the test cases actually...Any ideas? Since that simple example yielded malformed output I would expect something to have broken.

Not sure how it passed library tests but it passed my own visual validation because the color palette was so small. In the end I did find it produced malformed data as some grays were going to green.
For some reason I had in mind ascii was 8 bit as well. And after some research TextDecoder still isn't a choice with any given text encoding to replace fromCharCode...

101arrowz · 2021-03-24T08:29:42Z

src/modules/addimage.js

+    if (buffer.length < ARRAY_BUFFER_LIMIT) {
+      try {
+        return atob(btoa(String.fromCharCode.apply(null, buffer)));
+      } catch (e) {
+        // Buffer was overflown, fall back to other methods
      }


I don't think this works (or at least it shouldn't if the second argument is an ArrayBuffer). ArrayBuffer cannot be indexed and therefore cannot be used in .apply. I'm pretty sure you need to convert to Uint8Array first.

Also, I know atob(btoa(str)) was in the previous code, but I don't think atob(btoa(str)) is any different from str. Is there an example that differs?

Was wondering about that as well.

101arrowz · 2021-03-24T08:35:06Z

src/modules/addimage.js

+  // Heuristic limit after which String.fromCharCode will start to overflow.
+  // Need to change to StringDecoder.
+  var ARRAY_BUFFER_LIMIT = 6000000;
+


This figure doesn't seem right. The max frame size is 1MB to begin with, so 6MB is too big for sure. On my device, the limit is around 100kB, but I'm pretty sure going for around 16K is safe. Could you verify this in the console?

Seems a bit large for me, too. 16k sounds good.

Some benchmarks from different batch sizes: https://jsbench.me/x5kmnswnmz/1. Sweet spot at least for Chromium 89 seems to be at 4096-8192 slight degradation at 16k but slowing down rapidly after that. Manual testing shows minor GCs going to major GC.

Then let's take 8k? :)

101arrowz · 2021-03-24T08:53:26Z

By the way @Pantura, have you verified you're using the latest version of jsPDF? A function getBytes() is taking quite a bit in your image, but the newest version AFAIK is using a function unzlibSync from fflate. Since fflate is faster, maybe image adding will be faster once you update?

Pantura · 2021-03-24T12:50:49Z

By the way @Pantura, have you verified you're using the latest version of jsPDF? A function getBytes() is taking quite a bit in your image, but the newest version AFAIK is using a function unzlibSync from fflate. Since fflate is faster, maybe image adding will be faster once you update?

I did update and it didn't really improve that much. I actually took yet another approach to the data format as I noticed it was doing a PNG encode/decode for nothing. There will soon be a PR about possibility to input RGBA Array. This format wouldn't allow passing through image size (or width would be enough). Would it be ok to add it as an optional parameter?
(this is with statically defined size but performance is good)

Blob input to PNG to decode PNG:

Direct RBGA input with alpha channel extraction:

HackbrettXXX · 2021-03-24T16:19:32Z

@Pantura a PR with an RGBA array/buffer as input would be appreciated.

src/modules/addimage.js

…vement

HackbrettXXX

Alright. Thanks for the PR, I will merge it now.

Improve performance of arrayBufferToBinaryString

38638d7

* Skip expensive buffer overflows * Use TextDecoder when possible to speed up conversion

Pantura force-pushed the feature/arraybuffer-to-binary-string-improvement branch from 2f0b78c to 38638d7 Compare March 22, 2021 13:31

101arrowz suggested changes Mar 24, 2021

View reviewed changes

Change to batch fromCharCode

d7b5b60

anttipalola added 3 commits March 24, 2021 21:21

fixup and comment

3f2e886

improve comment yet again

d497fbf

Run prettier

52a15f3

Pantura commented Mar 24, 2021

View reviewed changes

src/modules/addimage.js Outdated Show resolved Hide resolved

Pantura commented Mar 25, 2021

View reviewed changes

src/modules/addimage.js Outdated Show resolved Hide resolved

anttipalola and others added 2 commits May 1, 2021 21:06

Final fixes

f85fa21

Merge branch 'master' into feature/arraybuffer-to-binary-string-impro…

c724347

…vement

HackbrettXXX approved these changes May 4, 2021

View reviewed changes

HackbrettXXX merged commit f41a18f into parallax:master May 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve performance of arrayBufferToBinaryString #3121

Improve performance of arrayBufferToBinaryString #3121

Pantura commented Mar 22, 2021 •

edited

Pantura commented Mar 22, 2021

HackbrettXXX commented Mar 23, 2021

Pantura commented Mar 23, 2021

Pantura commented Mar 23, 2021

101arrowz left a comment

101arrowz Mar 24, 2021

101arrowz Mar 24, 2021

Pantura Mar 24, 2021

101arrowz Mar 24, 2021 •

edited

101arrowz Mar 24, 2021

HackbrettXXX Mar 24, 2021

101arrowz Mar 24, 2021 •

edited

HackbrettXXX Mar 24, 2021

Pantura Mar 24, 2021

HackbrettXXX Mar 25, 2021

101arrowz commented Mar 24, 2021 •

edited

Pantura commented Mar 24, 2021 •

edited

HackbrettXXX commented Mar 24, 2021

HackbrettXXX left a comment

		var decoder = new TextDecoder("ascii");
		return decoder.decode(buffer);

Improve performance of arrayBufferToBinaryString #3121

Improve performance of arrayBufferToBinaryString #3121

Conversation

Pantura commented Mar 22, 2021 • edited

Skip expensive buffer overflows

Use TextDecoder when possible to speed up conversion

Pantura commented Mar 22, 2021

HackbrettXXX commented Mar 23, 2021

Pantura commented Mar 23, 2021

Pantura commented Mar 23, 2021

101arrowz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

101arrowz Mar 24, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

101arrowz Mar 24, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

101arrowz commented Mar 24, 2021 • edited

Pantura commented Mar 24, 2021 • edited

HackbrettXXX commented Mar 24, 2021

HackbrettXXX left a comment

Choose a reason for hiding this comment

Pantura commented Mar 22, 2021 •

edited

101arrowz Mar 24, 2021 •

edited

101arrowz Mar 24, 2021 •

edited

101arrowz commented Mar 24, 2021 •

edited

Pantura commented Mar 24, 2021 •

edited