Use native indexOf whenever possible #61

reconbot · 2018-10-05T21:36:48Z

This uses a windowing approach to determine how much of a buffer can be searched by the native indexOf when searching. It falls back to JS when it hits a boundary and if there is no available native search window. I'm pretty sure a BufferList can't be searched drastically faster than this.

Actual performance needs to be measured.
moved tests into their own file
search is now always converted to a buffer, this seems like an ok tradeoff, I'm sure we could delay copying it and check if the search is larger than the BufferList but the usual case is the search is larger than the buffer list. 🤷‍♂️

This uses a windowing approach to determine how much of a buffer can be searched by the native indexOf when searching. It falls back to JS when it hits a boundary and if there is no available native search window. I'm pretty sure a BufferList can't be searched drastically faster than this. Actual performance needs to be measured.

reconbot · 2018-10-05T22:55:45Z

I can't believe it's not obviously faster 🤔 https://gist.github.com/reconbot/abde2da91a76b6fa94f028db9fd27eab

mcollina · 2018-10-06T10:44:44Z

I think the bottleneck is something else. As you can see the length of the buffer is not impacting the measurements at all. Have you tried generating a flamegraph?

You might also try varying the number of buffer in the list and see if that impacts the benchmarks at all.

reconbot · 2018-10-06T13:51:50Z

I think you're right.

Running the new approach today

smallBuffers small search x 2,490,521 ops/sec ±3.04% (84 runs sampled)
smallBuffers medium search x 2,422,262 ops/sec ±2.88% (89 runs sampled)
smallBuffers large search x 2,600,668 ops/sec ±0.56% (94 runs sampled)
smallBuffersLong small search x 2,607,599 ops/sec ±0.59% (90 runs sampled)
smallBuffersLong medium search x 2,551,642 ops/sec ±2.46% (89 runs sampled)
smallBuffersLong large search x 2,588,665 ops/sec ±0.69% (91 runs sampled)
largeBuffers small search x 2,549,903 ops/sec ±2.64% (90 runs sampled)
largeBuffers medium search x 2,557,784 ops/sec ±1.33% (88 runs sampled)
largeBuffers large search x 2,581,822 ops/sec ±0.74% (88 runs sampled)
mixedBuffers small search x 2,470,664 ops/sec ±2.31% (91 runs sampled)
mixedBuffers medium search x 2,565,537 ops/sec ±0.62% (91 runs sampled)
mixedBuffers large search x 2,525,416 ops/sec ±2.39% (92 runs sampled)
veryLargeBuffers small search x 2,568,693 ops/sec ±0.78% (88 runs sampled)
veryLargeBuffers medium search x 2,557,850 ops/sec ±0.61% (89 runs sampled)
veryLargeBuffers large search x 2,487,267 ops/sec ±2.49% (90 runs sampled)

And now with new search init approach with less making of BL objects

smallBuffers small search x 28,823,176 ops/sec ±0.82% (94 runs sampled)
smallBuffers medium search x 28,685,310 ops/sec ±0.93% (95 runs sampled)
smallBuffers large search x 28,619,441 ops/sec ±0.85% (92 runs sampled)
smallBuffersLong small search x 28,845,733 ops/sec ±0.75% (94 runs sampled)
smallBuffersLong medium search x 28,998,154 ops/sec ±0.59% (92 runs sampled)
smallBuffersLong large search x 28,860,999 ops/sec ±0.79% (94 runs sampled)
largeBuffers small search x 28,934,405 ops/sec ±0.76% (90 runs sampled)
largeBuffers medium search x 28,965,175 ops/sec ±0.78% (93 runs sampled)
largeBuffers large search x 25,850,403 ops/sec ±1.59% (85 runs sampled)
mixedBuffers small search x 24,613,339 ops/sec ±0.92% (94 runs sampled)
mixedBuffers medium search x 28,834,310 ops/sec ±0.92% (91 runs sampled)
mixedBuffers large search x 28,900,673 ops/sec ±0.58% (92 runs sampled)
veryLargeBuffers small search x 28,583,304 ops/sec ±0.78% (91 runs sampled)
veryLargeBuffers medium search x 29,038,767 ops/sec ±0.72% (94 runs sampled)
veryLargeBuffers large search x 28,374,545 ops/sec ±1.06% (91 runs sampled)

So that's a lot better still not a great difference in time due to length.

Flame graph has nothing obvious yet...

reconbot · 2018-10-06T14:40:23Z

I screwed up, new Array(n).map(()=>1) will return an array of undefined no matter what.

These results make a lot more sense, this change is objectively better.

// windowing approach

smallBuffers small search x 6,868 ops/sec ±1.12% (90 runs sampled)
smallBuffers medium search x 11,344 ops/sec ±1.09% (90 runs sampled)
smallBuffers large search x 18,718 ops/sec ±1.17% (89 runs sampled)
smallBuffersLong small search x 115 ops/sec ±1.32% (74 runs sampled)
smallBuffersLong medium search x 101 ops/sec ±0.80% (74 runs sampled)
smallBuffersLong large search x 108 ops/sec ±0.92% (79 runs sampled)
largeBuffers small search x 6,679 ops/sec ±1.11% (92 runs sampled)
largeBuffers medium search x 93.14 ops/sec ±0.80% (75 runs sampled)
largeBuffers large search x 46.27 ops/sec ±0.74% (61 runs sampled)
mixedBuffers small search x 6,891 ops/sec ±0.68% (89 runs sampled)
mixedBuffers medium search x 129 ops/sec ±1.10% (73 runs sampled)
mixedBuffers large search x 101 ops/sec ±0.87% (74 runs sampled)
veryLargeBuffers small search x 187 ops/sec ±2.08% (79 runs sampled)
veryLargeBuffers medium search x 59.67 ops/sec ±1.18% (63 runs sampled)
veryLargeBuffers large search x 35.32 ops/sec ±1.23% (62 runs sampled)

// old approach

smallBuffers small search x 5,248 ops/sec ±1.43% (89 runs sampled)
smallBuffers medium search x 5,210 ops/sec ±1.07% (89 runs sampled)
smallBuffers large search x 5,400 ops/sec ±1.08% (91 runs sampled)
smallBuffersLong small search x 159 ops/sec ±1.10% (81 runs sampled)
smallBuffersLong medium search x 158 ops/sec ±1.20% (80 runs sampled)
smallBuffersLong large search x 157 ops/sec ±1.03% (81 runs sampled)
largeBuffers small search x 42.98 ops/sec ±1.42% (57 runs sampled)
largeBuffers medium search x 43.01 ops/sec ±1.79% (57 runs sampled)
largeBuffers large search x 42.33 ops/sec ±1.03% (56 runs sampled)
mixedBuffers small search x 96.89 ops/sec ±1.20% (73 runs sampled)
mixedBuffers medium search x 98.41 ops/sec ±1.19% (72 runs sampled)
mixedBuffers large search x 97.30 ops/sec ±1.06% (72 runs sampled)
veryLargeBuffers small search x 0.04 ops/sec ±2.17% (5 runs sampled)

// the very large buffers took over 4 minutes and I got board and killed it

The gist has the new perftest code.

mcollina · 2018-10-06T20:20:26Z

I think there is a huge performance cliff somewhere. Consider these two:

largeBuffers small search x 6,679 ops/sec ±1.11% (92 runs sampled)
largeBuffers medium search x 93.14 ops/sec ±0.80% (75 runs sampled)

This is a 60x drop. I think it's overall better than the previous one anyway, but maybe this should be investigated.

Do you plan to work more on this or we can land?

reconbot · 2018-10-06T21:01:37Z

Land, I'm still paying with profiling and other tools but it will be at least a week before I can get back to any of this.

…

On Sat, Oct 6, 2018, 4:20 PM Matteo Collina ***@***.***> wrote: I think there is a huge performance cliff somewhere. Consider these two: largeBuffers small search x 6,679 ops/sec ±1.11% (92 runs sampled) largeBuffers medium search x 93.14 ops/sec ±0.80% (75 runs sampled) This is a 60x drop. I think it's overall better than the previous one. Do you plan to work more on this or we can land? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#61 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AABlbmjPO-NAgxGcqsDUlovGI-zHN2oiks5uiRCMgaJpZM4XK9jQ> .

reconbot · 2018-10-07T03:21:55Z

ok one more patch, .get() can easily be made faster and it's used for looking between buffers.

before

get performance x 12,057,870 ops/sec ±0.95% (90 runs sampled)
smallBuffers small search x 6,707 ops/sec ±0.70% (91 runs sampled)
smallBuffers medium search x 11,849 ops/sec ±1.01% (90 runs sampled)
smallBuffers large search x 18,889 ops/sec ±0.92% (88 runs sampled)
smallBuffersLong small search x 124 ops/sec ±0.93% (79 runs sampled)
smallBuffersLong medium search x 100 ops/sec ±0.98% (73 runs sampled)
smallBuffersLong large search x 108 ops/sec ±0.96% (79 runs sampled)
largeBuffers small search x 7,065 ops/sec ±0.83% (94 runs sampled)
largeBuffers medium search x 98.31 ops/sec ±0.86% (73 runs sampled)
largeBuffers large search x 49.31 ops/sec ±0.99% (64 runs sampled)
mixedBuffers small search x 7,373 ops/sec ±0.76% (94 runs sampled)
mixedBuffers medium search x 139 ops/sec ±0.82% (79 runs sampled)
mixedBuffers large search x 109 ops/sec ±1.29% (80 runs sampled)
veryLargeBuffers small search x 203 ops/sec ±1.48% (79 runs sampled)
veryLargeBuffers medium search x 64.49 ops/sec ±1.00% (67 runs sampled)
veryLargeBuffers large search x 38.00 ops/sec ±1.28% (51 runs sampled)

after

get performance x 116,746,182 ops/sec ±0.67% (92 runs sampled)
smallBuffers small search x 11,821 ops/sec ±0.96% (90 runs sampled)
smallBuffers medium search x 16,909 ops/sec ±0.99% (89 runs sampled)
smallBuffers large search x 19,734 ops/sec ±0.73% (93 runs sampled)
smallBuffersLong small search x 142 ops/sec ±0.91% (81 runs sampled)
smallBuffersLong medium search x 116 ops/sec ±0.96% (75 runs sampled)
smallBuffersLong large search x 124 ops/sec ±0.96% (79 runs sampled)
largeBuffers small search x 11,877 ops/sec ±0.99% (92 runs sampled)
largeBuffers medium search x 175 ops/sec ±0.98% (81 runs sampled)
largeBuffers large search x 87.49 ops/sec ±1.14% (75 runs sampled)
mixedBuffers small search x 12,191 ops/sec ±0.81% (94 runs sampled)
mixedBuffers medium search x 236 ops/sec ±0.88% (86 runs sampled)
mixedBuffers large search x 181 ops/sec ±1.61% (84 runs sampled)
veryLargeBuffers small search x 200 ops/sec ±1.27% (78 runs sampled)
veryLargeBuffers medium search x 93.03 ops/sec ±0.96% (79 runs sampled)
veryLargeBuffers large search x 58.33 ops/sec ±1.01% (61 runs sampled)

improve search data initialization

6d71467

reconbot mentioned this pull request Oct 6, 2018

feat: use bl for a faster delimiter parser serialport/node-serialport#1693

Closed

make .get() much faster by not copying anything

172e3ee

lint

84b1b3a

mcollina merged commit ca58651 into rvagg:master Oct 8, 2018

snyk-bot mentioned this pull request Mar 13, 2020

[Snyk] Upgrade bl from 2.0.1 to 2.2.0 itsapi/github-listener#109

Merged

This was referenced Mar 12, 2021

Bump bl from 1.2.2 to 4.1.0 sebastianrothe/wp-datepicker-jquery#15

Closed

Bump bl from 1.2.2 to 4.1.0 capaj/react-tweet-embed#43

Closed

Bump bl from 1.2.2 to 3.0.1 BArnaudet/Hackathon-Wild-Fest#6

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use native indexOf whenever possible #61

Use native indexOf whenever possible #61

reconbot commented Oct 5, 2018 •

edited

reconbot commented Oct 5, 2018

mcollina commented Oct 6, 2018

reconbot commented Oct 6, 2018

reconbot commented Oct 6, 2018

mcollina commented Oct 6, 2018 •

edited

reconbot commented Oct 6, 2018 via email

reconbot commented Oct 7, 2018

Use native indexOf whenever possible #61

Use native indexOf whenever possible #61

Conversation

reconbot commented Oct 5, 2018 • edited

reconbot commented Oct 5, 2018

mcollina commented Oct 6, 2018

reconbot commented Oct 6, 2018

reconbot commented Oct 6, 2018

mcollina commented Oct 6, 2018 • edited

reconbot commented Oct 6, 2018 via email

reconbot commented Oct 7, 2018

reconbot commented Oct 5, 2018 •

edited

mcollina commented Oct 6, 2018 •

edited