Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preallocate buffer size when reading data #239

Merged

Conversation

pfreixes
Copy link
Contributor

@pfreixes pfreixes commented Jan 17, 2022

When small chunks are received from a large payload, i.e > 10MBs, performance
gets impacted because due to the buffer allocation, for payloads of 10MBs with
chunks of 2KBs it can take ~  20 seconds.

We add an exponential growth of the buffer providing enough room for subsequent
chunks until a maximum threshold of 1MB is reached

With that change we can see a x50 improvement with the trade off of having to
allocate a little extra space, worst case scenario 1MB.

Following is an example of the time needed for making the allocation of 10MB payload when chunks of 2K are received with the current implementation

var buf
var chunk = Buffer.from('x'.repeat(2048))
var start = Date.now();
for (i=0; i<10_000_000/2048; i++) {
  buf = buf ? Buffer.concat([buf, chunk]) : chunk
}
var delta = Date.now() - start;
console.log(`Total time ${delta}ms`)
node /tmp/test_concat_buffer.js
Total time 17214ms

When small chunks are received from a large payload, i.e > 10MBs, performance
gets impacted because due to the buffer allocation, for payloads of 10MBs with
chunks of 2KBs it can take ~  20 seconds.

We add an exponencial growth of the buffer providing enough room or subsequent
chunks until a maximum threshold of 1MB.

With that change we can see a x50 improvement with the trade off of having to
allocate a little extra space, worst case scenario 1MB.
@@ -15,6 +15,8 @@ var colon = 58
var space = 32
var lineFeed = 10
var carriageReturn = 13
// 1MB
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could you specify why you landed at 1MB?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question, TBH I piked up this number without data and expecting that would provide enough head room without impacting to much on the free memory.

Ive just tried different configurations and seems that the sweet spot is 256KBs, beyond this there is no gain in the performance.

2KB 9902ms
4KB 8192ms
8KB 5115ms
16KB 2529ms
32KB 1859ms
64KB 911ms
128KB 801ms
256KB 539ms
512KB 562ms
1MB 565ms

I will modify the 1MB max threshold to 256KBs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Would you mind just adding a comment noting this? You could even link to this if you want to. I just think we should make it clear to someone reading the code how we picked what could be an arbitrary value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Beyond that number there is no gain in the performance
@joeybaker
Copy link
Contributor

joeybaker commented Jan 20, 2022

@pfreixes I like this change, but the tests are failing :\ Could you see what this broke?

@pfreixes
Copy link
Contributor Author

@joeybaker Ill take look!

@pfreixes
Copy link
Contributor Author

@joeybaker tests fixed, there were a couple of bugs where I missed some "corner" cases. Also replaced the new Buffer for the none deprecated Buffer.alloc, lets see if it works for all Node versions that you are supporting.

Any chance to trigger again the CI?

Thanks!

@joeybaker
Copy link
Contributor

Nifty! That's why we have the tests :) I think the answer is no, but can you think of any tests we should add with this change?

@pfreixes
Copy link
Contributor Author

Wha about this change @joeybaker [1]?

This specific test is x5/x6 faster now than in master. Im not very familiar with JS and Mocha, and not sure how the timeout thresholds are applied (seems that tests can even pass when they are taking twice or three time to what is being configured), but in my local machine it works when I configure the 30ms time, not sure if this might end up being a flaky test. Any other recommendation?

[1] bab41d9

@joeybaker
Copy link
Contributor

That's a good idea! 30ms is probably too aggressive though, and as you expected leads to flaky tests (see current run). If you're seeing a 5x minimum speed up then maybe we do 1000 / 5 = 200 ms?

@pfreixes
Copy link
Contributor Author

Done!

Copy link
Contributor

@joeybaker joeybaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Tests are passing. I'll let this sit for a little bit to give other contributors a chance to weigh in. Ping me next week if there's not activity @pfreixes?

@pfreixes
Copy link
Contributor Author

pfreixes commented Feb 3, 2022

Any news on this one @joeybaker ?

@joeybaker
Copy link
Contributor

Yup! Since there are no objections…

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants