CharacterReader always allocate a 32 KB buffer that can even exceed document size #1773

chibenwa · 2022-05-16T08:39:55Z

This might be overkill and we might be able to decrease the size of this buffer for small strings to be parsed.

This should thus reduce memory allocation.

Here is a small async profiler memory flame graph showing this:

jhy · 2022-05-17T06:31:39Z

I think the difficulty will be in knowing upfront the expected content length of the response to parse. My understanding in practice is that if it is set, the Content Length header is often incorrect. And so browsers typically ignore it / handle larger sizes.

Have you actually profiled this to be a performance issue? I picked that number based on a bit of empirical benchmarking and testing and found it worked reasonably.

Another approach would be to reuse the buffer on subsequent reads, so the allocation doesn't need to happen. We use that pattern in e.g. StringBuilders.

chibenwa · 2022-05-17T07:17:22Z

Have you actually profiled this to be a performance issue? I picked that number based on a bit of empirical benchmarking and testing and found it worked reasonably.

No not that critical, though you get many small HTLM emails in an email server.

jhy · 2022-05-17T10:51:59Z

OK yes in that use-case I think it would make sense. A pattern of a threadlocal reusable buffer would work, similar to the Builders in StringUtil.

chibenwa · 2022-06-19T05:11:22Z

Thinking more about this, is there something that prevent doing (potentially opt-in) buffer recyclingjust like Jackson JSON parsers does ?

I am working on a similar approach in MIME4J and it did so far yield a 40& allocation reduction that translated in micro-benchmarks into a 10-15% performance improvment.

I would be happy to cary other such a contribution.

jhy · 2022-06-24T07:15:48Z

I'm not familiar with the implementation of Jackson parser. Recyling the buffer makes sense. I don't know that it should be opt-in as I would expect most people to miss it if it were.

Would be happy to review a PR!

chibenwa · 2022-06-28T04:28:53Z

Question before getting started: are there some jmh benchmarks somewhere? Would you be interested in getting a set of jmh benchmarks as well?

Will be in vacations all of july so nothing would likely happen before august.

chibenwa · 2022-07-01T02:30:07Z

I did succeed to put together #1800 before leaving.

Looks like massive gains are ahead ;-)

jhy linked a pull request Jul 1, 2022 that will close this issue

Implement buffer recycling for CharacterReader #1800

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CharacterReader always allocate a 32 KB buffer that can even exceed document size #1773

CharacterReader always allocate a 32 KB buffer that can even exceed document size #1773

chibenwa commented May 16, 2022

jhy commented May 17, 2022

chibenwa commented May 17, 2022

jhy commented May 17, 2022

chibenwa commented Jun 19, 2022

jhy commented Jun 24, 2022

chibenwa commented Jun 28, 2022

chibenwa commented Jul 1, 2022

CharacterReader always allocate a 32 KB buffer that can even exceed document size #1773

CharacterReader always allocate a 32 KB buffer that can even exceed document size #1773

Comments

chibenwa commented May 16, 2022

jhy commented May 17, 2022

chibenwa commented May 17, 2022

jhy commented May 17, 2022

chibenwa commented Jun 19, 2022

jhy commented Jun 24, 2022

chibenwa commented Jun 28, 2022

chibenwa commented Jul 1, 2022