Improve SeqDecoder performance by removing overhead #1719

jordiolivares · 2021-04-17T22:29:16Z

Hi, I was fiddling around to see if there was some way of making Circe faster and found that the current SeqDecoder has a non-trivial amount of overhead.

I believe this was mostly due to cursor object creation and destruction. But I can't be sure as those don't appear in the profiling flame graph that I used.

To compare, I ran sbt 'benchmark/jmh:run -f1 io.circe.benchmark.NumberParsingBenchmark.decodeMany*' before and after the optimization and there is a trend towards making arrays of primitives faster:

# before
[info] Benchmark                                      Mode  Cnt      Score     Error  Units
[info] NumberParsingBenchmark.decodeManyBigDecimals  thrpt    5   1520.022 ±  87.924  ops/s
[info] NumberParsingBenchmark.decodeManyBigInts      thrpt    5   1773.855 ±  12.936  ops/s
[info] NumberParsingBenchmark.decodeManyDoubles      thrpt    5  10004.552 ±  28.427  ops/s
[info] NumberParsingBenchmark.decodeManyLongs        thrpt    5  12490.381 ± 133.195  ops/s

# after
[info] Benchmark                                      Mode  Cnt      Score    Error  Units
[info] NumberParsingBenchmark.decodeManyBigDecimals  thrpt    5   1478.384 ± 62.771  ops/s
[info] NumberParsingBenchmark.decodeManyBigInts      thrpt    5   1799.242 ±  8.659  ops/s
[info] NumberParsingBenchmark.decodeManyDoubles      thrpt    5  10451.115 ± 19.050  ops/s
[info] NumberParsingBenchmark.decodeManyLongs        thrpt    5  13264.556 ± 55.887  ops/s

Just to confirm, I made a JMH test in the style of the previous benchmark class with booleans:

@State(Scope.Thread)
@BenchmarkMode(Array(Mode.Throughput))
@OutputTimeUnit(TimeUnit.SECONDS)
class ArraysBenchmark {
  val count = 1000

  val inputBooleans = "[" + List.fill(count)(true).mkString(", ") + "]"

  @Benchmark
  def decodeManyBoolean(): Either[circe.Error, Array[Boolean]] = jawn.decode[Array[Boolean]](inputBooleans)
}

and the effects are quite dramatic:

# before
[info] Benchmark                           Mode  Cnt      Score     Error  Units
[info] ArraysBenchmark.decodeManyBoolean  thrpt    5  38027.942 ± 273.341  ops/s

# after
[info] Benchmark                           Mode  Cnt      Score     Error  Units
[info] ArraysBenchmark.decodeManyBoolean  thrpt    5  56613.658 ± 293.886  ops/s

codecov-commenter · 2021-04-17T22:39:50Z

Codecov Report

❗ No coverage uploaded for pull request base (master@70171ba). Click here to learn what that means.
The diff coverage is 100.00%.

@@            Coverage Diff            @@
##             master    #1719   +/-   ##
=========================================
  Coverage          ?   86.91%           
=========================================
  Files             ?       59           
  Lines             ?     2415           
  Branches          ?      134           
=========================================
  Hits              ?     2099           
  Misses            ?      316           
  Partials          ?        0

Impacted Files	Coverage Δ
...re/shared/src/main/scala/io/circe/SeqDecoder.scala	`98.07% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 70171ba...96a7f29. Read the comment docs.

Improve SeqDecoder performance

96a7f29

jordiolivares requested a review from travisbrown as a code owner April 17, 2021 22:29

Merge branch 'series/0.14.x' into improve-performance

9cbbae4

jordiolivares requested review from zarthross and hamnis as code owners April 22, 2024 11:03

Merge branch 'series/0.14.x' into improve-performance

91dbb36

hamnis approved these changes May 15, 2024

View reviewed changes

hamnis merged commit ea24df4 into circe:series/0.14.x May 15, 2024
18 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve SeqDecoder performance by removing overhead #1719

Improve SeqDecoder performance by removing overhead #1719

jordiolivares commented Apr 17, 2021

codecov-commenter commented Apr 17, 2021 •

edited

Improve SeqDecoder performance by removing overhead #1719

Improve SeqDecoder performance by removing overhead #1719

Conversation

jordiolivares commented Apr 17, 2021

codecov-commenter commented Apr 17, 2021 • edited

Codecov Report

codecov-commenter commented Apr 17, 2021 •

edited