Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve SeqDecoder performance by removing overhead #1719

Merged
merged 3 commits into from
May 15, 2024

Conversation

jordiolivares
Copy link
Contributor

Hi, I was fiddling around to see if there was some way of making Circe faster and found that the current SeqDecoder has a non-trivial amount of overhead.

I believe this was mostly due to cursor object creation and destruction. But I can't be sure as those don't appear in the profiling flame graph that I used.

To compare, I ran sbt 'benchmark/jmh:run -f1 io.circe.benchmark.NumberParsingBenchmark.decodeMany*' before and after the optimization and there is a trend towards making arrays of primitives faster:

# before
[info] Benchmark                                      Mode  Cnt      Score     Error  Units
[info] NumberParsingBenchmark.decodeManyBigDecimals  thrpt    5   1520.022 ±  87.924  ops/s
[info] NumberParsingBenchmark.decodeManyBigInts      thrpt    5   1773.855 ±  12.936  ops/s
[info] NumberParsingBenchmark.decodeManyDoubles      thrpt    5  10004.552 ±  28.427  ops/s
[info] NumberParsingBenchmark.decodeManyLongs        thrpt    5  12490.381 ± 133.195  ops/s

# after
[info] Benchmark                                      Mode  Cnt      Score    Error  Units
[info] NumberParsingBenchmark.decodeManyBigDecimals  thrpt    5   1478.384 ± 62.771  ops/s
[info] NumberParsingBenchmark.decodeManyBigInts      thrpt    5   1799.242 ±  8.659  ops/s
[info] NumberParsingBenchmark.decodeManyDoubles      thrpt    5  10451.115 ± 19.050  ops/s
[info] NumberParsingBenchmark.decodeManyLongs        thrpt    5  13264.556 ± 55.887  ops/s

Just to confirm, I made a JMH test in the style of the previous benchmark class with booleans:

@State(Scope.Thread)
@BenchmarkMode(Array(Mode.Throughput))
@OutputTimeUnit(TimeUnit.SECONDS)
class ArraysBenchmark {
  val count = 1000

  val inputBooleans = "[" + List.fill(count)(true).mkString(", ") + "]"

  @Benchmark
  def decodeManyBoolean(): Either[circe.Error, Array[Boolean]] = jawn.decode[Array[Boolean]](inputBooleans)
}

and the effects are quite dramatic:

# before
[info] Benchmark                           Mode  Cnt      Score     Error  Units
[info] ArraysBenchmark.decodeManyBoolean  thrpt    5  38027.942 ± 273.341  ops/s

# after
[info] Benchmark                           Mode  Cnt      Score     Error  Units
[info] ArraysBenchmark.decodeManyBoolean  thrpt    5  56613.658 ± 293.886  ops/s

@codecov-commenter
Copy link

codecov-commenter commented Apr 17, 2021

Codecov Report

❗ No coverage uploaded for pull request base (master@70171ba). Click here to learn what that means.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master    #1719   +/-   ##
=========================================
  Coverage          ?   86.91%           
=========================================
  Files             ?       59           
  Lines             ?     2415           
  Branches          ?      134           
=========================================
  Hits              ?     2099           
  Misses            ?      316           
  Partials          ?        0           
Impacted Files Coverage Δ
...re/shared/src/main/scala/io/circe/SeqDecoder.scala 98.07% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 70171ba...96a7f29. Read the comment docs.

@hamnis hamnis merged commit ea24df4 into circe:series/0.14.x May 15, 2024
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants