benchmarks: add tracing-noop comparison; restructure to produce line graphs #194

davidbarsky · 2024-01-26T18:38:07Z

Hi! I'm a tracing maintainer. Overall, I'm really excited to see people working in this space! I have a few disorganized thoughts:

Congrats on creating a new tracing/instrumentation library! I'm sure that we can learn a bunch from one another.
I'm a little worried that the benchmarks are comparing apples to oranges, since minitrace is performing the equivalent of a noop. When I made tracing do the same thing, I've found that tracing is a little faster than minitrace. When I compared compare/Tokio Tracing/1000 to compare/minitrace/1000 (after adding a no-op tracing subscriber but before renaming it in this PR, I found that tracing clocked in at 4.7097μs, while minitrace clocked in at 26.920μs. I'm pretty sure that this can be chalked up to different priorities/approaches, in that minitrace opts to off-load spans to a background thread by default, while tracing does not.
There's lots of low-hanging fruit in tracing-opentelemetry and the Rust opentelemetry crates. Removing the usage of Box and Arc inside of opentelemetry and having a more efficient Registry in tracing-subscriber could go a long way in closing the performance gap you've observed. I'm not even talking about moving span handling/creation into a background thread for tracing-opentelemetry, but that's certainly in the cards to reduce latency.
I'm personally extremely sympathetic to the "no levels in spans" stance that y'all have, and while I don't think tracing can ever get there, I think we can make a default level for spans.
I'd like to emphasize that there's plenty of space for libraries focused on distributed tracing, which tracing isn't necessarily treating as its top priority.

The structuring of the benchmarks allows for creating line charts like the following:
.

On my M1 Mac, these are the results of me running the benchmarks:

cargo bench --bench compare

❯ cargo bench --bench compare
   Compiling minitrace v0.6.3 (/Users/dbarsky/Developer/minitrace-rust/minitrace)
    Finished bench [optimized] target(s) in 26.54s
     Running benches/compare.rs (target/release/deps/compare-78610f33505b00ac)
Gnuplot not found, using plotters backend
Comparison/minitrace-noop/1
                        time:   [219.40 ns 221.13 ns 222.85 ns]
Found 5 outliers among 100 measurements (5.00%)
  4 (4.00%) high mild
  1 (1.00%) high severe
Comparison/tokio/tracing-noop/1
                        time:   [20.680 ns 20.732 ns 20.788 ns]
Found 14 outliers among 100 measurements (14.00%)
  4 (4.00%) low mild
  7 (7.00%) high mild
  3 (3.00%) high severe
Comparison/tokio/tracing-otel/1
                        time:   [1.4626 µs 1.4668 µs 1.4722 µs]
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild
Comparison/rusttracing/1
                        time:   [644.82 ns 647.31 ns 650.24 ns]
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild
Comparison/minitrace-noop/10
                        time:   [491.78 ns 495.12 ns 498.75 ns]
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe
Comparison/tokio/tracing-noop/10
                        time:   [106.78 ns 107.09 ns 107.43 ns]
Found 7 outliers among 100 measurements (7.00%)
  6 (6.00%) high mild
  1 (1.00%) high severe
Comparison/tokio/tracing-otel/10
                        time:   [8.2320 µs 8.2961 µs 8.3674 µs]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
Comparison/rusttracing/10
                        time:   [1.7088 µs 1.7154 µs 1.7233 µs]
Comparison/minitrace-noop/100
                        time:   [3.0395 µs 3.0543 µs 3.0702 µs]
Found 6 outliers among 100 measurements (6.00%)
  1 (1.00%) low severe
  5 (5.00%) high mild
Comparison/tokio/tracing-noop/100
                        time:   [977.17 ns 981.52 ns 985.87 ns]
Found 16 outliers among 100 measurements (16.00%)
  11 (11.00%) high mild
  5 (5.00%) high severe
Comparison/tokio/tracing-otel/100
                        time:   [77.290 µs 78.145 µs 78.890 µs]
Comparison/rusttracing/100
                        time:   [11.685 µs 11.712 µs 11.741 µs]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe
Comparison/minitrace-noop/1000
                        time:   [26.265 µs 26.501 µs 26.785 µs]
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) low mild
  1 (1.00%) high mild
  3 (3.00%) high severe
Comparison/tokio/tracing-noop/1000
                        time:   [9.6475 µs 9.6658 µs 9.6882 µs]
Comparison/tokio/tracing-otel/1000
                        time:   [753.69 µs 758.48 µs 763.77 µs]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
Comparison/rusttracing/1000
                        time:   [107.25 µs 107.67 µs 108.21 µs]
Found 6 outliers among 100 measurements (6.00%)
  1 (1.00%) high mild
  5 (5.00%) high severe

…graphs

andylokandy · 2024-01-26T19:40:58Z

Thank you for the good point of view! In fact, mintrace and tokio-tracing is not overlap in too much. minitrace is designed for performance-critical systems like database to improve observability for end user, while tokio-tracing is designed for application developer to debug and make structured logs. Reflected in design, tokio-tracing is more flexible and powerful in functionalities, while minitarce has relatively prescribed usage.

I'm a little worried that the benchmarks are comparing apples to oranges, since minitrace is performing the equivalent of a noop. When I made tracing do the same thing, I've found that tracing is a little faster than minitrace. When I compared compare/Tokio Tracing/1000 to compare/minitrace/1000 (after adding a no-op tracing subscriber but before renaming it in this PR, I found that tracing clocked in at 4.7097μs, while minitrace clocked in at 26.920μs. I'm pretty sure that this can be chalked up to different priorities/approaches, in that minitrace opts to off-load spans to a background thread by default, while tracing does not.

Thanks for adding the new benchmark. But I think it should be moved to a new group called compare_noop. It's because in the benchmark before this PR, minitrace is not noop, I mean, a real world working minitrace is performing the same as in the benchmark. And more, in the compare_noop group, minitrace should be separated from other minitrace benchmark (in a new crate or in different run) and must no call set_reporter, which is the true noop mode of minitrace.

coveralls · 2024-01-27T03:41:54Z

Pull Request Test Coverage Report for Build 7672066243

0 of 0 changed or added relevant lines in 0 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage remained the same at 79.458%

Totals
Change from base Build 7664443160:	0.0%
Covered Lines:	1702
Relevant Lines:	2142

💛 - Coveralls

benchmarks: add tracing-noop comparison; restructure to produce line …

0368c24

…graphs

andylokandy mentioned this pull request May 2, 2024

Proposal to Adopt Tokio Tracing as the OTel Tracing API open-telemetry/opentelemetry-rust#1689

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

benchmarks: add tracing-noop comparison; restructure to produce line graphs #194

benchmarks: add tracing-noop comparison; restructure to produce line graphs #194

davidbarsky commented Jan 26, 2024

andylokandy commented Jan 26, 2024 •

edited

coveralls commented Jan 27, 2024

benchmarks: add tracing-noop comparison; restructure to produce line graphs #194

Are you sure you want to change the base?

benchmarks: add tracing-noop comparison; restructure to produce line graphs #194

Conversation

davidbarsky commented Jan 26, 2024

andylokandy commented Jan 26, 2024 • edited

coveralls commented Jan 27, 2024

Pull Request Test Coverage Report for Build 7672066243

💛 - Coveralls

andylokandy commented Jan 26, 2024 •

edited