Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add API support for bound instruments #1444

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

KallDrexx
Copy link
Contributor

Fixes #
Design discussion issue (if applicable) #1374

Changes

This PR adds API support for bound instruments for Counter and UpDownCounter. Bound instruments are counters which always change the value for a specific and predefined attribute set. The focus on this change is creating the initial API surface for bound instruments, and the mechanisms to allow non-bound instruments to become bound (since there is no direct path from aggregates to counters, it's all through closures).

To facilitate counters creating bound instruments that can refer back to their source aggregates, aggregates return a BoundMeasureGenerator<T> closure that is provided alongside the Measure<T> for the aggregate. This closure can then be invoked in order to return a BoundMeasure<T>, which provides the mechanism to invoke a measure for the cached attribute set.

This PR is not performance focused. It should not cause regressions but using bound instruments from this PR may not cause increased throughput. Performance enhancements that take advantage of bound instruments will come in a follow-up PR once we have an agreed upon API and generation path.

Benchmarks/Stress Tests

Main

Counter_Add_Sorted      time:   [717.94 ns 723.06 ns 729.17 ns]
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) high mild
  3 (3.00%) high severe

Counter_Add_Unsorted    time:   [711.21 ns 714.89 ns 719.15 ns]
Found 12 outliers among 100 measurements (12.00%)
  8 (8.00%) high mild
  4 (4.00%) high severe
stress test: metrics
Number of threads: 6
Throughput: 6,367,800 iterations/sec
Throughput: 6,581,800 iterations/sec
Throughput: 6,736,800 iterations/sec
Throughput: 6,720,200 iterations/sec
Throughput: 6,804,600 iterations/sec

PR

Counter_Add_Sorted      time:   [712.98 ns 715.33 ns 717.99 ns]
                        change: [-0.7779% +0.3569% +1.5923%] (p = 0.60 > 0.05)
                        No change in performance detected.
Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) high mild
  3 (3.00%) high severe

Counter_Add_Unsorted    time:   [705.13 ns 707.71 ns 710.82 ns]
                        change: [-2.0962% -0.9670% +0.0233%] (p = 0.09 > 0.05)
                        No change in performance detected.
Found 13 outliers among 100 measurements (13.00%)
  5 (5.00%) high mild
  8 (8.00%) high severe

Counter_Bounded         time:   [133.96 ns 134.82 ns 135.86 ns]
Found 12 outliers among 100 measurements (12.00%)
  3 (3.00%) high mild
  9 (9.00%) high severe
stress test: metrics
Number of threads: 6
Throughput: 6,173,200 iterations/sec
Throughput: 6,381,600 iterations/sec
Throughput: 6,250,000 iterations/sec
Throughput: 6,434,000 iterations/sec
stress test: metrics_bounded
Number of threads: 6
Throughput: 6,473,400 iterations/sec
Throughput: 6,459,600 iterations/sec
Throughput: 6,339,200 iterations/sec
Throughput: 6,369,200 iterations/sec

Merge requirement checklist

  • CONTRIBUTING guidelines followed
  • Unit tests added/updated (if applicable)
  • Appropriate CHANGELOG.md files updated for non-trivial, user-facing changes
  • Changes in public API reviewed (if applicable)

@KallDrexx KallDrexx requested a review from a team as a code owner December 14, 2023 22:06
@KallDrexx
Copy link
Contributor Author

The api will change slightly when/if #1421 get in to take an attribute set instead of a slice of key value pairs, but I didn't want to couple both together. Bounded performance might end up getting a bit better since cloning() becomes cheaper, but performance isn't the point of this PR :)

Copy link

codecov bot commented Dec 14, 2023

Codecov Report

Attention: 73 lines in your changes are missing coverage. Please review.

Comparison is base (8688e7e) 61.8% compared to head (cd3be86) 62.2%.
Report is 4 commits behind head on main.

Files Patch % Lines
opentelemetry/src/metrics/noop.rs 0.0% 12 Missing ⚠️
...-sdk/src/metrics/internal/exponential_histogram.rs 52.6% 9 Missing ⚠️
...pentelemetry-sdk/src/metrics/internal/histogram.rs 0.0% 9 Missing ⚠️
...entelemetry-sdk/src/metrics/internal/last_value.rs 0.0% 9 Missing ⚠️
opentelemetry-sdk/src/metrics/internal/sum.rs 50.0% 9 Missing ⚠️
opentelemetry-sdk/src/metrics/instrument.rs 80.5% 7 Missing ⚠️
...lemetry/src/metrics/instruments/up_down_counter.rs 46.1% 7 Missing ⚠️
opentelemetry/src/metrics/instruments/counter.rs 60.0% 4 Missing ⚠️
...pentelemetry-sdk/src/metrics/internal/aggregate.rs 95.3% 3 Missing ⚠️
opentelemetry-sdk/src/metrics/pipeline.rs 89.4% 2 Missing ⚠️
... and 2 more
Additional details and impacted files
@@           Coverage Diff           @@
##            main   #1444     +/-   ##
=======================================
+ Coverage   61.8%   62.2%   +0.3%     
=======================================
  Files        144     144             
  Lines      19810   20152    +342     
=======================================
+ Hits       12256   12538    +282     
- Misses      7554    7614     +60     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -10,6 +10,15 @@ use std::{any::Any, convert::TryFrom};
pub trait SyncCounter<T> {
/// Records an increment to the counter.
fn add(&self, value: T, attributes: &[KeyValue]);

/// Creates a child counter that's bound to a specific attribute set
fn bind(&self, attributes: &[KeyValue]) -> Arc<dyn BoundSyncCounter<T> + Send + Sync>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As Bound Instrument is not in specs (yet), should we have them under the feature flag (otel_unstable)? Not sure if this is going to add more noise with cfg macros across.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think as long as on the API side we mark it then we should be ok ?
I don't believe we need to add everywhere. Mainly on the pub interfaces of API and SDK.

This is where having specific feature flags is useful.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mainly on the pub interfaces of API and SDK.

Agree, that's what we followed for sync gauge support in #1410 - only the pub interfaces are encapsulated within the feature flag.


/// Creates a child counter that's bound to a specific attribute set
fn bind(&self, attributes: &[KeyValue]) -> Arc<dyn BoundSyncCounter<T> + Send + Sync>;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how relevant is this for the current implementation, but the earlier bound instrument specs also had an unbind() method to release (any) allocated resources - https://github.com/open-telemetry/opentelemetry-specification/blob/v1.1.0/specification/metrics/api.md#bound-instrument-calling-convention

As a consequence of their performance advantage, bound instruments also consume resources in the SDK. Bound instruments MUST support an Unbind() method for users to indicate they are finished with the binding and release the associated resources. Note that Unbind() does

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw that in the spec, but figured we didn't need it in Rust because UnBind() is no different than drop.

Copy link
Contributor

@hdost hdost Dec 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this affect the Bounded Measure Generator being 'static ?
If at all.
Maybe a test is worthwhile.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'static is a lifetime, it's not the static keyword. A lifetime only defines the scope of how long a value can last for. In other words it defines how long a value can last, not how long it will last. Almost all owned values have a 'static lifetime because they can be passed to other threads and can be kept alive for the rest of the program's lifetime if you want. However, when they go out of scope they get drop() called, just like any other owned value.

fn do_stuff(counter: Counter<u64>) {
    // pre_do_stuff
    { 
       let bound_counter = counter.Bind(some_attributes);
       bound_counter.add(5);
    } // <-------------------------------------------------------- Drop is here
    // other stuff
}

Since the bound_counter variable goes out of scope on the Drop is here line, the compiler calls drop() for it. Since drop() doesn't have an explicit implementation, it just calls drop() on all of it's fields, which will drop the Arc, which will either modify the reference count, or drop the underlying bound measure if it's the last instance of it.

I'm not really sure what you would test for. Not only is this all built-in and is fundamental to the whole Rust lifetime system, there's no observable behavior in this PR (there is in the performance oriented PR I have ready to go, but not in this PR).

The only reason 'static is defined in this code is because the Arc is holding a trait, not a concrete object. That means someone could have implemented the BoundSyncCounter<T> trait on a shared reference, in which case it's not valid to pass around because the compiler would not be able to guarantee the shared reference doesn't outlive it's owner. Thus 'static is required to tell the compiler "this is safe to pass around in any context, as this can live up to the end of the application if the consumer wants".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fully utilizes the trait to make sure it's not getting dropped due to non-use: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=87650c8e735c22878046af4c5d567e69

Copy link
Contributor

@hdost hdost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like the general direction, kinda surprised not to see much of a performance gain over standard counters. I feel like (not related to this change) we have a lot of Arc spread through the code that maybe we could avoid. A bit of locking our locks so we can lock more locks.

@KallDrexx
Copy link
Contributor Author

KallDrexx commented Dec 19, 2023

Like the general direction, kinda surprised not to see much of a performance gain over standard counters. I feel like (not related to this change) we have a lot of Arc spread through the code that maybe we could avoid. A bit of locking our locks so we can lock more locks.

Arc contains no locking. It's an atomic number that tracks how many references there are. When you clone() it increases the atomic count, and when you drop() it decrements the atomic count (and if it's the last reference it drops the underlying value). There's no locking involved in cloning, dropping, or accessing an Arc, and a lock is only involved if you access an internal function of the held type that invokes a lock (e.g. a mutex held by the arc).

The reason this doesn't provide any performance benefits is partially because #1421 is not yet merged. Since we don't have that change in, we must do a hard clone of the attribute set every time a bound measure is called (and for each measure in the ResolvedMeasure. Without #1421 it's cloning the whole vector and doing extra allocations.

There's no performance benefit too because this PR has bound instruments as a thin wrapper around normal measurements. Once this PR is merged, the next PR fully takes advantage of bound instruments by creating an atomic that's incremented by the bound measure function. This allows the BoundCounter.add() calls to increment with zero mutex locking and zero hashmap lookups. That's what this PR is setting the stage for.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants