Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define configurable max scale for exponential histogram aggregation #2987

Closed
jack-berg opened this issue Nov 28, 2022 · 6 comments · Fixed by #3017
Closed

Define configurable max scale for exponential histogram aggregation #2987

jack-berg opened this issue Nov 28, 2022 · 6 comments · Fixed by #3017
Assignees
Labels
spec:metrics Related to the specification/metrics directory

Comments

@jack-berg
Copy link
Member

The exponential histogram aggregation defines a max scale without giving concrete advice on what "reasonable" minimum and maximum scales might be.

The max scale is particularly important because as I've recently found, computing the bucket index with the the Logarithm Function is ~9x slower than the bit manipulation technique. The logarithm function is required for positive scales, while the bit manipulation version can be used at scales less than or equal to zero.

I suggest we:

  1. Make min and max scale configurable. High performance applications may want to avoid the performance hit of indexing using the log function, as well as unnecessary rescaling.
  2. Provide guidance on the default min and max scale.
  3. Provide guidance on the bounds for min and max scale. Users would not be allowed to configure min and max scale outside of these bounds.

cc @jmacd @jsuereth

@jack-berg jack-berg added the spec:metrics Related to the specification/metrics directory label Nov 28, 2022
@oertl
Copy link

oertl commented Nov 29, 2022

For positive scales, there are significantly faster techniques using lookup tables (mentioned here ) that avoid the logarithm. As a drawback, the size of the lookup table scales with 2^scale, which requires limiting the scale to a value of about 10, but this corresponds to a very small relative error of 2^(1/2^scale) = 2^(1/2^10) = 0.00068 = 0.068% and is sufficient for most practical cases.

There was a discussion about limiting the scale in this thread open-telemetry/opentelemetry-proto#322 (comment) in which I favored an upper limit, not only because the lookup table approach is faster, but also because it would have the potential to standardize the mapping of values to bucket indices, which is not possible with the logarithm due to platform-dependent numerical errors.

@jack-berg
Copy link
Member Author

For positive scales, there are significantly faster techniques using lookup tables (mentioned here ) that avoid the logarithm.

The text mentions that the logarithm method is preferred because its "nearly as fast and accurate as the lookup table approach". Is there a reference implementation with a performance comparison lying around anywhere?

If the lookup table approach requires limiting scale to ~10, that seems to further indicate that configuring the max scale bounds would be useful.

@oertl
Copy link

oertl commented Nov 30, 2022

@jack-berg
A while ago, I tested the performance of various exponential histogram implementations in Java on my local machine.

The results showed that histograms using a logarithm for indexing (NrSketchSimpleSubBucketLogIndexerRecordingSpeedBenchmark, DynaHistDynamicLogOptimalRecordingSpeedBenchmark, DDSketchPaginatedLogarithmicRecordingSpeedBenchmark) require at least 30ms for recording 1M values while those using the lookup table approach (NrSketchSimpleSubBucketLookupIndexerRecordingSpeedBenchmark, DynaHistDynamicOpenTelemetryExponentialBucketRecordingSpeedBenchmark) are significantly faster and can even do that in less than 10ms (DynaHistDynamicOpenTelemetryExponentialBucketRecordingSpeedBenchmark).

Since adding a value to the histogram also involves range-checks and reallocations of buckets, the indexing costs are only part of it. Therefore, I think the results might be consistent with the 9x performance difference you have reported.

For devices without floating point unit (FPU), the difference could be even more significant.

@jack-berg
Copy link
Member Author

I propose adding a MaxScale configuration parameter. I don't have strong opinions about the default value, but the Java and Dotnet implementations are currently hardcoded to 20, so that seems like a good starting point.

Implementations that care to optimize can use the log strategy for indexes greater than 10, the lookup strategy for indexes 10 >= x > 0, and the bitshifting strategy for indexes less than or equal to zero.

Users that care about performance can choose a MaxScale that minimizes downscales and which is able to use a bucket indexing strategy better than the log approach.

@open-telemetry/specs-metrics-approvers WDYT?

@jmacd
Copy link
Contributor

jmacd commented Dec 13, 2022

I think it is reasonable to support a MaxScale parameter, though I also recommend looking into @oertl's technique (or similar by @yzhuge). I support a MaxScale parameter because it allows the user to avoid filling memory with high-resolution data when they would prefer low resolution.

@jmacd
Copy link
Contributor

jmacd commented Dec 13, 2022

As a practical example, consider a case where OTLP histograms are being translated into OpenHistogram (Circllhist). In that histogram, which has fixed resolution, there are values of the OTLP scale parameter that are senselessly high resolution, in which case a user would prefer to limit memory growth, staying below MaxSize instead of filling space with higher-than-necessary precision.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
spec:metrics Related to the specification/metrics directory
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants