Add metrics support for Netty 4.x #3742

bclozel · 2023-04-04T19:52:47Z

This commit adds two new MeterBinder implementations for instrumenting
Netty 4.x: NettyAllocatorMetrics and NettyEventExecutorMetrics.

NettyAllocatorMetrics will instrument any ByteBufAllocatorMetricProvider
and gather information about heap/direct memory allocated; additional
metrics are provided for pooled allocators.

NettyEventExecutorMetrics will instrument EventExecutor (typically,
EventLoop instances) and count the number of pending tasks for each.

Metrics and tags are described in the NettyMeters class.

Closes gh-522

franz1981 · 2023-04-04T20:27:14Z

In case you are interested, it is possible to add these additional metrics out of Netty event loops : netty/netty#9080

Sadly I have never completed it, but it should be simple

And the same, here: netty/netty#11293 (comment)
To use a sentinel periodic task, per event loop, to measure busyness of it

bclozel · 2023-04-04T20:28:23Z

This PR is focusing on Netty 4.x - Netty 5.x is still in alpha version, we can add that support anytime.
I'll add a few notes here to highlight important points regarding the binder setup and the metrics themselves.

MeterBinder setup

The Reactor team is using a ConcurrentMap cache to avoid binding metrics to the same allocator/event loop multiple times. Since allocator and event loop resources can be configured in multiple ways, the project chose to instrument those lazily at runtime as they are encountered during channel initialization.

I initially adopted that approach but rolled it back for two reasons:

this is quite unusual in MeterBinder implementations
it works for Reactor Netty as it relies on a single registry, but supporting multiple registries would need a more complicated setup

Maybe this cache could be handled still as a Reactor Netty opinion and still leverage the binders provided here?

Metrics names configuration

In my initial proposal I said that I would try to provide a way to customize metric names. Because of the number and structure of metrics, this PR doesn't allow that. Instead, I think that libraries and apps could use a MeterFilter to rewrite metric names on the fly with a custom prefix. Is that acceptable? See the next section for the actual metrics.

Metrics

"netty.allocator.memory.used" - Size of memory used by the allocator, in bytes
Tags: "id" (unique id for the allocator), "allocator.type" (pooled, unpooled), "memory.type" (heap, direct)
"netty.allocator.memory.pinned" - Size of memory used by allocated buffers, in bytes.
Tags: "id" (unique id for the allocator), "allocator.type" (pooled, unpooled), "memory.type" (heap, direct)
"netty.allocator.pooled.arenas" - Number of Arenas for a pooled allocator.
Tags: "id" (unique id for the allocator), "allocator.type" (pooled, unpooled), "memory.type" (heap, direct)
"netty.allocator.pooled.cache.size" - Size of the cache for a pooled allocator, in bytes.
Tags: "id" (unique id for the allocator), "allocator.type" (pooled, unpooled), "cache.type" (normal, small)
"netty.allocator.pooled.threadlocal.caches" - Number of ThreadLocal caches for a pooled allocator.
Tags: "id" (unique id for the allocator), "allocator.type" (pooled, unpooled)
"netty.allocator.pooled.chunk.size" - Size of memory chunks for a pooled allocator, in bytes.
Tags: "id" (unique id for the allocator), "allocator.type" (pooled, unpooled)
"netty.eventexecutor.tasks.pending" - Number of pending tasks in the event executor.
Tags: "name" (unique name for the event executor)

This PR does not instrument the DNS infrastructure and I didn't dig much in that area. I'm not sure we can use a DnsQueryLifecycleObserver to record Timers, maybe only Counters are possible?

Prometheus format sample

Here is a sample of Prometheus format I captured while testing the instrumentation on a running Netty server.

# HELP netty_eventexecutor_tasks_pending  
# TYPE netty_eventexecutor_tasks_pending gauge
netty_eventexecutor_tasks_pending{name="nioEventLoopGroup-3-2",} 0.0
netty_eventexecutor_tasks_pending{name="nioEventLoopGroup-3-1",} 0.0
netty_eventexecutor_tasks_pending{name="nioEventLoopGroup-3-5",} 0.0
netty_eventexecutor_tasks_pending{name="nioEventLoopGroup-3-4",} 0.0
netty_eventexecutor_tasks_pending{name="nioEventLoopGroup-3-3",} 0.0
# HELP netty_allocator_memory_used  
# TYPE netty_allocator_memory_used gauge
netty_allocator_memory_used{allocator_type="pooled",id="814169746",memory_type="heap",} 2.097152E7
netty_allocator_memory_used{allocator_type="pooled",id="814169746",memory_type="direct",} 2.097152E7
# HELP netty_allocator_pooled_cache_size  
# TYPE netty_allocator_pooled_cache_size gauge
netty_allocator_pooled_cache_size{allocator_type="pooled",cache_type="small",id="814169746",} 256.0
netty_allocator_pooled_cache_size{allocator_type="pooled",cache_type="normal",id="814169746",} 64.0
# HELP netty_allocator_memory_pinned  
# TYPE netty_allocator_memory_pinned gauge
netty_allocator_memory_pinned{allocator_type="pooled",id="814169746",memory_type="heap",} 5734400.0
netty_allocator_memory_pinned{allocator_type="pooled",id="814169746",memory_type="direct",} 0.0
# HELP netty_allocator_pooled_arenas  
# TYPE netty_allocator_pooled_arenas gauge
netty_allocator_pooled_arenas{allocator_type="pooled",id="814169746",memory_type="heap",} 16.0
netty_allocator_pooled_arenas{allocator_type="pooled",id="814169746",memory_type="direct",} 16.0
# HELP netty_allocator_pooled_threadlocal_caches  
# TYPE netty_allocator_pooled_threadlocal_caches gauge
netty_allocator_pooled_threadlocal_caches{allocator_type="pooled",id="814169746",} 5.0
# HELP netty_allocator_pooled_chunk_size  
# TYPE netty_allocator_pooled_chunk_size gauge
netty_allocator_pooled_chunk_size{allocator_type="pooled",id="814169746",} 4194304.0

shakuzen · 2023-04-05T02:30:09Z

Instead, I think that libraries and apps could use a MeterFilter to rewrite metric names on the fly with a custom prefix. Is that acceptable?

As someone who isn't a Netty expert, I think so. I'd love to hear from others who know more about Netty usage than me, though. In the past, the strong need to customize the metric name at the binder level came from there being multiple instances of the instrumented thing with potentially different tags on it, like ExecutorServiceMetrics. Will there be multiple instances of the binder in an app with a need to distinguish between the metrics from each? Netty being shaded was something I thought about, but if the package is different, these binders won't be usable anyway.

This PR does not instrument the DNS infrastructure and I didn't dig much in that area.

I think it's fine to leave that as out-of-scope for this PR. If any users would like us to add this, please open an issue requesting it (or a pull request).

shakuzen · 2023-04-05T02:42:15Z

...rc/test/java/io/micrometer/core/instrument/binder/netty4/NettyEventExecutorMetricsTests.java

+            if (eventExecutor instanceof SingleThreadEventExecutor) {
+                SingleThreadEventExecutor singleThreadEventExecutor = (SingleThreadEventExecutor) eventExecutor;
+                names.add(singleThreadEventExecutor.threadProperties().name());
+                new NettyEventExecutorMetrics(eventExecutor).bindTo(this.registry);


Without being a Netty expert, having users loop like this to bind each feels a bit weird to me. Would there be a reason a user would want metrics for some event executors in an EventExecutorGroup but not others? I wonder if we should use a higher level abstraction in NettyEventExecutorMetrics and add metrics for each executor for users rather than make them bind each one individually.

I don't get this question.
Basically you have EventLoopGroup with EventLoops. Every EventLoop has a name and a queue with pending tasks. What you are proposing is to have metrics on the EventLoopGroup is that correct? Typically the EventLoops are not equally loaded.

I'm saying we should have metrics on all of the EventLoops like this test does, but without making users do new NettyEventExecutorMetrics(eventExecutor).bindTo(this.registry) for each individual EventExecutor. Instead we could take the EventLoopGroup as a parameter and register metrics for each EventExecutor so users only need to call, e.g. new NettyEventExecutorMetrics(eventLoopGroup).bindTo(this.registry) once rather than iterating over each element like now. Basically the question is does it make sense to make things as granular as they are now? It only makes sense to me if there is a case you would only want metrics for some EventLoops in a group but not all of them. If you always want all of them, we should just take the group as a parameter and iterate internally so users don't have to. Does that make more sense?

I agree with you, it is just easier in Reactor Netty at this point to do it per EventLoop. Definitely you want metrics for all EventLoops in the EventLoopGroup.

I guess I could add a constructor variant that takes the entire EventLoopGroup?
This implementation was targeting the "lazy" case where the current EventLoop is found in the given channel during its initialization. Something like:

@Override public void initChannel(SocketChannel channel) throws Exception { ByteBufAllocator alloc = channel.alloc(); if (alloc instanceof ByteBufAllocatorMetricProvider) { // this concurrent check must be implemented by micrometer users if (isAllocatorInstrumented(alloc)) { new AllocatorMetrics(((ByteBufAllocatorMetricProvider)alloc)).bindTo(prometheusRegistry); } } // this concurrent check must be implemented by micrometer users if (isEventLoopInstrumented(channel.eventLoop())) { new EventExecutorMetrics(channel.eventLoop()).bindTo(prometheusRegistry); } channel.pipeline().addLast(new HttpRequestDecoder()); channel.pipeline().addLast(new HttpResponseEncoder()); channel.pipeline().addLast(new CustomHttpServerHandler()); }

violetagg · 2023-04-05T07:38:03Z

In case you are interested, it is possible to add these additional metrics out of Netty event loops : netty/netty#9080

Sadly I have never completed it, but it should be simple

And the same, here: netty/netty#11293 (comment) To use a sentinel periodic task, per event loop, to measure busyness of it

Yep that's something that Reactor Netty has also in its backlog reactor/reactor-netty#1433

violetagg · 2023-04-05T08:49:31Z

I think that libraries and apps could use a MeterFilter to rewrite metric names on the fly with a custom prefix. Is that acceptable?

Reactor Netty will need to change the name if we want to keep backwards compatibility ...

bclozel · 2023-04-05T12:40:59Z

In case you are interested, it is possible to add these additional metrics out of Netty event loops : netty/netty#9080

Sadly I have never completed it, but it should be simple

And the same, here: netty/netty#11293 (comment) To use a sentinel periodic task, per event loop, to measure busyness of it

@franz1981 @violetagg I think this type of instrumentation really belongs in Netty directly. I'd be happy to expand metrics here once this API is available in Netty. We do maintain more involved instrumentations, but they usually rely on official extension points that are not likely to change.

bclozel · 2023-04-05T12:42:22Z

I've just pushed additional changes in a separate commit that:

change the "allocator.type" Tag for allocator metrics now hold the actual Java simple class name, e.g. "UnpooledByteBufAllocator"
the binder for executor metrics now accept Iterable<EventExecutor>, which means both EventLoopGroup and EventLoop types are compatible.

I will squash this commit before merging this PR, once we're done with the review cycle.

franz1981 · 2023-04-05T13:04:37Z

@bclozel

I think this type of instrumentation really belongs in Netty directly. I'd be happy to expand metrics here once this API is available in Netty

For this one netty/netty#9080 I think it's fine; but please consider that in Netty we don't modify public APIs unless marking them as Unstable (at least in Netty 4.1, for Netty 5, no idea)

netty/netty#11293 (comment)

For this one, is different, because is something that Netty cannot provide nor decide by it's own, if the dynamic that allow it work is clear: if not, I can better explain it here instead.

shakuzen

Looks good to me. I would like to get rid of id on the allocator metrics if possible because it doesn't seem particularly meaningful to a user looking at the metrics. Due to my lack of experience with Netty, I don't know if there would ever be multiple instances of the same type of allocator in the same app to instrument. If not, it seems like we could get rid of id. However, it does seem theoretically possible for their to be multiple allocator instances of the same type.

shakuzen · 2023-04-05T16:10:35Z

...er-core/src/main/java/io/micrometer/core/instrument/binder/netty4/NettyAllocatorMetrics.java

+ * @since 1.11.0
+ * @see NettyMeters
+ */
+public class NettyAllocatorMetrics implements MeterBinder {


Could we add a typical usage example to a JavaDoc? I don't know if it will be common knowledge for Netty users from the API defined here how/where to get the type to pass to the constructor.

Tests do show this to some extent, but far from real usage, so +1 on @shakuzen 's suggestion.

@alesj we've added code snippets in the reference documentation. Does this work for you?

This commit adds two new `MeterBinder` implementations for instrumenting Netty 4.x: `NettyAllocatorMetrics` and `NettyEventExecutorMetrics`. `NettyAllocatorMetrics` will instrument any `ByteBufAllocatorMetricProvider` and gather information about heap/direct memory allocated; additional metrics are provided for pooled allocators. `NettyEventExecutorMetrics` will instrument `Iterable<EventExecutor>` (typically, `EventLoop` or `EventLoopGroup` instances) and count the number of pending tasks for all. Metrics and tags are described in the `NettyMeters` class. Closes micrometer-metricsgh-522

See micrometer-metricsgh-3742

See gh-3742

bclozel mentioned this pull request Apr 4, 2023

Metrics support for Netty allocators and event executors #522

Closed

shakuzen reviewed Apr 5, 2023

View reviewed changes

shakuzen approved these changes Apr 5, 2023

View reviewed changes

bclozel force-pushed the netty-metrics branch from f3cac57 to d985e62 Compare April 6, 2023 13:05

bclozel merged commit d985e62 into micrometer-metrics:main Apr 6, 2023
1 check passed

izeye mentioned this pull request Apr 13, 2023

Polish "Add metrics support for Netty 4.x" #3768

Merged

izeye added a commit to izeye/micrometer that referenced this pull request Apr 13, 2023

Polish "Add metrics support for Netty 4.x"

0db41b7

See micrometer-metricsgh-3742

jonatan-ivanov pushed a commit that referenced this pull request Apr 13, 2023

Polish "Add metrics support for Netty 4.x" (#3768)

23804d5

See gh-3742

bclozel deleted the netty-metrics branch April 24, 2023 08:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add metrics support for Netty 4.x #3742

Add metrics support for Netty 4.x #3742

bclozel commented Apr 4, 2023

franz1981 commented Apr 4, 2023 •

edited

bclozel commented Apr 4, 2023 •

edited

shakuzen commented Apr 5, 2023

shakuzen Apr 5, 2023

violetagg Apr 5, 2023 •

edited

shakuzen Apr 5, 2023

violetagg Apr 5, 2023

bclozel Apr 5, 2023

violetagg commented Apr 5, 2023

violetagg commented Apr 5, 2023 •

edited

bclozel commented Apr 5, 2023

bclozel commented Apr 5, 2023

franz1981 commented Apr 5, 2023

shakuzen left a comment

shakuzen Apr 5, 2023

alesj Apr 24, 2023 •

edited

bclozel Apr 24, 2023

Add metrics support for Netty 4.x #3742

Add metrics support for Netty 4.x #3742

Conversation

bclozel commented Apr 4, 2023

franz1981 commented Apr 4, 2023 • edited

bclozel commented Apr 4, 2023 • edited

MeterBinder setup

Metrics names configuration

Metrics

Prometheus format sample

shakuzen commented Apr 5, 2023

shakuzen Apr 5, 2023

Choose a reason for hiding this comment

violetagg Apr 5, 2023 • edited

Choose a reason for hiding this comment

shakuzen Apr 5, 2023

Choose a reason for hiding this comment

violetagg Apr 5, 2023

Choose a reason for hiding this comment

bclozel Apr 5, 2023

Choose a reason for hiding this comment

violetagg commented Apr 5, 2023

violetagg commented Apr 5, 2023 • edited

bclozel commented Apr 5, 2023

bclozel commented Apr 5, 2023

franz1981 commented Apr 5, 2023

shakuzen left a comment

Choose a reason for hiding this comment

shakuzen Apr 5, 2023

Choose a reason for hiding this comment

alesj Apr 24, 2023 • edited

Choose a reason for hiding this comment

bclozel Apr 24, 2023

Choose a reason for hiding this comment

franz1981 commented Apr 4, 2023 •

edited

bclozel commented Apr 4, 2023 •

edited

violetagg Apr 5, 2023 •

edited

violetagg commented Apr 5, 2023 •

edited

alesj Apr 24, 2023 •

edited