Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rolling Exceptions from KafkaMetrics with micrometer-core 1.6.2 - kafka-streams (2.5.1, 2.6.0) - spring-boot 2.4.1 #2388

Closed
rvertigo opened this issue Dec 17, 2020 · 2 comments

Comments

@rvertigo
Copy link

Issue:

  • "Rolling" exceptions (stack example below) in Spring Cloud Kafka-Streams Application
    • Spring Boot 2.4.1, spring-cloud-stream-binder-kafka-streams Hoxton.SR9

Repro:

  • The Application has multiple State-Stores in use (which may be the repro-case for this issue).
  • Upgrade from Spring Boot 2.3.7.RELEASE to 2.4.1 (which upgrades to micrometer-core 1.6.2 and kafka-streams 2.6.0)
  • Access /actuator/metrics endpoint, or wait some time.

Result:

  • Continually Rolling Exception, similar to this one:
2020-12-17 01:24:52.947  WARN 29447 --- [r-kafka-metrics] i.m.c.i.binder.kafka.KafkaMetrics        : Failed to bind meter: kafka.stream.state.write.stall.duration.avg [tag(thread.id=etl-index-source-in-0-f84f2a01-d765-471a-83ee-d58f397feb41-StreamThread-1), tag(task.id=0_8), tag(rocksdb.state.id=index-source-state), tag(kafka.version=2.6.0), tag(spring.id=streams.9)].

java.lang.IllegalArgumentException: Collector already registered that provides name: kafka_stream_state_write_stall_duration_avg
	at io.prometheus.client.CollectorRegistry.register(CollectorRegistry.java:54) ~[simpleclient-0.9.0.jar:na]
	at io.prometheus.client.Collector.register(Collector.java:139) ~[simpleclient-0.9.0.jar:na]
	at io.micrometer.prometheus.PrometheusMeterRegistry.lambda$applyToCollector$16(PrometheusMeterRegistry.java:410) ~[micrometer-registry-prometheus-1.6.2.jar:1.6.2]
	at java.base/java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1947) ~[na:na]
	at io.micrometer.prometheus.PrometheusMeterRegistry.applyToCollector(PrometheusMeterRegistry.java:406) ~[micrometer-registry-prometheus-1.6.2.jar:1.6.2]
	at io.micrometer.prometheus.PrometheusMeterRegistry.newGauge(PrometheusMeterRegistry.java:208) ~[micrometer-registry-prometheus-1.6.2.jar:1.6.2]
	at io.micrometer.core.instrument.MeterRegistry.lambda$gauge$1(MeterRegistry.java:297) ~[micrometer-core-1.6.2.jar:1.6.2]
	at io.micrometer.core.instrument.MeterRegistry.lambda$registerMeterIfNecessary$5(MeterRegistry.java:561) ~[micrometer-core-1.6.2.jar:1.6.2]
	at io.micrometer.core.instrument.MeterRegistry.getOrCreateMeter(MeterRegistry.java:614) ~[micrometer-core-1.6.2.jar:1.6.2]
	at io.micrometer.core.instrument.MeterRegistry.registerMeterIfNecessary(MeterRegistry.java:568) ~[micrometer-core-1.6.2.jar:1.6.2]
	at io.micrometer.core.instrument.MeterRegistry.registerMeterIfNecessary(MeterRegistry.java:561) ~[micrometer-core-1.6.2.jar:1.6.2]
	at io.micrometer.core.instrument.MeterRegistry.gauge(MeterRegistry.java:297) ~[micrometer-core-1.6.2.jar:1.6.2]
	at io.micrometer.core.instrument.Gauge$Builder.register(Gauge.java:190) ~[micrometer-core-1.6.2.jar:1.6.2]
	at io.micrometer.core.instrument.binder.kafka.KafkaMetrics.registerGauge(KafkaMetrics.java:208) ~[micrometer-core-1.6.2.jar:1.6.2]
	at io.micrometer.core.instrument.binder.kafka.KafkaMetrics.bindMeter(KafkaMetrics.java:200) ~[micrometer-core-1.6.2.jar:1.6.2]
	at io.micrometer.core.instrument.binder.kafka.KafkaMetrics.lambda$checkAndBindMetrics$1(KafkaMetrics.java:180) ~[micrometer-core-1.6.2.jar:1.6.2]
	at java.base/java.util.LinkedHashMap.forEach(LinkedHashMap.java:684) ~[na:na]
	at java.base/java.util.Collections$UnmodifiableMap.forEach(Collections.java:1505) ~[na:na]
	at io.micrometer.core.instrument.binder.kafka.KafkaMetrics.checkAndBindMetrics(KafkaMetrics.java:153) ~[micrometer-core-1.6.2.jar:1.6.2]
	at io.micrometer.core.instrument.binder.kafka.KafkaMetrics.lambda$bindTo$0(KafkaMetrics.java:104) ~[micrometer-core-1.6.2.jar:1.6.2]
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) ~[na:na]
	at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) ~[na:na]
	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) ~[na:na]
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[na:na]
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[na:na]
	at java.base/java.lang.Thread.run(Thread.java:834) ~[na:na]

Additional metrics listed in the stacks:

java.lang.IllegalArgumentException: Collector already registered that provides name: kafka_stream_state_block_cache_data_hit_ratio
java.lang.IllegalArgumentException: Collector already registered that provides name: kafka_stream_state_block_cache_filter_hit_ratio
java.lang.IllegalArgumentException: Collector already registered that provides name: kafka_stream_state_block_cache_index_hit_ratio
java.lang.IllegalArgumentException: Collector already registered that provides name: kafka_stream_state_bytes_read_compaction_rate
java.lang.IllegalArgumentException: Collector already registered that provides name: kafka_stream_state_bytes_read_rate
java.lang.IllegalArgumentException: Collector already registered that provides name: kafka_stream_state_bytes_read_total
java.lang.IllegalArgumentException: Collector already registered that provides name: kafka_stream_state_bytes_written_compaction_rate
java.lang.IllegalArgumentException: Collector already registered that provides name: kafka_stream_state_bytes_written_rate
java.lang.IllegalArgumentException: Collector already registered that provides name: kafka_stream_state_bytes_written_total
java.lang.IllegalArgumentException: Collector already registered that provides name: kafka_stream_state_flush_latency_avg
java.lang.IllegalArgumentException: Collector already registered that provides name: kafka_stream_state_flush_latency_max
java.lang.IllegalArgumentException: Collector already registered that provides name: kafka_stream_state_flush_rate
java.lang.IllegalArgumentException: Collector already registered that provides name: kafka_stream_state_memtable_bytes_flushed_rate
java.lang.IllegalArgumentException: Collector already registered that provides name: kafka_stream_state_memtable_bytes_flushed_total
java.lang.IllegalArgumentException: Collector already registered that provides name: kafka_stream_state_memtable_hit_ratio
java.lang.IllegalArgumentException: Collector already registered that provides name: kafka_stream_state_number_file_errors_total
java.lang.IllegalArgumentException: Collector already registered that provides name: kafka_stream_state_number_open_files
java.lang.IllegalArgumentException: Collector already registered that provides name: kafka_stream_state_put_latency_avg
java.lang.IllegalArgumentException: Collector already registered that provides name: kafka_stream_state_put_latency_max
java.lang.IllegalArgumentException: Collector already registered that provides name: kafka_stream_state_put_rate
java.lang.IllegalArgumentException: Collector already registered that provides name: kafka_stream_state_restore_latency_avg
java.lang.IllegalArgumentException: Collector already registered that provides name: kafka_stream_state_restore_latency_max
java.lang.IllegalArgumentException: Collector already registered that provides name: kafka_stream_state_restore_rate
java.lang.IllegalArgumentException: Collector already registered that provides name: kafka_stream_state_write_stall_duration_avg
java.lang.IllegalArgumentException: Collector already registered that provides name: kafka_stream_state_write_stall_duration_total

Issue Resolved if Micrometer Version is downgraded:

  • Downgrade to <micrometer.version>1.5.9</micrometer.version>
  • Downgrading the micrometer version results in a single warning message:
Failed to bind meter: kafka.stream.state.memtable.bytes.flushed.rate [tag(thread-id=etl-timeline-in-0-187b56ba-35f7-49af-8f7f-21bda652a117-StreamThread-1), tag(task-id=0_9), tag(rocksdb-window-state-id=merge-activity-windows-other-join-store), tag(kafka-version=2.6.0), tag(spring.id=streams.11)]. However, this could happen and might be restored in the next refresh. Note that subsequent logs will be logged at debug level.
@shakuzen
Copy link
Member

The many Collector already registered exceptions should go away with the fix for #2399. You can test with 1.6.4-SNAPSHOT until it is released (soon). That said, there may still be an issue that meters are not being registered with all the expected tags. A minimal project to reproduce the scenario would be very helpful in tracking that down.

@rvertigo
Copy link
Author

Confirmed fixed. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants