Skip to content

ddelbondio/datadog-autodiscovery-bug

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Datadog JMX metrics collection bug after payload pod restart in Kubernetes

In the occurrence of payload Kubernetes pod restart the JMX metrics collection does not resume. It seems to be related to JMXFetch instance initialization

The relevant error seems to be

2020-04-13 09:40:12 UTC | CORE | INFO | (pkg/jmxfetch/jmxfetch.go:248 in func1) | 2020-04-13 09:40:12,115 | WARN  | App | No instance could be initiated. Retrying initialization.

Versions

Docker

Server: Docker Engine - Community
 Engine:
  Version:          19.03.8
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.12.17
  Git commit:       afacb8b
  Built:            Wed Mar 11 01:29:16 2020
  OS/Arch:          linux/amd64
  Experimental:     true

Kubernetes

"clientVersion": {
  "major": "1",
  "minor": "15",
  "gitVersion": "v1.15.11",
  "gitCommit": "d94a81c724ea8e1ccc9002d89b7fe81d58f89ede",
  "gitTreeState": "clean",
  "buildDate": "2020-03-12T21:08:59Z",
  "goVersion": "go1.12.17",
  "compiler": "gc",
  "platform": "linux/amd64"
},
"serverVersion": {
  "major": "1",
  "minor": "16+",
  "gitVersion": "v1.16.6-beta.0",
  "gitCommit": "e7f962ba86f4ce7033828210ca3556393c377bcc",
  "gitTreeState": "clean",
  "buildDate": "2020-01-15T08:18:29Z",
  "goVersion": "go1.13.5",
  "compiler": "gc",
  "platform": "linux/amd64"
}

Running the repro case

You need a running kubernetes cluster on the same host as the test script. The script will use the local Docker daemon for the created images and requires a kubectl to be set up to connect the local cluster.

Datadog logs for the test are written to logs/ folder.

Test results

The tests where performed with the versions listed above on debian (9.12) and a Kubernetes cluster running on Docker Desktop with WSL 2 backend

Build image
sha256:6868f97297db0e3826154d982fb9cd8a80b85c474ee944498c83a5642f86c96e
sha256:d353cafb59a45efa98c88d6d403a268a411b7c1f61e384c690685d07722f465a


Setting up
namespace/datadog-agent created
serviceaccount/datadog-agent created
clusterrole.rbac.authorization.k8s.io/datadog-agent unchanged
clusterrolebinding.rbac.authorization.k8s.io/datadog-agent unchanged
deployment.apps/fake-datadog created
service/fake-datadog created


Running tests

[12:38:21] Running test
-------------------------
[12:38:24] Agent: running, JVM: no restart => success


[12:39:17] Agent: running, JVM: restart => fail


shutting down
namespace "datadog-agent" deleted

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published