Skip to content
This repository has been archived by the owner on May 26, 2022. It is now read-only.

Cleanup prometheus metrics more predictably #81

Closed
aschmahmann opened this issue Jul 9, 2021 · 0 comments · Fixed by #82
Closed

Cleanup prometheus metrics more predictably #81

aschmahmann opened this issue Jul 9, 2021 · 0 comments · Fixed by #82
Assignees

Comments

@aschmahmann
Copy link

At the moment we have a global variable for TCP metrics

var collector *aggregatingCollector

Which has a map of connections

conns map[uint64] /* id */ *tracingConn

That we only clear out when the metrics are collected

func (c *aggregatingCollector) Collect(metrics chan<- prometheus.Metric) {

We should clear out these connections more predictably (e.g. on connection close or some background goroutine).


There might also be a bug related to the cleanup itself

go-tcp-transport/metrics.go

Lines 110 to 120 in 1b96803

var bytesSent, bytesRcvd uint64
for _, conn := range c.conns {
info, err := conn.getTCPInfo()
if err != nil {
if strings.Contains(err.Error(), "use of closed network connection") {
c.closedConn(conn)
continue
}
log.Errorf("Failed to get TCP info: %s", err)
continue
}

where if we're unable to get the TCP info we never clean it up. e.g. on Windows I see the following error:

2021-07-08T21:39:47.925-0400    ERROR   tcp-tpt go-tcp-transport@v0.2.2/metrics.go:118  Failed to get TCP info: raw-control tcp 192.168.1.6:4001: getsockopt: not implemented
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants