Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds Shutdown to Metrics API #132

Merged
merged 3 commits into from Apr 26, 2022
Merged

Adds Shutdown to Metrics API #132

merged 3 commits into from Apr 26, 2022

Conversation

ggambetti
Copy link
Member

@ggambetti ggambetti commented Apr 14, 2022

Short lived applications run the risk of losing metrics that are
generated near the end of their lifetimes because MetricSinks can
and do buffer data locally. The exact amount of data loss depends on
MetricSink implementation, caller configuration, and the timing of
the last metric sync.

This adds a Shutdown function to the package API and the MetricSink
interface. Shutdown flushes locally buffered data to storage, and
then frees resources allocated to the MetricSink.

Currently not all MetricSinks support exiting, and so resource
release when calling Shutdown is best effort. Since Shutdown is
intended for use immediately prior to application exit this is deemed
acceptable, since resource leakage is minimized in this case.

Short lived applications run the risk of losing metrics that are
generated near the end of their lifetimes because MetricSinks can
and do buffer data locally. The exact amount of data loss depends on
MetricSink implementation, caller configuration, and the timing of
the last metric sync.

This adds a Shutdown function to the package API and the MetricSink
interface. Shutdown flushes locally buffered data to storage, and
then frees resources allocated to the MetricSink.

Currently not all MetricSinks support exiting, and so resource
release when calling Shutdown is best effort. Since Shutdown is
intended for use immediately prior to application exit this is deemed
acceptable, since resource leakage is minimized in this case.
Copy link
Contributor

@acpana acpana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for opening this 💯 It looks good; I took a level 1 pass and added some questions and comments.

circonus/circonus.go Outdated Show resolved Hide resolved
const_unix.go Outdated Show resolved Hide resolved
datadog/dogstatsd.go Show resolved Hide resolved
func Shutdown() {
m := globalMetrics.Load().(*Metrics)
// Replace global metrics with the BlackholeSink like how init setup the library.
globalMetrics.Store(&Metrics{sink: &BlackholeSink{}})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this line 100% <-- would you mind helping me understand why we need to do this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal is to explicitly break the following example, so that library consumers build their applications knowing that nothing is collected post Shutdown, and don't accidentally become dependant on collecting metrics post-Shutdown.

import "github.com/armon/go-metrics"

func main() {
  // The implementation of the metrics library means
  // that {"hello", "world"} will only ever have the value 1.
  metrics.New(...)
  metrics.IncrCounter([]string{"hello", "world"}, 1)
  metrics.Shutdown()
  metrics.IncrCounter([]string{"hello", "world"}, 1)
}

Resetting the library to the BlackholeSink isn't strictly necessary. But between applications not cleaning up goroutines (that generate calls to the library), and MetricSink implementations not supporting shutdown/close, it's possible that applications collect metrics and then upload them in the small window between Shutdown and exit. This makes it pretty explicit that this isn't intended or supported.

There is a race condition here: callers getting a pointer to the underlying Metrics struct and then completing their call into the MetricSink after Shutdown finishes. MetricSink must be thread-safe, so this is likely fine.

Copy link
Contributor

@acpana acpana Apr 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for laying out for me here 💯 I see now the issue that could happen;

start.go Show resolved Hide resolved
Copy link
Contributor

@acpana acpana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey @ggambetti this is great! Thanks for addressing all the my comments 💯 🤝

I think this is ready to merge 🚀 . AFAIK, @banks is the person I know that can help merge and maybe cut a release for these changes 🙈 . I hope I'm not volunteering Paul here.

Copy link
Member

@banks banks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ggambetti this looks like a clean solution to this issue!

@banks banks merged commit 5d4d6f5 into hashicorp:master Apr 26, 2022
@heucuva
Copy link

heucuva commented May 11, 2022

Was there a reason why this was a backwards compatibility breaking change with only a patch-level version increase (i.e: v0.3.10 -> v0.3.11)? Was a little caught off guard by it.

@banks
Copy link
Member

banks commented May 11, 2022

@heucuva Apologies, that's my fault. Since this was purely additive I didn't think it was a breaking change. I missed that since it was a public interface addition, other callers implementing their own interfaces would be impacted. Honestly, even without the interface change this probably should have been a minor version at least. My mistake.

This repo is in the process of being moved to an official team rather than a handful of us and Armon who happen to have commit access. Hopefully that will help with a more thoughtful release processes.

I'm going to talk with a few folks and decide what the best path forward is - perhaps re-release as a new major version and remove the 0.3.11 release from GH.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants