Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a worker-specific random source to remove lock contention. #178

Merged
merged 2 commits into from Jan 18, 2021

Conversation

matthewdale
Copy link
Contributor

What does this PR do ?

Use a separate random number source per worker. Currently, the worker function shouldSample() calls rand.Float64(), which uses a lock to serialize all calls to the package-global pseudorandom number generator, creating lock contention between all instances of worker.

Changes:

  • Create and seed a separate pseudorandom number generator for every worker to remove the lock contention Use of rand.Float64() reduces parallelism #80
  • Add a test for the shouldSample() function to assert that it returns true the expected percentage of the time
  • Add a benchmark for the shouldSample() function to measure improvement from removing lock contention

Benchmarks against master

name            old time/op  new time/op  delta
ShouldSample     444ns ± 0%   190ns ± 1%  -57.26%  (p=0.000 n=8+9)
ShouldSample-2   876ns ± 2%    94ns ± 0%  -89.27%  (p=0.000 n=10+9)
ShouldSample-4  1.17µs ± 1%  0.09µs ± 2%  -92.59%  (p=0.000 n=8+8)
ShouldSample-8  1.25µs ± 3%  0.09µs ± 4%  -92.77%  (p=0.000 n=9+10)

Copy link
Member

@hush-hush hush-hush left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, this looks great. I added a few nit-pick but otherwise this looks ready to go.

statsd/worker_test.go Show resolved Hide resolved
statsd/worker.go Show resolved Hide resolved
and run TestShouldSample subtests in parallel.
Copy link
Member

@hush-hush hush-hush left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR !

@hush-hush hush-hush merged commit f6b3e86 into DataDog:master Jan 18, 2021
@@ -59,7 +71,7 @@ func (w *worker) processMetric(m metric) error {
}

func (w *worker) shouldSample(rate float64) bool {
if rate < 1 && rand.Float64() > rate {
if rate < 1 && w.random.Float64() > rate {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matthewdale @hush-hush Shouldn't we have a local lock here? From the docs:

The default Source is safe for concurrent use by multiple goroutines, but Sources created by NewSource are not.

I think this is causing race conditions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're correct. When MutexMode is enabled, the call to worker.processMetric() is made from the same goroutine as the call to the top-level metric emit function (e.g. Count). I was previously under the impression that all calls to worker.processMetric() would be made from a single goroutine per worker, but that's not the case.

I'm working on a fix for the data race bug and will submit it when it's ready.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants