New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
profiler: collect profiles concurrently #1282
Conversation
The upcoming C allocation profile will work by starting a profile, waiting for the profile period to elapse, then stopping and collecting the output. This won't work if the profiles are collected synchronously since the CPU profile is also collected this way, meaning there won't be enough time in a single profile period to collect both. So the profiles need to be collected concurrently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM in general, but see comments.
bat.addProfile(prof) | ||
} | ||
wg.Add(1) | ||
go func(t ProfileType) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT/🐞: enabledProfileTypes()
goes out of its way to return the profiles in a useful order (see comment). Collecting the profiles concurrently essentially circumvents that.
Idea for quick-fix (from Nick): Put interruptibleSleep(profilingPeriod) in all the non-blocking profile types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, and I had to change a few of the tests since they would now block for the default profiling period (1 minute!)
With concurrent profile collection, all profiles should block so that they're collected at the end of the profile period. This preserves the property that all profiles in a batch cover the same time, so events in one profile correspond to events in another. Along the way, a few tests needed to be fixed to not take 1 minute (the default profling period) to complete. There was also a potential panic in TestAllUploaded if the sync.WaitGroup count goes negative due to multiple uploads going through before profiling is stopped. Also fixed unbounded growth of the collected profiles slice. Whoops! Made TestAllUploaded check the *second* batch of profiles to catch this bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🙏
_, params, err := mime.ParseMediaType(r.Header.Get("Content-Type")) | ||
if err != nil { | ||
t.Fatalf("bad media type: %s", err) | ||
return | ||
} | ||
mr := multipart.NewReader(r.Body, params["boundary"]) | ||
profiles = profiles[:0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: I guess this means we'll only verify the second profile batch at the bottom of the test? Ideally we could verify both, but I'm okay to leave this as-is.
If you want to change it, maybe change the type to profiles [][]string
and then perform the check below in a loop to make sure each first level item has the right profile names in it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 I was thinking we could verify the profilers in the handler (see this cleanup PR: #1267) which would mean all uploads get checked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I review #1267 now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll ping you when it's ready for review since I have to fix some merge conflicts for that PR (caused by this one)
The upcoming C allocation profile will work by starting a profile,
waiting for the profile period to elapse, then stopping and collecting
the output. This won't work if the profiles are collected synchronously
since the CPU profile is also collected this way, meaning there won't be
enough time in a single profile period to collect both. So the profiles
need to be collected concurrently.