Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gcp/observability: Implement tracing/metrics via OpenCensus #5372

Merged
merged 4 commits into from Jun 7, 2022

Conversation

lidizheng
Copy link
Contributor

@lidizheng lidizheng commented May 19, 2022

As title.

This PR includes changes from #5352 and #5347

RELEASE NOTES:

  • gcp/observability: add experimental OpenCensus tracing/metrics support

@lidizheng
Copy link
Contributor Author

The flake looks unrelated, filed #5375

@lidizheng lidizheng marked this pull request as ready for review May 20, 2022 20:38
@lidizheng
Copy link
Contributor Author

@dfawley @menghanl PTAL. This PR will be all the things we need for PPII, I'm okay to work to merge each child PR or to merge this one directly.

@zasweq zasweq requested a review from dfawley May 24, 2022 17:33
@lidizheng
Copy link
Contributor Author

@dfawley PTAL when you got a chance.

CC @fengli79

@dfawley dfawley self-assigned this Jun 2, 2022
@dfawley dfawley added this to the 1.48 Release milestone Jun 2, 2022
@dfawley
Copy link
Member

dfawley commented Jun 2, 2022

Looks like there are compilation errors here now:

Error: stats/stats_test.go:179:1: expected '}', found '<<'

@lidizheng
Copy link
Contributor Author

Good catch 😵 I'll rebase to clean up the commits once the other two PRs are merged into master.

@lidizheng
Copy link
Contributor Author

@dfawley PTAL. This PR is now up-to-date with master branch.

gcp/observability/observability.go Outdated Show resolved Hide resolved
gcp/observability/opencensus.go Outdated Show resolved Hide resolved
Comment on lines 730 to 731
deadline := time.Now().Add(defaultTestTimeout)
for time.Now().Before(deadline) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW IMO it's nicer to use a context for things like this.

ctx, cancel := context.WithTimeout(context.Backgound(), defaultTestTimeout)
defer cancel()
for ctx.Err() == nil {
...

The reason for this is that contexts are more versatile (you might need to pass one into function calls anyway), so you may already have one, and it's good to follow a consistent pattern.

That said this is fine, so no need to change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, this way feels more Golang. I didn't find the second usage of the new context though.

TIL if a context is timed out, the timeout error will be in Err().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find the second usage of the new context though.

That's why I was saying no need to change for this PR; there is no other context usage here. Many tests need contexts already for other purposes, so using that same context for a loop timeout works well.

ctx.Err() is either Canceled or DeadlineExceeded or nil.

time.Sleep(100 * time.Millisecond)
}
if !validClientViews || !validServerViews || !validSpans {
t.Fatalf("Invalid OpenCensus export data: validClientViews=%v validServerViews=%v validSpans=%v", validClientViews, validServerViews, validSpans)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with this is it doesn't show what values were received instead.

IMO it would be better to save the actual values and log them here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Using error as the recording type instead of bool.

gcp/observability/observability_test.go Outdated Show resolved Hide resolved
gcp/observability/observability_test.go Outdated Show resolved Hide resolved
Copy link
Contributor Author

@lidizheng lidizheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review! All comments should be addressed.

gcp/observability/observability.go Outdated Show resolved Hide resolved
gcp/observability/observability_test.go Outdated Show resolved Hide resolved
gcp/observability/observability_test.go Outdated Show resolved Hide resolved
Comment on lines 730 to 731
deadline := time.Now().Add(defaultTestTimeout)
for time.Now().Before(deadline) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, this way feels more Golang. I didn't find the second usage of the new context though.

TIL if a context is timed out, the timeout error will be in Err().

time.Sleep(100 * time.Millisecond)
}
if !validClientViews || !validServerViews || !validSpans {
t.Fatalf("Invalid OpenCensus export data: validClientViews=%v validServerViews=%v validSpans=%v", validClientViews, validServerViews, validSpans)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Using error as the recording type instead of bool.

gcp/observability/opencensus.go Outdated Show resolved Hide resolved
@lidizheng
Copy link
Contributor Author

All checks are passing, PTALA. @dfawley

Comment on lines 730 to 731
deadline := time.Now().Add(defaultTestTimeout)
for time.Now().Before(deadline) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find the second usage of the new context though.

That's why I was saying no need to change for this PR; there is no other context usage here. Many tests need contexts already for other purposes, so using that same context for a loop timeout works well.

ctx.Err() is either Canceled or DeadlineExceeded or nil.

validClientViews = false
validServerViews = false
validSpans = false
var clientViewsError, serverViewsError, spansError error
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about var errs []error? It might be simpler. (Or []string would work for this, too.)

var errs []error
for ctx.Err() == nil {
	errs = nil
	if ...; value != TypeOpenCensusViewCount {
		errs = append(errs, fmt.Errorf("this error %v", value))
	}
	...
	if len(errs) == 0 {
		break
	}
}
if len(errs) > 0 {
	t.Fatalf("Invalid OpenCensus export data: %v", errs)
}

@dfawley dfawley assigned lidizheng and unassigned dfawley Jun 7, 2022
@dfawley dfawley changed the title o11y: Implement tracing/metrics via OpenCensus gcp/observability: Implement tracing/metrics via OpenCensus Jun 7, 2022
@dfawley dfawley merged commit 9ee2f14 into grpc:master Jun 7, 2022
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 5, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants