Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace Prometheus with OpenTelemetry #1920

Open
sagikazarmark opened this issue Jan 14, 2021 · 7 comments
Open

Replace Prometheus with OpenTelemetry #1920

sagikazarmark opened this issue Jan 14, 2021 · 7 comments

Comments

@sagikazarmark
Copy link
Member

Is your feature request related to a problem?

OpenTelemetry will be ready soon (it's already in RC).

Describe the solution you'd like to see

We should replace the Prometheus lib with OpenTelemetry.

We should probably keep the OpenMetrics export, but we should add an (configurable) OpenTelemetry agent target as well.

@nabokihms
Copy link
Member

There are a few words about OpenTelemetry Prometheus compatible exporter.

I managed to replace our current implementation of metrics endpoint with the combination of an otel-prometheus-exporter and a handler wrapper. It seems like both metrics exporter and traces exporter works fine. During my test, I looked through traces via stdout exporter.

@sagikazarmark If it is ok for you, I would like to be in charge of this issue.

The plan is:

  • Replace prometheus with otel without losing observability
  • Write a (design?) doc about traces exporting
  • Add ability to export traces to jaeger
  • Add ability to export traces to otl-collector

@sagikazarmark
Copy link
Member Author

Sure, feel free to take it. A couple thoughts:

Honestly, I'd start with a minimum replacement of the prometheus lib with open telemetry and go from there.

@nabokihms
Copy link
Member

@sagikazarmark played with it a little last weekend.

Some concerns/problems I have faced:

  • GRPC version in dex is v1.26.0, but for otel instrumentation, it is required to be around v1.35.0
  • It looks impossible to count errors with otel net/http instrumentation. This method's code shows that we can propagate a "labeler" to add custom labels. We can extract info from the request (such as method name) and add it to labels, but unfortunately, we can't add status code or other things from the response.

@sagikazarmark
Copy link
Member Author

* GRPC version in dex is v1.26.0, but for otel instrumentation, it is required to be around [v1.35.0](https://github.com/open-telemetry/opentelemetry-go-contrib/blob/v0.18.0/instrumentation/google.golang.org/grpc/otelgrpc/go.mod#L15)

Once we can upgrade etcd to 3.5, we can upgrade gRPC as well.

* It looks impossible to count errors with otel net/http instrumentation. [This method's code](https://github.com/open-telemetry/opentelemetry-go-contrib/blob/v0.18.0/instrumentation/net/http/otelhttp/handler.go#L164) shows that we can propagate a "labeler" to add custom labels. We can extract info from the request (such as method name) and add it to labels, but unfortunately, we can't add status code or other things from the response.

Metrics is probably not yet fully ready.

@nabokihms
Copy link
Member

I opened the draft, but there are still some problems.

@sagikazarmark
Copy link
Member Author

One possible workaround is to use OpenCensus with the available bridge.

@sagikazarmark
Copy link
Member Author

Now that the status code label thing is resolved, we should probably revive this topic as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants