Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CURATOR-695] Open Telemetry tracing #494

Draft
wants to merge 15 commits into
base: master
Choose a base branch
from
Draft

Conversation

teabot
Copy link

@teabot teabot commented Dec 11, 2023

This changes adds an Open Telemetry tracing driver. To make this more effective, it also decomposes OperationTrace events so that the start and end of the event can be handled distinctly.

Curator's OperationTraces are mapped to OTel Spans. When an OperationTrace is started (created), a corresponding OTel span is started and made current. The span is closed when the OperationTrace is committed. OperationTrace members are mapped to OTel span attributes, and result codes are mapped to appropriate OTel representations.

Any Curator EventTraces are attributed to the current OTel span.

teabot and others added 6 commits December 11, 2023 11:24
This changes adds an Open Telemetry tracing driver. To make this more effective, it also decomposes OperationTrace events so that the start and end of the event can be managed separately.
This changes adds an Open Telemetry tracing driver. To make this more effective, it also decomposes OperationTrace events so that the start and end of the event can be managed separately.
Co-authored-by: shenjianeng <sjn@dxy.cn>
@teabot
Copy link
Author

teabot commented Dec 27, 2023

Any thoughts on how this looks @eolivelli ?

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work.
I left some comment about files that we don't need.

Apart from that I think it is better to also add a test case that uses some basic curator functions and verify that otel tracks them

@@ -0,0 +1,202 @@

Apache License
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this file

@@ -0,0 +1,5 @@
Apache Curator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this file

@@ -0,0 +1,202 @@

Apache License
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for this file

@@ -0,0 +1,5 @@
Apache Curator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need

<dependency>
<groupId>org.apache.curator</groupId>
<artifactId>curator-client</artifactId>
<version>${project.version}</version>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be at scope 'provided'

@@ -0,0 +1,5 @@
Apache Curator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The target directory must not be committed

tisonkun and others added 9 commits January 2, 2024 22:36
Signed-off-by: tison <wander4096@gmail.com>
This changes adds an Open Telemetry tracing driver. To make this more effective, it also decomposes OperationTrace events so that the start and end of the event can be managed separately.
# Conflicts:
#	curator-client/src/main/java/org/apache/curator/drivers/OperationTrace.java
#	curator-drivers/open-telemetry/pom.xml
#	curator-drivers/open-telemetry/src/main/java/org/apache/curator/drivers/opentelemetry/OpenTelemetryTracingDriver.java
#	curator-drivers/pom.xml
@@ -26,11 +26,11 @@
<parent>
<groupId>org.apache.curator</groupId>
<artifactId>apache-curator</artifactId>
<version>5.5.1-SNAPSHOT</version>
<version>5.6.1-SNAPSHOT</version>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should bump the version in a separate commit, can you please revert ?

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM

but please revert the change to the pom versions

@eolivelli
Copy link
Contributor

I have asked for review on the mailing list

@eolivelli
Copy link
Contributor

also please check CI

@kezhuw kezhuw self-requested a review January 29, 2024 13:07
Copy link
Member

@kezhuw kezhuw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the direction and attach my comments inline.

*/
public abstract void addTrace(OperationTrace trace);
public abstract Object startTrace(OperationTrace trace);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to introduce a new sub-class/sub-interface of TracerDriver so we won't break third-party usage of AdvancedTracerDriver.

}
// We will close this in endTrace
@SuppressWarnings("MustBeClosedChecker")
Scope scope = span.makeCurrent();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curator invokes callbacks asynchronously.

The timing of callback invocation is indeterministic. Span::makeCurrent in startTrace and Scope::close in endTrace probably will not form try-with-resources as suggested in javadoc of Span::makeCurrent.

Also, endTrace is probably not invoked in thread where startTrace called as far as I known. Given CuratorFramework::getChildren as an example, startTrace is invoked in Curator threads while endTrace is invoked in ZooKeeper thread.

Since startTrace and endTrace are the only interceptions, I think we can drop scope/context propagation here.

);
}

span.setAttribute(LATENCY_MS_ATTRIBUTE, trace.getRequestBytesLength());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be trace.getLatencyMs() ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants