-
Notifications
You must be signed in to change notification settings - Fork 492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(otelgin): add support for recording panics #5090
Open
asecondo
wants to merge
2
commits into
open-telemetry:main
Choose a base branch
from
asecondo:5088_otelgin-catch-record-panics
base: main
Could not load branches
Branch not found: {{ refName }}
Could not load tags
Nothing to show
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This behavior is already in the default SDK's
End
method:https://github.com/open-telemetry/opentelemetry-go/blob/e8973b75b230246545cdae072a548c83877cba09/sdk/trace/span.go#L403-L420
Why is this being added here? It seems like an SDK concern not an instrumentation one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's a really good question. the general pattern/setup i typically deal with have a middleware pattern that look something like this:
panics show up in logs via the panic recovery middleware described above and also show up in the spans via the mechanism you linked. thinking about this more, the gap i currently have is less that panics get added to this specific span somehow, and more that the spans generated by the otelgin middleware don't have a span status as error and don't set the
http.status_code
to anything. when an alert fires for an increase in 5XX error codes, i usually first go to Jaeger and type something likehttp.status_code=500
orerror=true
. in the cases of panics, those types of queries don't work.given that, i think this would leave a few options:
http.status_code
field not being guaranteed, but that's not necessarily the responsibility of this middleware to set that. the span status will still be set to error, which is a strong signal.i'm leaning towards 2, but i'd love to get other opinions as well :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the latest semantic conventions, the
error.type
attribute should be used here to capture the panic information.It is a known thing that our HTTP instrumentation is quite behind on semantic conventions, but I think this should be our target. Instead of having configuration to enable this, we should add the
error.type
attribute if a panic occurred always.This will require this instrumentation to go through the process of upgrading semconv which will require a migration path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for the late reply - getting back to this now. i think that makes sense. i'm not familiar with these migrations, but i'll happily give it a shot. @MrAlias i may poke you for help with the migrations if i run into issues or have questions. i'll apply these changes, as well as the suggestions from @dmathieu.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually it looks like #5092 will take care of the migration plan. i might wait until that PR is merged and then make the changes to this package on top of that. how does that sound?