Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

semantic conventions: add structured stacktrace to exception attributes #2841

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,11 @@ release.
- Add `process.context_switches`, and `process.open_file_descriptors`, to the
metrics semantic conventions
([#2706](https://github.com/open-telemetry/opentelemetry-specification/pull/2706))
- Add structured stacktrace to exception attributes. The stacktrace is broken up
to 4 attributes: `exception.structured_stacktrace.function_names`,
Comment on lines +32 to +33
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the past few months I think we have seen multiple use cases where we want attributes with map values. I would argue that if such value are widely needed then we need to lift the restriction and allow map values. Vendors who don't support map values can flatten the data in their exporters.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I've wanted to also find the discussion/notes around why this was left out. Was it simply that some vendors don't support it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the issue: #376

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that issue. Thanks, I saw you include this PR in the comments already.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I support map-valued attributes (and specifying how to flatten them). I see how we could use semantic conventions for conveying source location in stacktraces, but this seems to me a very expensive option. Looking at other fields the OpenCensus stacktrace message contained, I find this mechanism to avoid repetitive stacktraces:

  // The hash ID is used to conserve network bandwidth for duplicate
  // stack traces within a single trace.
  //
  // Often multiple spans will have identical stack traces.
  // The first occurrence of a stack trace should contain both
  // `stack_frames` and a value in `stack_trace_hash_id`.
  //
  // Subsequent spans within the same request can refer
  // to that stack trace by setting only `stack_trace_hash_id`.
  //
  // TODO: describe how to deal with the case where stack_trace_hash_id is
  // zero because it was not set.
  uint64 stack_trace_hash_id = 2;

And this is where most of my concerns here lie -- a structured stack is a more-expensive way to encode stacktraces than the simple string, unless you have a way to avoid repetition.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. We could do similar to the OpenCensus method of not repeating parts of stacktraces through the use of a hash id. But it is also still an optional field that doesn't need to be filled in for those languages that the vendors can parse. Instead of asking vendors to parse languages outside a certain subset we only ask that they fallback on a single structure all the unsupported languages will use.

It is a bit of a pain to say which languages need to be sending the structured version, so some may want to make it a user defined option, except for major players like Java, .net and Go who can pretty much guarantee support.

`exception.structured_stacktrace.filenames`,
`exception.structured_stacktrace.line_numbers`,
`exception.structured_stacktrace.column_numbers` ([#2841](https://github.com/open-telemetry/opentelemetry-specification/pull/2841))

### Compatibility

Expand Down
24 changes: 24 additions & 0 deletions semantic_conventions/trace/exception.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,30 @@ groups:
at com.example.GenerateTrace.methodB(GenerateTrace.java:13)\n
at com.example.GenerateTrace.methodA(GenerateTrace.java:9)\n
at com.example.GenerateTrace.main(GenerateTrace.java:5)'
- id: structured_stacktrace.function_names
type: string[]
brief: >
The fully-qualified names that uniquely identify the function or
method that is active in the frame.
examples: ["com.example.GenerateTrace.methodB", "com.example.GenerateTrace.methodA",
"com.example.GenerateTrace.main"]
note: |-
The `structured_stacktrace` attribute takes precedence over `stacktrace` when both appear. Each array must start with the top of the stack.
- id: structured_stacktrace.filenames
type: string[]
brief: >
The source file names where each function call appears.
examples: ["GenerateTrace.java", "GenerateTrace.java", "GenerateTrace.java"]
- id: structured_stacktrace.line_numbers
type: int[]
brief: >
The line number in file where each function call appears.
examples: [13, 9, 5]
- id: structured_stacktrace.column_numbers
type: int[]
brief: >
The column number, if available, in each file where the function call appears.
examples: [8, 4, 4]
- id: escaped
type: boolean
brief: >
Expand Down
10 changes: 8 additions & 2 deletions specification/trace/semantic_conventions/exceptions.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,9 +47,15 @@ The event name MUST be `exception`.
| `exception.type` | string | The type of the exception (its fully-qualified class name, if applicable). The dynamic type of the exception should be preferred over the static type in languages that support it. | `java.net.ConnectException`; `OSError` | See below |
| `exception.message` | string | The exception message. | `Division by zero`; `Can't convert 'int' object to str implicitly` | See below |
| `exception.stacktrace` | string | A stacktrace as a string in the natural representation for the language runtime. The representation is to be determined and documented by each language SIG. | `Exception in thread "main" java.lang.RuntimeException: Test exception\n at com.example.GenerateTrace.methodB(GenerateTrace.java:13)\n at com.example.GenerateTrace.methodA(GenerateTrace.java:9)\n at com.example.GenerateTrace.main(GenerateTrace.java:5)` | Recommended |
| `exception.escaped` | boolean | SHOULD be set to true if the exception event is recorded at a point where it is known that the exception is escaping the scope of the span. [1] | | Recommended |
| `exception.structured_stacktrace.function_names` | string[] | The fully-qualified names that uniquely identify the function or method that is active in the frame. [1] | `[com.example.GenerateTrace.methodB, com.example.GenerateTrace.methodA, com.example.GenerateTrace.main]` | Recommended |
| `exception.structured_stacktrace.filenames` | string[] | The source file names where each function call appears. | `[GenerateTrace.java, GenerateTrace.java, GenerateTrace.java]` | Recommended |
| `exception.structured_stacktrace.line_numbers` | int[] | The line number in file where each function call appears. | `[13, 9, 5]` | Recommended |
| `exception.structured_stacktrace.column_numbers` | int[] | The column number, if available, in each file where the function call appears. | `[8, 4, 4]` | Recommended |
| `exception.escaped` | boolean | SHOULD be set to true if the exception event is recorded at a point where it is known that the exception is escaping the scope of the span. [2] | | Recommended |

**[1]:** An exception is considered to have escaped (or left) the scope of a span,
**[1]:** The `structured_stacktrace` attribute takes precedence over `stacktrace` when both appear. Each array must start with the top of the stack.

**[2]:** An exception is considered to have escaped (or left) the scope of a span,
if that span is ended while the exception is still logically "in flight".
This may be actually "in flight" in some languages (e.g. if the exception
is passed to a Context manager's `__exit__` method in Python) but will
Expand Down