Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prohibit attribute value from evolving to contain complex types #3858

Merged
merged 6 commits into from
Mar 6, 2024

Conversation

jack-berg
Copy link
Member

If we aren't going to accept complex attribute types (#2888) we should explicitly rule them out of future designs. Doing so cements the idea that attributes are "metadata" instead of "data", since if attributes were data, we would not want to artificially limit their structure. Once its clear that attributes are metadata and restricted to a limited set of types, its easy to determine that use cases which require complex types (like event payloads) should seek to put the data elsewhere (like in a log record body).

While I was in favor of supporting complex attribute types (#2888) I believe its more important that we commit one way or the other. The uncertainty around the question of whether this type of evolution will occur has muddied the waters of several related conversations.

There was consensus on codifying this in the 1/30/24 spec SIG meeting. We should capitalize on this momentum and get this over the finish line. Stalling out just to revisit this same debate in the future is a bad use of time.

specification/common/README.md Outdated Show resolved Hide resolved
specification/common/README.md Outdated Show resolved Hide resolved
Copy link
Contributor

@tedsuo tedsuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this!

specification/common/README.md Outdated Show resolved Hide resolved
@tigrannajaryan
Copy link
Member

I am a bit reluctant with this PR. It is atypical for our spec to prohibit future evolution with strict normative language. Goes against "never say never" philosophy in my book :-)

I understand the reasoning for making this clarification and the need to avoid uncertainty. Could we instead just add it is a soft guideline and explain why we chose to do so?

We don't know if sometime in the future we don't discover some new strong argument in favour of supporting complex types, so why close the door completely?

@jack-berg
Copy link
Member Author

I am a bit reluctant with this PR. It is atypical for our spec to prohibit future evolution with strict normative language. Goes against "never say never" philosophy in my book :-)

Hmm.. I'm torn on this. Often normative language state's what a thing is in such a way that there is no way to evolve that definition without it being considered a breaking change. Consider this sentence:

The attribute key MUST be a non-null and non-empty string.

Relaxing the non-null and non-empty requirement would be considered breaking, so we're restricting future evolution possibilities. In our case we're discussing whether its appropriate to say what a thing is not, and will never be. I'm struggling to see the difference between saying that attribute values must never be complex types and this example of saying that attribute keys must never be null or empty string.

To me, its a question of whether we really want to firmly take complex attribute types off the table or not. If we do, I think its perfectly appropriate to use normative language to state what a thing will not be, since the difference between what a thing is and is not are often blurred (i.e as I argued above, by stating clearly what a thing is you can end up stating something about what it will not ever be).

Personally, I think closing the door completely on this question is a constructive set of guardrails to steer design. I worry that without clearly shutting the door, we'll have to repeatedly answer the question of: what is a sufficiently strong new argument to warrant re-opening the discussion?

@pyohannes
Copy link
Contributor

I am a bit reluctant with this PR. It is atypical for our spec to prohibit future evolution with strict normative language. Goes against "never say never" philosophy in my book :-)

Hmm.. I'm torn on this. Often normative language state's what a thing is in such a way that there is no way to evolve that definition without it being considered a breaking change.

Maybe it would be less controversial if we'd prevent the wording of "prohibiting evolution", but just make it clear that any additions to the list below would be a breaking change (and maybe add some reasoning):

  • The attribute value is either:
    • A primitive type: string, boolean, double precision floating point (IEEE 754-1985) or signed 64 bit integer.
    • An array of primitive type values. The array MUST be homogeneous, i.e., it MUST NOT contain values of different types.

Complex types such as maps or heterogeneous arrays MUST NOT be supported as attribute values.

This wouldn't mean that such types would never be supported, but that adding support for them would imply a breaking change to the specification.

This would also remove the awkwardness or addressing specification authors instead of implementors.

@Oberon00
Copy link
Member

Oberon00 commented Feb 8, 2024

Maybe we could state that adding or allowing a new attribute type would be considered a breaking change.

@tigrannajaryan
Copy link
Member

What's the effect of this change on possible future signals? I assume the restriction does not apply to those, right? Hypothetically speaking a new signal can choose to use the definition of attributes as defined by logs, for example, or even define its own attribute type if there is a strong need.

@tigrannajaryan
Copy link
Member

Just to be clear why I am asking about the new signals, two current examples that deviate from common attribute definition:

  • Events most likely need to use the same complex attribute definition as logs.
  • Profiling may need to record units with attribute values.

There are also unknown future signals that may need something else.

I would be strictly against this PR if we in any way assume the restriction it brings must also apply to new future signals.

If we are all in agreement that the restriction only applies to resource attributes, span and span event attributes and metric attributes then I don't mind.

@jack-berg
Copy link
Member Author

If we are all in agreement that the restriction only applies to resource attributes, span and span event attributes and metric attributes then I don't mind.

I imagine us formalizing the notion of "common" or "general" attributes, which are used in most places. If some part of a data model requires (like logs) requires a special definition, they must describe how the definition differs from the common definition. Most cases should use the common definition, since doing so means getting to take advantage of the existing definition, and probably re-using a common representation in the particular language.

I imagine language maintainers should will prefer the consistency of having a common attributes representation, and offer some resistance to attempts to define new special definitions in the spec.

@jack-berg
Copy link
Member Author

Hi all - thanks for the feedback. I've pushed a commit which attempts to incorporate it:

  • Add a note that extending the set of allowable attribute value types is a breaking change
  • Add a new section designating a "standard attribute" definition, which SHOULD be used in data modeling unless there is strong justification to diverge, noting the motivation for the log record attributes divergence

Please take a look.

@tedsuo
Copy link
Contributor

tedsuo commented Feb 20, 2024

I suggest that we go further with this PR, and include language describing the reasons why we have a limitation to the types available to standard attributes. Here's the language I would suggest:

OpenTelemetry limits the definition of standard attributes for two reasons.

One reason is practical. We have existing API fields which conform to this subset of attribute types. For these fields, backends do not expect additional attribute types. It would be a breaking change to expose additional value types on these fields because we would be breaking a contract we have established with our users.

The second reason is a design choice. In OpenTelemetry we have made a decision regarding how we define metadata. Any time we need metadata in an API, we create use standard attributes, ideally in a field named "Attributes." OpenTelemetry defines “metadata” as a field explicitly intended to be used to create search indexes and statistical data.

We have made the design choice that flat, simple data is appropriate for this use case. We do not want end users and backends to have to dig into some kind of complex type in order to create these indexes and statistics. It should be possible for an automated system to make use of metadata without having to make a decision as to how to handle complex types. We believe that simple, flat data best enables these use cases.

To that end, OpenTelemetry SHOULD NOT define semantic conventions for standard attributes that are essentially complex data type encoded into a simple data types. For example, a string encoded JSON attribute is not an acceptable semantic convention for a standard attribute in OpenTelemetry.

@carlosalberto carlosalberto merged commit 4f71c16 into open-telemetry:main Mar 6, 2024
7 checks passed
@carlosalberto carlosalberto mentioned this pull request Mar 8, 2024
carlosalberto pushed a commit that referenced this pull request Mar 12, 2024
## Changes

Resolves open comments from PR (#3858)

1. Clarify the "why" w/o redefining metadata (See
[comment](#3858 (comment)))

2. Keep all the new standard attribute content together in one place

This addresses issue (#3925).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet