Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Java implementation of the File Configuration #6170

Open
brunobat opened this issue Jan 23, 2024 · 20 comments
Open

Java implementation of the File Configuration #6170

brunobat opened this issue Jan 23, 2024 · 20 comments
Labels
Feature Request Suggest an idea for this project

Comments

@brunobat
Copy link
Contributor

brunobat commented Jan 23, 2024

This issue pretends to be an umbrella to steer the implementation of the File Configuration under the Java SDK and related projects.

The file configuration of the SDK has been added to the project in this PR: #5831
This work must follow the guidelines established under the OpenTelemetry specification for the configuration of the SDK defined here: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/configuration/sdk-configuration.md
On the spec, it's mentioned that:

All other configuration mechanisms SHOULD be built on top of this interface.

The programatic interface

The programatic interface in the Java SDK is the properties supplier defined in the AutoConfigureOpenTelemetrySdkBuilder as Map<String, String>> propertiesSupplier

The file configuration MUST provide a map of properties to this supplier and MUST NOT override the existing auto-configuration interfaces, namely, if a configuration file is provided, it MUST NOT take precedence over the provided programatic configurations, oTelConfigs, in this example:

AutoConfiguredOpenTelemetrySdk.builder()
                        .setResultAsGlobal()
                        .disableShutdownHook()
                        .addPropertiesSupplier(() -> oTelConfigs)
                        .build().getOpenTelemetrySdk()

Including existing signal builders, providers and customisers.

Config sources

Multiple configuration sources are defined in the spec without a sense of priorities, however, in Java it's common practice to have a hierarchy of configuration sources:

We can see that it's common practice for configurations to be sourced in many different ways and usually the same property can be set in many different sources.

Major java frameworks and cloud based systems in java assume precedence of env. vars. and sys. vars. over other configuration methods. This is a common, accepted and even expected practice.

Configuration collisions and unavailability.

If a source is not able to provide an unambiguous value for a particular configuration value, that configuration will be unavailable in that source. This behaviour must be documented and a default value must be provided. Example:

OTEL_TRACES_EXPORTER=otlp
OTEL_EXPORTER_OTLP_ENDPOINT=http://yet-another-endpoint:4318

And the user has this code in the application:

SdkTracerProvider.builder()

        .addSpanProcessor(BatchSpanProcessor.create(OtlpGrpcSpanExporter.create("http://some-endpoint:4318")))

        .addSpanProcessor(BatchSpanProcessor.create(OtlpGrpcSpanExporter.create("http://some-other-endpoint:4318")))

In this case it is not possible to use the env vars to configure 2 different exporters and they will end up with the same address. This must be documented. In the future, if required, support for this could be added by implementing new env. vars.
It should be noted however that frameworks integrating OpenTelemetry could find a solution of their own for this problem.

Configuration source independence

According to the OTel spec, A configuration must not be exclusive of a particular configuration source, namely the file configuration.

Broad support

By principle, libraries are much better off not forcing a specific way of configuration on users, but let those decisions be driven by frameworks - this makes library usage a breeze in the frameworks that integrate with said libraries, but also allows power users to provide arbitrary configuration options if desired.
Providing a file configuration alternative to "Java main()" standalone applications or the Java agent shouldn't interfere with other types of systems.

Maybe some aspects of the file configuration should be part of the Java agent and not the SDK itself.


There are related discussions on this PR: #5912
And these issues:

@brunobat brunobat added the Feature Request Suggest an idea for this project label Jan 23, 2024
@brunobat
Copy link
Contributor Author

I'm not sure about the config collisions part because it seems to me that if we don't allow env. var. configs to particular configurations, it seems to me it will be dificult to find unique keys to identify all the property names in the PropertiesSupplier Map.

@jack-berg
Copy link
Member

The programatic interface in the Java SDK is the properties supplier defined in the AutoConfigureOpenTelemetrySdkBuilder as Map<String, String>> propertiesSupplier

Incorrect. The programmatic interface are the builders for the specific components: BatchSpanProcessor, OtlpGrpcSpanExporter, Resource, PeriodicMetricReader, etc. The autoconfigure module is an abstraction which interprets the environment variable configuration scheme and configures the SDK using the programmatic configuration interfaces of individual components.

The file configuration MUST provide a map of properties to this supplier and MUST NOT override the existing auto-configuration interfaces, namely, if a configuration file is provided, it MUST NOT take precedence over the provided programatic configurations, oTelConfigs, in this example:

You're using normative language here, but its not taken from the specification. I assume you're expressing a strong opinion then?

If a source is not able to provide an unambiguous value for a particular configuration value, that configuration will be unavailable in that source. This behaviour must be documented and a default value must be provided. Example:

In this case it is not possible to use the env vars to configure 2 different exporters and they will end up with the same address. This must be documented. In the future, if required, support for this could be added by implementing new env. vars.
It should be noted however that frameworks integrating OpenTelemetry could find a solution of their own for this problem.

I'm not sure what you're trying to convey here. The example you include is taken out of context. I wrote that example because Diego and I were talking about how some SDK implementations directly embed interpretation of environment variables into the components (in contrast to opentelemtry-java which extracts that to the separate autoconfigure artifact). We were discussing that example wondering how those other SDK implementations handle the types of environment variable conflicts today.

According to the OTel spec, A configuration must not be exclusive of a particular configuration source, namely the file configuration.

What spec language are you referencing here?

By principle, libraries are much better off not forcing a specific way of configuration on users, but let those decisions be driven by frameworks

Whether or not file configuration supports environment variable overrides (still not decided and being actively debated), there would be nothing forcing a user or a framework to use file configuration, or the environment variable scheme for that matter. A user or framework will always be able to simply not use the autoconfigure module and use their own configuration scheme with the low level programmatic APIs.

@brunobat
Copy link
Contributor Author

brunobat commented Jan 24, 2024

The programatic interface in the Java SDK is the properties supplier defined in the AutoConfigureOpenTelemetrySdkBuilder as Map<String, String>> propertiesSupplier

Incorrect. The programmatic interface are the builders for the specific components: BatchSpanProcessor, OtlpGrpcSpanExporter, Resource, PeriodicMetricReader, etc. The autoconfigure module is an abstraction which interprets the environment variable configuration scheme and configures the SDK using the programmatic configuration interfaces of individual components.

We should clarify this, because that is not my understanding. The properties supplier is generic enough and not dependent of any "environment variable" in order to work, if entries come from env. vars. or any other place, the map doesn't care.
On a broader note, Will this mean that, if I want to stay independent of additional configuration methods and I want to control the configuration, I will need to rewrite the bootstrap of the OTel SDK in order to avoid the Autoconfiguration?
Who should use the Autoconfiguration, then?

The file configuration MUST provide a map of properties to this supplier and MUST NOT override the existing auto-configuration interfaces, namely, if a configuration file is provided, it MUST NOT take precedence over the provided programatic configurations, oTelConfigs, in this example:

You're using normative language here, but its not taken from the specification. I assume you're expressing a strong opinion then?

This part is a proposal. Open for debate.

According to the OTel spec, A configuration must not be exclusive of a particular configuration source, namely the file configuration.

What spec language are you referencing here?

Here: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/configuration/sdk-configuration.md
In here, the programatic interface is the foundation of all other configuration methods or sources and all Configuration methods are on par in terms of priority.
We can argue if priority should be introduced, though, but that's being discusses here: open-telemetry/opentelemetry-specification#3752

We definitely need to clarify and define somewhere what is considered the programatic interface in the case of Java.

By principle, libraries are much better off not forcing a specific way of configuration on users, but let those decisions be driven by frameworks

Whether or not file configuration supports environment variable overrides (still not decided and being actively debated), there would be nothing forcing a user or a framework to use file configuration, or the environment variable scheme for that matter. A user or framework will always be able to simply not use the autoconfigure module and use their own configuration scheme with the low level programmatic APIs.

True, if frameworks weren't using already the Autoconfiguration to create an SDK. This effectively means a breaking change for existing systems and a message to not use the Autoconfiguration.

@asafm
Copy link
Contributor

asafm commented Jan 24, 2024

Can you please clarify what problem this issue is trying to solve? I read the entire issue and couldn't figure out the pain points the issue is trying to solve.

From my understanding, the programmatic interface is the *Builder classes, allowing you to build OpenTelemetrySdk and everything it needs: MetricReader, View, Exporter, etc.

AutoConfiguredOpenTelemetrySdkBuilder Allows you to build the SDK entirely from environment variables, system properties, and configuration files. It also allows you to modify any part configured from those sources before it's built.
The only part that needs to be added is the ability to merge the configuration you got from the environment variables and system properties with the configuration from the files (views and general config). Today, it's either from configuration files or from environment variables and system properties. That is what I would state as a proposal/problem to solve.

I read the SDK specification you linked, and it answers that precisely.
I have to say the design @jack-berg made here is brilliant - elegant and clean. It's x10 better than other libraries I've used, being metric libraries (Dropwizard, Micrometer, Prometheus Java client) or even general-purpose libraries.

@brunobat
Copy link
Contributor Author

@asafm these are some of the points to clarify:

  1. Who should use the AutoConfiguredOpenTelemetrySdkBuilder?
    Framework integrators have been relying on this because it provides a standardised set of properties. What you say leads me to believe that we should have relied in the *Builder classes and ignore any standard Java SDK configuration properties. We should have came up with our own. I don't agree with this vision. We had that and we changed the properties to the ones used in the autoconfiguration.
  2. Where is the functionality scope for the autoconfiguration defined?
    Right now is referred as an alternative way to configure the SDK with env vars, but we are adding file config to it.
  3. If current Autoconfigure properties are not considered part of the programatic interface, what standard configuration abstraction (what property names? what config entities?) should we use?
    If you tell me builders, It means anyone can come up with whatever property names they want to configure the SDK because the names are not normalised at that level and they only exist because the Autoconfiguration is the de-facto standard to create an SDK. It can be argued that having common properties across the board is a good thing for users.
  4. The Autoconfiguration is being changed in a way that breaks existing integrations if the file config is used. We need to know if there isn't another way to do this.
  5. Overall, I think it would be an advantage to include the autoconfiguration under the umbrella of the programatic interface.
  6. The problem with the merge of properties coming from different sources.

I don't question the merits of the tech solution proposed by @jack-berg. His work is an example to all of us.
I think the file configuration "epic" should have been completed to a reasonable degree at the spec level and also discussed under a design issue to asses the impacts in the existing user base.
This issue tries to fuel this discussion.

@brunobat
Copy link
Contributor Author

CC @kittylyst

@asafm
Copy link
Contributor

asafm commented Jan 24, 2024

  1. Can you please clarify or give an example of what exactly are Framework integrators? Maybe a specific use case? I need to understand why you need a set of standard properties.
  2. Regarding the documentation you're asking for, I defer this to @jack-berg.
  3. I need a concrete example to understand your use case.
  4. "The Autoconfiguration is being changed i" - changed by who? I don't understand.
  5. No No. The whole idea is that the programmatic interface is the basic building block. It's typed and documented. It is straightforward to use in many bespoke cases. The auto-configuration is essentially a translation layer from the config to the method calls of the builders (the programmatic API). To me, it's very standard architecture.
  6. Can you please elaborate more on this? What exactly is the problem of merging? Maybe specify an example so I can understand.

I think they were trying as much as possible not to be overly specific in the spec, as it can create havoc later. Okay, let's work through your answers to the questions to get to the bottom of the problems you see.

@brunobat
Copy link
Contributor Author

  1. Can you please clarify or give an example of what exactly are Framework integrators? Maybe a specific use case? I need to understand why you need a set of standard properties.

Quarkus, Wildfly, OpenLiberty, Payara, Spring, etc...

  1. Regarding the documentation you're asking for, I defer this to @jack-berg.
  2. I need a concrete example to understand your use case.

These are the Autoconfigure properties: https://github.com/open-telemetry/opentelemetry-java/blob/main/sdk-extensions/autoconfigure/README.md#general-configuration
If I cannot use the Autoconfigure I'm free to define my own as I please, like:
Instead of otel.exporter.otlp.endpoint I could very well define telemetry.out.cannel1.endpoint.
This would totally break portability.
I'm totally in favor of portability, but I don't want to reimplement something that is already done in Autoconfigure.

  1. "The Autoconfiguration is being changed i" - changed by who? I don't understand.

Changed on the PR mentioned in the description. The code in here
As you can see, if you have a file configuration, the current Autoconfiguration is ignored.

  1. No No. The whole idea is that the programmatic interface is the basic building block. It's typed and documented. It is straightforward to use in many bespoke cases. The auto-configuration is essentially a translation layer from the config to the method calls of the builders (the programmatic API). To me, it's very standard architecture.

I disagree because of the complexity of setting up an SDK. Have you try to do it without the Autoconfigure for a real world app (Multithreaded+REST+DB+heath+Security+Resources+Dependency Injection+etc) ?
Ths OTel SDK is more complex than most HTTP servers and all of them provide some higher level abstraction... You don't need to build your own thread pool in order to use it. Sensible defaults for crosscutting functionalities should be in place. This includes standard configuration, which implies the use of the Autoconfigure.

  1. Can you please elaborate more on this? What exactly is the problem of merging? Maybe specify an example so I can understand.

You have the same property with different values in the file and on an env. var. Who wins?

I think they were trying as much as possible not to be overly specific in the spec, as it can create havoc later. Okay, let's work through your answers to the questions to get to the bottom of the problems you see.

@jack-berg
Copy link
Member

The properties supplier is generic enough and not dependent of any "environment variable" in order to work, if entries come from env. vars. or any other place, the map doesn't care.

No its not generic enough. There are many things that cannot be expressed using flat properties. For example, non-trivial processor configurations, views, and non-trivial sampler configurations. The customization options in autoconfiguration are largely in direct response to inadequate expressive power of the flat scheme we have today.

On a broader note, Will this mean that, if I want to stay independent of additional configuration methods and I want to control the configuration, I will need to rewrite the bootstrap of the OTel SDK in order to avoid the Autoconfiguration?
Who should use the Autoconfiguration, then?

I think that's a decision frameworks will have to make:

  • they can use autoconfiguration, understanding the customization hooks and behavior, and in doing so benefit from being able to defer parts of the configuration scheme to the otel specification.
  • they can reject autoconfiguration, and have full control, but in doing so will have to take on some of the configuration burden handled by autoconfiguration.

This decision should be analogous to how frameworks handle log configuration files today:

  • Log frameworks like log4j and logback allow you to express your configuration in files.
  • Frameworks like spring and others don't want to re-invent the wheel for logging, and they typically provide multiple mechanisms for configuring logging:
    • A user can use their log4j / logback configuration file
    • A user can encode similar information into the framework's centralized configuration mechanism

Why shouldn't the story with opentelemetry be the same?

We definitely need to clarify and define somewhere what is considered the programatic interface in the case of Java.

This is not ambiguous. The module description is:

Autoconfigure OpenTelemetry SDK from env vars, system properties, and SPI

The first line of the autoconfigure readme (present since > 3 years ago) states:

This artifact implements environment-based autoconfiguration of the OpenTelemetry SDK. This can be
an alternative to programmatic configuration using the normal SDK builders.

True, if frameworks weren't using already the Autoconfiguration to create an SDK. This effectively means a breaking change for existing systems and a message to not use the Autoconfiguration.

Incorrect. There is no behavior change when the input (i.e. environment variables and system properties and SPIs) is the same. The conditions to trigger a change are explicitly setting otel.config.file and explicitly including the opentelemetry-sdk-extension-incubator dependency. To consider that breaking to consider the addition of any new property breaking, with is not aligned with industry standards. Furthermore, customization options exist that allow frameworks to nullify any user attempt to set otel.config.file.

This genie can't be put back in the bottle. The OpenTelemetry community is committed to having a file format. The TC put a moratorium on enhancements to the env variable scheme, a dedicated working group was spun up, an OTEP was long debated and approved, a bunch of work has been debated and merged to the spec regarding file configuration, and there are prototypes in at least 3 languages. The wheels are in motion - the opentelemetry community will not be limited to a flat env var / system property style configuration scheme.

There are open questions about how file configuration interacts with the environment variable config scheme, and for java specifically, its appropriate to debate how file configuration should interact with autoconfigure tooling. But file configuration is happening, and over time its likely to supplant the environment variable scheme as the dominant configuration mechanism.

@jack-berg
Copy link
Member

With that aide, let's get to the crux of what I think we ought to care about, which is how exactly file configuration should interact with the autoconfigure mechanism.

I've uploaded two diagrams to help talk through things.

Autoconfigure without file configuration

Here we see how autoconfigure works without file configuration. There's a merging of different sources which contribute to ConfigProperties. The ConfigProperties are read from in the process of configuring Resource, TracerProvider, MeterProvider, LoggerProvider, and Propagators. There are SPIs for providing custom plugin components (i.e. exporters, samplers, etc), which read from ConfigProperties. There are SPIs for customizing the result of any of these plugin components, and further for customizing TracerProvider, MeterProvider, and LoggerProvider.

Note the many hooks in place that are essentially escape hatches for the lack of expressiveness of the flat configuration scheme. We can't easily wrap one exporter in another, so we allow exporter customizers, we can't register views, so we expose a MeterProvider customizer, etc, etc.

Screenshot 2024-01-24 at 2 19 41 PM

Proposal for autoconfigure with file configuration

Here's my current working model on how file configuration interacts with this. We still resolve ConfigProperties from multiple contributing sources, but we have a key fork point in the flow based on whether otel.config.file is defined. Note that customizers contributing to ConfigProperties can influence this fork. If otel.config.file is not set, continue as we do today. if otel.config.file is set, we go to a different path: We parse the contents of the config file to an in-memory configuration model. We have a phase where the configuration model is customized:

  • We potentially override the model with the environment variable config schema according to merge semantics, using ConfigProperties to read the environment variable config schema values
  • We provide a new SPI callback hook allowing arbitrary customization of the configuration model

We interpret the resolved model, invoking any of the plugin extension SPIs to provide exporters, samplers, processors which are not built-in.

This seems well balanced to me. Existing usages of autoconfigure would continue working as normal, but we have a toolkit to evolve to a more mature solution where we're able to encode complex configuration in a structure configuration model. We continue our practice of providing customization hooks.

Note that once we have a configuration model, we have a reliable and simpler alternative for frameworks to use their own configuration mechanism compared with environment configuration OR the programmatic interface. A framework use synthesize its own configuration sources into a configuration model, and be able to reliably configure an SDK from that model.

Screenshot 2024-01-24 at 2 20 02 PM

@brunobat
Copy link
Contributor Author

brunobat commented Jan 25, 2024

Thanks @jack-berg this makes much more sense now.

I think that's a decision frameworks will have to make

Are you aware of any java framework that starts the SDK without the Autoconfiguration? Is this really a practical choice or even something that we want to incentivise?

Why shouldn't the story with opentelemetry be the same?

The files can be used but also other methods. In Quarkus those files are not needed.

This is not ambiguous. The module description is:

Autoconfigure OpenTelemetry SDK from env vars, system properties, and SPI

But it doesn't mention file config, does it?... This was not under the radar when the Autoconfiguration was created, I acknowledge.

Let me suggest a design alternative to highlight my pain points.
Ideally, from my point of view, There should be retro-compatibility (which seems assured) but also a path forward for existing systems to use the new configuration model without massive rewrites.
I imagine some of the current functionality and all future features will use only the new config model.

We could rearrange things in a way that allows existing use but also provides a path forward to existing Autoconfiguration users to use the new, richer model.
Autoconfigure_altB

You can argue that current configurations will not map perfectly to the new model, but does this mapping have to be perfect? Maybe this transformation can happen painlessly on 90% plus of the cases, no?

@jkwatson
Copy link
Contributor

I haven't been following all the details here, but one thing that is non-negotiable... everything must be configurable, by an end-user writing code using public APIs. We cannot have "hidden", non-public methods for configuring the SDK that are only accessible via file or env-var configuration.

@jack-berg
Copy link
Member

Are you aware of any java framework that starts the SDK without the Autoconfiguration?

Spring, and specifically the opentelemetry spring boot starter.

I don't have a list of all the frameworks that integrate with OpenTelemetry and how they do configuration.

The files can be used but also other methods.

The same is true here.

But it doesn't mention file config, does it?... This was not under the radar when the Autoconfiguration was created, I acknowledge.

It doesn't need to.

All of file configuration can be implemented as an AutoConfigurationCustomizerProvider. Nothing needs to be built into the autoconfigure module for file configuration to completely override the autoconfiguration output. I just need to implement a customizer with a high order number, and replace the SdkTracerProviderBuilder, SdkMeterProviderBuilder, SdkLoggerProvider builder.

I imagine some of the current functionality and all future features will use only the new config model.
We could rearrange things in a way that allows existing use but also provides a path forward to existing Autoconfiguration users to use the new, richer model.

Can you elaborate? I can't tell what you mean by this.

@jack-berg
Copy link
Member

Also see this document explaining some of the original thinking behind SDK configuration design: https://github.com/open-telemetry/opentelemetry-java/blob/main/docs/sdk-configuration.md#goals-and-non-goals

Note that all the emphasis is on the builders as the configuration mechanism.

Note that listed as a "non-goal" is:

Make sure everything is auto-configurable. This is out of the scope of the SDK, and instead is left to auto-configuration layers, which are also described below but not as part of the core SDK. The SDK provides an autoconfiguration extension as an option which is not internal to the main SDK components.

@brunobat
Copy link
Contributor Author

I imagine some of the current functionality and all future features will use only the new config model.
We could rearrange things in a way that allows existing use but also provides a path forward to existing Autoconfiguration users to use the new, richer model.

Can you elaborate? I can't tell what you mean by this.

If the rich file model allows configurations that are not possible with env vars, would this mean some of the configurations could only be performed using the file?

@jack-berg
Copy link
Member

If the rich file model allows configurations that are not possible with env vars, would this mean some of the configurations could only be performed using the file?

This is a loaded question.

Whether or not all components of a configuration file are representable with env vars is currently being debated. As discussed in this comment, if we merge the existing environment variable scheme its unlikely everything in file configuration will be representable. But if we ignore the existing environment variable scheme and invest a new one with names that are derived from the configuration model, then its likely that everything in file configuration will be representable with environment variables. If this is important to you then I suggest you go advocate for that.

But I want to emphasize that the file configuration mechanism comes paired with a configuration model, which is produced as a result of parsing a configuration file, but which can also be programmatically constructed or edited.

@brunobat
Copy link
Contributor Author

brunobat commented Jan 26, 2024

But I want to emphasize that the file configuration mechanism comes paired with a configuration model, which is produced as a result of parsing a configuration file, but which can also be programmatically constructed or edited.

This sounds good to me in principle, but the current discussion seems to be file centric and requiring a file to work. At least in the Autoconfigure it currently requires a file to work.

Could we get a method were we send not a file but a configuration object to create the SDK in the same fashion the file does?

@trask
Copy link
Member

trask commented Jan 26, 2024

Could we get a method were we send not a file but a configuration object to create the SDK in the same fashion the file does?

Is this what you are looking for, or something slightly different?

#6170 (comment)

  • We provide a new SPI callback hook allowing arbitrary customization of the configuration model

@brunobat
Copy link
Contributor Author

Could we get a method were we send not a file but a configuration object to create the SDK in the same fashion the file does?

Is this what you are looking for, or something slightly different?

#6170 (comment)

  • We provide a new SPI callback hook allowing arbitrary customization of the configuration model

I need to see the details, but that could work.

@jack-berg
Copy link
Member

This sounds good to me in principle, but the current discussion seems to be file centric and requiring a file to work. At least in the Autoconfigure it currently requires a file to work.
Could we get a method were we send not a file but a configuration object to create the SDK in the same fashion the file does?

See the file configuration spec requirement for a configuration model, and for a method called create which accepts a configuration model and returns configured SDK components.

The java embodiment of the model is a (currently generated) class called OpenTelemetryConfiguration, which this create method accepts.

The idea behind an SPI would be something like:

public interface ConfigurationModelCustomizerProvider {

  OpenTelemetryConfiguration customize(OpenTelemetryConfiguration);

}

Giving implementations full ability to customize the configuration model before it is used to configure the SDK.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature Request Suggest an idea for this project
Projects
None yet
Development

No branches or pull requests

5 participants