Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow configuring a formatter with extra fields to include #1531

Open
bjchambers opened this issue Sep 1, 2021 · 27 comments · May be fixed by #2943
Open

Allow configuring a formatter with extra fields to include #1531

bjchambers opened this issue Sep 1, 2021 · 27 comments · May be fixed by #2943

Comments

@bjchambers
Copy link

Feature Request

Crates

tracing-subscriber

Motivation

See #1481 and open-telemetry/opentelemetry-rust#625 (comment).

For connecting logs and distributed traces it would be useful to include the distributed trace ID (for instance, from OpenTelemetry) in the logs written by the JSON formatter.

This doesn't clearly fit in either tracing or opentelemetry, since it is about connecting them. One idea that emerged was if there was a method to register additional fields (or a callback to get those additional fields) with the JSON formatter, then this could be implemented by the user as adding a callback to the JSON formatter that gets the OpenTelemetry trace ID and adds that as a field.

Proposal

Add something like:

tracing_subscriber::fmt()
        .json()
        .with_max_level(tracing::Level::TRACE)
        .with_current_span(false)
        .with_extra_fields(|span| span.context().span().span_context().trace_id())
        .init();

Specifically, the callback is given the tracing Span, and then is free to use methods from OpenTelemetry, such as context()

This seems like a useful extension mechanism for augmenting log information from spans (and perhaps with_current_span and with_span_list could even be built using that mechanism).

Alternatives

The other option would be to allow a layer to add fields to the current span for the purposes of logging. Then the OpenTelemetry layer could be registered before the log layer, and add the OpenTelemetry trace-id as a field. This might be easier to use (would happen automatically when the two layers were used together) but wouldn't allow for other use cases.

@excentris
Copy link

I'm using the same setup described in #1481 and facing a similar challenge, and an additional motivation to the one described in this issue. I would also like to have the possibility to add some generic fields to all tracing events (so they appear on both spans AND logs) like a "stage=acc/test/prod" for structured logging.
I'm currently exploring the possibility to write a custom layer with the necessary visitors to handle this, but I am pretty new to both rust and tracing and progress is slow...

@hobietje
Copy link

We are facing this as well. We use Datadog and are trying to connect our Datadog logs and traces with opentelemetry-rust.

From the Datadog docs: It’s necessary to manually patch your particular logging module or library with a processor that adds the aforementioned translated TraceId and SpanId as Log attributes marked dd.trace_id and dd.span_id, respectively.

@mojtab23
Copy link

@hobietje I had a similar issue, and tried a few ways to add it programmatically but it's hard and at the end you end up creating a new crate like tracing-bunyan-formatter. For now maybe you can use a tool like vector.dev to add or remove a field or change the log structure with Vector, And it's from Datadog so it might have a good integration with it.

@hawkw
Copy link
Member

hawkw commented Nov 18, 2021

I'm happy to approve a pull request to add this functionality to tracing-subscriber, if anyone is interested in implementing it.

@andrden
Copy link

andrden commented Dec 6, 2021

It's not that difficult writing a custom FormatEvent for this:

trace_id_format.rs.txt

let fmt_layer = tracing_subscriber::fmt::layer()
    //.json()
    .event_format(TraceIdFormat);

@bjchambers
Copy link
Author

That seems like a good work around. Gave that a try, although it seems like the trace ID is missing in cases where it has been set by a propagator or when its on a separate thread. Trying to see if I can figure anything specific out, but wanted to mention in case that triggered any ideas.

Any concerns around the performance of looking up the trace ID on each log? Would it be possible (and if so better) to have the opentelemtry have an extension that adds a field?

@andrden
Copy link

andrden commented Dec 7, 2021

trace_id_format.rs.txt
Yes, trace ID was missing for nested spans, this is a fixed version, using parent() to reach the root and find the trace ID there

@bjchambers
Copy link
Author

Have you tried this? I've gotten it to include some ID in the logs, but it doesn't match the ID being used in the actual traces reported via open telemetry. Not sure how to debug.

@RemiKalbe
Copy link

@bjchambers it seems to work for me, the trace id is the right one.

@bjchambers
Copy link
Author

@RemiKalbe interesting. I'm noticing that it gets an ID, but it doesn't map to any traces that are stored. I'm trying to figure out what is causing the problem. I'm using async and tonic, with the actual trace ID being propagated from the client to the server. So it seems like it could be a few different possibilities:

  1. Something about async (could see this since I think some of the tracing is thread-specific)
  2. Something about tonic (not sure how this could happen)
  3. Something about the propagation of the trace ID from the RPC call to the RPC handler?

Are you using any of those (so I can rule them out while digging into this)?

@RemiKalbe
Copy link

@bjchambers

  1. I'm using Tokio
  2. Also using tonic

Here is my code

let filter = EnvFilter::try_from_default_env()
    .or_else(|_| EnvFilter::try_new("info"))
    .unwrap();

let opentelemetry_endpoint = env::var("OPENTELEMETRY_ENDPOINT").expect("Missing env var 'OPENTELEMETRY_ENDPOINT''");

let exporter = opentelemetry_otlp::new_exporter()
    .tonic()
    .with_endpoint(opentelemetry_endpoint);

let config = opentelemetry::sdk::trace::config()
    .with_resource(opentelemetry::sdk::Resource::new(vec![
        opentelemetry::KeyValue::new("service.name", "media-server"),
        opentelemetry::KeyValue::new("service.namespace", "cloud-conference"),
]));

let tracer = opentelemetry_otlp::new_pipeline()
    .tracing()
    .with_exporter(exporter)
    .with_trace_config(config)
    .install_batch(opentelemetry::runtime::Tokio)
    .expect("Failed to create opentelemtry pipeline");

let telemetry = tracing_opentelemetry::layer().with_tracer(tracer);

let subscriber =                         
    tracing_subscriber::fmt::layer().json().with_current_span(true).event_format(TraceIdFormat);

tracing_subscriber::registry()
    .with(telemetry)
    .with(filter)
    .with(subscriber)
    .try_init()
    unwrap();

@bjchambers
Copy link
Author

@bjchambers

1. I'm using Tokio

2. Also using tonic

Here is my code

let filter = EnvFilter::try_from_default_env()
    .or_else(|_| EnvFilter::try_new("info"))
    .unwrap();

let opentelemetry_endpoint = env::var("OPENTELEMETRY_ENDPOINT").expect("Missing env var 'OPENTELEMETRY_ENDPOINT''");

let exporter = opentelemetry_otlp::new_exporter()
    .tonic()
    .with_endpoint(opentelemetry_endpoint);

let config = opentelemetry::sdk::trace::config()
    .with_resource(opentelemetry::sdk::Resource::new(vec![
        opentelemetry::KeyValue::new("service.name", "media-server"),
        opentelemetry::KeyValue::new("service.namespace", "cloud-conference"),
]));

let tracer = opentelemetry_otlp::new_pipeline()
    .tracing()
    .with_exporter(exporter)
    .with_trace_config(config)
    .install_batch(opentelemetry::runtime::Tokio)
    .expect("Failed to create opentelemtry pipeline");

let telemetry = tracing_opentelemetry::layer().with_tracer(tracer);

let subscriber =                         
    tracing_subscriber::fmt::layer().json().with_current_span(true).event_format(TraceIdFormat);

tracing_subscriber::registry()
    .with(telemetry)
    .with(filter)
    .with(subscriber)
    .try_init()
    unwrap();

Hmm. The only things I see different then (which I'll try to spend some time looking into):

  1. Possibly the use of tracing propagation to initialize the spans on the server.
  2. The use of json().with_current_span(true). I don't think this is the issue since I believe the event_format(TraceIdFormat) supersedes the JSON format configuration, but maybe it doesn't and somehow changes the behavior?

I'll try to look more deeply and see if I can post any more details.

@lfdominguez
Copy link

you tried this? I've gotten it to include some ID in the logs, but it doesn't match the ID being used in the actual tr

Hi I'm implementing with axum this tracing + log and i want to show the trace_id on logs... Used this code example, but this

image

on trace_id_format.rs dont get the current context... even when the opentracing_otlp is sending to the agent.

#[tokio::main]
async fn main() {
    if env::var("RUST_LOG").is_err() {
        env::set_var("RUST_LOG", "debug");
    }

    let opentelemetry_endpoint = env::var("OPENTELEMETRY_ENDPOINT").expect("Missing env var 'OPENTELEMETRY_ENDPOINT'");

    let exporter = opentelemetry_otlp::new_exporter()
      .tonic()
      .with_endpoint(opentelemetry_endpoint);

    let config = opentelemetry::sdk::trace::config()
      .with_resource(opentelemetry::sdk::Resource::new(vec![
          opentelemetry::KeyValue::new("service.name", "svc_a"),
          opentelemetry::KeyValue::new("service.namespace", "test"),
      ]));

    let tracer = opentelemetry_otlp::new_pipeline()
        .tracing()
        .with_exporter(exporter)
        .with_trace_config(config)
        .install_batch(opentelemetry::runtime::Tokio)
        .expect("Failed to create opentelemetry pipeline");

    let telemetry = tracing_opentelemetry::layer().with_tracer(tracer);
    let subscriber = tracing_subscriber::fmt::layer().json().with_current_span(true).event_format(TraceIdFormat);

    tracing_subscriber::registry()
        .with(telemetry)
        .with(tracing_subscriber::EnvFilter::new(
            env::var("RUST_LOG").expect("RUST_LOG env variable not set"),
        ))
        .with(subscriber)
        .try_init()
        .expect("Initialization of tracing failed");

    let app = Router::new()
        .layer(TraceLayer::new_for_http())
        .route("/greet", post(greet));

    let addr = SocketAddr::from(([0, 0, 0, 0], 3000));
    tracing::info!("listening on {}", addr);

    axum::Server::bind(&addr)
        .serve(app.into_make_service())
        .await
        .unwrap();

    global::shutdown_tracer_provider();
}

@nicmue
Copy link

nicmue commented May 25, 2022

You can't get the SpanBuilder directly but the OtelData from which you then can retrieve the SpanBuilder. With that you don't even need to recursively traverse the parent SpanRef. Here's a snippet how I solved it on my side:

use opentelemetry::trace::{SpanId, TraceContextExt};
use tracing::Subscriber;
use tracing_opentelemetry::OtelData;
use tracing_subscriber::registry::{LookupSpan, SpanRef};

pub struct TraceInfo {
    pub trace_id: String,
    pub span_id: String,
}

fn lookup_trace_info<S>(span_ref: &SpanRef<S>) -> Option<TraceInfo>
where
    S: Subscriber + for<'a> LookupSpan<'a>,
{
    span_ref.extensions().get::<OtelData>().map(|o| TraceInfo {
        trace_id: o.parent_cx.span().span_context().trace_id().to_string(),
        span_id: o.builder.span_id.unwrap_or(SpanId::INVALID).to_string(),
    })
}


//    ...

if let Some(ref span_ref) = ctx.lookup_current() {
    if let Some(trace_info) = lookup_trace_info(span_ref) {
        serializer.serialize_entry("span_id", &trace_info.span_id)?;
        serializer.serialize_entry("trace_id", &trace_info.trace_id)?;
    }
}

//   ...

@lfdominguez
Copy link

Works for span_id... but check how the trace_id is a lot of 000000... (this happen with the other loop method)
image

@nicmue
Copy link

nicmue commented May 25, 2022

In my case I have a remote trace_id which I feed into my spans with:

span.set_parent(opentelemetry::Context::new().with_remote_span_context(
    opentelemetry::trace::SpanContext::new(...),
));

With this the OtelData.parent_ctx.span().span_context() will return this said SpanContext with the remote trace_id. If you don't rely on remote trace ids you need to get the trace_id just like the span_id and do the traversal through the SpanRef.parent() till you find a valid trace_id.

However I need to admit I don't know how/if remote trace ids are fed into spans by the approach you used (TraceLayer::new_for_http()).

@lfdominguez
Copy link

In my case I have a remote trace_id which I feed into my spans with:

span.set_parent(opentelemetry::Context::new().with_remote_span_context(
    opentelemetry::trace::SpanContext::new(...),
));

With this the OtelData.parent_ctx.span().span_context() will return this said SpanContext with the remote trace_id. If you don't rely on remote trace ids you need to get the trace_id just like the span_id and do the traversal through the SpanRef.parent() till you find a valid trace_id.

However I need to admit I don't know how/if remote trace ids are fed into spans by the approach you used (TraceLayer::new_for_http()).

Thanks, do you have any public code where i can review how do you use that???

@nicmue
Copy link

nicmue commented May 25, 2022

Thanks, do you have any public code where i can review how do you use that???

No sorry that's extracted and adapted from a closed source project. However I can create a more detailed gist if your problem persists.

@lfdominguez
Copy link

Thanks, do you have any public code where i can review how do you use that???

No sorry that's extracted and adapted from a closed source project. However I can create a more detailed gist if your problem persists.

I would appreciate it.

@bjchambers
Copy link
Author

Ah-hah, I think this helped me get it working too. I based my code on #1531 (comment).

I'm using Tonic with a gRPC trace propagation. The part that I was missing was that the set_parent is setting the OTEL parent, not the tracing parent. So once you've found the "OTEL SPAN", you still need to ask for it's parent to get a trace ID. Not sure if it would also work using the SpanBuilder, but seems to work for me by going "tracing Span -> OtelData -> Otel parent -> trace ID".

@gunnigylfa
Copy link

gunnigylfa commented Feb 7, 2023

I am using this solution as well and it is working for me. Thank you. When I apply the custom event_format it stops being able to apply flatten_event. I would like to be able to pull the fields on the event onto the root json log structure. Is there a way to implement the flatten_event method for TraceIdFormat I am having a hard time figuring it out.

Edit:

Was able to get this working by converting the SerializeFieldMap into a serde_json::Value:

for (name, value) in serde_json::to_value(event.field_map())
    .unwrap()
    .as_object()
    .unwrap()
    .iter()
{
    serializer.serialize_entry(name, value)?;
}

I suspect nested values will need some special handle but this is sufficient for my use case for now.

@FarhadF
Copy link

FarhadF commented May 21, 2023

a gist or a complete example would be appreciated

@nicmue
Copy link

nicmue commented Jun 24, 2023

a gist or a complete example would be appreciated

I just remembered about this thread and created a small example project to showcase how we achieve a stable trace_id in our json logs: https://github.com/nicmue/tracing-stable-trace-id-example. Sorry for the delay...

@hseeberger
Copy link

But with this approach (fmt::layer().json().event_format(json::Json)) you are giving up on the original (from tracing_subscriber) Json and roll your own.

I'd prefer to use the original Json and just use the suggested with_extra_fields method.

Still interested in a PR?

@hseeberger
Copy link

@hawkw:

Still interested in a PR?

@hseeberger
Copy link

hseeberger commented Jul 23, 2023

I have a working prototype for only adding the OTel trace ID here: #2664.

That is used in this example app: hseeberger/hello-tracing-rs#19.

@jonkight
Copy link

FWIW, I tried a few of the methods in this issue and couldn't get any of them to work. My current setup is Axum + Tracing + OTEL crates. I have an OTEL collector set up locally that's exporting traces to Grafana Tempo and exporting logs (filelog) to Grafana Loki.

Code for my tracing setup looks something like below:

fn init_tracing() -> TracingGuards {
    let global_filter = tracing_subscriber::EnvFilter::try_from_default_env().unwrap_or_else(|_| {
        "my_crate=trace".into()
    });

    let tracer = opentelemetry_otlp::new_pipeline()
        .tracing()
        .with_exporter(opentelemetry_otlp::new_exporter().tonic().with_env())
        .install_batch(opentelemetry::runtime::Tokio)
        .expect("couldn't create OTLP tracer");
    let telemetry_layer = tracing_opentelemetry::layer().with_tracer(tracer);

    let file_appender = tracing_appender::rolling::daily("C:/logs", "events.log");
    let (file_appender, file_appender_guard) = tracing_appender::non_blocking(file_appender);
    let file_appender_layer = tracing_subscriber::fmt::layer()
        .json()
        .event_format(event_format::Json)
        .with_writer(file_appender);

    let stdout_layer = tracing_subscriber::fmt::layer()
        .compact()
        .with_writer(std::io::stdout);

    tracing_subscriber::registry()
        .with(global_filter)
        .with(telemetry_layer)
        .with(stdout_layer)
        .with(file_appender_layer)
        .init();

    TracingGuards {
        file_appender_guard,
    }
}

I found using the OtelData propagation method @nicmue mentioned would successfully push the trace_id and span_id into my logs, but then my OTEL traces would stop being sent to Grafana Tempo. I played around with it trying to get it to work, but had no success. I then tried the SpanBuilder method @andrden mentioned, but that didn't work either.

What did work was my altered version of their solutions, which seems less elegant and likely has problems, but appears to be correct with my testing:

    fn format_event(
        &self,
        _ctx: &FmtContext<'_, S, N>,
        mut writer: Writer<'_>,
        event: &Event<'_>,
    ) -> std::fmt::Result
    where
        S: Subscriber + for<'a> LookupSpan<'a>,
    {
        let meta = event.metadata();

        let mut visit = || {
            let mut serializer = serde_json::Serializer::new(WriteAdaptor::new(&mut writer));
            let mut serializer = serializer.serialize_map(None)?;
            serializer.serialize_entry("timestamp", &Utc::now().to_rfc3339())?;
            serializer.serialize_entry("level", &meta.level().as_serde())?;
            serializer.serialize_entry("fields", &event.field_map())?;
            serializer.serialize_entry("target", meta.target())?;

            let current_span = tracing::Span::current();
            let context = current_span.context();
            let span_ref = context.span();
            let span_context = span_ref.span_context();

            let trace_id = span_context.trace_id();
            if trace_id != TraceId::INVALID {
                serializer.serialize_entry("trace_id", &trace_id.to_string())?;

                let span_id = span_context.span_id();
                if span_id != SpanId::INVALID {
                    serializer.serialize_entry("span_id", &span_id.to_string())?;
                }
            }

            serializer.end()
        };

        visit().map_err(|_| std::fmt::Error)?;
        writeln!(writer)
    }

Regardless of this working, I'd also love to have a simpler way to alter the default Json formatter vs. creating my own, as the OP mentioned and @hseeberger created a PR for.

A collaboration with the OTEL team to have a straight forward method and example of propagating the ids to JSON logs would also be welcomed as I'm sure other people are naturally going to run into this problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.