Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Usability concerns regarding logging enrichment and tags collision #4947

Open
NatMarchand opened this issue Feb 15, 2024 · 12 comments
Open

Usability concerns regarding logging enrichment and tags collision #4947

NatMarchand opened this issue Feb 15, 2024 · 12 comments

Comments

@NatMarchand
Copy link

Hi!
I'm currently trying log enrichment on an app but I think there's a problem of usability.
In the simple case where I have an AspNet server endpoint which makes a HttpClient request however all the tags collide for example http.request.header, http.response.header. In my logs I don't know if the header comes from the incoming request or outgoing request.
I do understand that there are OTEL conventions however they seem to apply only for tracing not logs.

@geeknoid
Copy link
Member

geeknoid commented Mar 7, 2024

@klauco @xakep139 Any insights to provide here?

@xakep139
Copy link
Member

xakep139 commented Mar 8, 2024

Hello @NatMarchand, thank you for reaching out. I agree that some tags share same names in HttpClient and HTTP logging components, but you can easily distinguish them using:

  • log category:
    • incoming logs will be emitted with Microsoft.AspNetCore.HttpLogging.HttpLoggingMiddleware category
    • outgoing logs will have Microsoft.Extensions.Http.Logging.HttpClientLogger category
  • log message:
    • incoming logs will start with Request and Response: message
    • outgoing logs will have format {httpMethod} {httpServer}/{HttpPath}, e.g. GET localhost/api/users
  • event id:
    • incoming logs have "EventId": 9
    • outgoing logs have "EventId": 1
  • other tags (e.g. incoming HTTP logs have Scheme and Protocol tags emitted by default, outgoing HTTP logs don't emit those)

Additionally, you can also add your custom enricher (one that implements IHttpLogEnricher or IHttpClientLogEnricher) to add other tags that will help you distinguish the logs.

@klauco any other thoughts here?

@NatMarchand
Copy link
Author

NatMarchand commented Mar 21, 2024

When adding AddRequestHeadersLogEnricher(), an enricher is registered and then all the logs in the scope of the incoming http request will get the headers of the incoming request as tags, including logs from the outgoing call. I can show you on a small repro.

image
In the image above, the log is from the category Microsoft.Extensions.Http.Logging.HttpClientLogger (as you said). The http.response.status_code property is the status code from the outgoing call. The http.request.header.user-agent is the user agent from the incoming request. It's hard to know whether the http.request.* properties are coming from the outgoing call or the incoming request scope.

@xakep139
Copy link
Member

xakep139 commented Apr 2, 2024

@NatMarchand, I see where the confusion is coming from.
If log data you shared came from HttpClientLogger - that means that it contains data about an outgoing call only. In the log tags http.request.* tags correspond to the HttpRequestMessage that was sent by an HttpClient. Respectively, all http.response.* tags show the data dedicated to the HttpResponseMessage that was received by the HttpClient.
Does that explain you concerns?

@NatMarchand
Copy link
Author

If log data you shared came from HttpClientLogger - that means that it contains data about an outgoing call only.

That's not totally true. I've pinpointed the problem:

builder.Services
    .AddRequestHeadersLogEnricher(options =>
    {
#pragma warning disable EXTEXP0003
        options.HeadersDataClasses["User-Agent"] = DataClassification.None;
#pragma warning restore EXTEXP0003
    });

Issue the following log (with otel console exporter):

info: Microsoft.Extensions.Http.Logging.HttpClientLogger[1]
      GET bing.com/REDACTED
LogRecord.Timestamp:               2024-04-02T09:59:06.1827531Z
LogRecord.TraceId:                 0c5ba511208bc2d193a8f0197a394354
LogRecord.SpanId:                  eb8c401d9dfc55be
LogRecord.TraceFlags:              None
LogRecord.CategoryName:            Microsoft.Extensions.Http.Logging.HttpClientLogger
LogRecord.Severity:                Info
LogRecord.SeverityText:            Information
LogRecord.FormattedMessage:        GET bing.com/REDACTED
LogRecord.Body:                    GET bing.com/REDACTED
LogRecord.Attributes (Key:Value):
    http.request.method: GET
    server.address: bing.com
    url.path: REDACTED
    Duration: 487
    http.response.status_code: 200
    http.request.header.user-agent: Test/1.0
    http.request.header.user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36 Edg/123.0.0.0
LogRecord.EventId:                 1
LogRecord.EventName:               OutgoingRequest
LogRecord.ScopeValues (Key:Value):
[Scope.0]:SpanId: eb8c401d9dfc55be
[Scope.0]:TraceId: 0c5ba511208bc2d193a8f0197a394354
[Scope.0]:ParentId: 0000000000000000
[Scope.1]:ConnectionId: 0HN2IVE1H3S44
[Scope.2]:RequestId: 0HN2IVE1H3S44:00000005
[Scope.2]:RequestPath: /bing

Resource associated with LogRecord:
telemetry.sdk.name: opentelemetry
telemetry.sdk.language: dotnet
telemetry.sdk.version: 1.7.0
service.name: unknown_service:ReproTelemetry

See the http.request.header.user-agent added twice, one from my browser calling the server and the other from the httpclient outgoing. When using otlp exporter, only the last "survive"

Repro code
using Microsoft.Extensions.Compliance.Classification;
using Microsoft.Extensions.Compliance.Redaction;
using OpenTelemetry.Logs;

var builder = WebApplication.CreateBuilder(args);
builder.Logging
    .AddOpenTelemetry(otelBuilder =>
    {
        otelBuilder.AddConsoleExporter();
        otelBuilder.IncludeFormattedMessage = true;
        otelBuilder.IncludeScopes = true;
        otelBuilder.ParseStateValues = true;
    })
    .EnableEnrichment();
builder.Services
    .AddRequestHeadersLogEnricher(options =>
    {
#pragma warning disable EXTEXP0003
        options.HeadersDataClasses["User-Agent"] = DataClassification.None;
#pragma warning restore EXTEXP0003
    });
builder.Services
    .AddHttpClient("Bing", o =>
    {
        o.BaseAddress = new("https://bing.com");
        o.DefaultRequestHeaders.UserAgent.Add(new("Test", "1.0"));
    });
builder.Services
    .AddExtendedHttpClientLogging(o =>
    {
        o.RequestHeadersDataClasses.Add("User-Agent", DataClassification.None);
    });
builder.Services.AddSingleton<IRedactorProvider, NullRedactorProvider>();
var app = builder.Build();
app.MapGet("/", () => Results.Ok("hi"));
app.MapGet("/bing", (IHttpClientFactory httpClientFactory) => httpClientFactory.CreateClient("Bing").GetAsync(""));
app.Run();

@KalleOlaviNiemitalo
Copy link

What is the expected behaviour -- completely omit the incoming HTTP request headers from the outgoing HTTP request logging, or include them with different names?

@NatMarchand
Copy link
Author

If I had a magic wand, as I want to have both information, I'd make it on different prefix :
http.request.* for incoming request attributes
http_client.* or http.outgoing_request.* or something else for the outgoing request

@xakep139
Copy link
Member

xakep139 commented Apr 2, 2024

Okay, that happens because you added RequestHeadersLogEnricher - and it will augment all your logs emitted within the processing time of an incoming request, not only HTTP- or HttpClient-specific ones.
Do you really need User-Agent to be added to all these logs (even coming from Microsoft.AspNetCore.Routing.EndpointMiddleware category)?

With regards to the magic wand, you can always create your custom processor that derives from BaseProcessor<LogRecord> and rename/remove any attributes you want.

@xakep139
Copy link
Member

xakep139 commented Apr 2, 2024

Alternatively, you can use AddHttpLoggingRedaction() instead of AddRequestHeadersLogEnricher() - that way you'll get one nice log record per each incoming HTTP request without any ambiguity.

@NatMarchand
Copy link
Author

Do you really need User-Agent to be added to all these logs (even coming from Microsoft.AspNetCore.Routing.EndpointMiddleware category)?

Yes, it's what we find really interesting with enrichment: we have some headers that allow us to troubleshoot or correlate (we use UserAgent to know which of our service make the call or the cloudflare RayId). That avoid us to make a first call to find the TraceId from them and then filter on this (these) TraceId.

With regards to the magic wand, you can always create your custom processor that derives from BaseProcessor and rename/remove any attributes you want.

I'm not sure if in the BaseProcessor I'm able to determine if such attribute comes from the enrichment of the request and delete it or from the outgoing call and keep it.
In addition, we already have a BaseProcessor we'd like to get rid of which was added to workaround a bug between Compliance package and OTLP batch exporter (see open-telemetry/opentelemetry-dotnet#5276)

Alternatively, you can use AddHttpLoggingRedaction() instead of AddRequestHeadersLogEnricher() - that way you'll get one nice log record per each incoming HTTP request without any ambiguity.

See my first point, what we found really interesting is to filter logs against headers without join.

@xakep139
Copy link
Member

xakep139 commented Apr 4, 2024

I would encourage you to rely on TraceId to correlate logs for a particular request.
Regardless, @lmolkova are there any plans in terms of adding HTTP-specific tags in SemConv for logs?

@lmolkova
Copy link

lmolkova commented Apr 4, 2024

All OTel attributes can be used on traces, metrics, or logs, we don't differentiate attributes by signal.
We don't plan to add log-specific conventions and it would be hard to justify or push anyone to implement given all the prior art in the industry.

Here it seems we have an interesting issue - when enriching logs the new attributes clash with existing ones - I guess the enricher should be responsible for the conflict resolution and/or use dedicated namespace to add attributes to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants