Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Canonical Wide Log utility for emitting a single log entry per lambda execution or processed transaction (batch) #3878

Open
2 tasks done
walmsles opened this issue Mar 3, 2024 · 2 comments
Labels
need-customer-feedback Requires more customers feedback before making or revisiting a decision RFC

Comments

@walmsles
Copy link
Contributor

walmsles commented Mar 3, 2024

Is this related to an existing feature request or issue?

No response

Which Powertools for AWS Lambda (Python) utility does this relate to?

Logger

Summary

The logger is great and offers a heap of functionality. It is very useful for developer logging, debug, info, error logging but I don't think its great for observability logging. Some ideas have been circulating in the observability community around canonical logging and, more recently, about wide events. The idea of a single log entry per Lambda function invocation holding ALL the meta-data for what occurred during that execution is a very useful concept and something I have used for tracking, tracing and observing transactions across a distributed system.

Use case

I want to be able to gather meta-data during transaction processing and store it somewhere so I can write it all out in a single place, representing everything that happened to the transaction as it was being processed by my code. I want to be able to add to this meta-data from shared-libraries and want to enable developers to easily throw meta-data into the bucket for collection when transaction processing is done.

Log attributes does this sto a degree already - but is focused on applying the meta-data to every log output, not just a single oneat the end of processing.

Proposal

Existing capability:

from aws_lambda_powertools import Logger
from aws_lambda_powertools.utilities.typing import LambdaContext

logger = Logger()


@logger.inject_lambda_context
def lambda_handler(event: dict, context: LambdaContext) -> str:
    logger.append_keys(customer=event.customer_id)

    try:
        logger.info({"operation": "collect_payment", "charge_id": event["charge_id"]})

        # do more processing    
    
    finally:
        logger.info("WIDE") # log WIDE log entry along with collected attributes from 'append_keys'
    
    return "hello world"

Using a Wide Logging utility:

from aws_lambda_powertools import Logger
from aws_lambda_powertools.utilities.typing import LambdaContext

logger = Logger()


@logger.inject_lambda_context
@logger.wide_logger()
def lambda_handler(event: dict, context: LambdaContext) -> str:
    logger.wide.add(customer=event.customer_id)

    logger.debug({"operation": "collect_payment", "charge_id": event["charge_id"]})
    logger.wide.add(charge_id=event.get("charge_id"))
    logger.wide.add(operation=collect_payment)

    # do more processing    
    
    
    return "hello world"

Using middleware to emit the wide log means it is a set and forget operation, which adds value for developers who are already overloaded.
Being able to add in meta-data as transaction processing happens, encourages thoughtful use of meta-data attributes.
Collating them in a central object for emitting at the end means the meta-data you are collecting for observability purposes is separate from attributes you collate for debug logging - 2 very distinct purposes and separating the concerns will encourage clearer thinking on Observability vs debug.

Out of scope

Nothing.

Potential challenges

Wide logging needs to be considered at a transaction level - so not always as a per-function execution statement which is not true in the case of functions which process data through a batch mechanism - SQS, SNS, stream (DynamoDB, Kafka, Kinesis, etc). The middleware would need to apply to both batch functions and lambda handlers.

Dependencies and Integrations

This should be a natural extension of the logger utility - a slightly different function to just logging. I liken the idea of this more like the Metrics implementation where meta-data is collated during code execution and output at a singular level.This

Alternative solutions

No response

Acknowledgment

@walmsles walmsles added RFC triage Pending triage from maintainers labels Mar 3, 2024
@walmsles walmsles changed the title RFC: Canonical Wide Log utility for emitting a single log entry per lambda execution. RFC: Canonical Wide Log utility for emitting a single log entry per lambda execution or processed transaction (batch) Mar 3, 2024
@leandrodamascena
Copy link
Contributor

Hi, @walmsles! Thank you for opening this RFC and bringing to the table an updated concept of canonical logs. Over the last few months, I've seen this topic (and WIDE logs) being raised again in many forums and seeing how it is specializing observability into a telemetry and transaction path, which differentiates it from current logs for observability in general.

This is the first interaction in this RFC, so I have some questions/thoughts to share with you to see if it makes sense.

1 - If I got it correctly, the idea of this RFC is to create a middleware to add the ability to create WIDE logs per transaction/lambda execution. Could you confirm if the idea here is to add support for WIDE logs, canonical format logs, or both?

2 - You mention creating a middleware to not only provide WIDE logging per handler execution but also propagate this logger instance across batch/idempotence/event handler executions. I have some difficulty seeing how we can share this instance between all the utilities, perhaps using a decorator on all the functions you want to use this WIDE logger instance?

3 - What is the log level of this WIDE log? INFO by default? Customers can change this at runtime? This should inherit the same log level of the Logger() instance? In your example, you are using DEBUG and WIDE (INFO?) with the same Logger instance and I can imagine the middleware is respecting the log level configured at the constructor level.

4 - We have some parameters at the Logger constructor level and we also have some standalone functions, such as append_keys for example. In this WIDE logs middleware, should we consider and respect everything that was defined in the Logger() instance? For example: if the customer adds keys via append_keys or set correlationid via set_correlation_id, should we propagate this to this Middleware? We should not? This can result in duplicate or unwanted keys in the WIDE logs because a customer can call append_keys anywhere, for example.

5 - When will we flush the WIDE log to output? Right after the Lambda execution has completely finished? And if the customer is processing Idempotency and Batch Processing in the same Lambda execution, wouldn't this cause some confusion between the data the customer wants in the WIDE logs?

For now, these are my ideas in this RFC. I'm very excited to see that the community asks us to add new features that will help others.
I believe that after we evolve a little in this RFC and perhaps a PoC, we can schedule a conversation to move forward with this implementation.

Thanks as always for all your help @walmsles! ❤️

@leandrodamascena leandrodamascena added need-customer-feedback Requires more customers feedback before making or revisiting a decision and removed triage Pending triage from maintainers labels Mar 8, 2024
@walmsles
Copy link
Contributor Author

walmsles commented Mar 10, 2024

1 - If I got it correctly, the idea of this RFC is to create a middleware to add the ability to create WIDE logs per transaction/lambda execution. Could you confirm if the idea here is to add support for WIDE logs, canonical format logs, or both?

Conceptually Canonical and Wide logs are similar in the way they are written once to your log output, I would say they are actually the same thing. names are important which is why I personally like “Wide Logs”, it says what it is. Nobody understands what a Canonical log should be IMO 🙃. Let’s Agree - they are the same thing, and have the same logical functionality. From a Powertools perspective we should choose the name that we stay with and most customers understand.

2 - You mention creating a middleware to not only provide WIDE logging per handler execution but also propagate this logger instance across batch/idempotence/event handler executions. I have some difficulty seeing how we can share this instance between all the utilities, perhaps using a decorator on all the functions you want to use this WIDE logger instance?

When I think about using this kind of logging - I am generally looking to write a single log entry per transaction that gets processed. This is the complexity I was raising - to provide the utility such that it is useable in all of the following contexts:

  • Whole Lambda Execution where a single API transaction is processed. e.g Event Handler
  • In a batch setting (BatchProcessor)
  • Any others I have forgotten?
    Not really thinking of implementation - but highlighting the context of where I feel it should be used.

3 - What is the log level of this WIDE log? INFO by default? Customers can change this at runtime? This should inherit the same log level of the Logger() instance? In your example, you are using DEBUG and WIDE (INFO?) with the same Logger instance and I can imagine the middleware is respecting the log level configured at the constructor level.

I have a typescript implementation and I output it as “INFO” with a “WIDE” tag for filtering. I have been thinking more about whether it should be a different log level altogether in case INFO is not turned on - I feel WIDE or Canonical logs are ones you always want to write out so may need to think more about this from a level perspective - are they logs at all?, and what does that mean from an AWS Lambda perspective.

4 - We have some parameters at the Logger constructor level and we also have some standalone functions, such as append_keys for example. In this WIDE logs middleware, should we consider and respect everything that was defined in the Logger() instance? For example: if the customer adds keys via append_keys or set correlationid via set_correlation_id, should we propagate this to this Middleware? We should not? This can result in duplicate or unwanted keys in the WIDE logs because a customer can call append_keys anywhere, for example.

My thought here is that the Wide (or Canonical) log functionality should operate from a seperate attribute “bucket” to gather wide log data during processing. This is how my typescript implementation works, I liken it to the Metrics functionality which enables gathering of data and writes it out once.

5 - When will we flush the WIDE log to output? Right after the Lambda execution has completely finished? And if the customer is processing Idempotency and Batch Processing in the same Lambda execution, wouldn't this cause some confusion between the data the customer wants in the WIDE logs?

Let’s tackle this one in pieces - there are more than one question, I will focus on the why’s and what I would like - not the HOW - this needs to come later after we agree on the contexts and function.

When will we flush the WIDE log output?
Wide log output should be flushed logically at the end of the lambda execution. This makes the most sense and is how my Typescript implementation works via Middy Middleware.

If the customer is processing in batch?
Refer back to the context for the functionality - I agree this is a complexity. For me I would want a WIDE log entry per transaction. I would also want a mix of attributes which are common to each item in a batch as well as attributes that are ONLY for a single item.

If the customer is using Idempotency Utility?
This is another complexity - if the request being processed is a duplicate identified by Idempotency - you would not want a WIDE log entry output - since it has already been processed. If it is not a duplicate transaction - then you would want Wide logging to occur as normal.

At the end of the day we should provide a utility to enable customers to use it as they need to and desire regardless of the context of operation. The functionality should be consistent and simple in each use-case so customer’s can choose how to use it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need-customer-feedback Requires more customers feedback before making or revisiting a decision RFC
Projects
Status: Pending review
Development

No branches or pull requests

2 participants