Add easy support for Datadog and possibly other observability solutions #1433

petarlishov · 2022-08-10T09:35:58Z

Use case

I have played around with the Datadog and the AWS Powertools Lambda layers and as one that needs to integrate with Datadog, the Datadog Lambda layer is a good choice for getting that integration set up flawlessly.

But I also really enjoy some of the features that AWS Powertools have incorporated into their Lambda layer and as a developer I find it to be a very useful tool. Last time I checked (a while ago), the AWS Powertools layer was more lightweight as well.

Sadly, because both layers use similar packages (boto3 for example among other things), I believe they are not exactly compatible with each other. And either way, adding both would make our Lambdas' start times much worse as both layers are not exactly light either.

I have one collection of Lambdas using the Datadog layer and another collection using Poewrtools. I have thus noticed some of the differences that make Powertools tricky to easily integrate with Datadog despite being the more useful tool purely from a developer's perspective:

the log format is slightly different and it does not allow Datadog to easily query logs by things like request ID. It would be nice to be able to specify a Datadog log format that we can easily use within Powertools without having to create a custom one ourselves. I believe compatibility can be easily achieved here and all we need is a special logger format that mimics the one used by Datadog
Datadog does not ingest AWS CloudWatch metrics. I believe the easiest way to do a custom metric that gets interpreted by Datadog is to send logs in this format {"m": "Metric name", "v": "Metric value", "e": "Unix timestamp (seconds)", "t": "Array of tags"}. I know this is different from the embedded metrics format which AWS already provides - https://aws.amazon.com/about-aws/whats-new/2019/11/amazon-cloudwatch-launches-embedded-metric-format but maybe we can have a way to support both based on some configuration setting? Sadly the metrics that Powertools provides at the moment are not easily ingestible into Datadog
I have not tested it sufficiently, but if there is some incompatibility with the way Datadog handles traces, maybe someone else that has that knowledge can keep in mind that some adaptations would be useful there too? From what I can understand, there should be no problems there as traces are ingested straight from XRay but I have sadly under-utilised that functionality

Solution/User Experience

Provide a configuration or an option to define the format when setting up loggers, metrics and traces, which would allow for better integration with other ovservability solutions other than AWS' CloudWatch and XRay

Alternative solutions

It may be possible to do all these already. Maybe a separate package that adds this compatibility can be created that works well with Powertools, as long as Powertools already has easy methods to manipulate its behaviour in the required ways to allow for the requested integration.

Acknowledgment

This feature request meets Lambda Powertools Tenets
Should this be considered in other Lambda Powertools languages? i.e. Java, TypeScript

The text was updated successfully, but these errors were encountered:

boring-cyborg · 2022-08-10T09:35:59Z

Thanks for opening your first issue here! We'll come back to you as soon as we can.

heitorlessa · 2022-08-11T18:21:31Z

It is so rare to receive such a high quality feature request like this that I want us to take time to reply to you accordingly - please bear with us for an answer next week.

Until then, we're laying the ground work in E2E and Integ test framework to give us confidence to offer what you're asking -- either native support or expose the mechanisms we already have for customers to build them.

For anyone else reading this, please please add your +1 to the author to help us prioritise it.

Thank you for taking the time to share such rich detail.

heitorlessa · 2022-08-31T16:16:43Z

Hey @petarlishov, our new (internal) E2E framework took a lot longer than I expected to refactor, so I'm replying tomorrow morning to address your questions and a few asks as we started v2 in parallel.

heitorlessa · 2022-09-01T14:03:02Z

I'll break down my answer in categories to make it easier to parse later.

Making Lambda Powertools more lightweight

Sadly, because both layers use similar packages (boto3 for example among other things)
And either way, adding both would make our Lambdas' start times much worse as both layers are not exactly light either.

I bet you'll be excited to hear that we've started working on v2 (minor breaking changes) to cut down the final package size to ~464K (compressed) 🎉

In v2, we are making all dependencies optional, e.g. boto3, AWS X-Ray SDK and fastjsonschema bringing a >90% reduction. Powertools feature set today work with the older version Lambda supports at runtime (~7-8 months old), and if a feature requires a newer boto3 our docs will suggest how to upgrade it accordingly.

@rubenfonseca is leading V2. We'll create a RFC to discuss trade-offs of relying on Lambda runtime's packages, and how we're thinking of our Lambda Layer v2.

NOTE: Our stream of consciousness for Lambda Layer is to likely have boto3 removed while adding all optional dependencies to ease distribution for consumers - botocore alone is ~67M today (uncompressed).

Modularization is our medium-term game

This is an intermediate stage towards modularization in V3. This would need a major structural change but allow customers to pick and choose what they need, going as far as ~16K package size if one wants to. That however needs a ton of research and testing to make sure it's stable and maintainable - we plan to draft a RFC next year once we're comfortable with v2 outcomes.

Despite being a major version, we want to prevent disruptions as much as possible to our customers. We're working on our first upgrade guide, and for v3 we even have the ambition to create a linter plugin to help you upgrade faster.

Long-term, this will give us the structure we need to add support to non-Lambda runtimes like Fargate, Glue jobs, etc. We have some customers using it that way, but we haven't put a "it's supported" stamp on it yet. We could even expose our private integ/e2e testing utilities as a package for customers ;)

Datadog Log format

It would be nice to be able to specify a Datadog log format that we can easily use within Powertools without having to create a custom one ourselves. I believe compatibility can be easily achieved here and all we need is a special logger format that mimics the one used by Datadog

I'm not sure if you've tried, but Logger supports Bringing Your Own Logging Formatter without forgoing Logger features and UX. We recommend that option for customers looking to only change the final format without having to maintain a different Logger implementation altogether.

I'm not fully aware of what Datadog expects in a structured logging and why they have difficulties to query a JSON field. That said, we're more than happy to investigate any non-breaking change we could do on our side if this spans more than Datadog - feature request please!

In V3, we'll be able to create a providers package where we could use community help to get these quirks addressed while not imposing everyone to receive a copy of a provider (e.g., Datadog) themselves.

Datadog Metrics format

I believe the easiest way to do a custom metric that gets interpreted by Datadog is to send logs in this format {"m": "Metric name", "v": "Metric value", "e": "Unix timestamp (seconds)", "t": "Array of tags"}

We could make this more extensible quite easily - wanna create a RFC?

RFC will help us agree on a contract for exposing a Metric Provider (a simple sink pattern), so that customers can use the same UX but have different outputs and validation mechanisms. As of now, we didn't invest much and this part of the code could be easily rearranged until we have a proper Provider - https://github.com/awslabs/aws-lambda-powertools-python/blob/develop/aws_lambda_powertools/metrics/base.py#L139

Datadog Trace compatibility

I have not tested it sufficiently, but if there is some incompatibility with the way Datadog handles traces, maybe someone else that has that knowledge can keep in mind that some adaptations would be useful there too?

Tracing has an undocumented BaseProvider for that intent but we haven't been able to put more thoughts into it. Now that we're making X-Ray SDK optional in v2, this becomes a more interesting conversation to have otherwise we'd be forcing customers to have X-Ray SDK lib when they were using a DataDog Provider.

The hardest part in Tracing is patching modules and nomenclatures (e.g., segment/span, patching only X but not Y lib) --- wanna create a RFC for what minimum feature set the BaseProvider should support?

I initially wanted to have a drop-in replacement Tracing Provider, but then digging into 3-4 tracing providers' lib I saw how much custom logic each provider does and I became less sure of it - a RFC can help us get there ;)

We also looked into Open Telemetry but the cold start was too significant, and it was a moving target in terms of changes too. I think exposing our BaseProvider is a good first step.

Overall

We're going towards that direction but we'd love help from the community in helping us define a good contract for Providers (Tracing has already).

Right now, our main focus is on operational excellence (E2E test) to ensure V2 can be smooth sailing, and pave the road for our future modularization story. We'll continue to respond to feature requests and greatly appreciate any help we can get - we can't wait to create new utilities and new extensibility mechanisms, but first we need confidence large changes can be made ;)

Once again, thank you for creating such a comprehensive issue. These make me personally happy that we have a lot to do but also emphasize that got a community who cares ;)

Hopefully that answers your questions and remarks, please let us know otherwise!

PS: Join us on Discord, we'd love to have you if you aren't there already.

cc @rubenfonseca @leandrodamascena @mploski @am29d @saragerion @sliedig

heitorlessa · 2023-03-07T09:46:49Z

Update: We estimate one RFC per feature (Tracer, Metrics, Logger) starting in mid-April/early May. This will help focus the discussion on a standard interface to help customers bring their own provider.

Since we launched V2, the only difference for Tracer is that we'd stick with AWS X-Ray SDK provider as the default, while providing a built-in provider for OpenTelemetry - other 3rd party providers (e.g., Datadog, Lumigo, NewRelic, etc.) would be owned by them where we'd be happy to collaborate/coordinate.

Thank you all!!

leandrodamascena · 2023-04-11T22:46:18Z

I'm removing the help wanted and need-customer-feedback labels because already have work in progress, so anyone can comment on these RFCs.

heitorlessa · 2023-05-01T13:06:29Z

UPDATE: We're adding support in Logger for Datadog as a start. We're working on a POC for Metrics, and adding last refinements for Tracer Providers.

For Logger, Datadog was the only provider that required a custom timestamp so we've added a Formatter, and documented our recommendation to use Lambda Extensions to not impact in performance.

The only reason we're not adding OTel Log output now is because it's not Final yet - please feel free to open a feature request when that happens (whoever is reading and need that)

heitorlessa · 2023-08-04T09:13:53Z

Quick update: Datadog made the bugfix release we needed, we're tentatively looking to launch Datadog Metrics Provider by the end of next week ;)

Then we move on to adding Observability Provider to Tracer (rinse and repeat).

leandrodamascena · 2023-08-14T19:15:04Z

Hey everyone! We are very happy to announce that we have merged PR (#2906) to add support for the Datadog Metrics provider and create a way for adding more observability providers. We are planning a release for this week.

For now we are going to work on adding external observability providers to Tracer. As soon as we have news I'll update this issue.

Thank you to everyone who is working hard to make Powertools for AWS Lambda (Python) even better.

heitorlessa · 2024-02-26T13:47:14Z

Quick update -- @roger-zhangg is working on the last feature: Observability Provider for Tracer. Once that's done, we'll close this issue, and start investigating an alternative solution for OTel as cold starts haven't significantly improved.

petarlishov added feature-request feature request triage Pending triage from maintainers labels Aug 10, 2022

saragerion mentioned this issue Aug 11, 2022

Feature Request - add support to other observability providers (Support New Relic) aws-powertools/powertools-lambda-typescript#646

Open

heitorlessa added need-customer-feedback Requires more customers feedback before making or revisiting a decision help wanted Could use a second pair of eyes/hands and removed triage Pending triage from maintainers labels Sep 1, 2022

heitorlessa pinned this issue Sep 9, 2022

heitorlessa mentioned this issue Sep 28, 2022

Integrate with AWS Distro for OpenTelemetry aws-powertools/powertools-lambda#1

Open

heitorlessa mentioned this issue Oct 19, 2022

RFC: Lambda Powertools for Python v2 #1459

Closed

8 tasks

sthulb unpinned this issue Jan 5, 2023

sthulb pinned this issue Jan 5, 2023

heitorlessa mentioned this issue Jan 31, 2023

RFC: Sensitive data masking utility #1858

Closed

2 tasks

seshubaws mentioned this issue Mar 15, 2023

RFC: Support for external observability providers - Logging #2014

Closed

2 tasks

roger-zhangg mentioned this issue Mar 15, 2023

RFC: Support for external observability providers - Metrics #2015

Closed

2 tasks

Vandita2020 mentioned this issue Mar 21, 2023

RFC: Support for external observability providers - Tracer #2030

Open

2 tasks

leandrodamascena added logger metrics tracer Tracer utility and removed help wanted Could use a second pair of eyes/hands need-customer-feedback Requires more customers feedback before making or revisiting a decision labels Apr 11, 2023

leandrodamascena assigned heitorlessa Apr 28, 2023

This was linked to pull requests May 3, 2023

feat(metrics): support to bring your own metrics provider #2194

Merged

feat(logger): add DatadogLogFormatter and observability provider #2183

Merged

heitorlessa added this to the Observability Provider milestone Nov 13, 2023

heitorlessa assigned roger-zhangg and unassigned leandrodamascena Feb 26, 2024

This was referenced Apr 5, 2024

Roadmap update reminder - April #4076

Closed

Roadmap update reminder - April #4077

Closed

This was referenced Apr 10, 2024

CHANGELOG.* not picked up by @dependabot #35

Closed

[Test] Roadmap update reminder - April #4104

Closed

heitorlessa mentioned this issue Apr 18, 2024

Roadmap update reminder - undefined #4155

Closed

This was referenced Apr 29, 2024

Roadmap update reminder - April aws-powertools/actions#2

Closed

Roadmap update reminder - Python aws-powertools/actions#3

Open

Roadmap update reminder - May #4251

Closed

Roadmap update reminder - .NET aws-powertools/actions#11

Open

dreamorosi mentioned this issue May 7, 2024

Roadmap update reminder - May aws-powertools/actions#23

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add easy support for Datadog and possibly other observability solutions #1433

Add easy support for Datadog and possibly other observability solutions #1433

petarlishov commented Aug 10, 2022 •

edited

boring-cyborg bot commented Aug 10, 2022

heitorlessa commented Aug 11, 2022

heitorlessa commented Aug 31, 2022

heitorlessa commented Sep 1, 2022

heitorlessa commented Mar 7, 2023

leandrodamascena commented Apr 11, 2023

heitorlessa commented May 1, 2023

heitorlessa commented Aug 4, 2023

leandrodamascena commented Aug 14, 2023 •

edited

heitorlessa commented Feb 26, 2024

Add easy support for Datadog and possibly other observability solutions #1433

Add easy support for Datadog and possibly other observability solutions #1433

Comments

petarlishov commented Aug 10, 2022 • edited

Use case

Solution/User Experience

Alternative solutions

Acknowledgment

boring-cyborg bot commented Aug 10, 2022

heitorlessa commented Aug 11, 2022

heitorlessa commented Aug 31, 2022

heitorlessa commented Sep 1, 2022

Making Lambda Powertools more lightweight

Modularization is our medium-term game

Datadog Log format

Datadog Metrics format

Datadog Trace compatibility

Overall

heitorlessa commented Mar 7, 2023

leandrodamascena commented Apr 11, 2023

heitorlessa commented May 1, 2023

heitorlessa commented Aug 4, 2023

leandrodamascena commented Aug 14, 2023 • edited

heitorlessa commented Feb 26, 2024

petarlishov commented Aug 10, 2022 •

edited

leandrodamascena commented Aug 14, 2023 •

edited