Enabling profiling at runtime #17583

GustavoCaso · 2024-05-16T11:45:28Z

What does this PR do?

During the Agent innovation week @iglendd, @coignetp and myself are working on enabling continues profiling for the Agent and the integrations at runtime. The main idea is that we would be able to toggle continues profiling when requesting a flare.

Currently the integrations support profiling via the integration_profiling configuration from the Agent aka. datadog.yml. The profiler is only enabled at the init time once. Here

This PR introduces a naive profiling util that would allow use to enable and disable profiling at any point with in a check.

The Profiling class is a singleton. Which can be use in the check's code to enabled and disable profiling.

With the Check#run function we check the datadog_agent configuration for the value integration_profiling if it is enabled we start profiling. The main idea of using a singleton is to ensure is to not create many profiler instances, and to ensure one profile instance is running at all times.

Disclaimer

The PR is on its earlier stages and is missing test. I want to get some feedback from the integrations team before investing more time on it. Once validated, I will add tests and any other necessary changes to make the PR production ready

Motivation

Additional Notes

Review checklist (to be filled by reviewers)

Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
Changelog entries must be created for modifications to shipped code
Add the qa/skip-qa label if the PR doesn't need to be tested during QA.
If you need to backport this PR to another branch, you can add the backport/<branch-name> label to the PR and it will automatically open a backport PR once this one is merged

…time

datadog_checks_base/datadog_checks/base/checks/base.py

…h the profiling singleton

datadog_checks_base/datadog_checks/base/utils/profiling.py

…Technically the first branch shuould only be exercise one time

github-actions · 2024-05-21T14:11:57Z

The validations job has failed; please review the Files changed tab for possible suggestions to resolve.

iliakur · 2024-05-28T16:33:25Z

datadog_checks_base/datadog_checks/base/utils/profiling.py

+        if not self._running:
+            return
+
+        with self._mutex:


Why do we need self._mutex when checking if self._running is True but not when checking if it's False?

Could you please add an inline comment somewhere saying why mutex is necessary in the first place?
You can probably include also why this needs to be a singleton while you're at it.

And if you really want to convey more clearly that this should not be initialized outside of the module, you could make the class private as well s/Profiling/_Profiling/.

iliakur · 2024-05-28T16:43:10Z

datadog_checks_base/tests/base/utils/test_profiling.py

My 2 cents:

I know the convetion is to add tests for all modules in utils. I appreciate that you stuck to it.
At the same time unless you plan to expose datadog_checks.base.utils.profiling outside of the base package (which rn it doesn't sound like you do), I'd treat that module as internal. i.e. nix this test and try to cover as much ground as you can with test_agent_check.py

If we do decide to keep this test, let's not reach into the class and patch the private attribute _profiler, but make the external dependency explicit with either public attribute or (better) something we pass to __init__.

iliakur · 2024-05-28T16:45:09Z

datadog_checks_base/datadog_checks/base/utils/profiling.py

+                self._running = False
+
+    def status(self):
+        return "running" if self._running else "stopped"


Do we anticipate more status types? If no, we could turn this into running(self) -> bool, right?

iliakur · 2024-05-28T16:46:43Z

datadog_checks_base/tests/base/checks/test_agent_check.py

+
+@pytest.mark.parametrize(
+    "integration_profiling",
+    ["true", "false"],


Suggested change

["true", "false"],

[True, False],

That way you can later just say if integration_profiling:

Initial naive implementation for supporting enabling profiling at run…

c88553a

…time

datadog-agent-integrations-bot bot added the base_package label May 16, 2024

GustavoCaso changed the title ~~Initial naive implementation for supporting enabling profiling at run…~~ Initial naive implementation for supporting enabling profiling at runtime May 16, 2024

coignetp reviewed May 16, 2024

View reviewed changes

datadog_checks_base/datadog_checks/base/checks/base.py Outdated Show resolved Hide resolved

GustavoCaso added 3 commits May 16, 2024 15:42

fix profiling import

13d0ad6

Ensure the profiling singleton do not call __init__ everytime we fetc…

76942f7

…h the profiling singleton

Ensure passing Class and instance to super

c59fd2f

FlorentClarret reviewed May 17, 2024

View reviewed changes

datadog_checks_base/datadog_checks/base/utils/profiling.py Outdated Show resolved Hide resolved

GustavoCaso added 5 commits May 17, 2024 17:12

lazy load ddtrace.profiling

ecf5580

ensure importing ddtrace.profiling only happens once

4518043

use Mutex to ensure multiple processes operate correctly

f3490ac

remove code to ensure the profiling library only gets imported once. …

f12d8c0

…Technically the first branch shuould only be exercise one time

apply feedback

f5a3749

GustavoCaso added 2 commits May 27, 2024 17:36

Add profiling tests

566ff3d

Merge branch 'master' into allow-profiling-integrations-at-runtime

7393c92

GustavoCaso marked this pull request as ready for review May 27, 2024 15:50

GustavoCaso requested a review from a team as a code owner May 27, 2024 15:50

datadog-agent-integrations-bot bot added the team/agent-integrations label May 27, 2024

GustavoCaso changed the title ~~Initial naive implementation for supporting enabling profiling at runtime~~ Enabling profiling at runtime May 27, 2024

GustavoCaso added 2 commits May 27, 2024 18:06

add unit test for profiling class

6d045be

add changelog entry

c120c03

GustavoCaso force-pushed the allow-profiling-integrations-at-runtime branch from 1403f62 to c120c03 Compare May 27, 2024 16:09

fic lint issues

c919a0c

iliakur requested changes May 28, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enabling profiling at runtime #17583

Enabling profiling at runtime #17583

GustavoCaso commented May 16, 2024 •

edited

github-actions bot commented May 21, 2024

iliakur May 28, 2024

iliakur May 28, 2024

iliakur May 28, 2024

iliakur May 28, 2024

Enabling profiling at runtime #17583

Are you sure you want to change the base?

Enabling profiling at runtime #17583

Conversation

GustavoCaso commented May 16, 2024 • edited

What does this PR do?

Disclaimer

Motivation

Additional Notes

Review checklist (to be filled by reviewers)

github-actions bot commented May 21, 2024

iliakur May 28, 2024

Choose a reason for hiding this comment

iliakur May 28, 2024

Choose a reason for hiding this comment

iliakur May 28, 2024

Choose a reason for hiding this comment

iliakur May 28, 2024

Choose a reason for hiding this comment

GustavoCaso commented May 16, 2024 •

edited