Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Trace Cache Iteration Crash #692

Closed
wants to merge 3 commits into from
Closed

Conversation

TimPansino
Copy link
Contributor

Before contributing, please read our contributing guidelines and code of conduct.

Overview

Describe the changes present in the pull request

Related Github Issue

Include a link to the related GitHub issue, if applicable

Testing

The agent includes a suite of tests which should be used to
verify your changes don't break existing functionality. These tests will run with
Github Actions when a pull request is made. More details on running the tests locally can be found
here,
For most contributions it is strongly recommended to add additional tests which
exercise your changes.

@github-actions
Copy link

github-actions bot commented Nov 16, 2022

🦙 MegaLinter status: ❌ ERROR

Descriptor Linter Files Fixed Errors Elapsed time
✅ PYTHON bandit 1 0 3.0s
✅ PYTHON black 1 0 0 0.54s
❌ PYTHON flake8 1 18 0.33s
✅ PYTHON isort 1 0 0 0.19s
✅ PYTHON pylint 1 0 2.74s

See errors details in artifact MegaLinter reports on CI Job page
Set VALIDATE_ALL_CODEBASE: true in mega-linter.yml to validate all sources, not only the diff

MegaLinter is graciously provided by OX Security

@TimPansino TimPansino marked this pull request as ready for review November 16, 2022 18:34
@TimPansino TimPansino requested a review from a team as a code owner November 16, 2022 18:34
Copy link
Contributor

@hmstepanek hmstepanek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I apologize that I didn't see this earlier. I had this one on my todo list but I didn't see you'd already opened a PR for it until just now. I do not believe this fix will work as the crash was during the list() cast prior to the loop running. If the list casting didn't fix this, I do not believe a copy will fix it either. My comment on the issue was a bit rushed and not thought through. This type of fix would only work if the code inside the loop was where the _cache was being modified (the list casting would have also fixed this kind of issue). However, since the crash is still happening, I do not think that's what's happening here (nor do I see any code inside the loop that modifies the size of the _cache). I think this is being modified in a separate thread so the two threads are accessing the _cache simultaneously. I believe we need to add a lock here. I've opened a draft PR that does that here: https://github.com/newrelic/newrelic-python-agent/pull/702/files. We can either open that draft PR or we can modify this fix to be like what's in that draft.

@TimPansino
Copy link
Contributor Author

@hmstepanek

If the list casting didn't fix this, I do not believe a copy will fix it either.

The issue with a list is that it's not casting, it's iterating on the dict.items into a list, then iterating the list. The crash has less area to happen but can still happen if the change is made during the short time when the dict.items is iterated over by the list constructor (instead of the longer running for loop).

A shallow copy doesn't iterate as far as I'm aware, this is a patch that's been used in other places I've seen to guard against iteration issues.

That being said, I'm not opposed to using locking but I wonder if that causes potential performance issues. It's certainly harder to maintain. I think the simpler solution might be more appropriate considering this is only affecting event loop wait time and debug logic, as opposed to potential impacts to any time that the trace cache is accessed.

@TimPansino
Copy link
Contributor Author

Superseded by #704

@TimPansino TimPansino closed this Dec 1, 2022
@TimPansino TimPansino deleted the fix-trace-cache-crash branch December 1, 2022 00:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants