Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory Leak Python 3.12.2 #628

Open
0x78f1935 opened this issue Feb 14, 2024 · 8 comments
Open

Memory Leak Python 3.12.2 #628

0x78f1935 opened this issue Feb 14, 2024 · 8 comments

Comments

@0x78f1935
Copy link

Summary

I conducted several tests with Python 3.11 with success and recently migrated to Python 3.12. I observed that on my Debian system, the SSH connection is lost after running my GitHub Actions pipeline. This issue consistently occurs after all tests have completed, and the results are about to be fetched.

Upon further investigation, I was able to reproduce this problem on a Windows system running Visual Studio Code.

As the repository is private, I will provide as much information as possible in this ticket to aid in troubleshooting.

image

Note: This issue persists until all available memory has been used.

The terminal is stuck at:

image

When I develop on my remote server and this happens, SSH simply times out. When I run this on my desktop, I can barely use my computer due to the memory leak, which prevents me from stopping the generation of the coverage.

During my investigation, I toggled various settings within my test suite and did the same with libraries. I removed xdist because I thought it might be related to multiprocessing, but this was not the case. What remained was:

django-coverage-plugin==3.1.0
    # via -r requirements.in
pytest==8.0.0
    # via
    #   pytest-cov
    #   pytest-django
    #   pytest-sugar
pytest-cov==4.1.0
    # via -r requirements.in
pytest-django==4.8.0
    # via -r requirements.in
pytest-sugar==1.0.0
    # via -r requirements.in
python-dotenv==1.0.1
    # via -r requirements.in

I'm utilizing pyproject.toml and use various flags to start my tests.

[tool.pytest.ini_options]
addopts = "--exitfirst -vs --junitxml htmlcov/pytest.xml --cov --cov-report html --cov-report xml --cov-report term"

which should results into: python -m pystest --exitfirst -vs --junitxml htmlcov/pytest.xml --cov --cov-report html --cov-report xml --cov-report term

Note: When running in Python3.11 this works fine.

When running on Python3.12 with only the flags python -m pystest --exitfirst -vs everything runs fine.
image

You might have noticed by now this is a django application. When running the same tests but with the django suite, everything works fine, even in Pyhton3.12. The command would look like python manage.py test --debug-mode --noinput --pythonpath backend
image

Note: The last two images do not generate coverage data. (Django test suite and Pytest without coverage)

Expected vs actual result

I expect my tests to run just like Python3.11 without memory issues.

Reproducer

I don't necesarely have a reproducable environment, but I do have a base image which I use for my Django application. All requirements are in there eventho I listed them already. This image doesn't include any tests.

Versions

django-coverage-plugin==3.1.0
    # via -r requirements.in
pytest==8.0.0
    # via
    #   pytest-cov
    #   pytest-django
    #   pytest-sugar
pytest-cov==4.1.0
    # via -r requirements.in
pytest-django==4.8.0
    # via -r requirements.in
pytest-sugar==1.0.0
    # via -r requirements.in
python-dotenv==1.0.1
    # via -r requirements.in

Python 3.12.2

Config

Do note the usage of -n 8, which was originally employed for xdist and has already been removed from my configuration to exclude xdist from the problem.

Original config

[tool.pytest.ini_options]
addopts = "-n 8 --exitfirst -vs --junitxml htmlcov/pytest.xml --cov --cov-report html --cov-report xml --cov-report term"
testpaths = [
    "tests.py",
    "test_*.py",
    "*_tests.py",
]
DJANGO_SETTINGS_MODULE = "backend.application.settings"
norecursedirs = "frontend/src/*"

[tool.coverage.run]
branch = true
command_line = "-m pytest"
concurrency = [
    "multiprocessing",
    "thread",
]
parallel = true
source = [
    "backend",
]
omit = [
    "*.html",
    "*.txt",
    "*.log",
    "*.js",
    "*.cjs",
    "*.jsx",
    "*.json",
    "*.ts",
    "*.tsx",
    "*.css",
    "*.sass",
    "backend/application/management/commands/clear_migrations.py",
    "backend/application/pagination.py",
    "backend/application/serializers.py",
]
plugins = ["django_coverage_plugin", ]

[tool.coverage.report]
fail_under = 90
ignore_errors = false
precision = 2
show_missing = true
skip_covered = false
skip_empty = true
sort = "Cover"

[tool.coverage.html]
directory = "htmlcov"
show_contexts = true
skip_covered = false
skip_empty = true
title = "Backend-code Coverage"

[tool.coverage.xml]
output = "htmlcov/coverage.xml"

[tool.coverage.django_coverage_plugin]
template_extensions = 'html, txt, tex, email'

Code

An example how my tests might look like:

# -*- mode: python ; coding: utf-8 -*-
"""
Unit Test: Application
----------------------
"""
from django.test import TestCase
from backend.application import get_logger
from uuid import uuid4


class LoggerTests(TestCase):
    def test_obtaining_logger(self):
        """
        Check if application is able to import the logger.
        """
        name = str(uuid4())[20:]
        logger = get_logger(name)
        self.assertEqual(logger.name, name)
@nicoddemus
Copy link
Member

There's been some known performance problems with coverage on Python 3.12, might be related:

nedbat/coveragepy#1665 (comment)
python/cpython#107674

Perhaps try using COVERAGE_CORE=sysmon from 7.4.0, might help.

@0x78f1935
Copy link
Author

0x78f1935 commented Feb 16, 2024

Hi @nicoddemus! Thank you for referring to those references.

I've played around with my configuration, and so far, I have the following to document.

Attempt 1

Perhaps try using COVERAGE_CORE=sysmon from 7.4.0, might help.

This was the most straightforward thing to try. I changed my .vscode config to the following.

{
    "name": "Test: Pytest",
    "type": "python",
    "request": "launch",
    "module": "pytest",
    "env": {
        "PYDEVD_DISABLE_FILE_VALIDATION": "1",
        "COVERAGE_CORE": "sysmon"
    },
    "justMyCode": true
}

And made sure to disable "branch" in my pyproject.toml file.

[tool.coverage.run]
# Whether to measure branch coverage in addition to statement coverage
branch = false

The documentation states it doesn't work for branch coverage, therefor I turned it off.

While keeping an eye on my memory usage, I'm writing this message with 18% in use while my tests are running on my Windows 11 system. After my tests hit 100%, the memory usage skyrockets like before.

image

Conclusion

I don't think COVERAGE_CORE=sysmon solves my issue. But lets leave it on for now.

Attempt 2

There's been some known performance problems with coverage on Python 3.12, might be related:

nedbat/coveragepy#1665 (comment)
python/cpython#107674

I'm aware that CPython underwent significant changes from version 3.10 to 3.11 to 3.12. It could be related, but it's uncertain at this point... :C. Let's try removing coverage from my stack and see where this goes!

I removed pytest-cov==4.1.0 from my stack, including django-coverage-plugin==3.1.0.

I removed --cov --cov-report html --cov-report xml --cov-report term from addopts located in my pyproject.toml.

[tool.pytest.ini_options]
addopts = "--exitfirst -vs --junitxml htmlcov/pytest.xml"

I also removed the django plugin "django_coverage_plugin" from my toml file.

plugins = []

Time to try again.

Conclusion

image

That worked. Something is up with either pytest-cov==4.1.0 or django-coverage-plugin==3.1.0.

Attempt 3

I'll reinstall pytest-cov==4.1.0, but let's run the tests without any additional flags.

Installing collected packages: pytest-cov
Successfully installed pytest-cov-4.1.0

Lets fire it up!

Conclusion

image

That also works. Let's add the first flag back but without adding the django plugin.

Attampt 4

I added --cov --cov-report html to the addopts variable located in my pyproject.toml and re-ran my tests.

Conclusion

image

So we are back at square one.

Attempt 5

Let's remove --cov --cov-report html and add --cov --cov-report xml instead; maybe that makes a difference.

Conclusion

image

No luck.

Attempt 6

Perhaps the --cov command is the issue here; let's try to remove that instead and try again.

My addopts looks like this right now:

[tool.pytest.ini_options]
addopts = "--exitfirst -vs --junitxml htmlcov/pytest.xml --cov-report xml"

Conclusion

I'm surprised...

image

I was already writing my overall conclusion, but let's try to add the HTML report back!

Attempt 7

My addopts looks like this right now:

[tool.pytest.ini_options]
addopts = "--exitfirst -vs --junitxml htmlcov/pytest.xml --cov-report html --cov-report xml"

Conclusion

HOLYMOLY THAT WORKED!. ... Is what I thought..

image

image

Only the XML file actually generated.

Overall Conclusion

As mentioned in this ticket, I could run the Django test suite without coverage, but that doesn't work for my pipeline. I need the XML file for my pipeline! So just a few adjustments in the addopts variable should do the trick for me and solve my issue.

I'm not sure if I can be convinced that CPython is the issue here. My testing really points to the HTML generation of pytest-cov, which I like to utilize locally for development purposes.

Edit: I just discovered that junitxml also aint working with how I thought my bandaid was put down... woops

@nicoddemus
Copy link
Member

Hi @0x78f1935,

Sorry it is late so I kinda skimmed through your post, but before going to bed I decided to leave another suggestion you can try: have you tried using coverage directly (coverage run -m pytest), without using pytest-cov? This would help nail down if this is something related to pytest-cov or coverage itself.

@0x78f1935
Copy link
Author

Hi @nicoddemus,

No worries! I really appreciate your time. Thank you for your response!

I tried your suggestion!

Firstly, I added the following section to my .vscode/launch.json:

{
    "name": "Test: Pytest Coverage directly",
    "type": "python",
    "request": "launch",
    "module": "coverage",
    "args": [
        "run",
        "-m",
        "pytest",
    ],
    "env": {
        "PYDEVD_DISABLE_FILE_VALIDATION": "1",
        "COVERAGE_CORE": "sysmon"
    },
    "justMyCode": true
},

My addopts currently looks like this:

[tool.pytest.ini_options]
addopts = "--exitfirst -vs --junitxml htmlcov/pytest.xml --cov-report xml"

image

We know that --cov triggered the memory leak. So lets add that!, because right now we are missing some supposed to generated files.

Running it once more results into:

image

Unfortunately, the same issue. To make double sure, I'll remove Visual Studio Code out of the equation in the hope I can ctrl + c out of the memory leak.

Running coverage run -m pytest in my terminal results in:

image

And it keeps rising.

I straightup forgot to remove pytest-cov like you suggested, so lets head back to vscode and uninstall pytest-cov. I removed --cov --cov-report xml from my pyproject.toml.

[tool.pytest.ini_options]
addopts = "--exitfirst -vs --junitxml htmlcov/pytest.xml

Lets run it again.

    Test session starts (platform: win32, Python 3.12.2, pytest 8.0.0, pytest-sugar 1.0.0)
    cachedir: .pytest_cache
    django: version: 5.0.2, settings: backend.application.settings (from ini)
    rootdir: xxx
    configfile: pyproject.toml
    plugins: anyio-4.2.0, django-4.8.0, sugar-1.0.0

image

That worked.

Lets add back --cov --cov-report xml

[tool.pytest.ini_options]
addopts = "--exitfirst -vs --junitxml htmlcov/pytest.xml --cov --cov-report xml

And try again!

ERROR: usage: __main__.py [options] [file_or_dir] [file_or_dir] [...]
__main__.py: error: unrecognized arguments: --cov --cov-report
  inifile: xxx\pyproject.toml
  rootdir: xxx

So lets remove --cov --cov-report xml again and see if junitxml generates.

image

image

Unfortunately no luck

@nicoddemus
Copy link
Member

Indeed seems related to pytest-cov itself then.

@ionelmc
Copy link
Member

ionelmc commented Feb 18, 2024

Hey, I'm trying to make sense of this and I have some questions:

  • What's the actual architecture you run this on? You have pointed to a docker image (annihilator708/django-react-base) without any build detail (how is it built, where is the dockerfile?) but all the details you show are from windows task manager? Do you use that image at all? What is the actual architecture you run the project on?
  • You have shown aggregate memory stats for VS Code. I would assume that includes memory used by VS Code and subprocesses. It's not clear what is using what and how much of that memory is private bytes. Can you add more details using https://learn.microsoft.com/en-us/sysinternals/downloads/process-explorer ?
  • You have pointed out that you have a problem with the xml generation but it's all mixed up with the memory leak problem. Can you make a separate issue for that?

@0x78f1935
Copy link
Author

0x78f1935 commented Feb 19, 2024

Hey, I'm trying to make sense of this and I have some questions:

I'm happy to answer!

* What's the actual architecture you run this on?

This issue occurs on both Windows and Unix systems, specifically Debian and Alpine. It doesn't seem to be dependent on the execution environment, as it happens consistently across architectures (64-bit). Where Alpine is my production environment, Debian is my github actions environment and Windows is my local development environment.

You have pointed to a docker image (annihilator708/django-react-base) without any build detail (how is it built, where is the dockerfile?)

I have to acknowledge that the files are private, and the Python environment gets overwritten in the pipeline, although it's my image.

You could use the same environment. Create a new Dockerfile, touch a new file, and tail it in the entrypoint.

FROM annihilator708/django-react-base AS baselayer
RUN touch tmp.tmp
ENTRYPOINT ['tail', '-f', 'tmp.tmp']

When starting a container like this, you can use exec to run and check the Python environment.
The main container layer is python:alpine3.19 since my migration. Before the migration, it ran on alpine:3.18.2.

docker exec <container_id> python3.12 --version
docker exec <container_id> python3.12 -m pip freeze

But then again, the environment gets overwritten, so I don't really think the image is very helpful in this case.

* You have shown aggregate memory stats for VS Code. I would assume that includes memory used by VS Code and subprocesses. It's not clear what is using what and how much of that memory is private bytes. Can you add more details using https://learn.microsoft.com/en-us/sysinternals/downloads/process-explorer ?

Yes, Ill put me addopts back to how it works in Python3.11.

addopts = "--exitfirst -vs --junitxml htmlcov/pytest.xml --cov --cov-report html --cov-report xml --cov-report term"

Results:

Test session starts (platform: win32, Python 3.12.2, pytest 8.0.0, pytest-sugar 1.0.0)
cachedir: .pytest_cache
django: version: 5.0.2, settings: backend.application.settings (from ini)
rootdir: xxx
configfile: pyproject.toml
plugins: anyio-4.2.0, cov-4.1.0, django-4.8.0, sugar-1.0.0
collected 42 items

image

image

I guarantee that this issue also arises outside Visual Studio Code. When using the top command in Unix, I also observe my memory being consumed. I have 64GB of DDR4 RAM. The screenshots I shared of my Windows environment are merely indicative of the issue occurring again while I guide you through the steps I took to exclude other libraries. I hope this helps!

* You have pointed out that you have a problem with the xml generation but it's all mixed up with the memory leak problem. Can you make a separate issue for that?

If that issue still persists after this main issue has been resolved (the memory leak), I'll create another ticket. However, I'm very confident that those XML files will generate after this has been resolved, just like my environment in Python 3.11, which functions normally.

Edit: I'll look into a minimalistic environment, where I perhaps can reproduce the issue.

@0x78f1935
Copy link
Author

I can't put my finger on it. While attempting to reproduce the issue with a fresh Django application and some tests, I'm unable to replicate the exact problem. When removing my tests from the stack, pytest-cov actually finishes with all the files.

At this point, I'm wondering if there's something within my own tests causing the issue. It's strange that I don't have this problem with Python 3.11.

I think the best thing I can do is start from scratch and slowly port all my components one by one until I have more details to share.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants