New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Excessive memory usage to scan big projects #1495
Comments
Thanks for submitting an issue! I want to give it a shot, although I never really memory profiled Python app. Any tips will be appreciated. |
@rogalski you may want to try https://pypi.python.org/pypi/memory_profiler I have never used it myself though. |
I just had a look to this memory issue as well. The memory is quite a critical point to our project as well, and all we needed was a simple syntax check on each file, there was no need for a global check. Each file could be checked "stand-alone". I could have ran the executable on each file, but then it's the process time which was increased quite a bit. Here is the solution I finally found, which reduced the memory used from 1.4GO to less than 400MO. We called pylint from a python script directly, not with the executable, and cleared the
If you would like the |
Well, I was a bit too enthusiast: It does reduce the memory used, indeed, but it multiplies the processing time quite a bit as well. I guess the cache is quite useful :D. But it caches too much in our case. I will check what exactly take the tremendous spaces in this cache, and see if it can be disabled somehow. |
New POC:
There were two things that consumed a lot of memory for our project:
I guess you have to choose between time and memory, by setting a greater LRU cache or not. In our case it's the memory we have less and 200 items in the cache seems to be a good setup for us. The fact I overrode the Of course, for the
I am no expert of this library, what I say could be wrong :D. I am open to your remarks :) |
Without looking much in the code, I think we can safely not store the entire nodes in the |
Btw. is it a regression from 1.6.x? We started to infer results of |
We just find out why we had to empty manually Modules are appended to this Which is called in the two below methods:
Which are called when the below messages are activated:
There is well a method which empty this list regularly: However, this one seems to be called only for the three below messages:
Which makes us conclude that it can happen that this list is being filled but never emptied according to the messages you activate in the linter. This is our case: we activated for instance the message Maybe
should be put on
Well, at least not the whole method, as it is checking things for the above three messages/checks, but at least the part
That said, as I am no expert of this library, the fact the list is emptied only for these 3 messages is done like this by design. It would be odd, though, as by enabling a new check in the linter, I end up using less memory. |
Our issues (cf @beledouxdenis comments) happen on 1.6.4. |
See pylint-dev/pylint#1495 The pylint test now uses less memory: - Thanks to a LRU cache for the astroid cache (caching the modules body) - Activating the test `ungrouped-imports` then ignoring its messages due to a memory leak if not activated See pylint-dev/pylint#1495 (comment)
Attempts to help with pylint-dev#1495
Coming back on this issue.. We have been gradually increasing our VM requirement for pylint checking from 1GB to 2GB to 4GB during this year. Now we are start having instabilities. again. not sure about OOM or just 20min without output timeout. would be nice to have a progress output in order to avoid no-output timeouts. looks like there is no such option, is it? |
@tardyp right now there is no option for progress output, but could definitely be useful. |
The last time we tried this upgrade we encountered timeouts on the quality job, which it now appears were due to the worker running pylint common running out of memory and killing the Jenkins process. Switching to a different worker type with double the RAM (8 GB vs. 4 GB) seems to have fixed this; about 5.5 GB was used. Upstream is aware of the high memory usage on large projects, it's apparently due primarily to a cache of parsed modules: pylint-dev/pylint#1495 . Even after disabling some of the new checks that have been added, the new version of pylint found about twice as much to complain about. Just bumping the threshold for now to unblock the Django upgrade, we can try automated utilities like pyupgrade to fix some of these later.
I ran into this again, with the current version 2.6.0. The
(Not specific to podman -- you can use docker, or a VM, or anything where you can control the amount of RAM) and inside: dnf install -y python3-pip git
pip install pylint
git clone --depth=1 https://github.com/rhinstaller/anaconda
cd anaconda
pylint -j0 pyanaconda/ In
|
pylint hits 2 GiB per process pretty often [1], thus still causing OOM failures [2]. Bump the divisor. [1] pylint-dev/pylint#1495 [2] pylint-dev/pylint#3899
pylint hits 2 GiB per process pretty often [1], thus still causing OOM failures [2]. Bump the divisor. [1] pylint-dev/pylint#1495 [2] pylint-dev/pylint#3899 Cherry-picked from master PR rhinstaller#2923 Related: rhbz#1885635
max() was bogus -- we want *both* RAM and number of CPUs to limit the number of jobs, not take whichever is highest. [1] pylint-dev/pylint#1495 [2] pylint-dev/pylint#3899 Cherry-picked from master PR rhinstaller#2923 Related: rhbz#1885635
pylint hits 2 GiB per process pretty often [1], thus still causing OOM failures [2]. Bump the divisor. [1] pylint-dev/pylint#1495 [2] pylint-dev/pylint#3899 Cherry-picked from master PR rhinstaller#2923 Related: rhbz#1885635
pylint hits 2 GiB per process pretty often [1], thus still causing OOM failures [2]. Bump the divisor. [1] pylint-dev/pylint#1495 [2] pylint-dev/pylint#3899
@martinpitt thanks for your message. Note that there is currently a PR pylint-dev/astroid#837 that addresses this pb. |
@martinpitt PR pylint-dev/astroid#847 has been merged into master. Please can you test if it improves your situation? |
Sorry for taking so long to get back to this! With parallel processes (
I applied the patch from pylint-dev/astroid#847 :
and that makes the memory usage much more reasonable indeed, about 400 MB RSS. Many thanks! |
@martinpitt thanks for your feedback. I close this issue then. |
Steps to reproduce
Current behavior
pylint consumes up to 1GB of memory
Expected behavior
pylint should use less memory
pylint --version output
pylint 1.7.1,
astroid 1.5.2
Python 2.7.12 (default, Nov 19 2016, 06:48:10)
[GCC 5.4.0 20160609]
We are looking for tips to improve the situation as our CI docker machines starts to be randomly killed by OOM.
The text was updated successfully, but these errors were encountered: