Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python 3.12 perf import #4918

Open
jendap opened this issue Feb 15, 2024 · 3 comments
Open

Python 3.12 perf import #4918

jendap opened this issue Feb 15, 2024 · 3 comments

Comments

@jendap
Copy link

jendap commented Feb 15, 2024

Python 3.12 has support for linux perf. Firefox profiler is the best ui. Using them together with the recommended perf script -F +pid > test.perf works but it could be better. Python is wildly used. Would you be interested in improvements for it in profiler codebase?

The generated stack trace (by perf script) looks like:

python 1810767/1810767 2023391.719579:     363492 cycles:P:
        560e73c7e194 PyMem_RawRealloc+0x4 (/tmp/Python-3.12.2/python)
        560e73bef02d cfunction_vectorcall_O+0x6d (/tmp/Python-3.12.2/python)
        560e73bc7e55 PyObject_Vectorcall+0x35 (/tmp/Python-3.12.2/python)
        560e73b39053 _PyEval_EvalFrameDefault.cold+0xa95 (/tmp/Python-3.12.2/python)
        7f90b4a42945 py::Process.username:/tmp/test/top/psutil/__init__.py+0x6 (/tmp/perf-1810767.map)
        560e73bc7e55 PyObject_Vectorcall+0x35 (/tmp/Python-3.12.2/python)
        560e73b39053 _PyEval_EvalFrameDefault.cold+0xa95 (/tmp/Python-3.12.2/python)
        7f90b4a3d775 py::<module>:/tmp/test/top/simple_table.py+0x6 (/tmp/perf-1810767.map)
        560e73ca8269 PyEval_EvalCode+0xa9 (/tmp/Python-3.12.2/python)
        560e73cc7ed3 run_eval_code_obj+0x53 (/tmp/Python-3.12.2/python)
        560e73cc7e4a run_mod+0x6a (/tmp/Python-3.12.2/python)
        560e73cc8473 pyrun_file+0x83 (/tmp/Python-3.12.2/python)
        560e73cc820a _PyRun_SimpleFileObject+0x1ca (/tmp/Python-3.12.2/python)
        560e73cc801f _PyRun_AnyFileObject+0x4f (/tmp/Python-3.12.2/python)
        560e73cd08bc Py_RunMain+0x33c (/tmp/Python-3.12.2/python)
        560e73cd040c Py_BytesMain+0x3c (/tmp/Python-3.12.2/python)
        7f90b4828150 __libc_start_call_main+0x80 (/usr/lib/x86_64-linux-gnu/libc.so.6)

The improvements from here can be:

  1. Remove the noisy _PyEval_EvalFrameDefault.cold and PyObject_Vectorcall. Similar (and even bigger) noise is around generators and async. All cPython interpreter frames.
  2. The perf import will import py::Process.username:/tmp/test/top/psutil/__init__.py+0x6 (/tmp/perf-1810767.map) as function name py::Process.username:/tmp/test/top/psutil/__init__.py at file /tmp/perf-1810767.map. Obviously that could be better.
  3. It would be nice to decide that is native extension (non python code like numpy). Eyeballing a few stack traces it looks like cfunction_vectorcall on stack trace before any py::* may indicate that. But it may not be trivial.

I have quickly tried perf script -F +pid | ./python_perf_cleaner.py > test.perf. Where my python_perf_cleaner.py is:

#!/usr/bin/env python3

import re
import sys

PY_FRAME_PATTERN = re.compile(r'^(\s*[0-9A-Fa-f]+) py::([^:]+):([^\+]+)\+0x[0-9A-Fa-f]+ \(([^)]*)\)')

py_frame_seen = False

for line in sys.stdin:
  if line == '\n':
    py_frame_seen = False
  m = PY_FRAME_PATTERN.match(line)
  if m is not None:
    prefix, function, file, _ = m.groups()
    sys.stdout.write(f'{prefix} {function} ({file})\n')
    py_frame_seen = True
  elif not py_frame_seen:
    sys.stdout.write(line)

This is a big improvement for me. It does solve the first two issues!

For the third issue - identifying native frames... Looking at linux-perf.js I don't see an easy way to rewrite the stack traces even if one would figure out how to recognize them.

The question is if anybody (but me) cares about this? How to do it if so? Hacking directly the perf import in this codebase? Or maybe throwing the improved python reprocessing script to github. But the script would have to start writing your Profile.

BTW: Similar (but much easier) is Java. It is also big ecosystem. Might be worth extra few lines to make it nicer.
BTW: Is there (both ways) convertor of your Profile to Google's pprof (profile.proto)[https://github.com/google/pprof/blob/main/proto/profile.proto]?

┆Issue is synchronized with this Jira Task

@jrmuizel
Copy link

jrmuizel commented Mar 7, 2024

As an alternative to using perf you could try using my branch of https://github.com/jrmuizel/py-spy/tree/gecko-profiler and then loading the resulting json using samply load.

Here's an example: https://share.firefox.dev/4c5iJDy

@julienw
Copy link
Contributor

julienw commented Mar 8, 2024

We forgot to answer the initial question: yes we're interested! It should be possible to detect a python-origin linux-perf profile and do some pre-filtering or even completely move the parsing to a different function.
We also have a bunch of random scripts in bin/, maybe we could put your cleaning script there already, along with some documentation to docs-user/. The new documentation could also mention @jrmuizel's option.

@jendap
Copy link
Author

jendap commented Mar 9, 2024

@jrmuizel nice! Although it may need some more work. Your branch said "rote ... Samples: 1452", the file had 20kB. But the profile? https://share.firefox.dev/49KblMm

Py-spy may be cute but perf is the unrivaled (on linux).

I will look into bin directory. I may send a PR (in a few weeks).

BTW: Are there any plans to make the profiler UI more generic (less Firefox specific)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants