Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

string_processing: Performance regression #2314

Open
JelleZijlstra opened this issue Jun 7, 2021 · 5 comments
Open

string_processing: Performance regression #2314

JelleZijlstra opened this issue Jun 7, 2021 · 5 comments
Labels
C: performance Black is too slow. Or too fast. C: preview style Issues with the preview and unstable style. Add the name of the responsible feature in the title. F: strings Related to our handling of strings

Comments

@JelleZijlstra
Copy link
Collaborator

% time black --diff ~/some/file.py
All done! ✨ 🍰 ✨
1 file would be left unchanged.
black --diff ~/some/file.py  17.78s user 0.18s system 99% cpu 18.067 total
% time black --experimental-string-processing --diff ~/some/file.py
< many changes snipped >
would reformat /Users/jelle/some/file.py
All done! ✨ 🍰 ✨
1 file would be reformatted.
black --experimental-string-processing --diff   389.93s user 3.25s system 99% cpu 6:35.03 total

I can't share the file since it's private code, but it's 14000 lines of a single list display.

@JelleZijlstra JelleZijlstra added C: performance Black is too slow. Or too fast. F: strings Related to our handling of strings labels Jun 7, 2021
@JelleZijlstra JelleZijlstra self-assigned this Jun 7, 2021
@JelleZijlstra
Copy link
Collaborator Author

Assigning to myself to do profiling to figure out the cause

@JelleZijlstra
Copy link
Collaborator Author

Here's some profiling output on the latest release:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000 1161.996 1161.996 /home/jelle/ans/venv/lib/python3.6/site-packages/black/__init__.py:720(format_file_in_place)

        1    0.034    0.034 1161.996 1161.996 /home/jelle/ans/venv/lib/python3.6/site-packages/black/__init__.py:820(format_file_contents)

        2    0.095    0.047 1150.835  575.418 /home/jelle/ans/venv/lib/python3.6/site-packages/black/__init__.py:850(format_str)

1577774/28514    4.211    0.000 1092.089    0.038 /home/jelle/ans/venv/lib/python3.6/site-packages/black/linegen.py:296(transform_line)

105815/46    0.483    0.000 1091.390   23.726 /home/jelle/ans/venv/lib/python3.6/site-packages/black/linegen.py:947(run_transformer)

37028/123    0.998    0.000 1070.323    8.702 {method 'extend' of 'list' objects}

    88012    0.519    0.000  958.530    0.011 /home/jelle/ans/venv/lib/python3.6/site-packages/black/trans.py:126(__call__)

  2761738   48.296    0.000  942.670    0.000 /home/jelle/ans/venv/lib/python3.6/site-packages/black/lines.py:535(append_leaves)

 10635526   92.206    0.000  737.378    0.000 /home/jelle/ans/venv/lib/python3.6/site-packages/black/lines.py:48(append)

      244    0.142    0.001  624.820    2.561 /home/jelle/ans/venv/lib/python3.6/site-packages/black/trans.py:683(do_transform)

       92    0.001    0.000  321.532    3.495 /home/jelle/ans/venv/lib/python3.6/site-packages/black/trans.py:284(do_transform)

       46    3.479    0.076  311.581    6.773 /home/jelle/ans/venv/lib/python3.6/site-packages/black/trans.py:347(_merge_string_group)

  8858176   32.683    0.000  246.932    0.000 /home/jelle/ans/venv/lib/python3.6/site-packages/black/nodes.py:173(whitespace)

 10608999    9.382    0.000  193.363    0.000 /home/jelle/ans/venv/lib/python3.6/site-packages/black/lines.py:252(has_magic_trailing_comma)

 11556672   26.623    0.000  188.826    0.000 /home/jelle/ans/venv/lib/python3.6/site-packages/blib2to3/pytree.py:205(prev_sibling)

   128104  183.214    0.001  183.214    0.001 /home/jelle/ans/venv/lib/python3.6/site-packages/black/nodes.py:584(is_one_tuple_between)

  7546763   90.481    0.000  156.591    0.000 /home/jelle/ans/venv/lib/python3.6/site-packages/blib2to3/pytree.py:368(update_sibling_maps)

  8668937   22.493    0.000  134.083    0.000 /home/jelle/ans/venv/lib/python3.6/site-packages/black/nodes.py:482(replace_child)

  5376073   14.792    0.000  132.370    0.000 /home/jelle/ans/venv/lib/python3.6/site-packages/black/nodes.py:406(preceding_leaf)

 10608999   52.253    0.000   87.302    0.000 /home/jelle/ans/venv/lib/python3.6/site-packages/black/brackets.py:68(mark)

140427171   81.462    0.000   81.462    0.000 {built-in method builtins.id}

The problem is mostly that append_leaves() takes forever, and we repeatedly do that, probably once for every string in the logical line. My file is a 14000-line logical line with a lot of strings in it, so that adds up.

Perhaps this can be improved by adding more preformatted=True to append_leaves() calls, but I tried in a few places and got test failures.

@ichard26
Copy link
Collaborator

ichard26 commented Jun 11, 2021

Because I'm bored, here's a test case generator for this (don't @ me if this turns out to be garbage :p):

import random

lines = []

lines.append("a = [")

for _ in range(300):
    size = random.randint(30, 120)
    string = "".join([chr(random.randint(65, 122)) if random.random() < 0.8 else " " for _ in range(size)])
    clean = string
    for no in r'#"\{}':
        clean = clean.replace(no, "")
    comma = "," if random.random() > 0.5 else ""
    lines.append(f'    "{clean}"{comma}')

lines.append("]")

with open("out.py", "w", encoding="utf8") as f:
    f.write("\n".join(lines))

@JelleZijlstra JelleZijlstra removed their assignment Jun 18, 2021
@JelleZijlstra JelleZijlstra added the C: preview style Issues with the preview and unstable style. Add the name of the responsible feature in the title. label Dec 20, 2022
@JelleZijlstra
Copy link
Collaborator Author

#3467 makes this better but I'm still seeing about a 4.5x perf regression (though without mypyc) on an internal file:

$ git checkout .; rm -rf ~/.cache/black/
Updated 57 paths from the index
$ time black --preview path/to/file.py
reformatted path/to/file.py

All done! ✨ 🍰 ✨
1 file reformatted.

real    1m33.226s
user    1m32.556s
sys     0m0.468s
$ git checkout .; rm -rf ~/.cache/black/
Updated 1 path from the index
$ time black path/to/file.py
All done! ✨ 🍰 ✨
1 file left unchanged.

real    0m19.135s
user    0m18.740s
sys     0m0.176s

@JelleZijlstra
Copy link
Collaborator Author

With mypyc-compiled on an M1 Mac it's much better after the initial reformat:

% time python -m black path/to/file.py
All done! ✨ 🍰 ✨
1 file left unchanged.
python -m black path/to/file.py  2.05s user 0.05s system 95% cpu 2.190 total
% time python -m black --preview path/to/file.py
reformatted path/to/file.py

All done! ✨ 🍰 ✨
1 file reformatted.
python -m black --preview path/to/file.py  10.23s user 0.15s system 98% cpu 10.489 total
% rm -rf ~/Library/Caches/black                                               
% time python -m black --preview path/to/file.py
All done! ✨ 🍰 ✨
1 file left unchanged.
python -m black --preview path/to/file.py  3.09s user 0.07s system 95% cpu 3.297 total
% rm -rf ~/Library/Caches/black                                               
% time python -m black --preview path/to/file.py
All done! ✨ 🍰 ✨
1 file left unchanged.
python -m black --preview path/to/file.py  2.99s user 0.07s system 95% cpu 3.206 total
% python -m black --version
python -m black, 22.12.1.dev24+g3feff21 (compiled: yes)
Python (CPython) 3.11.1

So from 2 s to 3 s, still a significant slowdown but not as concerning.

@JelleZijlstra JelleZijlstra changed the title Performance regression with experimental string processing string_processing: Performance regression Jan 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: performance Black is too slow. Or too fast. C: preview style Issues with the preview and unstable style. Add the name of the responsible feature in the title. F: strings Related to our handling of strings
Projects
None yet
Development

No branches or pull requests

2 participants