Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixes: slow multicore performance on task runners #1077

Merged
merged 2 commits into from Dec 6, 2023

Conversation

Eein
Copy link
Contributor

@Eein Eein commented Oct 4, 2023

closes #1058

This PR removes the unnecessary worker code that was lowering performance on multicore systems.

I've used async streams instead of map/each on async await and updated the runner code to use an async stream keeping the exec.max_concurrent_check_runs to keep the same expected behaviour and make the code more modern.

This should resolve slow runs on multicore machines :)

Feel free to add any suggestions.

@rrrene
Copy link
Owner

rrrene commented Oct 5, 2023

Thanks for putting this together. 👍

What did you do to ensure that this change does lead to a slow down in other OTP/Elixir combinations than the ones mentioned in the linked issue? i.e. how do we make sure that this change does not improve performance for some setups while worsening it for others?

@Eein
Copy link
Contributor Author

Eein commented Oct 5, 2023

I'd be happy to generate some type of benchmark if that helps?

For example, a script that generates a large amount of modules, then we can run each elixir combination against those projects against said set of modules.

In other case:

this PR:
image

other PR that was just merged in:
image

It doesn't seem like theres a massive difference to compare to in CI (limited cores) - I'd be happy to try and run these tests in varying environments as well if that helps smoke test this PR.

@Eein
Copy link
Contributor Author

Eein commented Oct 6, 2023

master is on the credo codebase (d1d76ea)
large is a project with 6266 files (26138 mods/funcs)

All tests run on a Radeon 5900x CPU (12c/24t) on Linux Kernel 6.5.3

Credo codebase took 0.1s to load, large 0.9s in all cases
following times are hand tested full analysis times (including load/running)

I could not install erlang <= 23 - guessing due to crypto changes?

Version master master large PR PR large
elixir-1.15.6-otp26.1.1 1.6s 11.0s 1.5s 8.4s
elixir-1.15.6-otp25.3.2.6 1.6s 11.3s 1.5s 8.5s
elixir-1.15.6-otp24.3.4.13 1.7s 13.5s 1.6s 10.6s
elixir-1.14.5-otp25.3.2.6 1.5s 11.2s 1.4s 8.5s
elixir-1.14.5-24.3.4.13 1.8s 13.6s 1.5s 10.1s
elixir-1.14.5-otp23 x x x x
elixir-1.13.4-otp25.3.2.6 1.6s 10.6s 1.4s 8.0s
elixir-1.13.4-24.3.4.13 1.8s 12.1s 1.6s 10.4s
elixir-1.13.4-otp23 x x x x
elixir-1.12.2-24.3.4.13 1.7s 12.9s 1.6s 10.6s
elixir-1.12.2-otp23 x x x x
elixir-1.12.2-otp22 x x x x
elixir-1.11.4-24.3.4.13 1.9s 12.4s 1.6s 10.4s
elixir-1.11.4-otp23 x x x x
elixir-1.11.4-otp22 x x x x
elixir-1.11.4-otp21 x x x x

Note: This does not include single/split-core performance common on ci systems.

In CI for 25.3/1.15.6:
This PR in smoke test - Analysis took 6.2 seconds (0.4s to load, 5.8s running 55 checks on 430 files)
master in smoke test - Analysis took 7.9 seconds (0.5s to load, 7.4s running 55 checks on 431 files)

After rereading my initial issues in the issue i created, it does seem like whatever performance regression that existed in 1.15 or otp 26 may have been fixed prior to me running this test...

Edit:
After doing some extra research, the actual issue with the ticket was related to a specific OTP Version.. Ran some additional tests:

elixir 1.15.3 - otp 26.0 - Problem exists
elixir 1.15.3 - otp 26.0.1 - Problem exists
elixir 1.15.3 - otp 26.0.2 - Problem exists
elixir 1.15.3 - otp 26.1 - FIXED

This branch also does fix the issue in 26.0-26.0.2
Master: Gave up after 5 minutes or so
PR: Analysis took 11.4 seconds (0.9s to load, 10.4s running 55 checks on 6266 files)
Potentially related to: erlang/otp#7595

@Eein
Copy link
Contributor Author

Eein commented Oct 6, 2023

It may be also worth removing exec.max_concurrent_check_runs completely in this PR, as async_await takes care of this for us:

:max_concurrency - sets the maximum number of tasks to run at the same time. Defaults to System.schedulers_online/0

See: https://hexdocs.pm/elixir/1.15.0/Task.html#async_stream/5-options

@justincy
Copy link

justincy commented Nov 6, 2023

I'm still seeing significant slowness with OTP 26.1

@rrrene rrrene merged commit 925923a into rrrene:master Dec 6, 2023
10 checks passed
@rrrene
Copy link
Owner

rrrene commented Dec 6, 2023

@Eein Will publish an RC with some other patches shortly 👍

@ulissesalmeida
Copy link

@Eein thank you for your effort and complete report ❤️

@rrrene
Copy link
Owner

rrrene commented Dec 7, 2023

@Eein @justincy @ulissesalmeida This is published as part of v1.7.2-rc.2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Slow credo runs since 1.15.x
4 participants