Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Try to parallelize linting with amphp #4838

Closed
wants to merge 11 commits into from

Conversation

staabm
Copy link
Contributor

@staabm staabm commented Feb 23, 2020

trying to work on some parallelization, so linting could happen on several CPUs

@staabm staabm changed the base branch from 2.16 to master February 23, 2020 13:36
@staabm staabm changed the title Try to parallelize linting with amphp WIP: Try to parallelize linting with amphp Feb 23, 2020
@staabm
Copy link
Contributor Author

staabm commented Feb 23, 2020

@kelunik @trowski could someone of you give me a hand on how this could work.

it seems to work more or less, but I am not sure it works in parallel as it should be.
also I am seeing the following errors on shutdown:

$ time php php-cs-fixer fix ../redaxo/redaxo/src/core/
Loaded config default from "C:\xampp7.1.4\htdocs\PHP-CS-Fixer\.php_cs.dist".
Paths from configuration file have been overridden by paths provided as command arguments.
string(30) "PhpCsFixer\Linter\PooledLinter"

Fixed all files in 35.574 seconds, 16.000 MB memory used
PHP Fatal error:  Uncaught ErrorException: proc_terminate(): supplied resource is not a valid process resource in C:\xampp7.1.4\htdocs\PHP-CS-Fixer\vendor\amphp\process\lib\Internal\Windows\Runner.php:154
Stack trace:
#0 [internal function]: {closure}(2, 'proc_terminate(...', 'C:\\xampp7.1.4\\h...', 154, Array)
#1 C:\xampp7.1.4\htdocs\PHP-CS-Fixer\vendor\amphp\process\lib\Internal\Windows\Runner.php(154): proc_terminate(Resource id #634)
#2 C:\xampp7.1.4\htdocs\PHP-CS-Fixer\vendor\amphp\process\lib\Process.php(141): Amp\Process\Internal\Windows\Runner->kill(Object(Amp\Process\Internal\Windows\Handle))
#3 C:\xampp7.1.4\htdocs\PHP-CS-Fixer\vendor\amphp\parallel\lib\Context\Process.php(368): Amp\Process\Process->kill()
#4 C:\xampp7.1.4\htdocs\PHP-CS-Fixer\vendor\amphp\parallel\lib\Worker\Internal\WorkerProcess.php(57): Amp\Parallel\Context\Process->kill()
#5 C:\xampp7.1.4\htdocs\PHP-CS-Fixer\vendor\amphp\parallel\lib\Worker\TaskWorker.php(60): Amp\Parallel\Worker\Internal\WorkerProcess->kill()
#6 [internal function]: Amp\Parallel in C:\xampp7.1.4\htdocs\PHP-CS-Fixer\vendor\amphp\process\lib\Internal\Windows\Runner.php on line 154

Fatal error: Uncaught ErrorException: proc_terminate(): supplied resource is not a valid process resource in C:\xampp7.1.4\htdocs\PHP-CS-Fixer\vendor\amphp\process\lib\Internal\Windows\Runner.php:154
Stack trace:
#0 [internal function]: {closure}(2, 'proc_terminate(...', 'C:\\xampp7.1.4\\h...', 154, Array)
#1 C:\xampp7.1.4\htdocs\PHP-CS-Fixer\vendor\amphp\process\lib\Internal\Windows\Runner.php(154): proc_terminate(Resource id #634)
#2 C:\xampp7.1.4\htdocs\PHP-CS-Fixer\vendor\amphp\process\lib\Process.php(141): Amp\Process\Internal\Windows\Runner->kill(Object(Amp\Process\Internal\Windows\Handle))
#3 C:\xampp7.1.4\htdocs\PHP-CS-Fixer\vendor\amphp\parallel\lib\Context\Process.php(368): Amp\Process\Process->kill()
#4 C:\xampp7.1.4\htdocs\PHP-CS-Fixer\vendor\amphp\parallel\lib\Worker\Internal\WorkerProcess.php(57): Amp\Parallel\Context\Process->kill()
#5 C:\xampp7.1.4\htdocs\PHP-CS-Fixer\vendor\amphp\parallel\lib\Worker\TaskWorker.php(60): Amp\Parallel\Worker\Internal\WorkerProcess->kill()
#6 [internal function]: Amp\Parallel in C:\xampp7.1.4\htdocs\PHP-CS-Fixer\vendor\amphp\process\lib\Internal\Windows\Runner.php on line 154

real    0m44.422s
user    0m0.015s
sys     0m0.030s

do you have any hints for me?

@@ -14,9 +14,10 @@
}
],
"require": {
"php": "^5.6 || ^7.0",
"php": "^7.0",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will work on how the compat with old php can/should look like, after I get it working on current php versions

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to see #4810.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't want PHP 5, then why do we even need a parallel linter? We can use the tokenising linter on PHP 7?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we even need a parallel linter? We can use the tokenising linter on PHP 7?

I had the theory that tokenized linting in parallel process is faster then doing it in a single process

@staabm
Copy link
Contributor Author

staabm commented Feb 23, 2020

it looks like the above reported errors are gone.

it seems the linter does no longer report errors, when invalid files are checked though.

a collegue told me he is getting a "too many open files" error when running the fixer on macOS.

@staabm
Copy link
Contributor Author

staabm commented Feb 23, 2020

on the current state of this PR I am getting

image


public function __construct()
{
$this->processPool = new DefaultPool();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would allow a Amp\Parallel\Worker\Pool instance to be passed via the constructor. If one isn't passed, use the one returned from Amp\Parallel\Worker\pool().

@trowski
Copy link
Contributor

trowski commented Feb 23, 2020

a collegue told me he is getting a "too many open files" error when running the fixer on macOS.

This can happen if ulimit is set too low for the number of processes and open files being created. The default on macOS is a bit low and can be increased using ulimit -n 8192, for example. However, since this should be portable, I would reduce the number of workers the pool creates. The default is 32, try reducing it to 16 or 8.

on the current state of this PR I am getting [image]

On a quick glance, I'm assuming some implementation of LintingResultInterface is returning a result that contains a Closure that cannot be serialized. Looks like you may need to inspect what is returned and convert it to another implementor of LintingResultInterface that can be serialized.

it seems the linter does no longer report errors, when invalid files are checked though.

Seems the errors are reported as exceptions that likely cannot be serialized, such as TokenizerLintingResult. Another LintingResultInterface implementation that reports it in a serializable way should take care of that, shouldn't be too complicated. Look at ContextPanicError in the current release of amphp/parallel for an example of how exception serialization is handled in parallel.

@julienfalque
Copy link
Member

Just wondering, wouldn't it be easier to make PHP CS Fixer more friendly with multiple parallel instances so that it can be used with e.g. GNU Parallel? Maybe it already is but I never tried it myself.

*/
public function run(Environment $environment) {
$linter = new TokenizerLinter();
return $linter->lintFile($this->path);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return $linter->lintFile($this->path);
return $linter->lintFile($this->path)->check();

*/
public function run(Environment $environment) {
$linter = new TokenizerLinter();
return $linter->lintSource($this->source);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return $linter->lintSource($this->source);
return $linter->lintSource($this->source)->check();

private $promise;

/**
* @param Promise<TokenizerLintingResult> $promise
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @param Promise<TokenizerLintingResult> $promise
* @param Promise $promise

/**
* @var TokenizerLintingResult
*/
$result = \Amp\Promise\wait($this->promise);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
$result = \Amp\Promise\wait($this->promise);
// will throw on error
\Amp\Promise\wait($this->promise);

/**
* {@inheritdoc}
*/
public function check()
Copy link
Contributor Author

@staabm staabm Feb 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should return :void

{
if (null === $this->isSuccessful) {
/**
* @var TokenizerLintingResult
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @var TokenizerLintingResult
* @var bool

composer.json Outdated Show resolved Hide resolved
its not easy to handle a exception thrown within a task on the caller site.
therefore we transform it to a string a re-create the exception later on.
@staabm
Copy link
Contributor Author

staabm commented Feb 25, 2020

As far as I can tell it seems to work now. After I changed the tasks to no longer throw Exception accross process boundaries it works.

actually it got not faster on my 2 core workstation.. need to test again on a machine with more cpu cores.

with this patch:

$ php php-cs-fixer fix ../redaxo/redaxo/src/core/
Loaded config default from "C:\xampp7.1.4\htdocs\PHP-CS-Fixer\.php_cs.dist".
Paths from configuration file have been overridden by paths provided as command
arguments.
string(30) "PhpCsFixer\Linter\PooledLinter"

Fixed all files in 32.071 seconds, 16.000 MB memory used

Files that were not fixed due to errors reported during linting before fixing:
   1) C:\xampp7.1.4\htdocs\redaxo\redaxo\src\core\lib\list.php

without this patch:

$ php php-cs-fixer fix ../redaxo/redaxo/src/core/
Loaded config default from "C:\xampp7.1.4\htdocs\PHP-CS-Fixer\.php_cs.dist".
Paths from configuration file have been overridden by paths provided as command
arguments.
string(33) "PhpCsFixer\Linter\TokenizerLinter"

Fixed all files in 29.905 seconds, 14.000 MB memory used

Files that were not fixed due to errors reported during linting before fixing:
   1) C:\xampp7.1.4\htdocs\redaxo\redaxo\src\core\lib\list.php

@keradus
Copy link
Member

keradus commented Mar 26, 2020

@staabm , you won't see an improvement when parallelizing TokenizerLinter ( ref https://github.com/FriendsOfPHP/PHP-CS-Fixer/pull/4838/files#r386713529 )

you can see improvement If you would parallelize fixing (one process/thread fixing one file with all rules, other process/thread fixing another files)

@staabm
Copy link
Contributor Author

staabm commented Mar 31, 2020

you won't see an improvement when parallelizing TokenizerLinter ( ref #4838 (files)

the comment referenced does not mention why parallel linting cannot be faster?

you can see improvement If you would parallelize fixing (one process/thread fixing one file with all rules, other process/thread fixing another files)

I tried this before.. the main problem we face is, that we cannot use Exception objects in IPC messages. that means that we would need to do the exception handling in the workers and concurrently write into the cacheManager (or alternatively use a separate "error-object" which supports IPC but would then dont include a proper stacktrace etc).

@staabm
Copy link
Contributor Author

staabm commented Jan 14, 2021

Superceeded by #5390

@staabm staabm closed this Jan 14, 2021
@staabm staabm deleted the amp branch January 14, 2021 19:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants