Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Callback as child processes #97

Open
ClosetGeek-Git opened this issue Aug 3, 2022 · 8 comments
Open

Callback as child processes #97

ClosetGeek-Git opened this issue Aug 3, 2022 · 8 comments

Comments

@ClosetGeek-Git
Copy link

ClosetGeek-Git commented Aug 3, 2022

Swoole's process abstraction uses a callback rather than calling exec, and then provides an exec method that can be used within the callback for executing commands if that's desired within the child process. See https://openswoole.com/docs/modules/swoole-process-construct . This gives the option of executing PHP code within the child process.

Every time a child process is created within PHP you fork the entire PHP instance, extensions and all. Yet to run any type of PHP within the reactphp child you need to call the the php/php-cli command which will than create yet another php instance, this one without access to any of the original parents resources. This is both inefficient and restrictive. Keep in mind that if the intent is to purposefully restrict the spawned php environment that php/php-cli can still be called via the exec method, as can other procedures to further reduce privileges of that process (if the parent process has privileges to do so).

@ClosetGeek-Git
Copy link
Author

This would be greatly beneficial. ReactPHP's event loop can then be used as a core reactor much like how Swoole functions, but in PHP userspace (see here https://github.com/swoole/swoole-src/tree/master/src/reactor)

@ClosetGeek-Git
Copy link
Author

Their process object (https://github.com/swoole/swoole-src/blob/master/src/server/process.cc) is passed "server", which is really just needed for integrating access to the eventloop within the parent.

@ClosetGeek-Git
Copy link
Author

ClosetGeek-Git commented Aug 3, 2022

If you see here (https://github.com/ClosetMonkey/ReactPHPZMQ/blob/main/parallel_ratchet.php) I've done similar for creating a php parallel based thread pool for handling Ratchet websocket messages, but this is not ideal given the experimental nature of the parallel extension. This can be done fairly easily using child processes though to create stable solutions.

@SimonFrings
Copy link
Member

Hey @ClosetMonkey, thanks for bringing this up 👍

Sounds like your suggesting that we implement some kind of fork support. This sounds like a good idea and we're always happy about suggestions or PRs! This will also mean that we'd have to add extensions to this package to make this work. In ReactPHP we try to avoid dependencies if possible. This is why our current implementation works as is. For most use cases this should work just fine but I am interested in your ideas.

If you already have some implementation for this I am happy to take a look.

You can also take a peek at some answers written by @clue in event-loop#184 for more information on this topic and what others have to say about it.

There's also the clue/reactphp-pq project, maybe this fits into your use case.

Let me know what you think about this.

@ClosetGeek-Git
Copy link
Author

Thank you for your reply @SimonFrings . My input can be scattered at times. I didn't look deep enough to see that child_process doesn't fork. I've been away from userspace PHP for a while to be honest.

Are you familiar with PHP's start up process and lifecycle? Here's a quick read if not.

PHP starts up in two phases, module init and then request init. The majority of PHP's startup time and startup CPU overhead is within the module init phase. The request init phase can then be called multiple times which is the basic architecture of most servers - call module initialization once, then request init in a loop for each request in either threads or forked processes (minit called prefork). This allows for requests to be processed quicker, and more importantly when it comes to workers, it allows for persistent resources that survive across multiple requests such as database connection pools. This is also how php-cgi is designed to work, but typically by handling fcgi requests when in production. However, when passes a file php-cgi will usually execute the file and then fully shut down, going through the the whole php lifecycle in one go similar php-cli. However the -T arg for php-cgi causes it to execute a file T times, each time within a new request, meaning without having to fully reinitialize PHP, and without loosing persistent resources. T doesn't have a hardset max, so hypothetically it is the system's INT_MAX, which if INT_MAX executions took 1 microseconds each php-cgi would able to handle these 1us requests for slightly under 30,000 decades on a 64 bit system. 1us isn't a baseline for anything, it can just as easily run indefinitely. The -q flag is used to prevent php-cgi from spiting out HTTP headers (since it's not running in the context of a HTTP request handler).

I think the php-cgi should be of interest for ReactPHP because as a high performance IO framework ReactPHP can communicate with php-cgi as meant in production over FCGI. I don't mean in a traditional HTTP/CGI context, but as a means to create and control long running worker/child processes with less overhead and persistent, shareable resources. It's also cross-plat.

Alternatively, php-cli has an undocumented argument --repeat that is similar to php-cgi's -T arg. After a few days of searching I found that it is used heavily in php's test suite (such as make test after compiling PHP). However it didn't surface until PHP7. With the lack of documentation there's no telling if/how it may change.

This all leads me to a few questions, but one is regarding your statement about dependencies. Doe this include PHP userland dependencies that can be resolved with composer?

@ClosetGeek-Git
Copy link
Author

One that I'm looking at is opis/closure. If both the calling instance and php-cgi (or php-cli) have the same includes, and if the parent is knows the state of the child (fork, cli instance, etc), it's conceivable to be able to pass callable's into the child.

@ClosetGeek-Git
Copy link
Author

The -T argument was added to php-cgi in PHP-5.3 in 2007 and doesn't seem to be going anywhere php/php-src@6f7b738

The --repeat argument for php-cli was added in php-8.1.0RC1 for testing JIT and noted in the source that it might be changed php/php-src@1b3b430

I wrote up a quick PHP extension here to demonstrate the effect of -T and --repeat arguments. It will print a notice when major initialization and teardown procedures are completed. It also maintains a counter as a persistent resource across requests.
note that php_request_shutdown(void *dummy) here is where the entire PHP userspace is deconstructed and free'ed, not just garbage collected. This is the point where server sapis release everything created by PHP userspace when PHP is done processing a HTTP request, although the function itself is not bound to any form of HTTP context as can be seen in the php-cli.

This can still be tested without installing an extension. The fact that resources are persistent across iterations can be tested standard php streams.

Create a basic tcp server using ReactPHP
react_server_example.php

<?php

require __DIR__ . '/vendor/autoload.php';

$socket = new React\Socket\SocketServer('127.0.0.1:8080');

$socket->on('connection', function (React\Socket\ConnectionInterface $connection) {
    echo "New connection \n";

    $connection->on('data', function ($data) use ($connection) {
        echo "Got data: {$data} \n";
    });

    $connection->on('close', function () {
        echo "Connection Closed \n";
    });

    $connection->on('error', function (Exception $e) {
        echo 'error: ' . $e->getMessage();
    });    

});

Then run the following example using php-cgi -T 10 pfsockopen_example.php
pfsockopen_example.php

<?php

$psock = pfsockopen("127.0.0.1", 8080, $errno, $errstr, 30);

if (!$psock) {
    echo "{$errstr} ({$errno}) \n";
} else {
    fwrite($psock, "YO!");
}

Which will result the following, demonstrating one connection across 10 individual script executions:

New connection 
Got data: YO! 
Got data: YO! 
Got data: YO! 
Got data: YO! 
Got data: YO! 
Got data: YO! 
Got data: YO! 
Got data: YO! 
Got data: YO! 
Got data: YO! 
Connection Closed 

As far as performance goes, which can also be seen without an extension by calling a basic script, such as just echo “Hi”; 50,000 times using php-cgi -T 50000 echo.php takes 1403 milliseconds (1.40 seconds) on my machine. Calling the same 50,000 times without using -T or --repeat arguments, by calling php-cli in a loop took, much, much longer.

@SimonFrings
Copy link
Member

Nice read, I can tell you put quite some time into this topic 👍

This all leads me to a few questions, but one is regarding your statement about dependencies. Doe this include PHP userland dependencies that can be resolved with composer?

I meant that we only want to include extensions if they're really necessary, thus avoiding to overload the whole project (not the only reason tho).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants