Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert CURL resources to objects #5402

Closed
wants to merge 12 commits into from

Conversation

kocsismate
Copy link
Member

@kocsismate kocsismate commented Apr 16, 2020

This is a train-wreck yet... So I'd like to ask for a preliminary feedback before I start converting all the functions to use object.

@kocsismate kocsismate changed the title Convert CURL resources to objects [WIP] Convert CURL resources to objects Apr 16, 2020
@kocsismate
Copy link
Member Author

kocsismate commented Apr 16, 2020

@cmb69 Could you please have a look? Unfortunately, I don't understand a couple of things:

  • do I have to define the get_gc handler?
  • shouldn't I free the class entries in PHP_MSHUTDOWN()? I couldn't really find such precedents in other extension so my guess is that it's taken care by the engine.
  • should I move initialization logic from interface.c to somewhere else (e.g. a new php_curl.c file)?
  • I saw that Nikita made the xml_parser structure an object during the migration of ext/xml. Should (can) I do the same with the php_curl struct?

@cmb69
Copy link
Contributor

cmb69 commented Apr 16, 2020

Thanks for working on this!

  • I think a get_gc handler is necessary, to make postfields available to the GC.

  • The engine takes care of releasing the class entries, to my knowledge.

  • interface.c is relatively large, and splitting it up may be desireable, but I think just moving the initialization and perhaps all the object stuff doesn't bring much. If we add a "object-oriented API" sometime, it would make sense to move the classes to separate files.

  • Having php_curl contained curl_object saves a heap allocation, and inlining the members yields better readable code, IMO.

@kocsismate
Copy link
Member Author

@cmb69 Thank you very much! I'll try to align the PR according to your suggestions :)

@cmb69 cmb69 added the Feature label Apr 17, 2020
@kocsismate kocsismate force-pushed the curl-resource branch 3 times, most recently from ff5eb09 to 33f4713 Compare April 19, 2020 15:55
Copy link
Member Author

@kocsismate kocsismate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR is in much better shape now, some tests even pass locally. However, for example the ext/curl/tests/bug48203_multi.phpt test hangs indefinitely. I haven't manage to find out the root cause, but I see that the 2nd consecutive curl_init() invocation makes it behave like that.

ext/curl/interface.c Show resolved Hide resolved
ext/curl/php_curl.h Show resolved Hide resolved
ext/curl/interface.c Show resolved Hide resolved
ext/curl/interface.c Outdated Show resolved Hide resolved
ext/curl/interface.c Outdated Show resolved Hide resolved
ext/curl/interface.c Outdated Show resolved Hide resolved
ext/curl/multi.c Show resolved Hide resolved
ext/curl/share.c Outdated Show resolved Hide resolved
ext/curl/share.c Outdated Show resolved Hide resolved
ext/curl/share.c Outdated Show resolved Hide resolved
ext/curl/share.c Outdated Show resolved Hide resolved
@nikic
Copy link
Member

nikic commented Apr 20, 2020

As a general note, we cannot assume that ch->cp is non-null (similar for share / multi), because the object might have been created while bypassing the constructor / curl_init. Though I'm not sure how exactly to do the bypassing if we have serialization denied. Does anyone know a trick for that?

@kocsismate kocsismate force-pushed the curl-resource branch 2 times, most recently from c2030b3 to abb4182 Compare April 21, 2020 22:35
Copy link
Member Author

@kocsismate kocsismate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I addressed most of the existing review comments (cloning support is underway). Unfortunately, I have no idea how to trick cp to be null. What should I do with it?

ext/curl/multi.c Outdated Show resolved Hide resolved
@kocsismate kocsismate changed the title [WIP] Convert CURL resources to objects Convert CURL resources to objects Apr 21, 2020
@kocsismate
Copy link
Member Author

kocsismate commented Apr 23, 2020

Now, only 3 curl tests fail locally.

ext/curl/interface.c Outdated Show resolved Hide resolved
ext/curl/interface.c Outdated Show resolved Hide resolved
ext/curl/interface.c Outdated Show resolved Hide resolved
ext/curl/interface.c Outdated Show resolved Hide resolved
@kocsismate kocsismate force-pushed the curl-resource branch 2 times, most recently from bdc7120 to 318c386 Compare April 27, 2020 09:16
@kocsismate kocsismate force-pushed the curl-resource branch 2 times, most recently from e56cb9a to 6a1841b Compare April 27, 2020 11:41
Copy link
Member

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't see anything obviously wrong in the curl multi implementation.

ext/curl/multi.c Outdated Show resolved Hide resolved
ext/curl/multi.c Show resolved Hide resolved
ext/curl/multi.c Outdated Show resolved Hide resolved
@nikic
Copy link
Member

nikic commented Apr 28, 2020

I've pushed some fixed for leaks and a segfault. However, I'm not sure the fix is ideal: https://github.com/php/php-src/pull/5402/files#diff-ab7a9164033bd55887d45d0a6cb837eeR542-R548

The problem is that the multi handle holds on to the easy handles, but the easy handles may end up being freed first. The "verify handle" loop then causes use after free. We can check detect this situation, but I'm still a bit uncertain about safety. In particular, wouldn't curl_multi_cleanup still try to access the easy handles? As I'm not getting valgrind errors anymore, something must protect against that, but I'm not sure what.

@nikic
Copy link
Member

nikic commented Apr 28, 2020

Hm, I guess that can't happen, because curl will automatically remove the easy handle from the multi handle if the easy handle is destroyed. So this should be safe.

I think it may make sense to make curl_multi_close() not a complete no-op though, and do a loop to remove all easy handles from the multi handle in there. If cycles are involved, one would not be able to get predictable cleanup order otherwise.

@kocsismate
Copy link
Member Author

@nikic Thank you very much for your efforts to fix this! I'll try to implement the suggested improvement for curl_multi_close() a bit later.

@kocsismate
Copy link
Member Author

kocsismate commented Apr 29, 2020

@nikic I've just added the for-loop, but I don't get why it's better than calling zend_llist_clean() directly? Or did I just misunderstand you?

Besides, undortunately, the test still fails for me with a segfault (and a memory leak). I ran ./run-tests.php -m and the .mem file shows the following (I'm copy-pasting it in the hope that it's useful):

--54603-- UNKNOWN fcntl 101!
--54603-- UNKNOWN fcntl 101! (repeated 2 times)
--54603-- UNKNOWN fcntl 101! (repeated 4 times)
--54603-- UNKNOWN fcntl 101! (repeated 8 times)
--54603-- UNKNOWN fcntl 101! (repeated 16 times)
--54603-- UNKNOWN fcntl 101! (repeated 32 times)
--54603-- UNKNOWN fcntl 101! (repeated 64 times)
--54603-- UNKNOWN fcntl 101! (repeated 128 times)
==54603== 
==54603== Process terminating with default action of signal 4 (SIGILL)
==54603==  Illegal opcode at address 0x1042A1A73
==54603==    at 0x1042A1A73: __pthread_init.cold.2 (in /usr/lib/system/libsystem_pthread.dylib)
==54603==    by 0x10137472D: libSystem_initializer (in /usr/lib/libSystem.B.dylib)
==54603==    by 0x100DB01E2: ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) (in /usr/lib/dyld)
==54603==    by 0x100DB05ED: ImageLoaderMachO::doInitialization(ImageLoader::LinkContext const&) (in /usr/lib/dyld)
==54603==    by 0x100DAB00A: ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) (in /usr/lib/dyld)
==54603==    by 0x100DAAF75: ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) (in /usr/lib/dyld)
==54603==    by 0x100DA9013: ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) (in /usr/lib/dyld)
==54603==    by 0x100DA90B3: ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) (in /usr/lib/dyld)
==54603==    by 0x100D9759F: dyld::initializeMainExecutable() (in /usr/lib/dyld)
==54603==    by 0x100D9CAF7: dyld::_main(macho_header const*, unsigned long, int, char const**, char const**, char const**, unsigned long*) (in /usr/lib/dyld)
==54603==    by 0x100D96226: dyldbootstrap::start(dyld3::MachOLoaded const*, int, char const**, dyld3::MachOLoaded const*, unsigned long*) (in /usr/lib/dyld)
==54603==    by 0x100D96024: _dyld_start (in /usr/lib/dyld)

@nikic
Copy link
Member

nikic commented Apr 30, 2020

Looks like there's an assertion failure when running guzzle tests:

80) GuzzleHttp\Tests\PoolTest::testExecutesPendingWhenWaiting
php: /home/nikic/php-src/Zend/zend_types.h:1147: zend_gc_delref: Assertion `p->refcount > 0' failed.
Aborted

@nikic
Copy link
Member

nikic commented Jun 16, 2020

If I revert the last commit and apply this patch to guzzle, I get a clean test run under asan: https://gist.github.com/nikic/107010824a818d638998494925711b4f

@kocsismate
Copy link
Member Author

@nikic Thank you very much for the help! I have a good friend at Guzzle, I'll talk to him about the migration. :) What's the problem with the last commit?

P.S: I've run the regular tests locally, and the ext/curl/tests/bug76675.phpt test still fails. I was trying to hunt down the issue in the previous days, but I've not had success yet.

@kocsismate
Copy link
Member Author

kocsismate commented Jun 16, 2020

Ah, sorry, I thought that the last commit was yours. But in fact, it was mine. That said, I'll revert it.

@nikic
Copy link
Member

nikic commented Jun 16, 2020

P.S: I've run the regular tests locally, and the ext/curl/tests/bug76675.phpt test still fails. I was trying to hunt down the issue in the previous days, but I've not had success yet.

The valgrind log you posted doesn't really look related. Just to double check, it does not occur on master when running under valgrind?

@nikic
Copy link
Member

nikic commented Jun 16, 2020

I pushed a fix for the last commit, so no need to revert anymore.

@kocsismate
Copy link
Member Author

kocsismate commented Jun 16, 2020

it does not occur on master when running under valgrind?

@nikic Haha, you're right, it occurs :) If I knew this before that also it fails on master...

ext/curl/curl.stub.php Outdated Show resolved Hide resolved
UPGRADING Show resolved Hide resolved
UPGRADING Show resolved Hide resolved
ext/curl/interface.c Show resolved Hide resolved
ext/curl/interface.c Show resolved Hide resolved
ext/curl/multi.c Show resolved Hide resolved
@php-pulls php-pulls closed this in b516566 Jun 17, 2020
@kocsismate kocsismate deleted the curl-resource branch June 17, 2020 14:14
@carusogabriel
Copy link
Contributor

@kocsismate Mind to label these PRs with resources or something, the same way we are doing with stubs? That will help me organize some stuff for PHP 8.0 presentations

@kocsismate
Copy link
Member Author

@carusogabriel Unfortunately, I can't create new labels.

@gmponos
Copy link

gmponos commented Jun 17, 2020

btw is this merged.. ? I can see that it is, right? Not asking in bad way.. but doesn't this need an RFC? Maybe one exists but I can't find it here..

@kocsismate
Copy link
Member Author

@gmponos It's merged. See the comment from php-pulls (b516566).

This doesn't need an RFC because resources are a legacy data structure which should be eliminated. Since there is no resource type declaration, this change is mostly backward compatible (not counting the is_resource() checks and a few smaller things).

@twose
Copy link
Member

twose commented Jun 29, 2020

Hi @kocsismate @nikic
We have rewritten curl as an object to support the coroutine feature, and I think that some community feedback we collected can be shared with you here

  1. is_resource()
  2. (int) $curl
  3. (string) $curl (I didn’t see you talking about it)

The above problems have caused many community components down

Solving the backward compatibility problem of is_resource is the most difficult, I also want to know how you plan to solve it (It is not possible to do that at the extension layer, is_rousure is kind of inline)

I'm not fond of the idea (fix them) too, but I am obliged to admit the fact that BC break has a huge impact on the PHP community

@nikic
Copy link
Member

nikic commented Jun 29, 2020

@twose The situation for us is probably a bit different than for swoole. Swoole is at a bit of a disadvantage here, because most libraries will not be willing to make changes, even very simple changes, to make code work with swoole, if it works with normal PHP. On the other hand, they don't have much choice when it comes to PHP support :) Of the most important curl consumers, composer did not require updates, Symfony is already compatible with objects and Guzzle is being made compatible in guzzle/guzzle#2715. Of course, there is a long tail of less significant libraries. But at least based on the examples we have, the effort to be compatible with PHP 8 is quite minimal, as far as curl is concerned.

@GrahamCampbell
Copy link
Contributor

Note that that PR is somewhat in complete. All phpdoc saying "resource" will need to be careful checked. Because there are no resource types in the language, we won't have errors shown to us by runtime typing.

@nikic
Copy link
Member

nikic commented Jun 29, 2020

@GrahamCampbell Changes to phpdoc type hints are not relevant to this discussion, because that's something we cannot mitigate from the php-src side. You'll want to talk with the phpstan maintainer on how you can avoid false-positive warnings, or just disable checks.

@carusogabriel carusogabriel added this to the PHP 8.0 milestone Jun 29, 2020
@twose
Copy link
Member

twose commented Jun 29, 2020

@nikic You're right, PHP kernel changes will force the community to make changes, it's good for us too, Thank you for your hard work :)
BTW, I have never understood why there is a resource type before, I recently knew that it's a historical burden...

andypost added a commit to skilld-labs/xhprof that referenced this pull request Aug 20, 2020
In 8+ curl using objects instead of resources php/php-src#5402
SammyK added a commit to DataDog/dd-trace-php that referenced this pull request Nov 19, 2020
The curl extension API was upgraded from a resource to an object in php/php-src#5402.
SammyK added a commit to DataDog/dd-trace-php that referenced this pull request Nov 19, 2020
The curl extension API was upgraded from a resource to an object in php/php-src#5402.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants