Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Probable memory leak in PHP extension 1.37.0 #26085

Closed
lurkie opened this issue Apr 26, 2021 · 6 comments
Closed

Probable memory leak in PHP extension 1.37.0 #26085

lurkie opened this issue Apr 26, 2021 · 6 comments

Comments

@lurkie
Copy link

lurkie commented Apr 26, 2021

What version of gRPC and what language are you using?

gRPP version 1.37.0 for PHP 7.2.24

What operating system (Linux, Windows,...) and version?

Ubuntu 18.04
uname:
Linux $hostname 5.4.0-72-generic #80~18.04.1-Ubuntu SMP Mon Apr 12 23:26:25 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

What runtime / compiler are you using (e.g. python version or version of gcc)

PHP 7.2.24-0ubuntu0.18.04.7 (cli) (built: Oct 7 2020 15:24:25) ( NTS )
gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0

Also tested on Ubuntu 20.04:
PHP 7.4.3 (cli) (built: Oct 6 2020 15:47:56) ( NTS )
gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0

What did you do?

When querying GCP's Cloud Datastore using the gRPC PECL extension, it will return (in our case) 29k entities containing XML snippets, totaling for ~125 MB of XML data. We got error messages stating out-of-memory errors in our PHP application. Given the two files described below, I query GCP's Datastore. This example does depend on a preconfigured and pre-filled GCP Project though.

composer.json:

{
    "require": {
        "php": "^7.2",
        "ext-grpc": "*",
        "ext-protobuf": "*",
        "google/cloud-datastore": "^1.12"
    }
}

test.php:

<?php
require __DIR__ . '/vendor/autoload.php';
$keyFile = '...';
$kind = '...';
$parentId = '...';
ini_set('memory_limit', 250 * 1024 * 1024);

/**
 * @property string $xml_detail
 */
class TestModel extends Google\Cloud\Datastore\Entity{}

function dumpMemoryUsage(int $i): void {
    $n1 = memory_get_usage(true);
    $n2 = memory_get_peak_usage(true);
    printf(
        "<!-- Item #: %d, usage: %d MB, peak usage: %d MB -->\n",
        $i, intdiv($n1, 1024 ** 2), intdiv($n2, 1024 ** 2)
    );
}

function testMemoryLeak(): void {
    global $keyFile, $kind, $parentId;
    $client = new Google\Cloud\Datastore\DatastoreClient(['keyFilePath' => $keyFile]);
    $parentKey = $client->key($kind, $parentId);
    $query = $client->query()->kind($kind)->hasAncestor($parentKey);
    $entityIterator = $client->runQuery($query, ['className' => TestModel::class]);

    $i = 0;
    // Iterates over ~29k entities
    foreach ($entityIterator as $entity) {
        /** @var TestModel $entity */
        $i++;
        // xml_detail contains ~ 4600 bytes avg
//        echo $entity->xml_detail;
    }
    dumpMemoryUsage($i);
}
testMemoryLeak();

Script has been run using CLI.

What did you expect to see?

I expected to see reasonably memory usage.

What did you see instead?

Memory usage is very high, considering PHP output buffering in pretty small (even disabled for CLI) and no entities are being held in memory by this code. Using grpc pecl extension, memory usages is up to 204 MB:
<!-- Item #: 28702, usage: 204 MB, peak usage: 204 MB -->

Anything else we should know about your project / environment?

  • We use GCP Datastore using a hierarchy (parent/child entities). I did not try to query a kind without hierarchy.
  • GCP also uses PHP extension protobuf. It did not matter whether the protobuf extension was enabled or not.
  • I did not yet try to run with GRPC_VERBOSITY or GRPC_TRACE configured, as I'm not that knowledgeable in grpc
  • I did try to use the composer package grpc/grpc instead of the PHP extension. When using this, memory usage stays at a reasonably low level:
    <!-- Item #: 28702, usage: 14 MB, peak usage: 18 MB -->
  • I did search for existing issues, finding issues describing memory leaks for other programming languages. Not sure if they are related.
@stanley-cheung
Copy link
Contributor

Will look into this.

@HannahShiSFB
Copy link
Collaborator

It only happens when protobuf extension is enabled.

Here is the report of valgrind:

ZEND_DONT_UNLOAD_MODULES=1 USE_ZEND_ALLOC=0 valgrind --leak-check=full --show-leak-kinds=all --track-origins=yes php test.php
...
==68953== 51,073,952 bytes in 39,579 blocks are definitely lost in loss record 147 of 147
==68953== at 0x483577F: malloc (vg_replace_malloc.c:299)
==68953== by 0x5E2448: __zend_malloc (in /usr/local/bin/php)
==68953== by 0x8F28842: zend_string_alloc (zend_string.h:133)
==68953== by 0x8F28842: zend_string_init (zend_string.h:155)
==68953== by 0x8F28842: Convert_UpbToPhp (convert.c:466)
==68953== by 0x8F2DC1A: zim_Message_readOneof (message.c:1014)
==68953== by 0x69066D: execute_ex (in /usr/local/bin/php)
==68953== by 0x5FCBA5: zend_call_function (in /usr/local/bin/php)
==68953== by 0x626EDC: zend_call_method (in /usr/local/bin/php)
==68953== by 0x6273DA: zend_user_it_move_forward (in /usr/local/bin/php)
==68953== by 0x680A20: ??? (in /usr/local/bin/php)
==68953== by 0x6896C0: execute_ex (in /usr/local/bin/php)
==68953== by 0x691152: zend_execute (in /usr/local/bin/php)
==68953== by 0x60B2A2: zend_execute_scripts (in /usr/local/bin/php)
==68953==
==68953== LEAK SUMMARY:
==68953== definitely lost: 53,389,824 bytes in 51,070 blocks
==68953== indirectly lost: 336 bytes in 5 blocks
==68953== possibly lost: 143,376 bytes in 23 blocks
==68953== still reachable: 1,211,224 bytes in 12,948 blocks
==68953== of which reachable via heuristic:
==68953== length64 : 669,766 bytes in 12,877 blocks
==68953== suppressed: 0 bytes in 0 blocks
==68953==
==68953== For counts of detected and suppressed errors, rerun with: -v
==68953== ERROR SUMMARY: 1748553 errors from 125 contexts (suppressed: 0 from 0)

It may be caused by
https://github.com/protocolbuffers/protobuf/blob/master/php/ext/google/protobuf/message.c#L1012-L1017
where Convert_UpbToPhp created a PHP object ret and RETURN_ZVAL(&ret, 1, 0); created another copy for return but didn't destroy the ret. The memory leak went away after changed it to RETURN_ZVAL(&ret, 0 /*copy*/, 1 /*dtor*/);

It appears to be the same as protocolbuffers/protobuf#8525 and haberman is working on it for a complete solution as there seems to be some other similar leaks as well.

@lurkie
Copy link
Author

lurkie commented May 12, 2021

I was quite sure protobuf did not matter... However, I did some tests again, and it seems to confirm that protobuf is indeed the culprit.

Memory usage (MB)

gRPC:
Protobuf: Disabled 1.37.0
Disabled 16 18
3.15.8 16 214
3.16.0 16 214

However, I kept also track of required time to retrieve and process all XML data. It seems enabling gRPC increases the required time, especially when protobuf is disabled. In that case, the CPU utilization of the PHP process also reaches 100%. This was a bit of a surprise to me.

Avg time (seconds).
Average time required on at least 3 runs using the script mentioned it the issue description:

gRPC:
Protobuf: Disabled 1.37.0
Disabled 37.09 121.65
3.15.8 37.39 49.86
3.16.0 36.02 43.95

Should I report this in a new issue?

@stanley-cheung
Copy link
Contributor

Yes it's expected - the protobuf extension is supposed to provide a much better performance boost. It's less efficient to use the protobuf composer package.

(Btw, the 2 tables above were slightly mis-aligned - it's a bit hard to read)

@haberman
Copy link
Contributor

The fixes in protocolbuffers/protobuf#8614 make the protobuf C extension Valgrind-clean (except for a few edge cases involving exceptions that I was not able to figure out).

This should be released in the next week or two as a patch release of 3.17.x.

@lurkie
Copy link
Author

lurkie commented May 14, 2021

@stanley-cheung

Yes it's expected - the protobuf extension is supposed to provide a much better performance boost.

I do indeed expect less performance when disabling the protobuf extension. I did not expect an increased performance when I disable both. This should be a bit easier to read:

  1. gRPC & protobuf enabled: 43.95s (for protobuf version 3.16.0)
  2. only gRPC enabled: 121.65s
  3. both disabled: 37.09s

(time required to fetch ~125 MB of XML data)

tjholm added a commit to nitrictech/php-sdk that referenced this issue Jun 11, 2021
Need to address memory leak: grpc/grpc#26085
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants