Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High memory usage of preg_match_all() in SQLParserUtils::collectPlaceholders() #4261

Closed
sgrossberndt opened this issue Sep 8, 2020 · 13 comments

Comments

@sgrossberndt
Copy link

Bug Report

When executing a lot of queries (6.000+ in my case) with one named parameter each this leads to a very high memory usage (250MB in my case in one request) due to the preg_match_all() statement in the SQLParserUtils::collectPlaceholders() method. When using positional parameters for the query instead this high memory usage does not happen.

Q A
BC Break no
Version 2.10.2

Current behaviour

How to reproduce

Product Version
PHP 7.2.33-1+0~20200807.47+debian9~1.gbpcb3068
PCRE Library 8.43 2019-02-23
CREATE TABLE tx_my_x_y_mm (
 uid_local int(10) unsigned NOT NULL DEFAULT '0',
 uid_foreign int(10) unsigned NOT NULL DEFAULT '0',
 PRIMARY KEY uid_local (uid_local,uid_foreign),
 KEY uid_foreign (uid_foreign)
);

CREATE TABLE tx_my_y (
 uid int(10) unsigned NOT NULL auto_increment,
 deleted tinyint(3) unsigned NOT NULL DEFAULT '0',
 PRIMARY KEY (uid)
);

Execute many queries with one named parameter and profile the memory usage

SELECT "tx_my_x_y_mm".*, "tx_my_y".*
FROM "tx_my_x_y_mm" "tx_my_x_y_mm"
LEFT JOIN "tx_my_y" "tx_my_y" ON "tx_my_x_y_mm"."uid_foreign" = "tx_my_y"."uid"
WHERE ("tx_my_x_y_mm"."uid_local" = :dcValue1) AND (("tx_my_y"."deleted" = 0) OR ("tx_my_y"."uid" IS NULL)) AND ("tx_my_y"."deleted" = 0)
ORDER BY "tx_my_x_y_mm"."sorting" ASC

Exceute many queries with one positional parameter and profile the memory usage

SELECT "tx_my_x_y_mm".*, "tx_my_y".*
FROM "tx_my_x_y_mm" "tx_my_x_y_mm"
LEFT JOIN "tx_my_y" "tx_my_y" ON "tx_my_x_y_mm"."uid_foreign" = "tx_my_y"."uid"
WHERE ("tx_my_x_y_mm"."uid_local" = ?) AND (("tx_my_y"."deleted" = 0) OR ("tx_my_y"."uid" IS NULL)) AND ("tx_my_y"."deleted" = 0)
ORDER BY "tx_my_x_y_mm"."sorting" ASC

Compare the memory usages and you see a huge difference due to the preg_match_all

Expected behaviour

Memory usage it much lower although named parameters are used.

@greg0ire
Copy link
Member

greg0ire commented Sep 8, 2020

Link for the lazy:

private static function collectPlaceholders(
string $statement,
string $match,
string $token,
callable $collector
): array {
if (strpos($statement, $match) === false) {
return [];
}
$carry = [];
foreach (self::getUnquotedStatementFragments($statement) as $fragment) {
preg_match_all('/' . $token . '/', $fragment[0], $matches, PREG_OFFSET_CAPTURE);
foreach ($matches[0] as $placeholder) {
$collector($placeholder[0], $placeholder[1], $fragment[1], $carry);
}
}
return $carry;
}

@morozov
Copy link
Member

morozov commented Sep 8, 2020

Given that the query being processed is relatively small and the fact that the issue is reproducible at a certain scale, it looks like a memory leak.

@sgrossberndt could you provide a script that could be used to reproduce this high memory usage?

@sgrossberndt
Copy link
Author

sgrossberndt commented Sep 9, 2020

<?php

$connection = DriverManager::getConnection(array(/*..*/));

$parameterTypeForQueryBuilder = 'createNamedParameter';

if ($parameterTypeForQueryBuilder === 'createNamedParameter') {
	for ($uid = 1; $uid < 6000; $uid++) {
		$queryBuilder = $connection->createQueryBuilder();
		$records = $queryBuilder
			->select('x.uid', 'mm.uid_foreign')
			->from('tx_my_x_y_mm', 'mm')
			->join('mm', 'tx_my_y', 'x', $queryBuilder->expr()->eq('x.uid', $queryBuilder->quoteIdentifier('mm.uid_local')))
			->where($queryBuilder->expr()->eq('x.uid', $queryBuilder->createNamedParameter($uid, \PDO::PARAM_INT)))
			->orderBy('x.uid')
			->addOrderBy('mm.uid_foreign')
			->execute()
			->fetchAll();
		var_dump($records);
		// has a Excl. MemUse of 243 MB for preg_match_all() where 238MB are calls from Doctrine\DBAL\SQLParserUtils::getUnquotedStatementFragments
	}
} elseif ($parameterTypeForQueryBuilder === 'createPositionalParameter') {
	for ($uid = 1; $uid < 6000; $uid++) {
		$queryBuilder = $connection->createQueryBuilder();
		$records = $queryBuilder
			->select('x.uid', 'mm.uid_foreign')
			->from('tx_my_x_y_mm', 'mm')
			->join('mm', 'tx_my_y', 'x', $queryBuilder->expr()->eq('x.uid', $queryBuilder->quoteIdentifier('mm.uid_local')))
			->where($queryBuilder->expr()->eq('x.uid', $queryBuilder->createPositionalParameter($uid, \PDO::PARAM_INT)))
			->orderBy('x.uid')
			->addOrderBy('mm.uid_foreign')
			->execute()
			->fetchAll();
		var_dump($records);
	}
	// has a Excl. MemUse of 1,4 MB for preg_match_all() where 1,3 MB are calls from Doctrine\DBAL\SQLParserUtils::getUnquotedStatementFragments
}

Here are the xhprof profiles of both runs:
createNamedParameters.doctrine_dbal.zip and createPositionalParameters.doctrine_dbal.zip

@norkunas
Copy link

norkunas commented Oct 1, 2021

I'm also facing this memory leak when running a query in a long running process.. 😞

@morozov
Copy link
Member

morozov commented Oct 1, 2021

What DBAL version are you using? The SQL parser was completely reworked in 3.0.0 (#4397).

@norkunas
Copy link

norkunas commented Oct 3, 2021

I'm on 2.13 because orm blocks from upgrading to v3

@greg0ire
Copy link
Member

greg0ire commented Oct 3, 2021

Currently trying to release 2.10.0, which will allow it :)
Follow my progress at doctrine/orm#9061

@norkunas
Copy link

norkunas commented Oct 3, 2021

But why it was not fixed in v2? :) The rework in my opinion was not a feature but a bugfix

@greg0ire
Copy link
Member

greg0ire commented Oct 3, 2021

I don't know 🤷‍♂️ , and I'm not sure we should care if I can manage to release 2.10.0 (which is harder than it sounds)

@norkunas
Copy link

norkunas commented Oct 4, 2021

doctrine/migrations is another blocker :)

@greg0ire
Copy link
Member

greg0ire commented Oct 4, 2021

Feel free to work on either a fix or migration of doctrine/migrations (it's already underway)

@morozov
Copy link
Member

morozov commented Oct 30, 2021

Closing as irrelevant as of #4397.

@github-actions
Copy link

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 24, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants