New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
5.1 unbuffered queries #17649
5.1 unbuffered queries #17649
Conversation
Unbufferred results are only available when directly iterating the SelectQuery instance. Calling SelectQuery::all() with always returned a buffered result set.
} | ||
|
||
$this->closeCursor(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I see how this and fetchAll() should co-exist. Won't the statement be empty if we call one or the other?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One needs to pick one of the options, either call fetchAll() or iterate the instance.
Hmm.. an ORM test which uses unbuffered queries and contains a hasMany assoc. (which cause a new query) fails on mysql with
I tried the same on 4.x and there's no error. Trying to walk through the 4.x code I haven't been able to figure out why the same error doesn't occur on 4.x. |
src/ORM/EagerLoader.php
Outdated
$results = (new Collection($results)) | ||
->map(function ($row) use ($query, $external, $assocData) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an interesting solution.
I figured out why the error doesn't occur on 4.x. Even if you have bufferring turned off, if you contain hasMany/belongsToMany associations the eagerloader will implicitly start buffering by using the cakephp/src/ORM/EagerLoader.php Line 809 in fe40e4f
|
Since new queries can't be executed for the same connection for unbuffered MySQL queries, containing externally loaded associations will implicitly buffer results.
That sounds right. |
f76adfa
to
2d1d1af
Compare
Fix errors reported by static analysis
2d1d1af
to
de5c56b
Compare
This is ready for another round of review. If associations which require additional queries are contained then even for unbuffered queries all results will be pre-fetched since MySQL doesn't allow running new queries when an unbuffered query is already active. I'll work on adding more tests after this. |
@nicosp Can you please try out this branch and provide feedback whether it restores support for your use case? |
Sure. I will try it in a couple of days. Thank you for attempting to fix this |
|
||
if ($query instanceof SelectQuery) { | ||
try { | ||
$this->getPdo()->setAttribute(PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, $query->isBufferedResultsEnabled()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we did the the same in 4.x, so this should be ok. I'm wondering if setting the attribute per query is necessary or affects performance at all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it's the same as 4.x, no idea about the performance implications.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets not overcomplicate things. If this worked in 4.x it will be good enough here too. If it creates a performance regression we can patch around it.
@@ -30,184 +28,8 @@ | |||
* @template T of \Cake\Datasource\EntityInterface|array | |||
* @implements \Cake\Datasource\ResultSetInterface<T> | |||
*/ | |||
class ResultSet implements ResultSetInterface | |||
class ResultSet extends Collection implements ResultSetInterface |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I approve of extending Collection directly here. I'm interested in making Collection extensible and more customizable with collection iterators instead of CollectionInterface.
Not sure what ResultSetInterface means in the future, but I think this is ok for 5.1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what ResultSetInterface means in the future, but I think this is ok for 5.1.
ResultSetInterface
doesn't really hold much meaning. Maybe in the future we can just drop it and make ORM\Query\SelectQuery:all()
's return type as Collection
/CollectionInterface
.
$results = (new Collection($results)) | ||
->map(function ($row) use ($data) { | ||
return $this->groupResult($row, $data); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hopefully users don't iterator over the result set multiple times. Should we unwrap this before passing to ResultSet()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the non-array path the unbuffered statement so we can't unwrap this? Should we check it's a statement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes a non array path means it's an buffered statement. But since the Collection
can handle any iterable, I don't want to narrow this down unnecessarily using a Statement
check.
@@ -111,13 +119,6 @@ public function all(): iterable | |||
{ | |||
if ($this->_results === null || $this->_dirty) { | |||
$this->_results = $this->execute()->fetchAll(StatementInterface::FETCH_TYPE_ASSOC); | |||
if ($this->_resultDecorators) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving this to the statement seems good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, Statement
was already decorating the records using the typemap so moving user provided decorators there too makes sense, so all decorations are now in one place.
With light testing on a preminary 5.x branch it's working for me. Let me know if there is anything more I can do to help |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work 👏
|
||
if ($query instanceof SelectQuery) { | ||
try { | ||
$this->getPdo()->setAttribute(PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, $query->isBufferedResultsEnabled()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets not overcomplicate things. If this worked in 4.x it will be good enough here too. If it creates a performance regression we can patch around it.
Co-authored-by: Mark Story <mark@mark-story.com>
I was hoping to add some more tests but have been quite busy and won't be able to get to it anytime soon. So if we are happy with the changes this can be merged. |
Do you have an idea what kind of tests you wanted? |
Haven't got around to even that yet :). It's just that there's currently only 1 ORM test with results buffering turned off. Would be nice a few tests at DBAL and ORM levels to check that actually only 1 record is fetched at a time with buffering disabled. Also have ORM tests which fetch hasmany/belongstomany associations with buffering turned off, to show we get the expected results (albeit with buffering implicitly turned off). |
Let's leave the PR open for a bit to see if one of us finds time to confirm no issues. |
Refs #17517
I have been playing around to add back support for unbuffered queries and have come up with a POC. The implementation is a bit different than earlier since we got rid of statement decoration in 5.0.
Digging into the topic I learned that at PDO level postgres is always buffered, sqlite is always unbuffered, only mysql has the option to toggle buffered/unbuffered at connection level.
Here are a few noteworthy points about the implementation:
Statement
instance is iterated which returns a generator. Since generators can't be rewind, trying to iterate the resultset again or using any collection method which internally iterates will throw an exception. Personally I think that's a reasonable limitation. An alternative would be to try and transparently re-execute the query.Statement::getIterator()
directly return thePDOStatment
instance failed as the results are later wrapped in collection for decoration andCollection::unwrap()
has return typeIterator
instead ofTraversable
.ResultSet
simply extendCollection
.TODO: