You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
An early draft/POC exists ([GH-1569] Optimize eager fetch for collections to batch query #8391) that might allow to get away with one extra query per collection field, at least for one-to-many associations; but this cannot be enabled on a per-query basis and/or does not allow for context-sensitive optimization.
In order to initialize collections, use fetch joins.
That blog post also shows the usage of PARTIAL, e. g. SELECT PARTIAL user.{id}, sessions FROM User user JOIN user.sessions sessions to avoid loading all of the user.* data in this query, since all that data has already been loaded (we're only interested in initializing the collection in this step). But partial objects have been deprecated for good reasons in Deprecate Partial Objects #8471.
When not using PARTIAL, overhead data may be multiplied in the database result set. For example, given a Customer → Order → OrderLineItem association chain with all → being to-many, initializing the Order::$items collection means fetching unnecessary Order data once for every item, and unnecessary Customer data for as many times as the Customer has total OrderLineItems.
Suggestion
Provide a mechanism, helper class, new DQL query type or keyword or other extension that
makes it easy (or easier) for users to initialize to-many collections,
can be used to conveniently perform multi-step hydration
reduces database (= result set size) overhead as much as possible
can be maintained without having to provide partial object support
at best, can be used given arbitrary queries
To be clear, the goal of this feature is to only initialize uninitialized collection fields of entities already loaded/being in the identity map.
Users must be able to determine for which of those entities initialization shall occur, probably based on a DQL query expression.
It would be acceptable to initialize only one field at a time, and to perform a constant number (O(1) ) of database queries per such field.
The new mechanism is not required to return any data (entities) when invoked, although it may do so if the semantics can be clearly defined. This might limit the applicability of DQL SELECT clauses.
At this time, I have no insight into whether a new DQL query type, query hints, new SELECT keywords or even helper classes (with no DQL changes required) would be needed to implement the suggestion.
Illustrative examples
Let's assume we have the User with to-many Session and SocialAccount associations as in the blog post.
SELECT u FROM User u WHERE u.created >= ... – we're interested in a subset of Users only.
Example 1
A new INITIALIZED keyword means "select identifying columns only, and additionally skip result row when not in identity map":
SELECT INITIALIZED u, s FROM User u LEFT JOIN u.socialAccount s WHERE u.created >= ... SELECT INITIALIZED u, s FROM User u LEFT JOIN u.sessions s WHERE u.created >= ...
Basically, this is like PARTIAL but avoids declaring the identifying columns u.{id, ...} and does not need to create partial objects, since objects not in the identity map may be skipped.
Example 2
A new INITIALIZE query type might make it more expressive that we're aiming at collection fields, and may specify only one such field at a time?
INITIALIZE u.socialAccount FROM User u WHERE u.created >= ...
Maybe this would not even require to start on the root entity and/or specify the entire association chain from there. With the association chain described above:
INITIALIZE o.items FROM Customer c LEFT JOIN c.orders o WHERE ...
Under the hood, the multi-step hydration would happen through SELECT PARTIAL _t1.{id}, _t2 FROM Order _t1 LEFT JOIN o.items _t2 WHERE _t1 IN (SELECT IDENTITY(o) FROM Customer c LEFT JOIN c.orders o WHERE ...) (does that even work?).
The text was updated successfully, but these errors were encountered:
I believe that the selection should be native
something like SELECT u.title, u.children FROM User
if the children field is specified in my entity, then it should be logical that I can get it from DQL
It's a mystery to me why it still doesn't work
the only place where I see a problem is possible looping, but this is solved by skipping related entities in the requested
I am now starting to think about how, make a fork and implement this mechanism on my own
If there are already some solutions, make selections by type SELECT u.title, u.order, u.children, u.parent FROM User
Please tell me
In the meantime, in such a request, I get messages about the absence of fields children and parent
Feature Request
Background
fetch=EAGER
does not make sense for to-many collections, since currently one additional query per loaded entity is needed to initialize to-many collections (DDC-3905: Eager fetching yields N+1 queries in *ToMany associations when using DQL #4762)PARTIAL
, e. g.SELECT PARTIAL user.{id}, sessions FROM User user JOIN user.sessions sessions
to avoid loading all of theuser.*
data in this query, since all that data has already been loaded (we're only interested in initializing the collection in this step). But partial objects have been deprecated for good reasons in Deprecate Partial Objects #8471.PARTIAL
, overhead data may be multiplied in the database result set. For example, given aCustomer → Order → OrderLineItem
association chain with all→
being to-many, initializing theOrder::$items
collection means fetching unnecessaryOrder
data once for every item, and unnecessaryCustomer
data for as many times as theCustomer
has totalOrderLineItem
s.Suggestion
Provide a mechanism, helper class, new DQL query type or keyword or other extension that
To be clear, the goal of this feature is to only initialize uninitialized collection fields of entities already loaded/being in the identity map.
Users must be able to determine for which of those entities initialization shall occur, probably based on a DQL query expression.
It would be acceptable to initialize only one field at a time, and to perform a constant number (
O(1)
) of database queries per such field.The new mechanism is not required to return any data (entities) when invoked, although it may do so if the semantics can be clearly defined. This might limit the applicability of DQL
SELECT
clauses.At this time, I have no insight into whether a new DQL query type, query hints, new
SELECT
keywords or even helper classes (with no DQL changes required) would be needed to implement the suggestion.Illustrative examples
Let's assume we have the
User
with to-manySession
andSocialAccount
associations as in the blog post.SELECT u FROM User u WHERE u.created >= ...
– we're interested in a subset ofUser
s only.Example 1
A new
INITIALIZED
keyword means "select identifying columns only, and additionally skip result row when not in identity map":SELECT INITIALIZED u, s FROM User u LEFT JOIN u.socialAccount s WHERE u.created >= ...
SELECT INITIALIZED u, s FROM User u LEFT JOIN u.sessions s WHERE u.created >= ...
Basically, this is like
PARTIAL
but avoids declaring the identifying columnsu.{id, ...}
and does not need to create partial objects, since objects not in the identity map may be skipped.Example 2
A new
INITIALIZE
query type might make it more expressive that we're aiming at collection fields, and may specify only one such field at a time?INITIALIZE u.socialAccount FROM User u WHERE u.created >= ...
Maybe this would not even require to start on the root entity and/or specify the entire association chain from there. With the association chain described above:
INITIALIZE o.items FROM Customer c LEFT JOIN c.orders o WHERE ...
Under the hood, the multi-step hydration would happen through
SELECT PARTIAL _t1.{id}, _t2 FROM Order _t1 LEFT JOIN o.items _t2 WHERE _t1 IN (SELECT IDENTITY(o) FROM Customer c LEFT JOIN c.orders o WHERE ...)
(does that even work?).The text was updated successfully, but these errors were encountered: