Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partial objects get cached and break "actual" objects #7633

Closed
Amunak opened this issue Mar 2, 2019 · 13 comments
Closed

Partial objects get cached and break "actual" objects #7633

Amunak opened this issue Mar 2, 2019 · 13 comments
Milestone

Comments

@Amunak
Copy link

Amunak commented Mar 2, 2019

Bug Report

Q A
BC Break no
Version 2.6.3

Summary

When using partial objects and second-level cache, the partial objects can get cached and then they get retrieved from the cache instead of querying for the "full" object.

Current behavior

Doctrine doesn't properly recognize query as fetching a partial object, caches it and then later when asked to query the actual object it only retreives the partial from cache.

How to reproduce

The following snippet would save a partial object to your second-level cache:

class Object {
	public $id;
	public $field1;
	public $field2;
}

/** @var $qb  Doctrine\ORM\QueryBuilder */
$qb
	->select('partial o.{id, field1}')
	->from(Object::class, 'o')
	->setCacheable(true)
;

// when second level cache is enabled, this caches the resulting partial object(s)
$partial = $qb->getQuery()->getResult();

Then later running this code:

/** @var $om Doctrine\Common\Persistence\ObjectManager */
$real = $om->find(Object::class, 1);

$real would actually contain only the partial object data - property $real->field2 would be null and if you could compare the $partial and $real objects you would notice that they are essentialy the same.

Expected behavior

Ideally Doctrine would cache partial objects separately. While you could argue "why use partial objects when you cache the actual objects anyways and save on speed there" they are great for optimizations and in some cases you don't want to load hundreds of potentially big objects as a whole when you need only a small portion of them.

However at the very least it should recognize that the query above is indeed a partial query and throw CacheException - as it does when you modify the snippet like this:

/** @var $qb  Doctrine\ORM\QueryBuilder */
$qb
	->select('partial o.{id, field1}')
	->from(Object::class, 'o')
	->setCacheable(true)
;
$query = $qb->getQuery();
$query->setHint(Query::HINT_FORCE_PARTIAL_LOAD, 1);

$partial = $query->getResult();

Obviously that makes the query unusable as it now throws an exception, so the actual solution is to just not mark it as cacheable.

No matter what solution will be implemented I would strongly advice to also mention this caveat of partial objects in their documentation - there is no mention of them not being cacheable anywhere.


Writing this I realized that this bug has probably the same cause as #2094. However this is an important and annoying edge case that proved hard to debug and I feel like it should be documented.

Also as a side note, how the hell is this bug not fixed after 8 years? It seems pretty major.

@FabianKoestring
Copy link

This issue still open? We have the same problem over here. Is there a fix already or some workaround?

@Amunak
Copy link
Author

Amunak commented Sep 5, 2019 via email

@SenseException
Copy link
Member

You actually also get the partial object even without caching. What is the behavior of $om->reload($partialEntity); with active cache?

Partial objects might look useful for improving database SQL query performance, but mostly are harder to handle in the domain code. If you don't need the whole values of an entity, you might want to use a custom value object that fits to a use case and fetch them with the NEW operator: https://www.doctrine-project.org/projects/doctrine-orm/en/latest/reference/dql-doctrine-query-language.html#new-operator-syntax

@yura3d
Copy link

yura3d commented Sep 12, 2019

I have the same problem, too.

The idea of separately storing of partial entities, like @Amunak suggested above, is interesting. Current EntityCacheKey class (that represents key for entities data in cache) has 2 parameters:

$entityKey = new EntityCacheKey($entityName, $identifier);

Why not extend it with parameter that contains required entity fieldset for the current query? For example:
EntityCacheKey($entityName, $identifier, $fieldset);

I see that nothing was done for resolve this problem in future 2.7 and 3.0 (master) branches. It's very sad.

@lcobucci
Copy link
Member

Also as a side note, how the hell is this bug not fixed after 8 years? It seems pretty major.

I see that nothing was done for resolve this problem in future 2.7 and 3.0 (master) branches. It's very sad.

@Amunak @yura3d @FabianKoestring this is definitely an edge-case that hasn't been anticipated during L2C's design.
I believe it makes no sense to use L2C for partial loading. It creates more complexity to bind query cache region and entity region, without much benefit.

Not sure if @guilhermeblanco agrees but I'd rather prevent L2C usage for partial queries and recommend you to use the result set cache. That works well since the cache is bound to each query and set of parameters, the only drawback is that it's up to you to control TTL or eviction.

@Amunak
Copy link
Author

Amunak commented Oct 28, 2019

Not sure if @guilhermeblanco agrees but I'd rather prevent L2C usage for partial queries and recommend you to use the result set cache. That works well since the cache is bound to each query and set of parameters, the only drawback is that it's up to you to control TTL or eviction.

Again, while not ideal it's better than having this edge case exist and confuse people :)

@flaushi
Copy link
Contributor

flaushi commented Feb 21, 2020

I think I am running into the same issue.

Question: are DQL-queries cached by default?
My queries look like

if($this->prefetchingEnabled && $this->partial) {
        $this->em->createQuery('SELECT partial d.{id, name, table, reportingInterval, intervalConfined} FROM App\Entity\SampledDataCategory d')->getResult();
        $this->em->createQuery('SELECT partial d.{id, name, table, cat1, cat2, type} FROM App\Entity\CompoundDataCategory d')->getResult();
        $this->em->createQuery('SELECT partial d.{id, name, table, computeAverage, onlyMinMax} FROM App\Entity\MultiCompoundDataCategory d')->getResult();
        $this->em->createQuery('SELECT d FROM App\Entity\TimeAveragingDataCategory d')->getResult();
        $this->em->createQuery('SELECT partial d.{id, name, table, selectedTags, computeAverage, onlyMinMax} FROM App\Entity\TagDataCategory d')->getResult();
        $this->em->createQuery('SELECT partial d.{id, name, table, dataCategory, type, param1, param2} FROM App\Entity\UnaryDataCategory d')->getResult();
        }

and there is no 'setCachable'.

I'd like to write a test on this. Can someone show me how to "look into the SLC"? I mean how I can query the SLC cache provider directly to inspect the behaviour.

@beberlei beberlei added this to the 2.7.2 milestone Feb 21, 2020
@beberlei
Copy link
Member

This should be easy to fix, DQL Queries have a hint for partial set when a partial query is done. This would need to be checked in the second level query cache during the "save" part and ignored.

@flaushi
Copy link
Contributor

flaushi commented Feb 22, 2020

Do you mean I have to set $query->setHint(Query::HINT_FORCE_PARTIAL_LOAD, true)? Does this prevent the result to be saved to SLC ? If yes, then I‘d create a PR for the documentation repo. This is definitely an important detail.

@flaushi
Copy link
Contributor

flaushi commented Feb 23, 2020

This is weird. I looked into it now.

cache does show wrong infos

I have created such a test case (symfony framework):

class SLCPartialObjectTest extends KernelTestCase
{
    public function testPartialQuery()
    {
        $this->em->persist(SampledDataCategory::create('A', 'A description'));
        $this->em->flush();
        $pool = static::$container->get('doctrine.slc_cache_pool');
        $pool->clear();
        $this->em->clear();
        $this->em->getCache()->evictEntityRegions(); // has no effect
        $this->em->getCache()->evictCollectionRegions(); // has no effect
        $this->em->getCache()->evictQueryRegions(); // has no effect

        $query = $this->em->createQuery("SELECT partial d.{id, description} FROM App\Entity\DataCategory d WHERE d.name = 'A'");
        $A = $query->getSingleResult();

        $this->assertNotNull($A->description); // OK
        $this->assertNotNull($A->getId()->toString()); // OK
        $this->assertNull($A->name); // OK
        
        $this->em->clear();
        $this->assertFalse($this->em->getCache()->containsEntity(SampledDataCategory::class, $A->getId()->toString())); // OK

        
        /** 
         *  Now query via reposotry. Should save to SLC cache! 
         */
        $AA = $this->em->getRepository(SampledDataCategory::class)->findOneBy(['name' => 'A']);

        $this->assertNotNull($AA->description); // OK
        $this->assertNotNull($AA->getId()->toString()); // OK
        $this->assertNotNull($AA->name); // OK

        foreach($pool->getItems() as $i) // empty
            dump($i);

        $this->assertTrue($this->em->getCache()->containsEntity(SampledDataCategory::class, $AA->getId()->toString())); // fails!
    }
}

However, looking into redis directly, I do see the cache entry for 'A'.
Seems there is a problem in the symfony <-> doctrine cache bridge?
My (symfony 4.3) config is as follows:

doctrine:
    orm:
       second_level_cache:
            regions:
                default:
                    lifetime: 0
                    cache_driver:
                        type: service
                        id: doctrine.slc_cache_provider
            region_cache_driver:
                type: service
                id: doctrine.slc_cache_provider
            region_lifetime: 0
            log_enabled:          true
            enabled:              true
framework:
    cache:
        default_redis_provider: 'redis://%env(REDIS_HOST)%/%env(REDIS_DB)%'
        pools:
            doctrine.query_cache_pool:
                adapter: cache.adapter.redis
                default_lifetime: 0
            doctrine.metadata_cache_pool:
                adapter: cache.adapter.redis
                default_lifetime: 0
            doctrine.slc_cache_pool:
                adapter: cache.adapter.redis
                default_lifetime: 0  
services:
    doctrine.query_cache_provider:
        class: Symfony\Component\Cache\DoctrineProvider
        public: false
        arguments:
            - '@doctrine.query_cache_pool'

    doctrine.metadata_cache_provider:
        class: Symfony\Component\Cache\DoctrineProvider
        public: false
        arguments:
            - '@doctrine.metadata_cache_pool'

    doctrine.slc_cache_provider:
        class: Symfony\Component\Cache\DoctrineProvider
        public: false
        arguments:
            - '@doctrine.slc_cache_pool'

@beberlei
Copy link
Member

beberlei commented Mar 1, 2020

@flaushi no, i mean that Doctrine internally needs to check for the partial queries and not store them in SLC.

beberlei added a commit to beberlei/doctrine2 that referenced this issue Mar 1, 2020
There was a check in DefaultQueryCache that prevented partial queries,
because they are not supported. However the checked hint
Query::HINT_FORCE_PARTIAL_LOAD is optional, so cant be used to prevent
caching partial DQL queries.

Introduce a new hint that the SqlWalker sets on detecing a PARTIAL
query and throw an exception in the DefaultQueryCache if thats found.
@beberlei
Copy link
Member

beberlei commented Mar 1, 2020

@flaushi So i have made a PR #8050 that correctly prevents partial queries from being cached. The fix for you now (and in the future) is to remove all setCacheable(true) from queries that contain the PARTIAL keyword because this is not supported.

beberlei added a commit to beberlei/doctrine2 that referenced this issue Mar 1, 2020
beberlei added a commit that referenced this issue Mar 15, 2020
* [GH-7633] Bugfix: Partial queries were stored in 2LC.

There was a check in DefaultQueryCache that prevented partial queries,
because they are not supported. However the checked hint
Query::HINT_FORCE_PARTIAL_LOAD is optional, so cant be used to prevent
caching partial DQL queries.

Introduce a new hint that the SqlWalker sets on detecing a PARTIAL
query and throw an exception in the DefaultQueryCache if thats found.

* Housekeeping: CS

* [GH-7633] HINT_FORCE_PARTIAL_LOAD still needs to be checked.

* Housekeeping: Fix CS
@beberlei
Copy link
Member

Merged in #8050

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants