Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON columns throws Invalid token type: 'StartObject' exception with AsNoTrackingWithIdentityResolution() #33073

Open
DieBunnyxxx opened this issue Feb 12, 2024 · 6 comments · May be fixed by #33101
Assignees
Labels
area-json area-query closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. customer-reported regression type-bug
Milestone

Comments

@DieBunnyxxx
Copy link

I have issues using Json columns.
I'm getting this error via single or split query while trying to query entities with JsonColumns

[ERR] [Microsoft.EntityFrameworkCore.Query] An exception occurred while iterating over the results of a query for context type 
System.InvalidOperationException: Invalid token type: 'StartObject'.
   at Microsoft.EntityFrameworkCore.Query.RelationalShapedQueryCompilingExpressionVisitor.ShaperProcessingExpressionVisitor.IncludeJsonEntityCollection[TIncludingEntity,TIncludedCollectionElement](QueryContext queryContext, Object[] keyPropertyValues, JsonReaderData jsonReaderData, TIncludingEntity entity, Func`4 innerShaper, Action`1 getOrCreateCollectionObject, Action`2 fixup, Boolean trackingQuery)
   at lambda_method14418(Closure, QueryContext, DbDataReader, ResultContext, SingleQueryResultCoordinator)
   at Microsoft.EntityFrameworkCore.Query.Internal.SingleQueryingEnumerable`1.AsyncEnumerator.MoveNextAsync()
System.InvalidOperationException: Invalid token type: 'StartObject'.
   at Microsoft.EntityFrameworkCore.Query.RelationalShapedQueryCompilingExpressionVisitor.ShaperProcessingExpressionVisitor.IncludeJsonEntityCollection[TIncludingEntity,TIncludedCollectionElement](QueryContext queryContext, Object[] keyPropertyValues, JsonReaderData jsonReaderData, TIncludingEntity entity, Func`4 innerShaper, Action`1 getOrCreateCollectionObject, Action`2 fixup, Boolean trackingQuery)
   at lambda_method14418(Closure, QueryContext, DbDataReader, ResultContext, SingleQueryResultCoordinator)
   at Microsoft.EntityFrameworkCore.Query.Internal.SingleQueryingEnumerable`1.AsyncEnumerator.MoveNextAsync()

Error occurs if multiple instances of same object is queried. Works with AsNoTracking() but error with AsNoTrackingWithIdentityResolution()

modelBuilder.Entity<Customer>().OwnsMany(c => c.ContactInfo, ownedNavigationBuilder =>
{
    ownedNavigationBuilder.ToJson();
});
await context.Set<CaseCustomer>().AsNoTrackingWithIdentityResolution().Include(cc => cc.Customer).Where(e => e.CaseId == 1).ToListAsync();

project to reproduce issue:
https://github.com/DieBunnyxxx/EFCoreJsonException/tree/master

Include provider and version information

EF Core version: 8.0.1
Database provider: Microsoft.EntityFrameworkCore.SqlServer
Target framework: NET 8.0

@ajcvickers
Copy link
Member

Note for triage: still repros on latest daily; regression from EF7.

@maumar
Copy link
Contributor

maumar commented Feb 13, 2024

for queries with JSON (when we are streaming), we still need to go through the entire stream even if we have an entry in the change tracker already. In case of TrackAll we have special logic that stores entity from the change tracker into a variable and we store bool value indicating that we already have entity instance. We loop through entire stream and and the end we discard the object that we constructed from the stream and instead use the one from change tracker.

For NoTrackingWithIdentityResolution we don't do that. Instead, when we find that the object is in the change tracker already we short circuit. This leaves JSON stream in unexpected state, since the data (that is there and waiting to be read) is pending but we already move to processing next entity/whatnot.

maumar added a commit that referenced this issue Feb 14, 2024
… exception with AsNoTrackingWithIdentityResolution()

Problem was that we were treating NoTrackingWithIdentityResolution queries as regular NoTracking in the context of JSON queries. However due to how JSON is materialized (streaming + nested includes are part of the parent materialization, rather than each entity materialized separately) we have special provisions when ChangeTracker is being used.
In NoTrackingWithIdentityResolution, ChangeTracker is being used but the materializer didn't adjust to that, which lead to errors.

Fix is to generate JSON materializer code based on whether query uses Change Tracker rather than if it's a Tracking/NoTracking query.

Fixes #33073
@maumar maumar added the closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. label Feb 14, 2024
maumar added a commit that referenced this issue Feb 14, 2024
… exception with AsNoTrackingWithIdentityResolution()

Problem was that we were treating NoTrackingWithIdentityResolution queries as regular NoTracking in the context of JSON queries. However due to how JSON is materialized (streaming + nested includes are part of the parent materialization, rather than each entity materialized separately) we have special provisions when ChangeTracker is being used.
In NoTrackingWithIdentityResolution, ChangeTracker is being used but the materializer didn't adjust to that, which lead to errors.

Fix is to generate JSON materializer code based on whether query uses Change Tracker rather than if it's a Tracking/NoTracking query.

Fixes #33073
@maumar maumar removed the closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. label Feb 14, 2024
@maumar
Copy link
Contributor

maumar commented Feb 15, 2024

There is a fundamental problem with using NTWIR and not projecting root entity. For Tracking queries we don't allow that (i.e. we throw), but currently for NTWIR we do. When using change tracker, the problem we can get ourselves into is that collection navigation results are being populated in the order they are being materialized in the query. When parent entity is present we process it in a separate code path (Include), and results of that processing always come first in the materializer code. So, "full" JSON collection projections always are discovered and materialized first and therefore all subsequent instances of those collections are populated in the correct ("original") order.

For no tracking we always materialize everything separately so the problem doesn't exist.

For NTWIR, we can have a situation right now where a collection element is projected first, and only later the full collection is projected. In that scenario, in the "full" collection, that entity discovered earlier would come first in the list. This is an issue for JSON, because collections are supposed to be ordered.

Example problematic query:

            ss => ss.Set<JsonEntityBasic>().Select(
                x => new
                {
                    x.Id,
                    Duplicate = x.OwnedReferenceRoot.OwnedCollectionBranch[1],
                    Original = x.OwnedReferenceRoot,
                }).AsNoTrackingWithIdentityResolution(),

in the Original, OwnedCollectionBranch will contain OwnedCollectionBranch[1], before the rest of the elements.

@maumar
Copy link
Contributor

maumar commented Feb 15, 2024

We can either do sophisticated ordering of which elements get materialized before which (i.e. full collections before individual element accesses), or block the scenario where you don't project root in NTWIR, just like we do for tracking query.
In long term the correct fix is to implement Ordered Collections: #9067, then the issue goes away.

@roji
Copy link
Member

roji commented Feb 15, 2024

OTOH blocking this scenario (for now) doesn't seem too bad to me. But given that this does work correctly for tracked queries, is it simply an implementation difficulty for the NTWIR code path or something? I mean, if tracking queries can do this correctly without ordered collections, it seems odd that we wouldn't be able to do the same for NTWIT no?

In any case, we may really want to have a design discussion on whether we want to continue investing in owned entity JSON mapping - as opposed to complex types. Unless I'm mistaken, this wouldn't be a problem with complex types, since we don't do identity resolution on them anyway (i.e. we'd project different instances, just like in the no-tracking case?).

@maumar
Copy link
Contributor

maumar commented Feb 15, 2024

NTWIT works fine for the case where you also project parent entity (just like regular Tracking), after fix from #33101 is applied. Cases where we don't project the parent are blocked for Tracking queries, so there is no issue. If you removed the block, we would be hitting the same problem. NTWIT doesn't have the same block so the problem gets exposed.

maumar added a commit that referenced this issue Mar 5, 2024
… exception with AsNoTrackingWithIdentityResolution()

Problem was that we were treating NoTrackingWithIdentityResolution queries as regular NoTracking in the context of JSON queries. However due to how JSON is materialized (streaming + nested includes are part of the parent materialization, rather than each entity materialized separately) we have special provisions when ChangeTracker is being used.
In NoTrackingWithIdentityResolution, ChangeTracker is being used but the materializer didn't adjust to that, which lead to errors.
Fix is to generate JSON materializer code based on whether query uses Change Tracker rather than if it's a Tracking/NoTracking query.

However, this uncovered some issue with NoTrackingWithIdentityResolution - depending on the order in which entities are processed during materialization, they could cause data corruption (wrong order) when materializing JSON collections. Also, using queryable operators on JSON collections may cause errors or data corruption - we don't propagate key values of those queries to the materializer, so those entities end up with null keys.
Adding a validator that makes sure that entities are visited in the correct order and issues exception instructing what to do, if the order is wrong. Also we disable usage of queryable operators, due to the issue mentioned above, and cases where parameters are being used to access collection element in the navigatio chain. For cases with parameters, we can't tell if the value is the same or different so can't properly validate those cases.
Two different parameters can have the same value, leading to the same entity being materialized, but when we analyze their JSON path, those paths look different.

Fixes #33073
maumar added a commit that referenced this issue Mar 5, 2024
… exception with AsNoTrackingWithIdentityResolution()

Problem was that we were treating NoTrackingWithIdentityResolution queries as regular NoTracking in the context of JSON queries. However due to how JSON is materialized (streaming + nested includes are part of the parent materialization, rather than each entity materialized separately) we have special provisions when ChangeTracker is being used.
In NoTrackingWithIdentityResolution, ChangeTracker is being used but the materializer didn't adjust to that, which lead to errors.
Fix is to generate JSON materializer code based on whether query uses Change Tracker rather than if it's a Tracking/NoTracking query.

However, this uncovered some issue with NoTrackingWithIdentityResolution - depending on the order in which entities are processed during materialization, they could cause data corruption (wrong order) when materializing JSON collections. Also, using queryable operators on JSON collections may cause errors or data corruption - we don't propagate key values of those queries to the materializer, so those entities end up with null keys.
Adding a validator that makes sure that entities are visited in the correct order and issues exception instructing what to do, if the order is wrong. Also we disable usage of queryable operators, due to the issue mentioned above, and cases where parameters are being used to access collection element in the navigatio chain. For cases with parameters, we can't tell if the value is the same or different so can't properly validate those cases.
Two different parameters can have the same value, leading to the same entity being materialized, but when we analyze their JSON path, those paths look different.

Fixes #33073
maumar added a commit that referenced this issue Mar 5, 2024
… exception with AsNoTrackingWithIdentityResolution()

Problem was that we were treating NoTrackingWithIdentityResolution queries as regular NoTracking in the context of JSON queries. However due to how JSON is materialized (streaming + nested includes are part of the parent materialization, rather than each entity materialized separately) we have special provisions when ChangeTracker is being used.
In NoTrackingWithIdentityResolution, ChangeTracker is being used but the materializer didn't adjust to that, which lead to errors.
Fix is to generate JSON materializer code based on whether query uses Change Tracker rather than if it's a Tracking/NoTracking query.

However, this uncovered some issue with NoTrackingWithIdentityResolution - depending on the order in which entities are processed during materialization, they could cause data corruption (wrong order) when materializing JSON collections. Also, using queryable operators on JSON collections may cause errors or data corruption - we don't propagate key values of those queries to the materializer, so those entities end up with null keys.
Adding a validator that makes sure that entities are visited in the correct order and issues exception instructing what to do, if the order is wrong. Also we disable usage of queryable operators, due to the issue mentioned above, and cases where parameters are being used to access collection element in the navigatio chain. For cases with parameters, we can't tell if the value is the same or different so can't properly validate those cases.
Two different parameters can have the same value, leading to the same entity being materialized, but when we analyze their JSON path, those paths look different.

Fixes #33073
maumar added a commit that referenced this issue Mar 5, 2024
… exception with AsNoTrackingWithIdentityResolution()

Problem was that we were treating NoTrackingWithIdentityResolution queries as regular NoTracking in the context of JSON queries. However due to how JSON is materialized (streaming + nested includes are part of the parent materialization, rather than each entity materialized separately) we have special provisions when ChangeTracker is being used.
In NoTrackingWithIdentityResolution, ChangeTracker is being used but the materializer didn't adjust to that, which lead to errors.
Fix is to generate JSON materializer code based on whether query uses Change Tracker rather than if it's a Tracking/NoTracking query.

However, this uncovered some issue with NoTrackingWithIdentityResolution - depending on the order in which entities are processed during materialization, they could cause data corruption (wrong order) when materializing JSON collections. Also, using queryable operators on JSON collections may cause errors or data corruption - we don't propagate key values of those queries to the materializer, so those entities end up with null keys.
Adding a validator that makes sure that entities are visited in the correct order and issues exception instructing what to do, if the order is wrong. Also we disable usage of queryable operators, due to the issue mentioned above, and cases where parameters are being used to access collection element in the navigatio chain. For cases with parameters, we can't tell if the value is the same or different so can't properly validate those cases.
Two different parameters can have the same value, leading to the same entity being materialized, but when we analyze their JSON path, those paths look different.

Fixes #33073
maumar added a commit that referenced this issue Mar 5, 2024
… exception with AsNoTrackingWithIdentityResolution()

Problem was that we were treating NoTrackingWithIdentityResolution queries as regular NoTracking in the context of JSON queries. However due to how JSON is materialized (streaming + nested includes are part of the parent materialization, rather than each entity materialized separately) we have special provisions when ChangeTracker is being used.
In NoTrackingWithIdentityResolution, ChangeTracker is being used but the materializer didn't adjust to that, which lead to errors.
Fix is to generate JSON materializer code based on whether query uses Change Tracker rather than if it's a Tracking/NoTracking query.

However, this uncovered some issue with NoTrackingWithIdentityResolution - depending on the order in which entities are processed during materialization, they could cause data corruption (wrong order) when materializing JSON collections. Also, using queryable operators on JSON collections may cause errors or data corruption - we don't propagate key values of those queries to the materializer, so those entities end up with null keys.
Adding a validator that makes sure that entities are visited in the correct order and issues exception instructing what to do, if the order is wrong. Also we disable usage of queryable operators, due to the issue mentioned above, and cases where parameters are being used to access collection element in the navigatio chain. For cases with parameters, we can't tell if the value is the same or different so can't properly validate those cases.
Two different parameters can have the same value, leading to the same entity being materialized, but when we analyze their JSON path, those paths look different.

Fixes #33073
maumar added a commit that referenced this issue Mar 6, 2024
… exception with AsNoTrackingWithIdentityResolution()

Problem was that we were treating NoTrackingWithIdentityResolution queries as regular NoTracking in the context of JSON queries. However due to how JSON is materialized (streaming + nested includes are part of the parent materialization, rather than each entity materialized separately) we have special provisions when ChangeTracker is being used.
In NoTrackingWithIdentityResolution, ChangeTracker is being used but the materializer didn't adjust to that, which lead to errors.
Fix is to generate JSON materializer code based on whether query uses Change Tracker rather than if it's a Tracking/NoTracking query.

However, this uncovered some issue with NoTrackingWithIdentityResolution - depending on the order in which entities are processed during materialization, they could cause data corruption (wrong order) when materializing JSON collections. Also, using queryable operators on JSON collections may cause errors or data corruption - we don't propagate key values of those queries to the materializer, so those entities end up with null keys.
Adding a validator that makes sure that entities are visited in the correct order and issues exception instructing what to do, if the order is wrong. Also we disable usage of queryable operators, due to the issue mentioned above, and cases where parameters are being used to access collection element in the navigatio chain. For cases with parameters, we can't tell if the value is the same or different so can't properly validate those cases.
Two different parameters can have the same value, leading to the same entity being materialized, but when we analyze their JSON path, those paths look different.

Fixes #33073
@ajcvickers ajcvickers added this to the 9.0.0 milestone May 4, 2024
@maumar maumar added the closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. label May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-json area-query closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. customer-reported regression type-bug
Projects
None yet
4 participants