Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support ability to select top N of each group #13805

Closed
Tracked by #24106
Jonatthu opened this issue Oct 30, 2018 · 45 comments · Fixed by #25495
Closed
Tracked by #24106

Support ability to select top N of each group #13805

Jonatthu opened this issue Oct 30, 2018 · 45 comments · Fixed by #25495
Assignees
Labels
area-query closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. customer-reported ef6-parity type-enhancement
Milestone

Comments

@Jonatthu
Copy link

I am working on a batch query generator, this should be able to get an array of Ids and get first 10 of each id, and be able to skip, order and others. This query behaves like the result in the image but currently in EF Core 2.1 is resolved in memory which is extremely not performant.

The next query represents a similar result but this is resolved on memory.

var goo = this.dbSet
	.Where(x => new[]{1 , 2}.Contains(x.WaterId))
	.OrderBy(x => x.Id)
	.ThenByDescending(x => x.NumberOfDrops)
        .GroupBy(x => x.Id)
	.Skip(0)
	.Take(10)
	.Select(x => new Drops {
		Id = x.First().Id,
		WaterId = x.First().WaterId
	})
        .ToList();

I am not looking for give support to that query in specific, but need a query that is able to get results like the image making sorting, offsets and selecting top n's

image

@ajcvickers
Copy link
Member

@divega to follow up.

@divega
Copy link
Contributor

divega commented Nov 1, 2018

@Jonatthu Based on the SQL example provided and your description of the problem, I believe you intended to write a query that looks more less like this in LINQ:

                var top = db.Employees
                    .GroupBy(x => x.Occupation)
                    .Select(g => g.OrderByDescending(x => x.YearlyIncome).Take(2))
                    .SelectMany(g => g);

It is true that for this kind of query, we do part of the evaluation in memory. We bring all the rows into memory, and we discard rows of each group that didn't fit in the Take count. However we push down an ORDER BY Occupation to the database, to be able to process each group together.

We have don't some (although relatively little) thinking about translating this kind of conditions. This is closely related to the issue about window functions, tracked in the backlog at #12747.

In the meantime, I would suggest you can work around passing the actual SQL query to FromSql. For example:

                var top = db.Employees.FromSql(@"
                    SELECT [y].[Id], [y].[Name], [y].[Occupation], [y].[YearlyIncome]
                    FROM (
                        SELECT [x].*, 
                            ROW_NUMBER() OVER (
                                PARTITION BY [x].[Occupation] 
                                ORDER BY [x].[YearlyIncome] DESC) AS [ROW_NUMBER]
                        FROM [Employees] AS [x]) AS [y] 
                    WHERE [y].[ROW_NUMBER] < = 2"
                  );

@smitpatel suggested that there is probably a different workaround using OUTER APPLY.

@divega
Copy link
Contributor

divega commented Nov 1, 2018

Note for triage: Leaving this open because I think this is a reasonable scenarios for us to support in the long term. It is also related to window functions tracked in #12747.

@divega divega removed their assignment Nov 1, 2018
@Jonatthu
Copy link
Author

Jonatthu commented Nov 1, 2018

@divega The workaround would work if there would not be other ".Where" filters to apply before create the GroupBy, I would need to wait until this becomes a reality and yes you are right the query that you wrote gives the desired result but resolved on memory.

I think this is an important feature something like this:

Expression<Func<Employee, Employee>> customLamdaSelector = ...;

var top = db.Employees
.Where(x => new [] {1, 3, 5}.Contains(x.RelatedFKId))
.Where(x => x.name == "SomeName")
.Where(x => x.AddedDate < SomeDate)
.GroupBy(x => x.Occupation)
.Select(g => 
    g.OrderByDescending(x => x.YearlyIncome)
      .ThenOrderBy(x => x.Id)
      .Skip(10)
      .Take(2)
)
.SelectMany(customLamdaSelector);

@smitpatel
Copy link
Member

Any where before GroupBy would be server evaluated if they are translatable.

@divega divega added this to the Backlog milestone Nov 2, 2018
@yyjdelete
Copy link

yyjdelete commented Feb 28, 2019

As an workaround, if directly groupby an table which has primary key, the below code may can be used to get the result(but may be slow)
Test the below code works with SQLServer(2.1.8,3.0.0preview2), PostgreSQL(3.0.0preview1), but seems hit some strange limited and exception edge when try to write case2

//Take 1, simple PK
                /*
                      SELECT [outer].[PK], [outer].[GroupByKey], [outer].[OrderByKey]
                      FROM [Test001] AS [outer]
                      WHERE (
                          SELECT TOP(1) [inner].[PK]
                          FROM [Test001] AS [inner]
                          WHERE [inner].[GroupByKey] = [outer].[GroupByKey]
                          ORDER BY [inner].[OrderByKey]
                      ) = [outer].[PK]
                */
                var case1 = ctx.Test001.Where(outer => ctx.Test001
                                .Where(inner => inner.GroupByKey == outer.GroupByKey)
                                .OrderBy(inner => inner.OrderByKey)
                                .Select(inner => (int?)inner.PK)//must cast to nullable here
                                .FirstOrDefault() == outer.PK)
                                .ToList();

//Take more than one line Or composite PK used
                /*
      SELECT [outer].[PK1], [outer].[PK2], [outer].[GroupByKey], [outer].[OrderByKey]
      FROM [Test002] AS [outer]
      WHERE EXISTS (
          SELECT 1
          FROM (
              SELECT TOP(2) [inner].[PK1], [inner].[PK2]
              FROM [Test002] AS [inner]
              WHERE [inner].[GroupByKey] = [outer].[GroupByKey]
              ORDER BY [inner].[OrderByKey]
          ) AS [t]
          WHERE ([t].[PK1] = [outer].[PK1]) AND ([t].[PK2] = [outer].[PK2]))
                */
                var case2 = ctx.Test002.Where(outer => ctx.Test002
                                .Where(inner => inner.GroupByKey == outer.GroupByKey)
                                .OrderBy(inner => inner.OrderByKey)
                                .Select(inner => new { inner.PK1, inner.PK2 })//if replace `inner` to `outer` by mistake, System.InvalidOperationException:“variable 'outer' of type 'ConsoleApp15.Test002' referenced from scope '', but it is not defined”
                                .Take(2)
                                .Where(inner => inner.PK1 == outer.PK1 && inner.PK2 == outer.PK2)//try directly equals with anonymous obj when only Take(1)->FirstOrDefault() is needed like case1, but it not works(ClientEvaluation)
                                .Select(inner => 1)//use this to workaround an unknown limited(ClientEvaluation)
                                .Any()//try .Select(inner => true).FirstOrDefault() instead of `.Select(inner => 1).Any()` lead to an SQL error: `WHERE COALESCE((SELECT TOP (1) ...), 0) = 1`, and no `EXISTS` is used
                                )
                                .ToList();

                var case2Lite = ctx.Test002.Where(outer => ctx.Test002
                                .Where(inner => inner.GroupByKey == outer.GroupByKey)
                                .OrderBy(inner => inner.OrderByKey)
                                .Take(2)
                                .Where(inner => inner == outer)
                                .Any())
                                .ToList();

Also write the failed modified case2 in case somebody need it. Seems can't find any active issue for the 3 exceptions.

failed modified case2
  1. System.InvalidOperationException:“variable 'outer' of type 'ConsoleApp15.Test002' referenced from scope '', but it is not defined”
    (not sure if this will happen with normal join or cross apply)
    Looks like 2.1 regression: "variable referenced from scope '', but it is not defined" exception in GroupBy query translation #11904, but it's already fixed.
                /*var e = ctx.Test002.Where(outer => ctx.Test002
                                .Where(inner => inner.GroupByKey == outer.GroupByKey)
                                .OrderBy(inner => inner.OrderByKey)
                                .Select(inner => new { outer.PK1, outer.PK2 })//This doesn't work, should be inner here, just hit this exception by mistake
                                .FirstOrDefault() == new { outer.PK1, outer.PK2 })
                                .ToList();*/
  1. System.InvalidOperationException:“No coercion operator is defined between types 'System.Int32' and 'System.Nullable`1[System.Boolean]'.”
                ctx.Test001.Where(d1 => ctx.Test001
                    .Select(d2 => new { A = d2.PK })
                    .OrderBy(d2 => d2.A)
                    .Take(2)
                    .Where(d2 => d2.A == 1)
                    .Select(d2 => true)
                    .FirstOrDefault())
                    .ToList();
  1. Wrong SQL:
//simple `.Select(inner => true).FirstOrDefault()` works well.
                var f = ctx.Test002.Where(outer => ctx.Test002
                                .Where(inner => inner.GroupByKey == outer.GroupByKey)
                                .OrderBy(inner => inner.OrderByKey)
                                .Select(inner => new { inner.PK1, inner.PK2 })
                                .Take(2)
                                .Where(inner => inner.PK1 == outer.PK1 && inner.PK2 == outer.PK2)
                                //.Select(inner => 1)
                                .Select(inner => true)
                                .FirstOrDefault()
                                )
                                .ToList();

->

      SELECT [outer].[PK1], [outer].[PK2], [outer].[GroupByKey], [outer].[OrderByKey]
      FROM [Test002] AS [outer]
      WHERE COALESCE((
          SELECT TOP(1) [t].[PK1], [t].[PK2]
          FROM (
              SELECT TOP(2) [inner].[PK1], [inner].[PK2]
              FROM [Test002] AS [inner]
              WHERE [inner].[GroupByKey] = [outer].[GroupByKey]
              ORDER BY [inner].[OrderByKey]
          ) AS [t]
          WHERE ([t].[PK1] = [outer].[PK1]) AND ([t].[PK2] = [outer].[PK2])
      ), 0) = 1

ClientEvaluation:

//NOTE: It's really strange that simply add an `.Select(inner => 1)` before `.Any()` make it works
                var c = ctx.Test002.Where(outer => ctx.Test002
                                .Where(inner => inner.GroupByKey == outer.GroupByKey)
                                .OrderBy(inner => inner.OrderByKey)
                                .Select(inner => new { inner.PK1, inner.PK2 })
                                .Take(2)
                                .Where(inner => inner.PK1 == outer.PK1 && inner.PK2 == outer.PK2)
                                //.Select(inner => 1)
                                .Any()
                                )
                                .ToList();

//4. An simpler case of c
               var c1 =  ctx.Test001.Where(d1 => ctx.Test001
                    .Select(d2 => new { PK = d2.PK })
                    .OrderBy(d2 => d2.PK)
                    .Take(2)
                    .Where(d2 => d2.PK == 1)
                    //.Select(d2 => 1)
                    .Any())
                    .ToList();

//It's ok that it just not works, it may get the same result with case2(Take(2)->Take(1))
                var d = ctx.Test002.Where(outer => ctx.Test002
                                .Where(inner => inner.GroupByKey == outer.GroupByKey)
                                .OrderBy(inner => inner.OrderByKey)
                                .Select(inner => new { inner.PK1, inner.PK2 })
                                .FirstOrDefault() == new { outer.PK1, outer.PK2 })
                                .ToList();
dbcontext
    public partial class TestContext : DbContext
    {
        public TestContext()
        {
        }

        public TestContext(DbContextOptions<TestContext> options)
            : base(options)
        {
        }

        public virtual DbSet<Test001> Test001 { get; set; }

        public virtual DbSet<Test002> Test002 { get; set; }

        protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
        {
            if (!optionsBuilder.IsConfigured)
            {
                //optionsBuilder.UseNpgsql("Host=127.0.0.1;Database=test001;Username=postgres;Password=123456");
                optionsBuilder.UseSqlServer("Server=(localdb)\\MSSQLLocalDB;Database=Test_Database001;Trusted_Connection=True;");
                //optionsBuilder.ConfigureWarnings(warnings => warnings.Throw(Microsoft.EntityFrameworkCore.Diagnostics.RelationalEventId.QueryClientEvaluationWarning));
            }
            var logger = LoggerFactory.Create(conf =>
            {
                conf.AddConsole();
            });
            optionsBuilder.UseLoggerFactory(logger);
        }

        protected override void OnModelCreating(ModelBuilder modelBuilder)
        {
            modelBuilder.Entity<Test001>(entity =>
            {
                entity.HasKey(d => d.PK);
            });

            modelBuilder.Entity<Test002>(entity =>
            {
                entity.HasKey(d => new { d.PK1, d.PK2 });
            });
        }
    }

    public partial class Test002
    {
        public int PK1 { get; set; }
        public int PK2 { get; set; }
        public int GroupByKey { get; set; }
        public int OrderByKey { get; set; }
    }

    public partial class Test001
    {
        public int PK { get; set; }
        public int GroupByKey { get; set; }
        public int OrderByKey { get; set; }
    }

@Jonatthu
Copy link
Author

I hope this can be implemented on the framework since will make some scenarios possible

@smitpatel
Copy link
Member

Also cover linked issues.

@Jonatthu
Copy link
Author

@smitpatel and @divega Is this query possible now with ef core 3?

@Jonatthu
Copy link
Author

As an example of the query I wanna generate is

SELECT *
FROM (
    SELECT Id, [Key], [Type], ROW_NUMBER() 
    OVER (
        PARTITION BY [Type]
        ORDER BY [Key] ASC
    ) AS RowNum
    FROM [AutthuServerDb].[dbo].[GenPermission]
) a
WHERE a.RowNum <= 2 AND a.[Type] IN (0, 1, 2, 11)

@smitpatel
Copy link
Member

@Jonatthu - This issue is still open and in the backlog milestone, which indicates that it is not supported yet. We plan to do some day.

@Jonatthu
Copy link
Author

@smitpatel considering this has a year old, I guess will take another year or two, Is there any workaround that EFCore team can provide, even using the function FromQuery but allowing to extend that from query with more linq?

@smitpatel
Copy link
Member

Use FromSql with keyless entity type to fetch data from the server and then change shape on the client side to desired shape.

@bdongus
Copy link

bdongus commented Oct 18, 2019

At least for us, this is a showstopper. EF core 3.0 throws an exception if something is executed in memory and using SQL would bring infrastructure responsibitilies into the business layer.

Please consider the implementation of .First() too. In many cases we just need the first row of every group.

@Jonatthu
Copy link
Author

Jonatthu commented Oct 18, 2019

@bdongus I just made the implementation!
By using IMethodCallTranslator but only for SQL Server for now
I need to do the same fo SQLite

(SQLite not sure if i can do this)

@Jonatthu
Copy link
Author

Jonatthu commented Oct 18, 2019

https://docs.microsoft.com/en-us/dotnet/api/microsoft.entityframeworkcore.query.sqlexpressions.rownumberexpression?view=efcore-3.0

By using this, helps to implement that interface, and it's only for efcore 3.0

smitpatel added a commit that referenced this issue Apr 16, 2021
- ClientProjections is alternate when ProjectionMember binding is not possible using indexes
- ClientProjections is separate from SelectExpression.Projection where former holds complex client side mapped projections and later holds only scalars
- What it enables
  - Single result subquery is not joined right away
  - pendingCollections are removed
  - Ability to bind with above subquery translations after projection has converted to index based binding (enabling pending selector future change)
  - Ability to bind client projections smoothly like ProjectionMember binding so suppose translations post-client eval where it is supported
  - Grouping.FirstOrDefault can add subquery projection which applies separately so we can combine aggregate/non-aggregate projection on grouping
- Introduce CollectionResultExpression which holds info on how to create collection for a subquery
- ApplyProjection converts projectionMember/Index based bindings to actual scalar projection and updates shaper in same pass
- Unique collectionId are assigned by shaper
- Change in flow to projection to let sqlTranslator determines translatibility before processing for client side
- Regression in collection over single result subquery
  - Earlier we applied single result right away copying over pending collection to outer which allowed single result subquery to convert apply to join
  - Now all client projections apply at same time so pending collection (equivalent subquery) is applied inside single result subquery so we fail to convert apply to join

Ground work for #20291, #12088, #13805
Resolves #23571
smitpatel added a commit that referenced this issue Apr 16, 2021
- ClientProjections is alternate when ProjectionMember binding is not possible using indexes
- ClientProjections is separate from SelectExpression.Projection where former holds complex client side mapped projections and later holds only scalars
- What it enables
  - Single result subquery is not joined right away
  - pendingCollections are removed
  - Ability to bind with above subquery translations after projection has converted to index based binding (enabling pending selector future change)
  - Ability to bind client projections smoothly like ProjectionMember binding so suppose translations post-client eval where it is supported
  - Grouping.FirstOrDefault can add subquery projection which applies separately so we can combine aggregate/non-aggregate projection on grouping
- Introduce CollectionResultExpression which holds info on how to create collection for a subquery
- ApplyProjection converts projectionMember/Index based bindings to actual scalar projection and updates shaper in same pass
- Unique collectionId are assigned by shaper
- Change in flow to projection to let sqlTranslator determines translatibility before processing for client side
- Regression in collection over single result subquery
  - Earlier we applied single result right away copying over pending collection to outer which allowed single result subquery to convert apply to join
  - Now all client projections apply at same time so pending collection (equivalent subquery) is applied inside single result subquery so we fail to convert apply to join

Ground work for #20291, #12088, #13805
Resolves #23571
smitpatel added a commit that referenced this issue Apr 20, 2021
- ClientProjections is alternate when ProjectionMember binding is not possible using indexes
- ClientProjections is separate from SelectExpression.Projection where former holds complex client side mapped projections and later holds only scalars
- What it enables
  - Single result subquery is not joined right away
  - pendingCollections are removed
  - Ability to bind with above subquery translations after projection has converted to index based binding (enabling pending selector future change)
  - Ability to bind client projections smoothly like ProjectionMember binding so suppose translations post-client eval where it is supported
  - Grouping.FirstOrDefault can add subquery projection which applies separately so we can combine aggregate/non-aggregate projection on grouping
- Introduce CollectionResultExpression which holds info on how to create collection for a subquery
- ApplyProjection converts projectionMember/Index based bindings to actual scalar projection and updates shaper in same pass
- Unique collectionId are assigned by shaper
- Change in flow to projection to let sqlTranslator determines translatibility before processing for client side
- Regression in collection over single result subquery
  - Earlier we applied single result right away copying over pending collection to outer which allowed single result subquery to convert apply to join
  - Now all client projections apply at same time so pending collection (equivalent subquery) is applied inside single result subquery so we fail to convert apply to join

Ground work for #20291, #12088, #13805
Resolves #23571
ghost pushed a commit that referenced this issue Apr 20, 2021
- ClientProjections is alternate when ProjectionMember binding is not possible using indexes
- ClientProjections is separate from SelectExpression.Projection where former holds complex client side mapped projections and later holds only scalars
- What it enables
  - Single result subquery is not joined right away
  - pendingCollections are removed
  - Ability to bind with above subquery translations after projection has converted to index based binding (enabling pending selector future change)
  - Ability to bind client projections smoothly like ProjectionMember binding so suppose translations post-client eval where it is supported
  - Grouping.FirstOrDefault can add subquery projection which applies separately so we can combine aggregate/non-aggregate projection on grouping
- Introduce CollectionResultExpression which holds info on how to create collection for a subquery
- ApplyProjection converts projectionMember/Index based bindings to actual scalar projection and updates shaper in same pass
- Unique collectionId are assigned by shaper
- Change in flow to projection to let sqlTranslator determines translatibility before processing for client side
- Regression in collection over single result subquery
  - Earlier we applied single result right away copying over pending collection to outer which allowed single result subquery to convert apply to join
  - Now all client projections apply at same time so pending collection (equivalent subquery) is applied inside single result subquery so we fail to convert apply to join

Ground work for #20291, #12088, #13805
Resolves #23571
smitpatel added a commit that referenced this issue Aug 12, 2021
- Allow expanding navigations after GroupBy operator applied before reducing it to non-grouping Fixes #22609
- Translate FirstOrDefault over grouping element Fixes #12088
- Add ability to select N element over grouping element Fixes #13805

Overall approach:
A grouping element (the range variable you get after applying GroupBy operator) is of type `IGrouping<TKey, TElement>` which implements `IEnumerable<TElement>`. Hence we treat this enumerable as if it is queryable during nav expansion phase. During translation phase we inject ShapedQueryExpression in place of the grouping element which is being enumerated. What this allows us is to expand navigation just like any other query root and translate a subquery similar to other subqueries to facilitate reusing same code for the tasks.
During translation phase in relational layer, since aggregate operation can be lifted into projection for SelectExpression containing SQL GROUP BY. This code path works in 2 ways, when translating we try to combine predicate/distinct into the aggregate operation (so in future when we support custom aggregate operators, we don't have to understand the shape of it to modify it later. When adding this scalar subquery to SelectExpression, we try to pattern match it to see if we can lift it. Further during lifting, we also lift any additional joins in the subquery (which implies there were some joins expanded on grouping element before aggregate) including the navigation expanded from owned navigations. A pending TODO is to de-dupe navigation expanded. It is not straight forward since aliases of table would have changed when previous was lifted.
Given every enumerable grouping element act as query root, every time we replace it inside a lambda expression, we need to create a copy of the root. Navigation expansion and individual queryableMethodTranslatingEV does this. So each root act and translate independently from each other.

Bug fixes:
- Fix a bug in identifying single result in InMemory to convert it to enumerable
- Null out _groupingParameter in InMemoryQueryExpression once the projection to reduce it has been applied
- Throw better error message when translating Min/Max over an entity type for InMemory
@smitpatel smitpatel added the closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. label Aug 12, 2021
smitpatel added a commit that referenced this issue Aug 12, 2021
- Allow expanding navigations after GroupBy operator applied before reducing it to non-grouping Fixes #22609
- Translate FirstOrDefault over grouping element Fixes #12088
- Add ability to select N element over grouping element Fixes #13805

Overall approach:
A grouping element (the range variable you get after applying GroupBy operator) is of type `IGrouping<TKey, TElement>` which implements `IEnumerable<TElement>`. Hence we treat this enumerable as if it is queryable during nav expansion phase. During translation phase we inject ShapedQueryExpression in place of the grouping element which is being enumerated. What this allows us is to expand navigation just like any other query root and translate a subquery similar to other subqueries to facilitate reusing same code for the tasks.
During translation phase in relational layer, since aggregate operation can be lifted into projection for SelectExpression containing SQL GROUP BY. This code path works in 2 ways, when translating we try to combine predicate/distinct into the aggregate operation (so in future when we support custom aggregate operators, we don't have to understand the shape of it to modify it later. When adding this scalar subquery to SelectExpression, we try to pattern match it to see if we can lift it. Further during lifting, we also lift any additional joins in the subquery (which implies there were some joins expanded on grouping element before aggregate) including the navigation expanded from owned navigations. A pending TODO is to de-dupe navigation expanded. It is not straight forward since aliases of table would have changed when previous was lifted.
Given every enumerable grouping element act as query root, every time we replace it inside a lambda expression, we need to create a copy of the root. Navigation expansion and individual queryableMethodTranslatingEV does this. So each root act and translate independently from each other.

Bug fixes:
- Fix a bug in identifying single result in InMemory to convert it to enumerable
- Null out _groupingParameter in InMemoryQueryExpression once the projection to reduce it has been applied
- Throw better error message when translating Min/Max over an entity type for InMemory
smitpatel added a commit that referenced this issue Aug 12, 2021
- Allow expanding navigations after GroupBy operator applied before reducing it to non-grouping Fixes #22609
- Translate FirstOrDefault over grouping element Fixes #12088
- Add ability to select N element over grouping element Fixes #13805

Overall approach:
A grouping element (the range variable you get after applying GroupBy operator) is of type `IGrouping<TKey, TElement>` which implements `IEnumerable<TElement>`. Hence we treat this enumerable as if it is queryable during nav expansion phase. During translation phase we inject ShapedQueryExpression in place of the grouping element which is being enumerated. What this allows us is to expand navigation just like any other query root and translate a subquery similar to other subqueries to facilitate reusing same code for the tasks.
During translation phase in relational layer, since aggregate operation can be lifted into projection for SelectExpression containing SQL GROUP BY. This code path works in 2 ways, when translating we try to combine predicate/distinct into the aggregate operation (so in future when we support custom aggregate operators, we don't have to understand the shape of it to modify it later. When adding this scalar subquery to SelectExpression, we try to pattern match it to see if we can lift it. Further during lifting, we also lift any additional joins in the subquery (which implies there were some joins expanded on grouping element before aggregate) including the navigation expanded from owned navigations. A pending TODO is to de-dupe navigation expanded. It is not straight forward since aliases of table would have changed when previous was lifted.
Given every enumerable grouping element act as query root, every time we replace it inside a lambda expression, we need to create a copy of the root. Navigation expansion and individual queryableMethodTranslatingEV does this. So each root act and translate independently from each other.

Bug fixes:
- Fix a bug in identifying single result in InMemory to convert it to enumerable
- Null out _groupingParameter in InMemoryQueryExpression once the projection to reduce it has been applied
- Throw better error message when translating Min/Max over an entity type for InMemory
smitpatel added a commit that referenced this issue Aug 13, 2021
- Allow expanding navigations after GroupBy operator applied before reducing it to non-grouping Fixes #22609
- Translate FirstOrDefault over grouping element Fixes #12088
- Add ability to select N element over grouping element Fixes #13805

Overall approach:
A grouping element (the range variable you get after applying GroupBy operator) is of type `IGrouping<TKey, TElement>` which implements `IEnumerable<TElement>`. Hence we treat this enumerable as if it is queryable during nav expansion phase. During translation phase we inject ShapedQueryExpression in place of the grouping element which is being enumerated. What this allows us is to expand navigation just like any other query root and translate a subquery similar to other subqueries to facilitate reusing same code for the tasks.
During translation phase in relational layer, since aggregate operation can be lifted into projection for SelectExpression containing SQL GROUP BY. This code path works in 2 ways, when translating we try to combine predicate/distinct into the aggregate operation (so in future when we support custom aggregate operators, we don't have to understand the shape of it to modify it later. When adding this scalar subquery to SelectExpression, we try to pattern match it to see if we can lift it. Further during lifting, we also lift any additional joins in the subquery (which implies there were some joins expanded on grouping element before aggregate) including the navigation expanded from owned navigations. A pending TODO is to de-dupe navigation expanded. It is not straight forward since aliases of table would have changed when previous was lifted.
Given every enumerable grouping element act as query root, every time we replace it inside a lambda expression, we need to create a copy of the root. Navigation expansion and individual queryableMethodTranslatingEV does this. So each root act and translate independently from each other.

Bug fixes:
- Fix a bug in identifying single result in InMemory to convert it to enumerable
- Null out _groupingParameter in InMemoryQueryExpression once the projection to reduce it has been applied
- Throw better error message when translating Min/Max over an entity type for InMemory
smitpatel added a commit that referenced this issue Aug 16, 2021
- Allow expanding navigations after GroupBy operator applied before reducing it to non-grouping Fixes #22609
- Translate FirstOrDefault over grouping element Fixes #12088
- Add ability to select N element over grouping element Fixes #13805

Overall approach:
A grouping element (the range variable you get after applying GroupBy operator) is of type `IGrouping<TKey, TElement>` which implements `IEnumerable<TElement>`. Hence we treat this enumerable as if it is queryable during nav expansion phase. During translation phase we inject ShapedQueryExpression in place of the grouping element which is being enumerated. What this allows us is to expand navigation just like any other query root and translate a subquery similar to other subqueries to facilitate reusing same code for the tasks.
During translation phase in relational layer, since aggregate operation can be lifted into projection for SelectExpression containing SQL GROUP BY. This code path works in 2 ways, when translating we try to combine predicate/distinct into the aggregate operation (so in future when we support custom aggregate operators, we don't have to understand the shape of it to modify it later. When adding this scalar subquery to SelectExpression, we try to pattern match it to see if we can lift it. Further during lifting, we also lift any additional joins in the subquery (which implies there were some joins expanded on grouping element before aggregate) including the navigation expanded from owned navigations. A pending TODO is to de-dupe navigation expanded. It is not straight forward since aliases of table would have changed when previous was lifted.
Given every enumerable grouping element act as query root, every time we replace it inside a lambda expression, we need to create a copy of the root. Navigation expansion and individual queryableMethodTranslatingEV does this. So each root act and translate independently from each other.

Bug fixes:
- Fix a bug in identifying single result in InMemory to convert it to enumerable
- Null out _groupingParameter in InMemoryQueryExpression once the projection to reduce it has been applied
- Throw better error message when translating Min/Max over an entity type for InMemory
smitpatel added a commit that referenced this issue Aug 16, 2021
- Allow expanding navigations after GroupBy operator applied before reducing it to non-grouping Fixes #22609
- Translate FirstOrDefault over grouping element Fixes #12088
- Add ability to select N element over grouping element Fixes #13805

Overall approach:
A grouping element (the range variable you get after applying GroupBy operator) is of type `IGrouping<TKey, TElement>` which implements `IEnumerable<TElement>`. Hence we treat this enumerable as if it is queryable during nav expansion phase. During translation phase we inject ShapedQueryExpression in place of the grouping element which is being enumerated. What this allows us is to expand navigation just like any other query root and translate a subquery similar to other subqueries to facilitate reusing same code for the tasks.
During translation phase in relational layer, since aggregate operation can be lifted into projection for SelectExpression containing SQL GROUP BY. This code path works in 2 ways, when translating we try to combine predicate/distinct into the aggregate operation (so in future when we support custom aggregate operators, we don't have to understand the shape of it to modify it later. When adding this scalar subquery to SelectExpression, we try to pattern match it to see if we can lift it. Further during lifting, we also lift any additional joins in the subquery (which implies there were some joins expanded on grouping element before aggregate) including the navigation expanded from owned navigations. A pending TODO is to de-dupe navigation expanded. It is not straight forward since aliases of table would have changed when previous was lifted.
Given every enumerable grouping element act as query root, every time we replace it inside a lambda expression, we need to create a copy of the root. Navigation expansion and individual queryableMethodTranslatingEV does this. So each root act and translate independently from each other.

Bug fixes:
- Fix a bug in identifying single result in InMemory to convert it to enumerable
- Null out _groupingParameter in InMemoryQueryExpression once the projection to reduce it has been applied
- Throw better error message when translating Min/Max over an entity type for InMemory
@ghost ghost closed this as completed in #25495 Aug 17, 2021
ghost pushed a commit that referenced this issue Aug 17, 2021
- Allow expanding navigations after GroupBy operator applied before reducing it to non-grouping Fixes #22609
- Translate FirstOrDefault over grouping element Fixes #12088
- Add ability to select N element over grouping element Fixes #13805

Overall approach:
A grouping element (the range variable you get after applying GroupBy operator) is of type `IGrouping<TKey, TElement>` which implements `IEnumerable<TElement>`. Hence we treat this enumerable as if it is queryable during nav expansion phase. During translation phase we inject ShapedQueryExpression in place of the grouping element which is being enumerated. What this allows us is to expand navigation just like any other query root and translate a subquery similar to other subqueries to facilitate reusing same code for the tasks.
During translation phase in relational layer, since aggregate operation can be lifted into projection for SelectExpression containing SQL GROUP BY. This code path works in 2 ways, when translating we try to combine predicate/distinct into the aggregate operation (so in future when we support custom aggregate operators, we don't have to understand the shape of it to modify it later. When adding this scalar subquery to SelectExpression, we try to pattern match it to see if we can lift it. Further during lifting, we also lift any additional joins in the subquery (which implies there were some joins expanded on grouping element before aggregate) including the navigation expanded from owned navigations. A pending TODO is to de-dupe navigation expanded. It is not straight forward since aliases of table would have changed when previous was lifted.
Given every enumerable grouping element act as query root, every time we replace it inside a lambda expression, we need to create a copy of the root. Navigation expansion and individual queryableMethodTranslatingEV does this. So each root act and translate independently from each other.

Bug fixes:
- Fix a bug in identifying single result in InMemory to convert it to enumerable
- Null out _groupingParameter in InMemoryQueryExpression once the projection to reduce it has been applied
- Throw better error message when translating Min/Max over an entity type for InMemory
@bhugot
Copy link

bhugot commented Aug 17, 2021

This still don't work with efcore 5.0.9

@n9
Copy link

n9 commented Aug 17, 2021

@bhugot See the milestone.

@bhugot
Copy link

bhugot commented Aug 17, 2021

@n9 I agree but I see the PR #25495 is included in 5.0.9

@n9
Copy link

n9 commented Aug 17, 2021

@bhugot Where do you see this? (https://github.com/dotnet/efcore/releases/tag/v5.0.9)

@bhugot
Copy link

bhugot commented Aug 17, 2021

forget it i'm not well awake :(

@shoooe
Copy link

shoooe commented Oct 8, 2021

@ajcvickers How is this supported atm? I can't seem to find examples in the release notes or documentation.

@mguinness
Copy link

@shoooe Take a look at Improved GroupBy support.

@ajcvickers ajcvickers modified the milestones: 6.0.0-rc1, 6.0.0 Nov 8, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-query closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. customer-reported ef6-parity type-enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.