Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query: Support GroupJoin when it is final query operator #19930

Closed
Tracked by #24106
smitpatel opened this issue Feb 14, 2020 · 24 comments · Fixed by #28998
Closed
Tracked by #24106

Query: Support GroupJoin when it is final query operator #19930

smitpatel opened this issue Feb 14, 2020 · 24 comments · Fixed by #28998
Assignees
Labels
area-query closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. ef6-parity type-enhancement
Milestone

Comments

@smitpatel
Copy link
Member

This issue is about enabling following query.

context.Customers.GroupJoin(context.Orders, c => c.CustomerID, o => o.CustomerID, (c, os) => new {c, os}).ToList();
// or
from c in context.Customers
join o in context.Orders on c.CustomerID equals o.CustomerID into os
select { c, os}

We will not client eval GroupJoin in any of following cases

  • When GroupJoin operator is followed by any other query operator
  • When you apply any other operator before select in comprehension syntax (2nd query above)
  • When you compose over os in above query in any form

All above unsupported cases will still throw client evaluation exception even after this issue is fixed.

GroupJoin operator in query which is used to generated LeftJoin in SQL as described in our documentation is supported and will remains so. Even after that use case, if you still need EF Core to client eval GroupJoin operator with aforementioned constraints please upvote this issue. We are not looking to client eval any other GroupJoin which appears in the middle of the query.

@powermetal63
Copy link

Don't waste your time with this useless enhancement. This definitely is NOT what I wanted. Client eval, seriously? If it wasn't clear after all these writings in the now closed issue, I want FULL SERVER support for this LINQ operator. And btw, it has nothing in common with GroupBy. It's a correlation operator similar to Join, but producing correlated set rather than flattened set. Putting it into the same box as GroupBy is a BIG mistake. IMHO. This my last comment here, luckily I'm not using EF Core in production, hence I don't really care, except for the unlucky people which do use and I'm trying to help overcoming the unnecessary limitations put by EF Core team. We all know LINQ to Queryable is broken, but EF Core makes that a rule rather than an exception. The sad thing is that even though EF6 has no client eval concept, it translates and evaluates much more LINQ constructs than EF Core, so it's not surprising that some people actively propose EF6.4 or EF Classic as better EF Core alternatives. Which is sad because I like the simplicity of the EF Core exposed metadata, also the replaceable services architecture, but w/o proper query translation it is just not useful in real world applications. No more comments.

@powermetal63
Copy link

Really last comment :-) (hopefully constructive)

Here is quick and dirty EXTERNAL implementation of what I wanted:

public static partial class QueryableExtensions
{
	public static IQueryable<T> ConvertGroupJoins<T>(this IQueryable<T> source)
	{
		var expression = new GroupJoinConverter().Visit(source.Expression);
		if (source.Expression == expression) return source;
		return source.Provider.CreateQuery<T>(expression);
	}

	class GroupJoinConverter : ExpressionVisitor
	{
		protected override Expression VisitMethodCall(MethodCallExpression node)
		{
			if (node.Method.DeclaringType == typeof(Queryable) && node.Method.Name == nameof(Queryable.GroupJoin) && node.Arguments.Count == 5)
			{
				var outer = Visit(node.Arguments[0]);
				var inner = Visit(node.Arguments[1]);
				var outerKeySelector = Visit(node.Arguments[2]).UnwrapLambdaFromQuote();
				var innerKeySelector = Visit(node.Arguments[3]).UnwrapLambdaFromQuote();
				var resultSelector = Visit(node.Arguments[4]).UnwrapLambdaFromQuote();
				var outerKey = outerKeySelector.Body.ReplaceParameter(outerKeySelector.Parameters[0], resultSelector.Parameters[0]);
				var innerKey = innerKeySelector.Body;
				var keyMatch = MatchKeys(outerKey, innerKey);
				var innerQuery = Expression.Call(
					typeof(Queryable), nameof(Queryable.Where), new[] { innerKeySelector.Parameters[0].Type },
					inner, Expression.Lambda(keyMatch, innerKeySelector.Parameters));
				var resultTypes = resultSelector.Parameters.Select(p => p.Type).ToArray();
				var tempProjectionType = typeof(Tuple<,>).MakeGenericType(resultTypes);
				var tempProjection = Expression.New(
					tempProjectionType.GetConstructor(resultTypes),
					new Expression[] { resultSelector.Parameters[0], innerQuery },
					tempProjectionType.GetProperty("Item1"), tempProjectionType.GetProperty("Item2"));
				var tempQuery = Expression.Call(
					typeof(Queryable), nameof(Queryable.Select), new[] { outerKeySelector.Parameters[0].Type, tempProjectionType },
					outer, Expression.Lambda(tempProjection, resultSelector.Parameters[0]));
				var tempResult = Expression.Parameter(tempProjectionType, "p");
				var projection = resultSelector.Body
					.ReplaceParameter(resultSelector.Parameters[0], Expression.Property(tempResult, "Item1"))
					.ReplaceParameter(resultSelector.Parameters[1], Expression.Property(tempResult, "Item2"));
				var query = Expression.Call(
					typeof(Queryable), nameof(Queryable.Select), new[] { tempProjectionType, projection.Type },
					tempQuery, Expression.Lambda(projection, tempResult));
				return query;
			}
			return base.VisitMethodCall(node);
		}

		static Expression MatchKeys(Expression outerKey, Expression innerKey)
		{
			var newOuterKey = outerKey as NewExpression;
			if (newOuterKey == null)
				return Match(outerKey, innerKey);
			var newInnerKey = (NewExpression)innerKey;
			Expression keyMatch = null;
			for (int i = 0; i < newOuterKey.Arguments.Count; i++)
			{
				var match = Match(newOuterKey.Arguments[i], newInnerKey.Arguments[i]);
				keyMatch = keyMatch != null ? Expression.AndAlso(keyMatch, match) : match;
			}
			return keyMatch;
		}

		static Expression Match(Expression left, Expression right)
			=> Expression.Equal(left, right);
	}
}

When applied, it allows translation/processing (in 3.1.1) of all GroupJoin scenarios.

The question is, if I can do that externally, why can't you do something similar internally (using all the power of your expression processing helpers/experience) :-/ Cheers.

@ajcvickers ajcvickers changed the title Query: Client eval GroupJoin when it is final query operator Query: Support GroupJoin when it is final query operator Feb 18, 2020
@ajcvickers ajcvickers added this to the Backlog milestone Feb 18, 2020
@NetMage
Copy link

NetMage commented May 7, 2020

I think this is a good enhancement because without it, the (general) EF Core 3.x advice to manually switch to client-side evaluation does exactly what has been often commented as worth avoiding in automatic client side evaluation, i.e. pulling a lot of data from the server unnecessarily.

At least, in something like

var ans = context.Customers
                 .Where(c => c.CustomerID < 100)
                 .GroupJoin(context.Orders, c => c.CustomerID, o => o.CustomerID, (c, os) => new { c, os })
                 .ToList();

where could you put AsEnumerable that wouldn't involve pulling a lot of unnecessary Orders rows?

(I hope this case would be handled.)

It would be really good if this supports projection at least so that unneeded columns aren't pulled to the client as in:

var ans = context.Customers
                 .Where(c => c.CustomerID < 100)
                 .GroupJoin(context.Orders, c => c.CustomerID, o => o.CustomerID, (c, os) => new { c.CustomerID, OrderIDs = os.Select(o => o.OrderID) })
                 .ToList();

I am still not sure if translating to a single correlated SQL query versus two queries to minimize data transfer would be most efficient, but I suppose the answer would very depending on the database.

@smitpatel
Copy link
Member Author

context.Customers
    .Where(c => c.CustomerID < 100)
    .Select(c => new {
        c = c,
        os = context.Orders.Where(o => c.CustomerID == o.CustomerID).ToList()
    })

is equivalent query. LINQ is magical.

@powermetal63
Copy link

powermetal63 commented May 8, 2020

context.Customers
.Where(c => c.CustomerID < 100)
.Select(c => new {
c = c,
os = context.Orders.Where(o => c.CustomerID == o.CustomerID).ToList()
})
is equivalent query. LINQ is magical.

Yes, it is. That's why the EF job is to convert/transform/treat the original as the manually written equivalent.

SQL also has different valid query constructs (WHERE IN (…), WHERE EXISTS (…), JOIN`) which can produce equivalent results, but they are not forcing people to use just one of them. Modern query optimizers recognizes them and create one an the same execution plan. And EF Core is intended to be a "modern" framework, isn't it?


P.S. Sorry for repeating/linking to it couple times, but see the ConvertGroupJoins transformation from #19930 (comment) and then below applied to @NetMage example and your response:

var qA = context.Customers
    .Where(c => c.CustomerID < 100)
    .GroupJoin(context.Orders, c => c.CustomerID, o => o.CustomerID, (c, os) => new { c, os });

var qB = context.Customers
    .Where(c => c.CustomerID < 100)
    .Select(c => new {
        c = c,
        os = context.Orders.Where(o => c.CustomerID == o.CustomerID).ToList()
    });

var qC = qA.ConvertGroupJoins();
// qC is same as qB, should be done automatically by EF :)

@roji
Copy link
Member

roji commented May 8, 2020

Yes, it is. That's why the EF job is to convert/transform/treat the original as the manually written equivalent.

While in theory that's true, given that the amount of work we have is huge, we prioritize working on translations (and features in general) which don't have easy/simple workaround such as this one. If it's possible to rewrite your LINQ query so that it works - and is even simpler as in the case above - isn't that reasonable?

@powermetal63
Copy link

Yes, it is. That's why the EF job is to convert/transform/treat the original as the manually written equivalent.

While in theory that's true, given that the amount of work we have is huge, we prioritize working on translations (and features in general) which don't have easy/simple workaround such as this one. If it's possible to rewrite your LINQ query so that it works - and is even simpler as in the case above - isn't that reasonable?

@roji No, it isn't, because you want be to do something additional work which is not strongly necessary. Because my time is costly too.

Just out of curiosity, why you spent your valuable time writing the (IMHO more complicated) GroupJoinFlatteningExpressionVisitor to support the weird LINQ left join pattern? You could have asked the end users to rewrite their LINQ queries using the even simpler correlated subquery syntax, right?

Why don't you go further and remove the whole LINQ support and ask the end users to rewrite their queries using some "even simpler" but EF Core specific dialect for their queries?

Finally, it's a shame the "modern" framework translating less LINQ patterns than its "obsolete" predecessor. Just FYI :)

@roji
Copy link
Member

roji commented May 8, 2020

Once again, where we can, we add translations based on the implementation cost, on whether a different construct exists to express the same thing, and importantly, based on user feedback. This issue has received almost no votes or feedback from users, which is a main reason why we haven't prioritized it.

@NetMage
Copy link

NetMage commented May 8, 2020

That is nice, could it not be used as a GroupJoin translation?

@powermetal63
Copy link

@roji

This issue has received almost no votes or feedback from users,

Because the victims go to StackOverflow instead of here?

@powermetal63
Copy link

@NetMage This is what I'm fighting for here. But they "see no value" :(

@roji
Copy link
Member

roji commented May 9, 2020

This issue has received almost no votes or feedback from users,

Because the victims go to StackOverflow instead of here?

We do have other issues with hundreds of votes - this one does not. Note that @smitpatel - who is a team member - is the one who opened this issue in the first place, so it's not true that we see no value. But at the risk of repeating myself, this issue has very little user requests, and there is another LINQ construct that produces the same result (one could even say that it's simpler). Therefore, since we have an enormous list of things to do and our resources are limited, this issue is in the backlog (not closed).

@NetMage
Copy link

NetMage commented May 11, 2020

@powermetal63 Theoretically, if one cares enough, an enhancement could be done by someone from the community. Unfortunately having a usable EF Core is not my primary job, but I am working on a C# extension method using this translation in my spare time.

@powermetal63
Copy link

@roji From my point of view this is continuation of #17068, which was closed by @smitpatel with the following conclusion

GroupJoin operator
GroupJoin is mainly used to generate LeftJoin and there is no SQL equivalent. Hence we will not translate any composition over GroupJoin.
Filed #19930 to client eval GroupJoin when it is final query operator and grouping is not being composed over.
In all other cases, GroupJoin is not going to supported.
GroupJoin <-> GroupBy equivalence
Due to high cost of pattern matching GroupJoin operator, we are not going to add any translation in GroupJoin even if equivalent GroupBy query would be translated. Customers would be required use GroupBy syntax as that is closest representation of what is the SQL.

And here are some related issues recently closed as duplicate of it - #13887, #14490, #20660

What I'm trying to argue from the beginning is that putting GroupJoin into same box as GroupBy is wrong. And what I really want is reopening a separate issue for (or changing this to) Query: Support GroupJoin translation. With the following rationale:

As you can see from my sample code, doing that is quite easy even with external code. You already put significant effort in 3.x to support just one variation - LINQ left join pattern using SelectMany on GroupJoin.DefaultIfEmpty, and even additional effort for porting (refactoring) it into this huge TryFlattenGroupJoinSelectMany method.

But running my ConvertGroupJoins method not only solves the GroupJoin translation in general, but also totally eliminates the need of that TryFlattenGroupJoinSelectMany code (because there are no GroupJoins in the query expression tree at all!).

What I'm suggesting is replacing TryFlattenGroupJoinSelectMany with GroupJoin transformation similar to my sample code (adjusted with your internal coding standards) and voila - issue solved once forever. All that with much less code than currently.

@sungam3r
Copy link
Contributor

@powermetal63 What is ReplaceParameter in your example?

@powermetal63
Copy link

@sungam3r Nothing special, a typical expression visitor based helper for finding ParemeterExpression instances inside expression tree and replacing them with arbitrary Expression. Something like

public static partial class ExpressionUtils
{
    public static Expression ReplaceParameter(this Expression source, ParameterExpression parameter, Expression value)
        => new ParameterReplacer { Parameter = parameter, Value = value }.Visit(source);

    class ParameterReplacer : ExpressionVisitor
    {
        public ParameterExpression Parameter;
        public Expression Value;
        protected override Expression VisitParameter(ParameterExpression node)
            => node == Parameter ? Value : node;
    }
}

@NetMage
Copy link

NetMage commented Jan 5, 2021

@powermetal63 Can you provide a sample of the transformation your ConvertGroupJoins does on a query?

@smitpatel
Copy link
Member Author

There are other 3rd party libraries which integrate on top of EF Core and provide all those functionality. OData is one big consumer with EF Core which provides paging/sorting/filtering functionality in the way you mention above without users having to write whole query themselves. We work with them so the queries generated by them are translated correctly in EF Core. There also other extension on top of EF Core to provide more functionality from database side & also other 3rd party library which can be used to create queries.

@DreamU1952
Copy link

DreamU1952 commented Mar 8, 2022

OUCH! I have LINQ queries with 'into' clauses all over my Legacy code. I have just tried to upgrade that old Entity Framework v5.0 to Entity Framework core and hit this issue almost immediately. Stop removing features that used to work! I am tired of all the products being released with breaking changes and products that provide no upgrade path to get from a prior version to a new version. I am leaning towards abandoning Entity Framework as the upgrade hassle is beginning to outweigh the benefits. We just upgraded 30-year old legacy code and the SQL statements just ported straight into the new app! That's a big plus for just staying with good old SQL.

@smitpatel smitpatel added closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. and removed needs-design labels Sep 6, 2022
@ajcvickers ajcvickers modified the milestones: 7.0.0, 7.0.0-rc2 Sep 14, 2022
@ajcvickers ajcvickers modified the milestones: 7.0.0-rc2, 7.0.0 Nov 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-query closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. ef6-parity type-enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants