Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ExpressionEqualityComparer.GetHashCode not consistent with ExpressionEqualityComparer.Equals #30697

Open
aradalvand opened this issue Apr 15, 2023 · 5 comments · May be fixed by #30755
Open

Comments

@aradalvand
Copy link

aradalvand commented Apr 15, 2023

When the parameter names of two otherwise identical expressions are different, ExpressionEqualityComparer.Equals returns true whereas ExpressionEqualityComparer.GetHashCode returns different results for the two expressions:

using Microsoft.EntityFrameworkCore.Query;
using System.Linq.Expressions;

Expression<Func<User, Course, bool>> expr1 = (currentUser, c) => c.Id == currentUser.Id;
Expression<Func<User, Course, bool>> expr2 = (user, course) => course.Id == user.Id;

var exprEq = ExpressionEqualityComparer.Instance;

Console.WriteLine($"exprEq: {exprEq.Equals(expr1, expr2)}");
Console.WriteLine($"expr1 hash: {exprEq.GetHashCode(expr1)}");
Console.WriteLine($"expr2 hash: {exprEq.GetHashCode(expr2)}");

public class Course
{
    public int Id { get; set; }
}

public class User
{
    public int Id { get; set; }
}

Output:

exprEq: True
expr1 hash: -846652839
expr2 hash: -515645303

If the parameter names had been the same, GetHashCode would return the same value for both expressions.

Expected behavior: Unless there's something I'm missing, the GetHashCode algorithm shouldn't be dependent on the names of the parameters, just like the Equals algorithm isn't. The two should be consistent in this regard.

@roji
Copy link
Member

roji commented Apr 15, 2023

Note that Equals disregards parameter names only for lambda parameters. For captured variables it doesn't ignore them; the logic is that captured variable names end up in the actual SQL (as the SQL parameter name), but lambda names really are meaningless when translating to SQL etc.

It's true that GetHashCode could also track parameter scopes and take the name into account only for lambda parameters. This would allow two queries whose only different is a lambda parameter name to be considered the same in the query cache, slightly improving perf (in theory). Or are you seeing an additional problem here, such as an actual bug in EF behavior?

@aradalvand
Copy link
Author

aradalvand commented Apr 15, 2023

@roji Thank you for the quick response.

It's true that GetHashCode could also track parameter scopes and take the name into account only for lambda parameters. This would allow two queries whose only different is a lambda parameter name to be considered the same in the query cache, slightly improving perf (in theory). Or are you seeing an additional problem here, such as an actual bug in EF behavior?

I'm actually using ExpressionEqualityComparer on my own for an expression caching problem I have in my app, I happened to stumble into it and found out that it fits my use case, I then noticed that GetHashCode takes parameter names into account as part of its calculation while Equals doesn't and figured I should point this out here.

This very issue is causing the specific part of my app that's using ExpressionEqualityComparer to behave in an unexpected manner at times, and even though I'm not aware of any instances where this is actually causing a serious bug in EF itself, I believe that as a matter of principle, if you will, this could be considered a bug on its own.
Correct me if I'm wrong but the GetHashCode and Equals methods of an IEqualityComparer should always be consistent, in that when Equals returns true for two given values, you would expect GetHashCode to return identical hashes for those values as well. The current behavior violates this, and I'm not sure if that's somehow intentional?

And even though "correctness" could be thought of as the main concern here, you're also right that fixing this would also theoretically improve EF's performance in some specific cases, since a LINQ query with an identical expression to a previous one, but with different parameter names, would result in a cache hit, as it should, I would argue, because as you yourself said:

...lambda names really are meaningless when translating to SQL

And therefore they shouldn't be part of the query cache key; currently, however, they effectively are.

@roji
Copy link
Member

roji commented Apr 16, 2023

Note that I don't think ExpressionEqualityComparer was meant for use as a general expression tree comparison facility, to be used externally. For example, lambda parameter names may not be important for EF, but it may be very important in other context; so ExpressionEqualityComparer fits EF's needs.

Regardless, I agree about the principle of GetHashCode behaving like Equals here, and not taking into account lambda parameter names - though I wouldn't say it's very high priority.

Are you interested in submitting a PR to fix this?

@roji
Copy link
Member

roji commented Apr 23, 2023

@aradalvand ping

@aradalvand
Copy link
Author

Sorry I missed this.

Are you interested in submitting a PR to fix this?

Sure, will give it a try.

aradalvand added a commit to aradalvand/efcore that referenced this issue Apr 24, 2023
The `GetHashCode` now bases the hash code of each parameter expression
on its position in the containing lambda(s) parameter list, thereby
making it so that otherwise identical lambda expressions whose only
difference is the parameter names to yield the same hash code.

Fixes dotnet#30697
@ajcvickers ajcvickers added this to the Backlog milestone May 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants