[Feature request] Reordering of comparison in @EqualsAndHashCode for better performance #1543

stsypanov · 2017-12-19T20:46:33Z

Consider simple class:

@EqualsAndHashCode
class Smth {
  Long id;
  List<Long> props;
  String name;
  int count;
}

It generates this code:

public boolean equals(Object o) {
  if (o == this) return true;
  if (!(o instanceof Smth)) return false;
  final Smth other = (Smth) o;
  if (!other.canEqual((Object) this)) return false;
  final Object this$id = this.id;
  final Object other$id = other.id;
  if (this$id == null ? other$id != null : !this$id.equals(other$id)) return false;
  final Object this$props = this.props;
  final Object other$props = other.props;
  if (this$props == null ? other$props != null : !this$props.equals(other$props)) return false;
  final Object this$name = this.name;
  final Object other$name = other.name;
  if (this$name == null ? other$name != null : !this$name.equals(other$name)) return false;
  if (this.count != other.count) return false;
  return true;
}

protected boolean canEqual(Object other) {
  return other instanceof Smth;
}

Then the same code generated by IDEA (for Java 7+):

public boolean equals(Object o) {
    if (this == o) return true;
    if (o == null || getClass() != o.getClass()) return false;
    Smth smth = (Smth) o;
    return count == smth.count &&
            Objects.equals(id, smth.id) &&
            Objects.equals(props, smth.props) &&
            Objects.equals(name, smth.name);
}

IDEA-generated code reorders comparison (primitives and wrappers first) yielding better performance for the case when count or id fields are mutually different and we can immediately return false skipping more complicated comparison of strings and collections.

It'd be nice if Lombok could do the same.

The text was updated successfully, but these errors were encountered:

Maaartinus · 2017-12-20T06:45:11Z

I'm not sure about the usefulness of such an optimization. Obviously, testing the string before the list is better if only the former differs. It's probably also better in the average real use case, but what if the string is megabytes long and the list differ already in their size?

I always order my fields accordingly; not as an optimization, but placing the simple things first makes sense to me. There's a proposal for generating compareTo, where the fields can be ranked via a special annotation.

stsypanov · 2017-12-20T09:21:55Z

My proposal is only about primitives and wrappers like Integer (their comparison is in fact comparison of underlying primitives) as you are never sure about other fields of type Object.

Maaartinus · 2017-12-20T09:51:34Z

@stsypanov You're right, somehow I "saw" name being placed before props. I may be misreading it again, but the primitive count was placed before the wrapper id and the wrapper wasn't moved before anything as it was already at the top.

Anyway, I guess, the primitive comparison is so cheap that moving primitives up sounds like a good idea. I'd place wrappers next because of the two memory indirections (which may be much more costly than the comparison itself). Everything else then in the last group.

Disclaimer: I'm just a random commenter here, so whatever I say, doesn't matter. ;)

stsypanov · 2017-12-21T19:07:26Z

@Maaartinus your comment is useful as it helped me to clarify what is expected from requested change.

victorwss · 2018-01-09T22:42:43Z

Strings are ubiquitous and are reasonably fast to compare if their hashCode()'s do not collide nor happens to be zero. So, I propose that we should use four groups: primitives first, wrappers second, strings third and everything else last.

rspilker · 2018-01-10T10:27:46Z

I think primitives and primitive wrappers first is a good idea.

Regarding Strings, I'm a bit on the fence here. There could be many other objects that are even faster, and with upcoming value types, if their fields are primitives, they are probably faster. And if we give Strings special treatment, there is no way for a user to optimize their code. Currently, they can do that by reordering their fields.

Maaartinus · 2018-01-10T19:54:08Z

@victorwss String are only fast to compare if they hash codes differ. When the hash codes are equal (either by collision or because of the strings being equal), then you're out of luck and need to compare their arrays.

Another problem: In order to use the hash code, you have to call String#hashCode, which possibly has to compute the hash code and that for both strings. This may take much longer than the comparison. Thereafter, they're cached (unless zero), but this may or may not be useful. You could use reflection instead, but that's possibly slow, surely hacky and complicated by compact strings since JDK 9.

Java itself uses neither String.hash nor String.hashCode() in its equals and we're in a worse position as the field is private. So I don't think, we cab do anything fast here.

@rspilker I guess, we could use a benchmark. I'd bet, primitives are the fastest with a large margin as they require no indirection and method call.

Primitive wrappers could IMHO profit most from testing

 Integer a = this.myInt, b = that.myInt;
 if ((a != b) && (a == null || b == null || a.intValue() != b.intValue())) return false;

and avoiding the call.

All objects might profit from testing

 if (a != b && (a == null || b == null || !a.equals(b))) return false;

instead of

 if (a != null ? !a.equals(b) : b != null) return false;

The added tests are redundant, but avoid the indirection and the call in some common cases. The added cost in case the call is necessary is probably negligible (it'd get eliminated completely with inlining, but I guess, this doesn't happen).

Currently, they can do that by reordering their fields.

Sure, imagine a value class starting with an UUID.

samukce · 2020-06-04T03:12:11Z

Hello Folks, an up in this thread... I do have the same needs that are to be able to order the equals intentionally. Even we know that primitive is cheaper and would be first, for my business scenario I do have two String fields, basically: { String name; String value; } . If the equals are built checking fist the name and I have the value one that is the one that really changes frequently its content, the performance is hugely impacted. Sure that I'm talking about thousands of objects been compared, that is the scenario that I have and those more ms have an impact on the business flow that I need to achieve.

I captured this test scenario in a simple example just to compare with Lombok and creating one equals manually with the "right" ordination for my scenario.

https://gist.github.com/samukce/38d1f58a43e670a3c70d361daa80ba1b

Benchmark                           Mode  Cnt    Score   Error  Units
HashCodeSimulation.compareIntelliJ  avgt   15  251.713 ± 5.123  ms/op
HashCodeSimulation.compareLombok    avgt   15  270.360 ± 6.523  ms/op

if the folks are ok I can submit a PR with some idea, How is it sounds? @Maaartinus

randakar · 2020-06-04T05:56:54Z

I think reordering the fields in the class still also affects comparison order. Is that enough for your needs?

…

On Thu, Jun 4, 2020, 05:12 Samuel P. ***@***.***> wrote: Hello Folks, an up in this thread... I do have the same needs that are to be able to order the equals intentionally. Even we know that primitive is cheaper and would be first, for my business scenario I do have two String fields, basically: { String name; String value; } . If the equals are built checking fist the name and I have the value one that is the one that really changes frequently its content, the performance is hugely impacted. Sure that I'm talking about thousands of objects been compared, that is the scenario that I have and those more ms have an impact on the business flow that I need to achieve. I captured this test scenario in a simple example just to compare with Lombok and creating one equals manually with the "right" ordination for my scenario. https://gist.github.com/samukce/38d1f58a43e670a3c70d361daa80ba1b Benchmark Mode Cnt Score Error Units HashCodeSimulation.compareIntelliJ avgt 15 251.713 ± 5.123 ms/op HashCodeSimulation.compareLombok avgt 15 270.360 ± 6.523 ms/op if the folks are ok I can submit a PR with some idea, How is it sounds? @Maaartinus <https://github.com/Maaartinus> — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#1543 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABIERNGHCCZB5CLO5LLODDRU4GJRANCNFSM4EI6ENCA> .

samukce · 2020-06-04T06:10:24Z

Yeap, I thought this idea but this is really implicit, right? if any other person gives some maintenance in this class is almost impossible to figure out it @randakar .

I thought that we would have the same approach that we do for @ToString where we can inform the rank number for the field.

randakar · 2020-06-04T06:25:39Z

That doesn't sound like a bad idea. The best way to go about it is sane defaults with some measure of manual control as a possibility. Users should not need to fiddle with this kind of thing to get decent performance normally, but in special cases where performance is critical ... That said, maintenance burden is a thing. The maintainers said they won't accept features with a high maintenance cost to (boilerplate reduction) win ratio, and in this case developers are always free to roll their own equals methods. So there's a risk here. Don't let that stop you though, more people working on this project is good.

…

On Thu, Jun 4, 2020, 08:10 Samuel P. ***@***.***> wrote: Yeap, I thought this idea but this is really implicit, right? if any other person gives some maintenance in this class is almost impossible to figure out it @randakar <https://github.com/randakar> . I thought that we would have the same approach that we do for @tostring where we can inform the rank number for the field. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1543 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABIERJZQX35NIQWT4UMTQLRU43F7ANCNFSM4EI6ENCA> .

Maaartinus · 2020-06-04T07:20:00Z

@samukce I'm afraid, your benchmark is broken: You've excluded name from CategoryLombok, but it's not constant. Whatever the implications on performance might be, your aren't measuring the same things.

Additionally, the class is not final, so canEqual gets generated and used.

There are other differences like getClass() != o.getClass() vs. instanceof which itself may be relevant. In case of final classes, Lombok could switch to the former, when the performance difference matters, but this would require an own benchmark.

if the folks are ok I can submit a PR with some idea, How is it sounds? @Maaartinus

I'm not a project owner, so I can't decide anything. Anyway, before you start, have a look at #512 discussing the syntax for an annotation joining @EAHC and Comparable.

I guess, you'd like something like @EAHC.Rank. I'd start by benchmarking the difference caused by reordering same-type fields.

I could actually imagine that splitt tests for a field can be better, e.g., like for strings s1 and s2

boolean same_s1 = s1 == that.s1;
if (!same_s1 && (s1 == null || that.s1 == null || s1.length() != that.s1.length())) return false;

boolean same_s2 = s2 == that.s2;
if (!same_s2 && (s2 == null || that.s2 == null || s2.length() != that.s2.length())) return false;

if (!same_s1 && !s1.equals(that.s1)) return false;

if (!same_s2 && !s2.equals(that.s2)) return false;

However, this is most probably nothing for Lombok (too complicated, unsure gain).

Yeap, I thought this idea but this is really implicit, right? if any other person gives some maintenance in this class is almost impossible to figure out it

Sure, and probably nobody would like the order "(value, name)", especially as it's used also in @ToString. If there are just too fields, I'd surely go for a manual solution.

If I were to hack on Lombok (unfortunately no time), I'd start with

@EAHC.Rank on fields and
@EQHC.Strategy on class with values NONE (status quo) and PRIMITIVES_FIRST and a corresponding configuration key

stsypanov · 2020-06-04T09:37:36Z

FYI, IDEA does similar reordering as a result of https://youtrack.jetbrains.com/issue/IDEA-170178. The issue is still open, but in fact when you take example snippet and generate equals() with IDEA the primitive will be compared first.

P. S. however this is not relevant for wrappers, they are not reordered

samukce · 2020-06-04T15:40:00Z

@Maaartinus that was a second test that I did but, to compare with that field vs without that field.

So, yeah, I agree with the approaches that you proposed. The rank looks the most simple and the flexible option, given then for each case the order can impact differently depending on the business code flow. For instance, I can have one object with 5x int fields and one String, but for instance, I can have this String field 90% changing in the time and the ints fields almost fixed value, so for this case, even primitive firsts what will make difference, in this case, will be the String field first. I will open a PR for the rank idea, and we can see the acceptance. I would like to add this in my project and remove the manual equals method that I needed to add in many different classes.

samukce · 2020-06-05T15:55:11Z

@stsypanov it can help you. ;) PR: #2485

rspilker · 2020-06-18T20:19:38Z

I've closed this issue, because we now have two improvements:

@Include(rank=n) can be used to manually re-order the fields in the equals generation.
Primitives get a default rank of 1000, references 0.

So everyone get the added performance boost of the primitives, and customization is possible.

Regarding primitive wrappers, it's not completely off the table yet. Although we can never be sure that Long refers to java.lang.Long (it could be a class a different file in the same package), it is also not likely that that would be a problem. And primitive wrappers are likely candidates, especially now value classes are getting more attention, and your JVM probably already contains optimized code to handle equality for those.

rspilker · 2020-06-18T22:53:04Z

Update: primitives default to 1000, "primitive wrappers" to 800, references to 0.

The caveat is that the primitive wrapper detection could be off, but then:

The chances that someone does this is very low
The impact of a false detection is also low
The rank field can still be used to fix this

samukce mentioned this issue Jun 5, 2020

[Feature request] @EqualsAndHashCode Flexibility for rank properties for equal/hashcode method #2485

Merged

rspilker closed this as completed in 39d2c28 Jun 18, 2020

rspilker added a commit that referenced this issue Jun 18, 2020

#1543: First primitives, then primitive wrappers, then other references

ed8ea0d

sync-by-unito bot mentioned this issue Oct 9, 2020

Bump lombok from 1.18.12 to 1.18.14 liquibase/liquibase-mongodb#42

Merged

bigibas123 mentioned this issue Oct 13, 2020

Bump lombok from 1.18.12 to 1.18.14 bigibas123/bigidiscordbot#27

Merged

sync-by-unito bot mentioned this issue Oct 16, 2020

Bump lombok from 1.18.14 to 1.18.16 liquibase/liquibase-mongodb#46

Merged

This was referenced Mar 13, 2021

Bump lombok from 1.18.14 to 1.18.18 ImSejin/lezhin-comics-downloader#22

Merged

Bump lombok from 1.18.12 to 1.18.18 ks-no/fiks-streaming-klient#56

Closed

Bump lombok from 1.18.10 to 1.18.18 fabiojose/snip#1

Closed

theaspect mentioned this issue Jan 11, 2022

Bump lombok from 1.18.8 to 1.18.22 theaspect/sandbox#15

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature request] Reordering of comparison in @EqualsAndHashCode for better performance #1543

[Feature request] Reordering of comparison in @EqualsAndHashCode for better performance #1543

stsypanov commented Dec 19, 2017

Maaartinus commented Dec 20, 2017

stsypanov commented Dec 20, 2017

Maaartinus commented Dec 20, 2017

stsypanov commented Dec 21, 2017

victorwss commented Jan 9, 2018

rspilker commented Jan 10, 2018

Maaartinus commented Jan 10, 2018

samukce commented Jun 4, 2020

randakar commented Jun 4, 2020 via email

samukce commented Jun 4, 2020

randakar commented Jun 4, 2020 via email

Maaartinus commented Jun 4, 2020

stsypanov commented Jun 4, 2020 •

edited

samukce commented Jun 4, 2020 •

edited

samukce commented Jun 5, 2020

rspilker commented Jun 18, 2020

rspilker commented Jun 18, 2020

[Feature request] Reordering of comparison in @EqualsAndHashCode for better performance #1543

[Feature request] Reordering of comparison in @EqualsAndHashCode for better performance #1543

Comments

stsypanov commented Dec 19, 2017

Maaartinus commented Dec 20, 2017

stsypanov commented Dec 20, 2017

Maaartinus commented Dec 20, 2017

stsypanov commented Dec 21, 2017

victorwss commented Jan 9, 2018

rspilker commented Jan 10, 2018

Maaartinus commented Jan 10, 2018

samukce commented Jun 4, 2020

randakar commented Jun 4, 2020 via email

samukce commented Jun 4, 2020

randakar commented Jun 4, 2020 via email

Maaartinus commented Jun 4, 2020

stsypanov commented Jun 4, 2020 • edited

samukce commented Jun 4, 2020 • edited

samukce commented Jun 5, 2020

rspilker commented Jun 18, 2020

rspilker commented Jun 18, 2020

stsypanov commented Jun 4, 2020 •

edited

samukce commented Jun 4, 2020 •

edited