Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add filter for Kotlin lateinit properties #707

Merged
merged 3 commits into from
Jul 11, 2018

Conversation

fabianishere
Copy link
Contributor

This change adds a new filter for Kotlin lateinit properties by searching in the bytecode for the following sequence of instructions:

IFNONNULL ...
LDC "..."
INVOKESTATIC kotlin/jvm/internal/Intrinsics.throwUninitializedPropertyAccessException (Ljava/lang/String;)V

and ignoring this sequence if found.

This approach scans through all instructions in every method found, so maybe it might be an idea to only apply the filter on Kotlin classes. I haven't run any integration tests to measure the performance impact except for a small Kotlin project to verify results:

Before:
before

After:
after

@Godin Godin self-requested a review July 5, 2018 13:43
@Godin Godin added this to TODO in Filtering Jul 5, 2018
@Godin Godin moved this from TODO to IN PROGRESS in Filtering Jul 9, 2018
@Godin
Copy link
Member

Godin commented Jul 9, 2018

Hi @fabianishere ,

first of all thank you for your contribution ❤️

Usually coverage report for test code is not created, not sure if this makes sense in general, so I guess that your screenshot doesn't demonstrate all ways to use lateinit? And for those (like me and @marchof) who do not know much about Kotlin and internals of Kotlin compiler, this IMO requires a bit more explanations - below I tried to do a bit of reverse engineering. So wondering if you know where in sources of Kotlin compiler we can find code of bytecode generation for lateinit? Reading of sources might be simpler than reverse engineering.


For LateInit.kt

// https://kotlinlang.org/docs/reference/properties.html#late-initialized-properties-and-variables

class LateInit {
  lateinit var x: String  // line 4
}

Kotlin compiler generates (javap -v -p LateInit.class)

  public java.lang.String x;
    descriptor: Ljava/lang/String;
    flags: ACC_PUBLIC
    RuntimeInvisibleAnnotations:
      0: #7()

  public final java.lang.String getX();
    descriptor: ()Ljava/lang/String;
    flags: ACC_PUBLIC, ACC_FINAL
    Code:
      stack=2, locals=1, args_size=1
         0: aload_0
         1: getfield      #11                 // Field x:Ljava/lang/String;
         4: dup
         5: ifnonnull     13
         8: ldc           #12                 // String x
        10: invokestatic  #18                 // Method kotlin/jvm/internal/Intrinsics.throwUninitializedPropertyAccessException:(Ljava/lang/String;)V
        13: areturn
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
            0      14     0  this   LLateInit;
      LineNumberTable:
        line 4: 0
    RuntimeInvisibleAnnotations:
      0: #7()

  public final void setX(java.lang.String);
    descriptor: (Ljava/lang/String;)V
    flags: ACC_PUBLIC, ACC_FINAL
    Code:
      stack=2, locals=2, args_size=2
         0: aload_1
         1: ldc           #25                 // String <set-?>
         3: invokestatic  #29                 // Method kotlin/jvm/internal/Intrinsics.checkParameterIsNotNull:(Ljava/lang/Object;Ljava/lang/String;)V
         6: aload_0
         7: aload_1
         8: putfield      #11                 // Field x:Ljava/lang/String;
        11: return
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
            0      12     0  this   LLateInit;
            0      12     1 <set-?>   Ljava/lang/String;
      LineNumberTable:
        line 4: 6
    RuntimeInvisibleParameterAnnotations:
      0:
        0: #7()

one could assume from this that possible to cover all branches just by invocation of getX before setX.

However similarly to your example for

// https://kotlinlang.org/docs/reference/properties.html#late-initialized-properties-and-variables

class LateInit {
  lateinit var x: String  // line 4

  fun write() {
    x = ""  // line 7
  }

  fun read() : String {
    return x  // line 11
  }
}

in addition to getX and setX Kotlin compiler will generate methods with direct access to the field

  public final void write();
    descriptor: ()V
    flags: ACC_PUBLIC, ACC_FINAL
    Code:
      stack=2, locals=1, args_size=1
         0: aload_0
         1: ldc           #33                 // String
         3: putfield      #11                 // Field x:Ljava/lang/String;
         6: return
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
            0       7     0  this   LLateInit;
      LineNumberTable:
        line 7: 0
        line 8: 6

  public final java.lang.String read();
    descriptor: ()Ljava/lang/String;
    flags: ACC_PUBLIC, ACC_FINAL
    Code:
      stack=2, locals=1, args_size=1
         0: aload_0
         1: getfield      #11                 // Field x:Ljava/lang/String;
         4: dup
         5: ifnonnull     13
         8: ldc           #12                 // String x
        10: invokestatic  #18                 // Method kotlin/jvm/internal/Intrinsics.throwUninitializedPropertyAccessException:(Ljava/lang/String;)V
        13: areturn
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
            0      14     0  this   LLateInit;
      LineNumberTable:
        line 11: 0
    RuntimeInvisibleAnnotations:
      0: #7()

Furthermore exactly as in your example in case of a private property methods getX and setX are not generated, and in case of internal their names start with getX$ and setX$ followed by name of a module.

Setter can be made private and in this case field also becomes private:

class LateInit {
  internal lateinit var x : String
    private set

  fun write() {
    x = ""
  }
}

In this case I'm wondering why Kotlin compiler generates setter at all - it looks like dead code since can't be called from Java without reflection and Kotlin code doesn't use it? Maybe guys from JetBrains can give us a hint? 👋 @goodwinnk 😉

Local variable also can be lateinit:

class LateInit {
  fun f() : String {
    lateinit var x : String
    if (b())
      x = ""
    return x  // line 6
  }
}
  public final java.lang.String f();
    descriptor: ()Ljava/lang/String;
    flags: ACC_PUBLIC, ACC_FINAL
    Code:
      stack=2, locals=2, args_size=1
         0: aconst_null
         1: astore_1
         2: aload_0
         3: invokevirtual #11                 // Method b:()Z
         6: ifeq          12
         9: ldc           #13                 // String
        11: astore_1
        12: aload_1
        13: dup
        14: ifnonnull     22
        17: ldc           #15                 // String x
        19: invokestatic  #21                 // Method kotlin/jvm/internal/Intrinsics.throwUninitializedPropertyAccessException:(Ljava/lang/String;)V
        22: areturn
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
            2      21     1     x   Ljava/lang/String;
            0      23     0  this   LLateInit;
      LineNumberTable:
        line 3: 0
        line 4: 2
        line 5: 9
        line 6: 12
      StackMapTable: number_of_entries = 2
        frame_type = 252 /* append */
          offset_delta = 12
          locals = [ class java/lang/String ]
        frame_type = 73 /* same_locals_1_stack_item */
          stack = [ class java/lang/String ]
    RuntimeInvisibleAnnotations:
      0: #7()

From the above clear that null check and invocation of kotlin/jvm/internal/Intrinsics.throwUninitializedPropertyAccessException indeed should be ignored to handle all cases of direct field access.

However this alone seems to be not enough to claim completeness of filter for lateinit - seems that the only way to cover setter and getter is to explicitly call them.

Before this change

before1

before2

After this change

after1

after2

To me seems that the only ways to handle setter are: either reading of kotlin metadata or detection of it based on assumption that it always comes strictly after getter that we can detect.

Or to postpone this case as known limitation. @marchof WDYT?


BTW and for the fun - here is reports produced by coverage engine built into IntelliJ IDEA:

intellij-tracing

intellij-sampling

😆 😉


With increasing amount of filters dedicated to Kotlin, might be valuable to introduce validation tests similar to ones that we have for javac and ecj, but this can be done separately.


Last but not least

This approach scans through all instructions in every method found, so maybe it might be an idea to only apply the filter on Kotlin classes. I haven't run any integration tests to measure the performance impact

If there is problem with performance, then restriction of filter to classes generated by Kotlin compiler will not solve problem and just hide it from projects that do not use Kotlin.

IMO iself iteration over all instructions is not a big issue - anyway we already do this in StringSwitchJavacFilter, however we avoid construction of array by not using access by index (instructions.get) and using getNext instead.

We can question allocation of Matcher, but maybe this allocation is completely eliminated thanks to inlining and escape analysis in JVM or already fast thanks to TLAB. One can imagine optimization that avoids unnecessary allocations by matching backward from invokestatic. Another possible optimization is to use hashCode to make negative comparison of owner and name faster since AFAIR hashCode is cached in String and Strings are interned thanks to constant pool.

But without measurements, I would call all this as unnecessary premature micro optimizations. And here is our usual quick sanity check - analysis of rt.jar from JDK 8u172:

$ time java -jar jacoco-0.8.2-before/lib/jacococli.jar report --classfiles ~/.java-select/versions/jdk-8u172-macosx-x64/jre/lib/rt.jar
[WARN] No execution data files provided.
[INFO] Analyzing 17742 classes.
        3.81 real         8.33 user         0.56 sys

$ time java -jar jacoco-0.8.2-after/lib/jacococli.jar report --classfiles ~/.java-select/versions/jdk-8u172-macosx-x64/jre/lib/rt.jar
[WARN] No execution data files provided.
[INFO] Analyzing 17742 classes.
        3.98 real         8.90 user         0.58 sys

from which doubtful that this change brings big performance problems.

@fabianishere
Copy link
Contributor Author

fabianishere commented Jul 10, 2018

Hi @Godin ,

Thanks for the detailed review! I see I forgot to cover all cases. I guess I cheered a bit to soon :p

  1. Translation of lateinit properties in Kotlin currently happen at IR level (see https://github.com/JetBrains/kotlin/blob/master/compiler/ir/backend.common/src/org/jetbrains/kotlin/backend/common/lower/LateinitLowering.kt) so does not directly map to bytecode.
  2. Regarding the generated getters/setters and direct field access: I have tested this and the issue is actually inherent to all class properties in Kotlin. This means that for all class properties in Kotlin, when the property is accessed from within the class and does not have a custom setter, a direct field access is used, while other classes access the property by its getter method.
    Since it is not specific to lateinit properties, I think it is best to solve it in another filter.
  3. I am willing to look into adding validation tests for Kotlin. I assume adding a build-time dependency on the Kotlin compiler for the org.jacoco.core.test package shouldn't be an issue.
  4. Okay, I will refactor the filter to use getNext() instead. Since the matching code is so small, I think we can remove the Matcher class and just try to find a INVOKESTATIC node that has a LDC predecessor node.

@Godin
Copy link
Member

Godin commented Jul 10, 2018

@fabianishere

Translation of lateinit properties in Kotlin currently happen at IR level

Thanks for information and link!


This means that for all class properties in Kotlin, when the property is accessed from within the class and does not have a custom setter, a direct field access is used, while other classes access the property by its getter method.

This is very valuable addition to my reverse engineering - indeed in case of LateInit.kt

class LateInit {
  lateinit var x: String
  var p: String = ""

  fun write() {
    x = ""
    p = ""
  }

  fun read(): String {
    return x + p
  }
}

class LateInitAccess {
  fun write(t: LateInit) {
    t.x = ""
    t.p = ""
  }

  fun read(t: LateInit): String {
    return t.x + t.p
  }
}

LateInitAccess uses setter and getter - javap -v -p LateInitAccess.class:

  public final void write(LateInit);
    descriptor: (LLateInit;)V
    flags: ACC_PUBLIC, ACC_FINAL
    Code:
      stack=2, locals=2, args_size=2
         0: aload_1
         1: ldc           #9                  // String t
         3: invokestatic  #15                 // Method kotlin/jvm/internal/Intrinsics.checkParameterIsNotNull:(Ljava/lang/Object;Ljava/lang/String;)V
         6: aload_1
         7: ldc           #17                 // String
         9: invokevirtual #23                 // Method LateInit.setX:(Ljava/lang/String;)V
        12: aload_1
        13: ldc           #17                 // String
        15: invokevirtual #26                 // Method LateInit.setP:(Ljava/lang/String;)V
        18: return
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
            0      19     0  this   LLateInitAccess;
            0      19     1     t   LLateInit;
      LineNumberTable:
        line 17: 6
        line 18: 12
        line 19: 18
    RuntimeInvisibleParameterAnnotations:
      0:
        0: #7()

  public final java.lang.String read(LateInit);
    descriptor: (LLateInit;)Ljava/lang/String;
    flags: ACC_PUBLIC, ACC_FINAL
    Code:
      stack=2, locals=2, args_size=2
         0: aload_1
         1: ldc           #9                  // String t
         3: invokestatic  #15                 // Method kotlin/jvm/internal/Intrinsics.checkParameterIsNotNull:(Ljava/lang/Object;Ljava/lang/String;)V
         6: new           #33                 // class java/lang/StringBuilder
         9: dup
        10: invokespecial #37                 // Method java/lang/StringBuilder."<init>":()V
        13: aload_1
        14: invokevirtual #41                 // Method LateInit.getX:()Ljava/lang/String;
        17: invokevirtual #45                 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
        20: aload_1
        21: invokevirtual #48                 // Method LateInit.getP:()Ljava/lang/String;
        24: invokevirtual #45                 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
        27: invokevirtual #51                 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
        30: areturn
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
            0      31     0  this   LLateInitAccess;
            0      31     1     t   LLateInit;
      LineNumberTable:
        line 22: 6
    RuntimeInvisibleAnnotations:
      0: #7()
    RuntimeInvisibleParameterAnnotations:
      0:
        0: #7()

and LateInit indeed uses direct field access even for non lateinit property:

  public final void write();
    descriptor: ()V
    flags: ACC_PUBLIC, ACC_FINAL
    Code:
      stack=2, locals=1, args_size=1
         0: aload_0
         1: ldc           #38                 // String
         3: putfield      #11                 // Field x:Ljava/lang/String;
         6: aload_0
         7: ldc           #38                 // String
         9: putfield      #33                 // Field p:Ljava/lang/String;
        12: return
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
            0      13     0  this   LLateInit;
      LineNumberTable:
        line 6: 0
        line 7: 6
        line 8: 12

  public final java.lang.String read();
    descriptor: ()Ljava/lang/String;
    flags: ACC_PUBLIC, ACC_FINAL
    Code:
      stack=3, locals=1, args_size=1
         0: new           #41                 // class java/lang/StringBuilder
         3: dup
         4: invokespecial #44                 // Method java/lang/StringBuilder."<init>":()V
         7: aload_0
         8: getfield      #11                 // Field x:Ljava/lang/String;
        11: dup
        12: ifnonnull     20
        15: ldc           #12                 // String x
        17: invokestatic  #18                 // Method kotlin/jvm/internal/Intrinsics.throwUninitializedPropertyAccessException:(Ljava/lang/String;)V
        20: invokevirtual #48                 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
        23: aload_0
        24: getfield      #33                 // Field p:Ljava/lang/String;
        27: invokevirtual #48                 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
        30: invokevirtual #51                 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
        33: areturn
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
            0      34     0  this   LLateInit;
      LineNumberTable:
        line 11: 0
    RuntimeInvisibleAnnotations:
      0: #7()

Since it is not specific to lateinit properties, I think it is best to solve it in another filter.

From the above I'm even not sure that public setters/getters require any filtering: if the property is not meant to be accessed from outside - make it private so that setter/getter not generated, otherwise most likely access via setter/getter is performed and so they will be covered. Similarly to what we said about var vs val in #689 :

Noticed that val's in data classes require read test and vars both read and write.
This seems correct to me - don't read val = remove it, don't write var = make it val. And this seems to be consistent with Groovy.

Case of generation of private setter that is not accessed still looks weird to me. Will be really valuable to get comments from developers of Kotlin compiler - maybe this is simply something that they can improve in it, so that we even wouldn't need a filter 😉


I am willing to look into adding validation tests for Kotlin. I assume adding a build-time dependency on the Kotlin compiler for the org.jacoco.core.test package shouldn't be an issue.

This won't be as simple as just addition of dependency, because AFAIK Kotlin compiler requires Java 1.6, while our tests should be runnable under Java 1.5 since this is our minimal supported version - see https://www.jacoco.org/jacoco/trunk/doc/build.html and https://travis-ci.org/jacoco/jacoco/

So this might require some dancing around Maven and even not sure that final result could be as nice as one could achieve for example with Gradle. If you're anyway willing to give a try - good luck 😉 but please do this separately from this PR.


Since the matching code is so small, I think we can remove the Matcher class

My point was that except

I will refactor the filter to use getNext() instead

the rest doesn't seem to require change.

and just try to find a INVOKESTATIC node that has a LDC predecessor node.

In any case (no matter from where you start - from ifnonnull or from invokestatic) you need to match all 3 instructions.

Copy link
Member

@Godin Godin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fabianishere I left few minor comments and can even address them by myself prior to merge if you prefer so.

@Godin Godin self-assigned this Jul 10, 2018
@Godin Godin added this to the 0.8.2 milestone Jul 10, 2018
@fabianishere
Copy link
Contributor Author

Oh that is not a problem. Will look into it tomorrow.

This change adds a new filter for Kotlin lateinit properties by
searching in the bytecode for the following sequence of instructions:

    IFNONNULL ...
    LDC "..."
    INVOKESTATIC kotlin/jvm/internal/Intrinsics.throwUninitializedPropertyAccessException (Ljava/lang/String;)V

and ignoring this sequence if found. The downside of this approach is
that every instruction in every method must be scanned. Perhaps this
filter could only be limited to Kotlin classes.
@fabianishere
Copy link
Contributor Author

fabianishere commented Jul 11, 2018

I have addressed the feedback in 221b926. Do you prefer to have the commit rebased onto master?

@Godin Godin merged commit ce7de98 into jacoco:master Jul 11, 2018
@Godin
Copy link
Member

Godin commented Jul 11, 2018

LGTM! Thanks a lot @fabianishere for your contribution 👍 which is now recorded in changelog ❤️ and thanks for your patience about our pedantism 😉

For the fun, for future readers of this thread and for possible future inclusion into validations tests for Kotlin: One might expect partial coverage in case of access to uninitialized lateinit property, whereas actually getter will be shown as not covered before/after this change. One might think that this relates to #321 However actually this is due to tradeoff between performance and precise coverage in case of "implicit" exceptions - we insert probe before method invocation only if invocation is on a new line ( #310 ), which is not the case here for kotlin/jvm/internal/Intrinsics.throwUninitializedPropertyAccessException. Unlikely that some handwritten code invokes this internal method, however if will be needed in future fact of absence of line number can be used to harden filter to avoid filtering of such handwritten code.

@Godin Godin moved this from IN PROGRESS to DONE in Filtering Jul 11, 2018
@fabianishere fabianishere deleted the kotlin-lateinit-filter branch July 11, 2018 19:41
@fabianishere
Copy link
Contributor Author

Regarding public properties with private setters having unused generated setter: this issue apparently already reported. See https://youtrack.jetbrains.com/issue/KT-20344. However, it has not receive a lot of attention since it has been reported.

@jacoco jacoco locked as resolved and limited conversation to collaborators Oct 8, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
Filtering
  
Done
Development

Successfully merging this pull request may close these issues.

None yet

2 participants