Skip to content

COMPLEXITIES: Nullity in the type system

Reinier Zwitserloot edited this page May 1, 2020 · 1 revision

Nullity via annotations is rather common; lombok supports generating many flavours of it.

Lombok cannot, unfortunately, really recommend any of these. Annotations feel like the right answer to better null support in java, but unfortunately the community first needs to coalesce on a library that does it right. The checker framework is probably closest; unfortunately, the most commonly used variants are not very good.

This wiki page describes the complications in adding nullity to the type system.

Nullity: Is it a property of params/methods/fields, or.. types?

There are 2 ways: Eclipse and checkerframework have set up their nullity annotations to apply to ElementType.TYPE_USE. All others have decided that their annotations apply to a batch of many things, except TYPE_USE: Methods, fields, params.

Eclipse and checkerframework got it right.

The reason is generics. All of the following notions are different and meaningful:

Given a list of string arrays:

  1. @NonNull List<@NonNull String @NonNull []> - The list cannot be null; it cannot null arrays, and the arrays themselves must not contain null string refs.
  2. @NonNull List<@NonNull String[]> - The list cannot be null and cannot contain null arrays. However, any given array in it may contain null refs.
  3. @NonNull List<String @NonNull []> - The list cannot be null, but it may contain null arrays. For all non-null arrays it has, it may not contain nullrefs, only non-null string refs.
  4. @NonNull List<String[]> - The list cannot be null, but it may contain null refs amongst its arrays. A non-null array contained in it may contain null refs amongst its strings.
  5. List<@NonNull String @NonNull []> - The list may itself be null. If it isn't, like #1.
  6. List<@NonNull String[]> - The list may itself be null. If it isn't, like #2.
  7. List<String @NonNull []> - The list may itself be null. If it isn't, like #3.
  8. List<String[]> - everything maybe null at any hierarchy.

An annotation that isn't focused on TYPE_USE can only tell you whether the list can be null or not and it ends there. This makes linting incomplete:

@NotNullByDefault class Example {
    @Nullable List<String> getNames() {
        return List.of("a", null, "b"); // [1]
    }

    public void foo(String arg) {}

    public static void main(String[] args) {
        var ex = new Example();
        var list = ex.getNames();
        foo(list.get(1)); // [2]
    }
}

Should a linter warning be generated on line [1] for violating the rules? Should a linter warning be generated on [2]? It should generate a warning on exactly one of these. But which one? Without TYPE_USE, you'd have to pick arbitrary positions and document this (that doesn't feel right, for an API), and it is not possible to express complications such as 'a non-nullable list of nullable arrays of non-null strings'.

Polynull

Generics complicate matters. Imagine this method:

/**
  * Finds the first element in the list that matches
  * the predicate, and adds it to the end of the list.
  */
public <T> void duplicateFirst(
    @nullable List<@UhOhWhatGoesHere T> list,
    @nonnull Predicate<@WhatGoesHereThen T> predicate) {

    if (list == null) return;
    for (T t : list) {
        if (predicate.test(t)) {
            list.add(t);
            return;
        }
    }
}

If you replace the @UhOh with @nonnull, this works; it would pass linter.

If you replace it with @nullable it also works.

But no nullity framework in existence for java, except the checker framework, can do this.

The problem is that generics are invariant. When you have a method: public @NonNull String foo(){}, and I write: @Nullable String x = foo();, you get a covariance situation: @Nullable String and @NonNull String are not the same type, but essentially @NonNull String is a subtype of @Nullable String; just like how Object o = ""; is valid java, so is assigning a never-null string to a potentially null string variable.

But generics do not work that way.

This is not valid java: List<Number> n = new ArrayList<Integer>(); - does not compile.

Thus, wanting to write a method that can read and write a list, but, it works regardless of the 'nullity' of the elements of the list, requires something fancy. With checker framework, you can actually do this:

import org.checkerframework.checker.nullness.qual.*;

public <T> void duplicateFirst(
    @Nullable List<@PolyNull T> list,
    @nonnull Predicate<? super @PolyNull T> predicate) {

    if (list == null) return;
    for (T t : list) {
        if (predicate.test(t)) {
            list.add(t);
            return;
        }
    }
}

Also see the javadoc of PolyNull.

Note that if I were to add list.add(null); at the end, in case no element matches the predicate, checkerframework would mark that line as not acceptable correctly. Your annotations cannot do this. Kotlin's nullity support can't do this. Only checkerframework and ceylon (via double-bounds on the generics) can do this, as far as I know.

Generics overrides

generics: The gift that keeps on giving!

Consider this method we all know: java.util.Map's get(Object) method. This is its signature:

public V get(Object key);

Now imagine this code:

    Map<String, @NonNull String> map = ...;
    String value = map.get("foo");
    if (value == null) throw new NoSuchElementException();

For quite a while, eclipse would mark this code as erroneous, telling you that there is no point checking value, as clearly it cannot be null (because .get returns V, and V is @NonNull String). The linter is actively making you write bugs instead of solving them! Clearly an unacceptable state of affairs.

Any annotation-based nullity marking library needs to be able to let you express all 4 variants when a method returns a type parameter:

  1. This method returns the same nullity-V as whatever V is.
  2. Even if V is nullable, this will never return null.
  3. Even if V is not nullable, this may return null.
  4. It is not known what the nullity of the returned value might be.

Obviously, any annotation system that has only 2 to 3 annotations ('nullable', 'nonnull', 'nonnullbydefault') cannot express all 4 different ways this can go, which makes them problematic to use. For example: Presumably you'd do #4 by not annotating at all, you can cover #2 and #3 by annotating explicitly with @NonNull or @Nullable, so.. how do I express #1 then?

Support the 30 years of history java has

The core libraries (java.*) as well as heaps upon heaps of existing third party libraries in maven central and elsewhere do not actually have such annotations on them. This is a very large problem, as it means your linter tool is going to toss many thousands of violations at your face if you take things literally and presume that un-annotated things therefore imply that they can be null, or, you get nullity errors that the linter won't catch and 'violations' where a nullable value ends up being returned as a marked-never-null if you ignore any API without nullity info the same way generics are ignored if you have raw types. Just like how generics has 'raw types', any nullity annotation system needs a similar 'legacy' flag to deal with APIs for which nullity is simply not known.

But, even if legacy is properly supported, it is still quite annoying to deal with legacy, again analogous to generics: Java CAN express the concept of legacy (pre-introduction-of-generics) code via raw types, but working with such things is a pain.

The solution is an 'external annotations' system: A way to supply as a standalone file a list of 'fixed' signatures. The ability to put in file someplace that Map.get()'s signature is more properly: @PromoteToNullable V get(@Nullable Object k).

Then, I expect a library that gathers widespread community support to ship with a large carefully assembled list with such external annotations for lots of the java.* core as well as many popular libraries, from guava to junit to log frameworks.

Without such a list, actually trying to use nullity annotations to make my life easier to write java code won't go particularly well: It'll just bother me with endless pointless warnings because the linter/checker I use doesn't recognize the nullity properties of the libraries I use, or it doesn't 'work', because the sheer amount of 'legacy nullity' flowing out of the method calls infects most of the code and is disabling the checker/linter's ability to catch problems.