Variable checked non-null outside a `forEach` or `Stream` lambda triggers an error #941

sdeleuze · 2024-03-25T13:55:15Z

The most common source of irrelevant errors we see in Spring Framework codebase with NullAway is this use case where the static analysis is not able to consider this.resolvedDataSources non-nullable in code found in lambdas:

public void initialize() {
	if (this.targetDataSources == null) {
		throw new IllegalArgumentException("Property 'targetDataSources' is required");
	}
	this.resolvedDataSources = CollectionUtils.newHashMap(this.targetDataSources.size());
	this.targetDataSources.forEach((key, value) -> {
		Object lookupKey = resolveSpecifiedLookupKey(key);
		DataSource dataSource = resolveSpecifiedDataSource(value);
		this.resolvedDataSources.put(lookupKey, dataSource);
	});
}

Such code generates the following error:

/Users/sdeleuze/workspace/spring-framework/spring-jdbc/src/main/java/org/springframework/jdbc/datasource/lookup/AbstractRoutingDataSource.java:145: error: [NullAway] dereferenced expression this.resolvedDataSources is @Nullable
                        this.resolvedDataSources.put(lookupKey, dataSource);

It is true that some lambda could be executed at a later point when the value has changes, but the fact that by default this is not consider as a valid use case generates a lot of irrelevant errors, and I would like to avoid to add an artificial check inside the lambda.

Could that be refined?

The text was updated successfully, but these errors were encountered:

msridhar · 2024-03-25T16:06:18Z

@sdeleuze thanks for the report. In general we want to be careful with assumptions regarding when lambdas will execute. We already bake in knowledge regarding certain stream operators, and I think forEach is another example where we could build in knowledge that the lambda runs "soon enough" to not require additional defensive null checks in the common case.

Is forEach the main case in Spring where this kind of issue occurs? Or are there other similar and commonly-used operators where the same issue occurs?

sdeleuze · 2024-03-25T16:35:58Z

I understand there is nuance here, so maybe let's focus on forEach for the most common types of collection as part of this issue. I have not been able to find obvious other common patterns, but I will create other issues if I find some.

sdeleuze · 2024-03-26T09:17:10Z

Would you be open to also make Stream methods like filter, map or flatMap more lenient as well?

msridhar · 2024-03-26T17:23:13Z

Hi @sdeleuze yes those other cases you mention make sense as well. As an FYI, I have an upcoming holiday and some other deadlines, so I will likely not get to looking at supporting these scenarios until after April 17 or so.

msridhar · 2024-04-10T20:48:13Z

Collection.removeIf is another good candidate for special treatment https://docs.oracle.com/javase/8/docs/api/java/util/Collection.html#removeIf-java.util.function.Predicate-

msridhar · 2024-04-12T13:53:44Z

@sdeleuze one concern I have with Streams is whether there are cases where a call like filter occurs, returning a Stream object, and then the Stream gets stored away somewhere like a field, and only later does the terminal operation get invoked on the stream. This happens with RxJava Observables, but they are meant to model processing of asynchronous inputs and don't really have an equivalent notion of terminal operations I think. What I'm not sure about is whether java Streams can be used like Observables, or whether that is an anti-pattern. Do you have any insights here?

sdeleuze · 2024-04-22T09:59:42Z

Java Stream are lazy so if returned from a method value or stored in a field, this is an argument for not taking in account the null check done outside the lambda. But a big difference compared to reactive API is that they are not reusable, so in practice, I am not sure this use case of usage outside the current method is popular. I suspect it is not.

msridhar · 2024-04-24T21:42:25Z

@sdeleuze I put up #952 as an initial implementation to fix this issue. I'm still a bit unsure about the Stream case, though I think in almost all cases it should be safe to treat the methods similarly to forEach. Let me see if I can do some code searches to try to gather evidence that, almost always, streams do not get passed around but instead have a terminal operation invoked in the same method where they are created.

msridhar · 2024-04-25T01:02:45Z

Thinking more here, and this is all fuzzy: the bad case for a Stream would be when it gets stored in a field. Returning a Stream or passing it as a parameter is not a problem in and of itself, as long as it gets consumed without other intervening computations that change nullability facts. Storing a Stream in a field to me would a clear indication that it is intended to be used "later." But, the Javadoc explicitly states that "A stream is not a data structure that stores elements," so storing it in a field seems strange, along with the fact that it can be consumed only once. I did some initial searching and didn't find any Stream-type fields, though the search was hardly exhaustive.

Using extra nullability information inside Stream lambdas is not sound, of course. But I really doubt it would lead to missed issues in practice, at least for reasonably-structured code. So, I am leaning toward propagating the extra nullability info to lambdas passed to all Stream methods.

@yuxincs @lazaroclapp if you have further thoughts here or disagree with the above reasoning please comment.

lazaroclapp · 2024-04-26T20:54:54Z

I recall (@yuxincs can correct me) that for NilAway we had some analysis of when lambdas "escaped" the current function. Could we do something similar for streams here? Like, as far as I can tell there are four cases, from most to least conservative:

The chain of stream operations must all happen within a single statement of the containing method/lambda (that's what our current stream propagation is most designed to handle, correct? We'd just need to extend it so fields are propagated as well if they aren't, right?)
Stream operations happen in different statements in the containing method/lambda, but the Stream object is never passed to another method/lambda, returned or assigned to a field.
Passed to a method/lambda, but neither caller nor callee(s) persist the Stream in a field anywhere.
Incomplete Streams are stored in fields.

Even for (2) there are some caveats for lambda's and fields, but they are similar to our caveats about aliasing:

E.g.

this.foo = "abcd";
someStream = someStream.filter(c -> this.foo.contains(c));
this.foo = null;
return someStream.collect(Collectors.toList()); // null deref as filter gets evaluated here, right?

For (3), the space for potential errors grows just based on how far the lambda expression might be from the point it gets executed.

How many of the cases we care about would be handled by just (1)? (The one in the example for the issue certainly would be, it's just that we currently don't uproot access paths for fields of this afaik)

msridhar · 2024-04-27T23:51:49Z

@lazaroclapp I think ideally, I'd like to implement your case 1, and see if there are still annoying false positives in practice. You're right that this is the case our current stream handlers are designed to handle. I guess the easiest implementation of this might be to have a list T of stream terminal operations, and then check if there is an appropriate parent node of the lambda in the AST that is a call to a method in T? We'd have to create the T list but otherwise this may not be too hard to implement, though it will add some complexity and runtime overhead.

I did some searching around, and I did find several cases where Stream objects are returned from methods. However, I could not really need if propagating more nullability facts would be relevant to such cases; I suspect it might not be.

Anyway, for simplicity maybe let's just continue the discussion in #952 🙂

…onously (#952) Fixes #941 We propagate full nullability info from the enclosing context to callbacks passed to `Map.forEach`, `Iterable.forEach`, `List.removeIf`, and all methods on `java.util.stream.Stream`

sdeleuze · 2024-05-03T09:28:46Z

Thanks for working on this, looking forward leveraging it in Spring with the upcoming NullAway related release.

msridhar · 2024-05-03T14:55:26Z

@sdeleuze the fix for this issue is part of NullAway 0.10.26 which was just released

sdeleuze changed the title ~~Variable checked non-null outside a lambda triggers an error~~ Variable checked non-null outside a forEach lambda triggers an error Mar 25, 2024

sdeleuze changed the title ~~Variable checked non-null outside a forEach lambda triggers an error~~ Variable checked non-null outside a forEach or Stream lambda triggers an error Mar 27, 2024

sdeleuze mentioned this issue Mar 27, 2024

Remove unneeded @SuppressWarnings("NullAway") spring-projects/spring-framework#32542

Closed

msridhar mentioned this issue Apr 24, 2024

Propagate more nullability info to lambdas known to be invoked synchronously #952

Merged

msridhar closed this as completed in #952 Apr 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Variable checked non-null outside a `forEach` or `Stream` lambda triggers an error #941

Variable checked non-null outside a `forEach` or `Stream` lambda triggers an error #941

sdeleuze commented Mar 25, 2024

msridhar commented Mar 25, 2024

sdeleuze commented Mar 25, 2024

sdeleuze commented Mar 26, 2024

msridhar commented Mar 26, 2024

msridhar commented Apr 10, 2024

msridhar commented Apr 12, 2024

sdeleuze commented Apr 22, 2024 •

edited

msridhar commented Apr 24, 2024

msridhar commented Apr 25, 2024

lazaroclapp commented Apr 26, 2024

msridhar commented Apr 27, 2024 •

edited

sdeleuze commented May 3, 2024

msridhar commented May 3, 2024

Variable checked non-null outside a forEach or Stream lambda triggers an error #941

Variable checked non-null outside a forEach or Stream lambda triggers an error #941

Comments

sdeleuze commented Mar 25, 2024

msridhar commented Mar 25, 2024

sdeleuze commented Mar 25, 2024

sdeleuze commented Mar 26, 2024

msridhar commented Mar 26, 2024

msridhar commented Apr 10, 2024

msridhar commented Apr 12, 2024

sdeleuze commented Apr 22, 2024 • edited

msridhar commented Apr 24, 2024

msridhar commented Apr 25, 2024

lazaroclapp commented Apr 26, 2024

msridhar commented Apr 27, 2024 • edited

sdeleuze commented May 3, 2024

msridhar commented May 3, 2024

Variable checked non-null outside a `forEach` or `Stream` lambda triggers an error #941

Variable checked non-null outside a `forEach` or `Stream` lambda triggers an error #941

sdeleuze commented Apr 22, 2024 •

edited

msridhar commented Apr 27, 2024 •

edited