Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Method put of JCollectionWrappers returns None if the old value is null #11894

Closed
ansvonwa opened this issue Feb 23, 2020 · 7 comments · Fixed by scala/scala#9344
Closed

Method put of JCollectionWrappers returns None if the old value is null #11894

ansvonwa opened this issue Feb 23, 2020 · 7 comments · Fixed by scala/scala#9344

Comments

@ansvonwa
Copy link

ansvonwa commented Feb 23, 2020

Short example:

val m = new mutable.WeakHashMap[Int, String]()
m.put(1, null)
println(m.put(1, "foo")) // prints None instead of Some(null)

Explanation

mutable.WeakHashMap is implemented by a JMapWrapper and inherits its put method from JMapWrapperLike, which states as follows:

override def put(k: K, v: V): Option[V] = Option(underlying.put(k, v))

If a key is not present, put shall return None as it does, but if it is present, the documentation (in mutable.Map) says that the result should be a Some containing the previous value, so Some(null).
In fact, mutable.HashMap behaves like this.
The problem is, that the implementation relies only on the put method of the underlying Java-map, which returns null in both cases: when the key is not present and when its associated value is null.

Solution Idea

A straight-forward fix would be

override def put(k: K, v: V): Option[V] =
  if (underlying.containsKey(k)) Some(underlying.put(k, v))
  else { underlying.put(k, v); None }

But this requires two accesses to the underlying map (contains and put both have to lookup the relevant entry) so the cost doubles. (Neglecting caching)
To (nearly) keep the current performance, one could take something like:

override def put(k: K, v: V): Option[V] = {
  val sz = underlying.size
  val oldV = underlying.put(k, v)
  if (underlying.size == sz) Some(oldV) // key was already present
  else None // key was new
}

Notes

  • Is the usage of size really safe here? (Or may this lead to conflicts with the GC in WeakHashMap? Edit: It does!)
  • Find all similar occurrences. As far as I see, they are all in s.c.c.JavaCollectionWrappers.
  • Determine if putIfAbsent is also somehow affected
  • We should add tests for that case for all mutable.Maps
  • This bug/behavior is at least eight years old. Is there a relevant chance that people rely on Some's content not being null when returned by put?
@som-snytt
Copy link

This is a bit tricky, in that get does respect null value as existing, but putIfAbsent in the Java API does not, i.e., an existing null value is treated as absent.

I'm not an expert, but the Java API is uniform in treating null as absent.

I like the size check trick, if size is cheap.

I think it's necessarily true that some code somewhere relies on the current behavior.

@ansvonwa
Copy link
Author

ansvonwa commented Feb 25, 2020

the Java API is uniform in treating null as absent.

This is only true for the concrete concurrent implementations in Java. The non-concurrent ones can hold null-values and there is nothing that prevents you from implementing a juc.ConcurrentMap that also can hold nulls (it is even mentioned explicitly in the doc).

I like the size check trick, if size is cheap.

It is. I am just a little worried that concurrent modifications (for example by the GC in WeakHashMap) between the size-calls may modify the size simultaneously.

Tests

I ran some tests for most Scala- and most Java-maps:

Full Table
Map Type \ Test Name put-null-contains-get-put empty-putIfAbsent put-null-putIfAbsent put-null-getOrElseUpdate put-b1-contains-get-put put-b1-putIfAbsent put-b1-getOrElseUpdate
Code m.put(a1, null); m.contains(a1)+","+ m.get(a1)+","+ m.put(a1,b2) m.putIfAbsent(a1,b2)+","+ m.get(a1) m.put(a1,null); m.putIfAbsent(a1, b2)+","+ m.get(a1) m.put(a1,null); m.getOrElseUpdate(a1,b2)+","+ m.get(a1) m.put(a1,b1); m.contains(a1)+","+ m.get(a1)+","+ m.put(a1,b2) m.put(a1,b1); m.putIfAbsent(a1,b2)+","+ m.get(a1) m.put(a1, b1); m.getOrElseUpdate(a1,b2)+","+ m.get(a1)
Expected Result Scala: true,Some(null),Some(null) Java: true,null,null Scala: None,Some(b2) Java: null,b2 Scala: Some(null),Some(null) Java (concurrent): null,null Java (other): null,b2 Scala: null,Some(null) Scala: true,Some(b1),Some(b1) Java: true,b1,b1 Scala: Some(b1),Some(b1) Java: b1,b1 Scala: b1,Some(b1)
mutable.Map() true,Some(null),Some(null) unsupported unsupported null,Some(null) true,Some(b1),Some(b1) unsupported b1,Some(b1)
mutable.HashMap() true,Some(null),Some(null) unsupported unsupported null,Some(null) true,Some(b1),Some(b1) unsupported b1,Some(b1)
mutable.WeakHashMap() true,Some(null),None unsupported unsupported null,Some(null) true,Some(b1),Some(b1) unsupported b1,Some(b1)
mutable.OpenHashMap() true,Some(null),Some(null) unsupported unsupported null,Some(null) true,Some(b1),Some(b1) unsupported b1,Some(b1)
concurrent.TrieMap() true,Some(null),Some(null) None,Some(b2) Some(null),Some(null) null,Some(null) true,Some(b1),Some(b1) Some(b1),Some(b1) b1,Some(b1)
asScala(new ju.HashMap()) true,Some(null),None unsupported unsupported null,Some(null) true,Some(b1),Some(b1) unsupported b1,Some(b1)
asScala(asJava(mutable.HashMap())) true,Some(null),Some(null) unsupported unsupported null,Some(null) true,Some(b1),Some(b1) unsupported b1,Some(b1)
asScala(new ju.concurrent.ConcurrentHashMap()) NullPointer None,Some(b2) NullPointer NullPointer true,Some(b1),Some(b1) Some(b1),Some(b1) b1,Some(b1)
asScala(new ju.concurrent.ConcurrentSkipListMap()) NullPointer None,Some(b2) NullPointer NullPointer true,Some(b1),Some(b1) Some(b1),Some(b1) b1,Some(b1)
asScala(new ju.LinkedHashMap()) true,Some(null),None unsupported unsupported null,Some(null) true,Some(b1),Some(b1) unsupported b1,Some(b1)
asScala(new ju.TreeMap()) true,Some(null),None unsupported unsupported null,Some(null) true,Some(b1),Some(b1) unsupported b1,Some(b1)
asScala(new ju.WeakHashMap()) true,Some(null),None unsupported unsupported null,Some(null) true,Some(b1),Some(b1) unsupported b1,Some(b1)
asScala(wrapped(asJava(new concurrent.TrieMap()))) false,None,None None,Some(b2) None,None b2,None true,Some(b1),Some(b1) Some(b1),Some(b1) b1,Some(b1)
asJava(mutable.Map()) true,null,null null,b2 null,b2 unsupported true,b1,b1 b1,b1 unsupported
asJava(asScala(new ju.HashMap())) true,null,null null,b2 null,b2 unsupported true,b1,b1 b1,b1 unsupported
asJava(mutable.HashMap()) true,null,null null,b2 null,b2 unsupported true,b1,b1 b1,b1 unsupported
asJava(mutable.WeakHashMap()) true,null,null null,b2 null,b2 unsupported true,b1,b1 b1,b1 unsupported
asJava(mutable.OpenHashMap()) true,null,null null,b2 null,b2 unsupported true,b1,b1 b1,b1 unsupported
asJava(concurrent.TrieMap()) true,null,null null,b2 null,null unsupported true,b1,b1 b1,b1 unsupported
new ju.HashMap() true,null,null null,b2 null,b2 unsupported true,b1,b1 b1,b1 unsupported
new ju.Hashtable() NullPointer null,b2 NullPointer NullPointer true,b1,b1 b1,b1 unsupported
new ju.concurrent.ConcurrentHashMap() NullPointer null,b2 NullPointer NullPointer true,b1,b1 b1,b1 unsupported
new ju.concurrent.ConcurrentSkipListMap() NullPointer null,b2 NullPointer NullPointer true,b1,b1 b1,b1 unsupported
new ju.LinkedHashMap() true,null,null null,b2 null,b2 unsupported true,b1,b1 b1,b1 unsupported
new ju.TreeMap() true,null,null null,b2 null,b2 unsupported true,b1,b1 b1,b1 unsupported
new ju.WeakHashMap() true,null,null null,b2 null,b2 unsupported true,b1,b1 b1,b1 unsupported
  • The only unexpected results are in put-null-contains-get-put, and the None,None in put-null-putIfAbsent. They are highlighted.
  • For java: s/contains/containsKey
  • "NullPointer" means the put method threw a NullPointerException (as some java-maps do not allow null-values)
  • wrapped just used to prevent that asScala unwraps a map wrapped with asJava. Alternatively, one could implement a ju.concurrent.ConcurrentMap on thier own. It is defined as
  def wrapped(underlying: ju.concurrent.ConcurrentMap[A, B]) = new ju.concurrent.ConcurrentMap[A, B] {
    override def size(): Int = underlying.size()
    ...
    override def putIfAbsent(key: A, value: B): B = underlying.putIfAbsent(key, value)
  }

More methods are affected:

Assume mJava to be a java.util.concurrent.ConcurrentMap[A, B] containing a1 -> null, then, we get the following very surprising result:

val mJava = wrapped(asJava(new concurrent.TrieMap[A, B]()))
mJava.put(a1, null)

val m = asScala(mJava)
m.putIfAbsent(a1, b1) // does nothing
m.getOrElseUpdate(a1, b1) // does nothing
println(mJava.containsKey(a1)) // true
println(m.contains(a1)) // false, expected: true
println(mJava.get(a1)) // null
println(m.get(a1)) // None, expected: Some(null)

So, why does m.contains(a1) return false even though we just called putIfAbsent?
putIfAbsent and getOrElseUpdate both do nothing, because get in JConcurrentMapWrapper has the same problem: It returns None if the underlying map contains as value null.

@som-snytt
Copy link

I meant to say, for example, Map#computeIfAbsent says, "If the specified key is not already associated with a value (or is mapped to null), attempts to compute its value using the given mapping function and enters it into this map unless null." So, mapped to null means the same as absent.

ansvonwa added a commit to ansvonwa/scala that referenced this issue Mar 2, 2020
In JMapWrapperLike, treat null-values correctly:
Return Some(null) instead of None, if the value is present, but null.
@dwijnand
Copy link
Member

So mutable.HashMap and mutable.WeakHashMap disagree on whether Some(null) or None should be returned (due to implementation details). @som-snytt, I see your point about Java's APIs, but in which direction do you think we should fix our map APIs?

@som-snytt
Copy link

som-snytt commented Nov 23, 2020

Wow, I just unfurled the impressive table of operations.

Feb 25 was just after my febrile period, so I don't know what I was thinking then.

On WeakHashMap: containsKey check is not sufficient because if a mapping foo -> bar is discarded between the check and the put, then Some(null) would be the wrong result.

So my conclusion is that a user wanting null sensitivity will have to do something laborious. (Inspect entrySet?)

Some doc would help. The Javadoc says "this class behaves somewhat differently from other Map implementations."

Is it worth doing containsKey when wrapping java.util.HashMap? Now I also understand my point about the Java API, mapped to null means mapped to no value.

Is it worth changing the Scala semantics to return None rather than Some(null)? I'll be bold and say yes. put returns the previously bound value and null is not a bound value. That doesn't mean putting a null value is useless, as it sticks the key in the map.

I can use applyOrElse to distinguish binding to null from absence, as well as contains or isDefinedAt.

Edit: another idea is to provide mymap.nn to enforce not nullable values.

And mymap.nno (non-null out) for mymap.view.filter(_._2 != null).

Edit: submitted the less eccentric patch.

@SethTisue
Copy link
Member

SethTisue commented Jan 6, 2021

There was discussion on scala/scala#9344, now abandoned. I don't think we have consensus on whether the current behavior is wrong or not. (I lean, gently, towards the status quo unless someone appears and steps forward with a clear, strong case for the change. Perhaps there is one, there in the back-and-forth.)

@som-snytt
Copy link

Java also has trouble keeping this straight. A current bug "TreeMap.computeIfAbsent Mishandles Existing Entries Whose Values Are null"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants