Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improvement: Add atomic update on TrieMap #5938

Merged
merged 1 commit into from Dec 29, 2023
Merged

Conversation

jkciesluk
Copy link
Member

connected to #5072

@jkciesluk jkciesluk marked this pull request as draft December 13, 2023 09:42
@jkciesluk jkciesluk changed the title improvement: Add atomix update on TrieMap improvement: Add atomic update on TrieMap Dec 13, 2023
def updateWith(key: K)(remappingFunc: Option[V] => Option[V]): Unit = {
val computeFunction = new java.util.function.BiFunction[K, V, V] {
override def apply(k: K, v: V): V = {
trieMap.get(key) match {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure, but maybe if inside this function we referenced k instead of key (so input of computeFunction rather than the updateWith method) we could then extract it into a constant defined on the AtomicTrieMap class level so that there's no need to allocate a new one each time an updateWith is called? 🤔

Copy link
Member Author

@jkciesluk jkciesluk Dec 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see its possible, because we need to pass it remappingFunc.
We could change trieMap type to TrieMap[K, Set[V]], since we only use it like this and add a append function, that would look the way you suggested 🤔

Edit: I think it won't work either, since we need to pass the value to append

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, you're absolutely right, somehow I missed the remappingFunc. 👍

null.asInstanceOf[V]
}
}
concurrentMap.compute(key, computeFunction)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this duplicate the amount of data we hold? Should we just use ConcurrentMap instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Size of concurrentMap is always 0, its only used to provide atomic updateWith for trieMap.
I don't think we should simply replace TrieMap with ConcurrentMap, because get on TrieMap should be faster

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Size of concurrentMap is always 0, its only used to provide atomic updateWith for trieMap.

Can you ELI5 this? :D

get on TrieMap should be faster

Can you share some source about that, benchmark maybe? I did some reading, found https://medium.com/@igabaydulin/java-concurrenthashmap-vs-scala-concurrent-map-e185e8a0b798 which then led me to scala/scala#10027 and I need to read everything once again because I'm already lost. Best part - those are not even about TrieMap 😅

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Size of concurrentMap is always 0, its only used to provide atomic updateWith for trieMap.

Can you ELI5 this? :D

Sure :D
We need an atomic way to update a TrieMap with a remapping function, but its not provided in its API.
ConcurrentHashMap has the ability to do that, with compute function.
We don't to store any data in ConcurrentHashMap and only use it as a sort of lock on the updated value.
So our compute function atomically calculates new value to be stored, but updates it only in TrieMap - it inserts null to ConcurrentHashMap, which means no mapping.

Can you share some source about that, benchmark maybe? I did some reading, found https://medium.com/@igabaydulin/java-concurrenthashmap-vs-scala-concurrent-map-e185e8a0b798 which then led me to scala/scala#10027 and I need to read everything once again because I'm already lost. Best part - those are not even about TrieMap 😅

To be honest I'm not sure anymore 😅 . I thought using a trie would result in a better performance, but from what I've read it could be similar or slightly worse than ConcurrentHashMap (TrieMap biggest advantage are snapshots, but we are not using them).

Copy link
Contributor

@tgodzik tgodzik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaning!

@jkciesluk jkciesluk merged commit 276a02c into scalameta:main Dec 29, 2023
24 of 25 checks passed
@jkciesluk jkciesluk deleted the i5072 branch December 29, 2023 13:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants