New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unexpected removalListener call after upgrade from 2.7.0 to 3.0.3 #593
Comments
I will investigate over the weekend (a bit late my time). The refresh implementation was rewritten in 3.0 and it has always been very complex to support. It may be a bug as I agree with your expectations. Thanks for he test and detailed report. |
An offhand glance and I think it’s because we now use the LSB of the timestamp as a refresh lock flag. That condition did not apply a mask to ignore it, so it’s incorrectly assumed the difference to be invalidation as if a manual write was performed. Since we suppress notifications for replacements with the same reference in all other cases, I think that timestamp check is wrong anyway and we can rely only on the reference check. wdyt? |
That makes sense! I don't know Caffeine that well but if we suppress analogous replacements in all other places it is worth it to be consistent here. Can't find the place where this mask ignoring LSB is applied though. Could you link some place where it's applied? |
The suppression is one minor difference from Guava and exactly for your case where the resource would be closed. An idempotent write is shouldn’t trigger a removal notification, but Guava overlooked that case. When setting the write time it’s zero’d caffeine/caffeine/src/main/java/com/github/benmanes/caffeine/cache/BoundedLocalCache.java Lines 1421 to 1425 in 4f6ec75
Before refreshing we prepare the lock with caffeine/caffeine/src/main/java/com/github/benmanes/caffeine/cache/BoundedLocalCache.java Line 1242 in 4f6ec75
Then try to acquire it caffeine/caffeine/src/main/java/com/github/benmanes/caffeine/cache/BoundedLocalCache.java Line 1247 in 4f6ec75
And release it caffeine/caffeine/src/main/java/com/github/benmanes/caffeine/cache/BoundedLocalCache.java Line 1282 in 4f6ec75
This way multiple readers do not block on the computeIfAbsent, but skip thanks to this soft lock. It seems that optimization has not been released yet, though. |
I guess it’s fine as it’s unlocked before the callback and unreleased code. I’ll investigate your test later and try to release a fix for you by Monday. |
It seems this condition exists in v2.x and was introduced in 2016. The test passes in 2.x and 3.0.4-SNAPSHOT because both use the writeTime as a soft lock to avoid redundant refresh attempts. That seems to be enough to mask the problem because it avoids a redundant reload attempt from being scheduled with a stale timestamp read. The condition as written tries to check that the cache still holds the same reference as when the reload started and that no other writes to that entry occurred (see ABA problem). This would match Guava which fails your test, as it discards the refresh if any entry modification occurs. We would also want to discard the refresh if an explicit I believe then the fix will look something like, compute(key, (k, currentValue) -> {
if (currentValue == null) {
if (value == null) {
return null;
} else if (refreshes().get(key) == refreshFuture[0]) {
return value;
}
} else if (currentValue == value) {
return value;
} else if ((currentValue == oldValue) && (node.getWriteTime() == writeTime)) {
return value;
}
discard[0] = true;
return currentValue;
} I'll take a fresh look at this tomorrow, add tests, and some helpful comments to explain the code rational. |
Released in 3.0.4 |
Thank you for identifying and fixing this so fast. |
In part of our codebase we assume that removal listener is called only when value no longer sits in cache. Here is how we configure the cache:
In 2.7.0 it's worked for us for a long time. Once we updated to 3.0.3 we observed that after some time some machines end up with invalidated resource - loader returns a reference to the same object so value does not change but removalListener is called anyway resulting in closing a resource. I've tried to debug caffeine a bit and found out that we can observe such behavior if in BoundedLocalCache#refreshIfNeeded
I believe we might observe a problem because of
node.getWriteTime() != writeTime
part. I've wrote a test:https://github.com/hejcz/caffeine-removal-listener-inspect/blob/main/src/test/java/io/github/hejcz/CaffeineTest.java
illustrating the problem. It doesn't fail every time but usually after 3-5 reloads it prints that a resource received from cache is invalidated.
Could you confirm that what we see is not a bug and this behavior changed between 2.7.0 and 3.0.3? Is there some alternative mechanism we can use? What we want is to clean up a value once it doesn't sits in the cache anymore. Values are unique in cache (in test they're not but in test we always load the same reference).
The text was updated successfully, but these errors were encountered: