Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

asMap() compute and the Weigher #513

Comments

@boschb
Copy link

boschb commented Mar 10, 2021

Hi Ben,

Using Caffeine 2.8.5 currently.

I'm trying to utilize the 'compute' varients (specifically computeIfPresent()) of the concurrent map from asMap() such that it will update the weight for the Weigher based eviction. I'm writing tests to see the functionality, but it's hard to know if Eviction is working correctly when things are updated in this way. Here is one strange behavior that I found already:

  @Test
  public void asMap_computeEvictionIssue() {
    LoadingCache<Long, Integer> cache =
        Caffeine.newBuilder()
            .maximumWeight(3)
            .<Long, Integer>weigher((k, v) -> v)
            .recordStats()
            .executor(MoreExecutors.directExecutor())
            .build(key -> 1);

    assertThat(cache.get(1L)).isNotNull();
    assertThat(cache.get(2L)).isNotNull();
    assertThat(cache.get(3L)).isNotNull();
    assertThat(cache.stats().evictionWeight()).isEqualTo(0); // nothing evicted yet

    // Real Test
    cache
        .asMap()
        .computeIfPresent(1L, (id, integer) -> 2); // Update id-1 with weight 2
    assertThat(cache.stats().evictionWeight()).isEqualTo(2); // the just computed is gone!
    assertThat(cache.getIfPresent(1L)).isEqualTo(2); // FIXME: isNull() currently!!
  }

  @Test
  public void putEvictionIssue() {
    LoadingCache<Long, Integer> cache =
        Caffeine.newBuilder()
            .maximumWeight(3)
            .<Long, Integer>weigher((k, v) -> v)
            .recordStats()
            .executor(MoreExecutors.directExecutor())
            .build(key -> 1);

    assertThat(cache.get(1L)).isNotNull();
    assertThat(cache.get(2L)).isNotNull();
    assertThat(cache.get(3L)).isNotNull();
    assertThat(cache.stats().evictionWeight()).isEqualTo(0); // nothing evicted yet

    // Real Test
    cache.put(1L, 2); // Update id-1 with weight 2
    assertThat(cache.stats().evictionWeight()).isEqualTo(2); // the just put is gone!
    assertThat(cache.getIfPresent(1L)).isEqualTo(2); // FIXME: isNull() currently!!
  }

Essentially I want to re-weigh a value and it appears the safest way to do that is to to update the value with a new value. Ideally in a compute() context, but I suppose I could settle for PUT in this case and just accept the race condition. Put appears none better however.

Any pointers on getting this update to values and weight correct?

Also I noticed that calling cleanUp() allows eviction to happen when utilizing compute() changes, but I have no idea how heavy a hammer this is? Does cleanUp() internally re-compute all weights?

@ben-manes
Copy link
Owner

I have have to jump into a meeting, so here is an immediate answer before I dig in for moore details.


I believe you are assuming LRU eviction semantics, but Caffeine may evict entries early. At a very tiny cache of 3, we may more aggressively evict from the window region causing the last insert/update to be dropped. In most cases the size is large enough that we can use this window region to retain very recent items, but at a tiny size the rejections are more noticeable. This also is non-deterministic as we introduce a small amount of jitter into the eviction decisions to protect against a HashDoS attack.

My immediate guess is that while the cache is honoring its bound, it is not choosing the entry you expected for eviction. You can debug this by adding a removal listener and outputting the current size,

.removalListener((k, v, cause) -> {
  System.err.printf("%s: %s=%s%n", cause, k, v);
})
// before and after write
System.out.println("Weighted Size: " + cache.policy().eviction().get().weightedSize().getAsLong());

For re-weighing, a compute or conditional replace should be used to avoid a race, as you suggested.

@boschb
Copy link
Author

boschb commented Mar 10, 2021

Ahh thanks! That makes sense. I tried with more entries this time, and things are more well behaved.

    @Test
  public void asMap_computeEvictionIssue() {
    LoadingCache<Long, Integer> cache =
        Caffeine.newBuilder()
            .maximumWeight(2 * MAX_SIZE)
            .<Long, Integer>weigher((k, v) -> v)
            .recordStats()
            .executor(MoreExecutors.directExecutor())
            .build(key -> 2);

    // Seed the cache
    for (long i = 0; i < MAX_SIZE; i++) {
      assertThat(cache.get(i)).isNotNull();
    }
    assertThat(cache.stats().evictionWeight()).isEqualTo(0); // nothing evicted yet
    assertThat(cache.estimatedSize()).isEqualTo(MAX_SIZE);
    assertThat(cache.asMap().size()).isEqualTo(MAX_SIZE);

    // Real Test
    cache.asMap().computeIfPresent(1L, (id, integer) -> 3); // Update id-1 with weight 3

    assertThat(cache.stats().evictionWeight()).isEqualTo(2);
    assertThat(cache.getIfPresent(1L)).isEqualTo(3);
    // 1 got evicted to ensure weight, but the original value for 1L appears not be counted here.
    // I suppose this is ok, maybe a product of compute() not knowing enough about the operation.

    assertThat(cache.estimatedSize()).isEqualTo(MAX_SIZE - 1);
    assertThat(cache.asMap().size()).isEqualTo(MAX_SIZE - 1);
  }

@ben-manes
Copy link
Owner

What does this refer to?

// 1 got evicted to ensure weight, but the original value for 1L appears not be counted here.
// I suppose this is ok, maybe a product of compute() not knowing enough about the operation.

@ben-manes
Copy link
Owner

Sorry, I'm still only able to partially focus.

Any pointers on getting this update to values and weight correct?

Is there any case where you observe the value/weight to be incorrect, or only that the cache evicted an entry that you did not expect?

Also I noticed that calling cleanUp() allows eviction to happen when utilizing compute() changes, but I have no idea how heavy a hammer this is? Does cleanUp() internally re-compute all weights?

cleanUp shouldn't do anything special, and a compute would trigger it for a write. There is no recalculation of weights. If you are using a same-thread executor, did you observe eviction not happening? Otherwise it will be asynchronous, where cleanUp is on the calling thread so there is no racy observations.

@boschb
Copy link
Author

boschb commented Mar 10, 2021

Regarding the comment: When doing a compute(), the original entry (1L:2) is replaced with another entry (1L:3). In addition it would appear one other entry is removed as well (I don't check which, but the final size is 99 and the updated entry is preserved). The total eviction weight stat count is 2 however. There could be a case made that the total should be 4. It doesn't matter much to my use case, and I suspect some internal reason for this with compute(), but maybe just FYI for you if you didn't know.

Other than that, it would appear things are good to go for me. Just needed a bigger test size to see the behavior. For reference here is my final @test confirming that computeIfPresent() is used to replace an entry that eviction takes place as expected.

  @Test
  public void asMapCompute_weightedEviction() {
    int maxSize = 100;
    int defaultWeight = 2;
    LoadingCache<Long, Integer> cache =
        Caffeine.newBuilder()
            .maximumWeight(2 * maxSize)
            .<Long, Integer>weigher((k, v) -> v)
            .recordStats()
            .executor(MoreExecutors.directExecutor())
            .build(key -> defaultWeight);

    // Seed the cache
    for (long i = 0; i < maxSize; i++) {
      assertThat(cache.get(i)).isNotNull();
    }
    assertThat(cache.stats().evictionWeight()).isEqualTo(0); // nothing evicted yet
    assertThat(cache.estimatedSize()).isEqualTo(maxSize);
    assertThat(cache.asMap().size()).isEqualTo(maxSize);
    assertThat(Iterables.size(cache.asMap().entrySet())).isEqualTo(maxSize);

    // Real Test
    cache.asMap().computeIfPresent(1L, (id, integer) -> 3); // Update id-1 with weight 3

    assertThat(cache.stats().evictionWeight()).isEqualTo(defaultWeight);
    assertThat(cache.getIfPresent(1L)).isEqualTo(3);
    // One entry got evicted for weight, but the original value for 1L replacement is not counted.
    // I suppose this is ok, maybe a product of compute() not knowing enough about the operation.

    assertThat(cache.estimatedSize()).isEqualTo(maxSize - 1);
    assertThat(cache.asMap().size()).isEqualTo(maxSize - 1);
    assertThat(Iterables.size(cache.asMap().entrySet())).isEqualTo(maxSize - 1);
  }

@ben-manes
Copy link
Owner

A few findings,

  1. The cache is partitioned into 3 regions: window, probation, and protected. Due to the tiny cache of 3, each region is allocated 1 unit of capacity. These regions are elastic to fill up the entire cache, but are used to classify for finding the best candidate to discard.
  2. When filling the cache, the inserts causes K1 to be pushed from the window into the probation. It is the oldest entry and therefore the victim.
  3. Updating the weight of K1 causes it to be reordered. Typically this would be to promote from probation to protected. However, since the entry size exceeds the protected region's size no reordering is performed. This is to avoid flushing the protected region (as 1 large entry causes many small ones to be evicted).
    /** Promote the node from probation to protected on an access. */
    @GuardedBy("evictionLock")
    void reorderProbation(Node<K, V> node) {
    if (!accessOrderProbationDeque().contains(node)) {
    // Ignore stale accesses for an entry that is no longer present
    return;
    } else if (node.getPolicyWeight() > mainProtectedMaximum()) {
    return;
    }
    // If the protected space exceeds its maximum, the LRU items are demoted to the probation space.
    // This is deferred to the adaption phase at the end of the maintenance cycle.
    setMainProtectedWeightedSize(mainProtectedWeightedSize() + node.getPolicyWeight());
    accessOrderProbationDeque().remove(node);
    accessOrderProtectedDeque().add(node);
    node.makeMainProtected();
    }
  4. A possible fallback to (3) would be to reorder the entry in the probation region. This wasn't done since SLRU treats probation as a FIFO and protected as an LRU, so it didn't occur to me as beneficial. Do you think this should be done? I have no strong opinion.
  5. When evicting on the size change, no entries from the window were evicted so only the probation's victim is considered. Since that is K1, this entry is chosen for removal.
  6. Guava defines CacheStats.evictionCount() as, "the number of times an entry has been evicted. This count does not include manual invalidations". Therefore when adding evictionWeight, this similarly does not include manual invalidations. A replacement is not an eviction, so neither metric is incremented. While the total weight of the removed entries is 4, only 2 units were removed by automatic eviction. I believe this is correct given the definition, even if it is not what might be expected.

Any thoughts?

@boschb
Copy link
Author

boschb commented Mar 10, 2021

I don't really have much input as this is pretty deep in the Cache inner workings for me. The 4 vs 2 makes sense, i'm fine with it as defined. The only part that was a little strange is the effect of a MaxWeight=3, add 3x elements of weight 1, then update an element to weight 2, actually removes the element, and leave the cache with the 2 original elements. I realize now why this happens, but it was a little gotcha that happens with low weight values and entries. Probably not worth fixing unless there was a larger issue a play here.

@ben-manes
Copy link
Owner

If we reorder in probation, then it would select K2 instead of K1, which would mask the problem a little bit better for you. It would pass your original test, except that the eviction weight would be 1. That's a benign change, so while it doesn't impact much real-world I agree having things behave less surprisingly when debugging through a test case is worthwhile.

@ben-manes
Copy link
Owner

In v3 your original tests passes, if ignoring the eviction weight misunderstanding, because we now search the admission window for additional candidates. This change was done to improve the handling for excessively large entries (464bc19) to avoid flushing a region because the newly added entry is in the MRU position. In your case, the candidate (3) and the victim (1) are compared by TinyLFU, where the victim wins due to already being promoted so it is discarded only if the candidate has a higher frequency.

Regardless, I ported over your test case and add the reordering as they seem like nice change that might avoid this type of confusion.

@boschb
Copy link
Author

boschb commented Mar 11, 2021 via email

ben-manes added a commit that referenced this issue Mar 11, 2021
For weighted caches, if an entry in the probation queue is too large to
be promoted to the protected queue, it was skipped. This could leave it
as the victim, making it eligable for eviction. In small caches, often
used by test code, the premature eviction can be surprising if one still
assumes LRU. To reduce this confusion, we can reorder within the
probation queue and treat it as an LRU instead of FIFO.

This test case already passes in v3 due to changes in how it selects
candidates in order to handle excessively large entries. However,
codifying the intent seems worthwhile and, if it helps reduce unit
test confusion, then a nice benefit.
ben-manes added a commit that referenced this issue Mar 11, 2021
For weighted caches, if an entry in the probation queue is too large to
be promoted to the protected queue, it was skipped. This could leave it
as the victim, making it eligable for eviction. In small caches, often
used by test code, the premature eviction can be surprising if one still
assumes LRU. To reduce this confusion, we can reorder within the
probation queue and treat it as an LRU instead of FIFO.

This test case already passes in v3 due to changes in how it selects
candidates in order to handle excessively large entries. However,
codifying the intent seems worthwhile and, if it helps reduce unit
test confusion, then a nice benefit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment