CacheLongKeyLIRS concurrency improvements #3069

amolokoedov · 2021-03-14T15:53:14Z

We've strike into significant concurrency limitation in a condition with large number of threads (60 threads) + empty database.

Looks like it happens because all threads read the same index which is small. And most of the time threads were blocked on CacheLongKeyLIRS.Segment.get() method (since is synchronized).

This degradation persists until this index is small and slowly improved with db growth (several hours in our case).

katzyn

Thank you for your contribution!

First of all, please provide a some standalone test case that illustrates the described scalability problem. CacheLongKeyLIRS isn't perfect, but even if you see a problem with it in the profiler it doesn't mean that everything else is OK, maybe there are other problems and they make the situation with the cache so bad.

katzyn · 2021-03-15T09:17:51Z

h2/src/main/org/h2/engine/DbSettings.java

+    /**
+     * Database setting <code>CACHE_CONCURRENCY</code>
+     * (default: 16).<br />
+     * Set the read cache concurrency.
+     */
+    public final int cacheConcurrency = get("CACHE_CONCURRENCY", 16);
+
+    /**
+     * Database setting <code>AUTO_COMMIT_BUFFER_SIZE_KB</code>
+     * (default: depends on max heap).<br />
+     * Set the size of the write buffer, in KB disk space (for file-based
+     * stores). Unless auto-commit is disabled, changes are automatically
+     * saved if there are more than this amount of changes.
+     *
+     * When the value is set to 0 or lower, data is not automatically
+     * stored.
+     */
+    public final int autoCommitBufferSize = get("AUTO_COMMIT_BUFFER_SIZE_KB",
+        Math.max(1, Math.min(19, Utils.scaleForAvailableMemory(64))) * 1024);


Please, explain why you need these settings.

We use H2 as a cache for binary chunks in high concurrency environment (currently 60 threads). After we increase cacheConcurrency up to 1024 H2 looks really well most of the time (same or even better than postgresql). The only problem arises after we emptied db content - we observed a significant degradation in throughput and cpu usage.
Here's what we run into:

CPU contention in Segment.get (optimization in this PR). Looks like when primary index is small and all 60 threads wants to use it they compete for the exactly same segments. The only way to fix it is to improve LIRS concurrency.

After the fix Incorrect LEFT JOIN with aggrigated query in the join condition #1 write rate increased and next is 'back pressure' feature.
// if unsaved memory creation rate is to high,
// some back pressure need to be applied
// to slow things down and avoid OOME
Looks like 2 Mb is too small for 150 Mb per sec. This is why we need to introduce AUTO_COMMIT_BUFFER_SIZE_KB.

katzyn · 2021-03-15T09:19:06Z

h2/src/main/org/h2/mvstore/cache/CacheLongKeyLIRS.java

+        /*
+         * Used as null value for ConcurrentSkipListSet
+         */
+        private final Entry<V> ENTRY_NULL = new Entry<>();


I think it would be better to convert misses to AtomicLong and avoid a lock for a null value completely. In that case this dummy value and related tricks will not be required.

Make sense. Will do.

and avoid a lock for a null value completely

Actually there is no separate lock for this null value - most of the time processing is done inside lock that is currently active. But i agree that it looks a bit 'artificial'.

katzyn · 2021-03-15T09:21:02Z

h2/src/main/org/h2/mvstore/cache/CacheLongKeyLIRS.java

+        public int compareTo(Entry<V> tgt) {
+            return key == tgt.key ? 0 : key < tgt.key ? -1 : 1;


@Override

Long.compare()

katzyn

I think we still need a standalone test case for your problem for further investigation.

The whole implementation of this class with segments looks like something from Java 1.4–5 era when atomic operations made only their first steps in Java.

Most likely this class can be reimplemented without segments and exclusive locks with modern APIs.

katzyn · 2021-03-18T04:56:58Z

h2/src/main/org/h2/mvstore/cache/CacheLongKeyLIRS.java

+            if (!l.tryLock()) {
+                if (value == null) {
+                    misses.incrementAndGet();
+                } else {
+                    concAccess.add(e);
+                }
+                return value;
+            }


When value == null there is no need to call tryLock().

katzyn · 2021-03-18T05:01:08Z

h2/src/main/org/h2/mvstore/cache/CacheLongKeyLIRS.java

+        <T> T withLock(Supplier<T> c) {
+            l.lock();
+            try {
+                return c.get();
+            }
+            finally {
+                l.unlock();
+            }
+        }


From my point of view this method doesn't make code more readable and introduction of lambda functions can make it slower. Usually it isn't critical, but it looks strange when you're trying to improve performance.

andreitokar · 2021-03-19T02:52:19Z

Forgive my ignorance, but I failed to understand how ReentrantLock.lock()/unlock() will improve concurrency compare to plain old synchronized?

Also what about unused field in Segment.concAccess of type ConcurrentSkipListSet<Entry> ?

Why eager creation of WeakReference is better than a lazy one? Even if every entry will get it at some point in it's life-cycle (just a speculation), total number of those WeakReference objects at any given time, will surely increase.

It seems like most significant optimization (eliminated synchronized) is on the code path of a missed get(), which will be followed by a much more expensive page read anyway.

Side notes:

unused import and commented out chunks of code does not make it more readable
please try to adhere to overall project's code style, i.e. do not use "final" for local variables, unless it's required by compiler.

amolokoedov · 2021-03-19T06:08:37Z

Guys, i've this put request to draft mode until we are done with its tests.

amolokoedov added 6 commits March 14, 2021 18:37

Improved concurrency

fea847c

CACHE_CONCURRENCY and AUTO_COMMIT_BUFFER_SIZE_KB

a81f372

CACHE_CONCURRENCY/AUTO_COMMIT_BUFFER_SIZE_KB

3b9dc9b

fixes java 8

f342c4a

fixes java 8

0e9df84

fixes test

0a5fedb

amolokoedov marked this pull request as ready for review March 14, 2021 18:08

katzyn reviewed Mar 15, 2021

View reviewed changes

amolokoedov added 9 commits March 16, 2021 00:36

Fixes upon review

a3a4988

Fixed missing synchronized statements

af0457b

Fixed syntax

3a11c81

Fixed tests

90d0135

Added missing import

d233a18

Fixed syntax

69deaba

Fixes

7656f24

Fixed compile

17ede53

Fixed compile

0f30f6a

katzyn reviewed Mar 18, 2021

View reviewed changes

amolokoedov added 3 commits March 18, 2021 19:05

Optimizations

8b261e8

Optimization

8f49a13

Fix

4711416

amolokoedov marked this pull request as draft March 19, 2021 06:05

amolokoedov added 4 commits March 19, 2021 10:59

Fix

3ec21cd

Fix

a31529c

Added getCacheSize()

95970d7

Apply initial cache size

beccc44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CacheLongKeyLIRS concurrency improvements #3069

CacheLongKeyLIRS concurrency improvements #3069

amolokoedov commented Mar 14, 2021 •

edited

katzyn left a comment

katzyn Mar 15, 2021

amolokoedov Mar 15, 2021

katzyn Mar 15, 2021

amolokoedov Mar 15, 2021

amolokoedov Mar 15, 2021 •

edited

katzyn Mar 15, 2021

amolokoedov Mar 15, 2021

katzyn left a comment

katzyn Mar 18, 2021

katzyn Mar 18, 2021

andreitokar commented Mar 19, 2021

amolokoedov commented Mar 19, 2021

		public int compareTo(Entry<V> tgt) {
		return key == tgt.key ? 0 : key < tgt.key ? -1 : 1;

CacheLongKeyLIRS concurrency improvements #3069

Are you sure you want to change the base?

CacheLongKeyLIRS concurrency improvements #3069

Conversation

amolokoedov commented Mar 14, 2021 • edited

katzyn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amolokoedov Mar 15, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

katzyn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andreitokar commented Mar 19, 2021

amolokoedov commented Mar 19, 2021

amolokoedov commented Mar 14, 2021 •

edited

amolokoedov Mar 15, 2021 •

edited