Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maintenance taking too much resources since 2.2.222 #3909

Closed
V0174 opened this issue Oct 11, 2023 · 39 comments
Closed

Maintenance taking too much resources since 2.2.222 #3909

V0174 opened this issue Oct 11, 2023 · 39 comments

Comments

@V0174
Copy link

V0174 commented Oct 11, 2023

Hello,
the daily usage of my H2 instance (custom Java application with H2 library) can be divided into four phases:

  1. About 3-6 hours in the morning, crunch new data and CRUD them in the DB.
  2. Then, most of the day, nothing happens, only occasional DB query (usually a few per day)
  3. During the night, close the database with SHUTDOWN COMPACT to avoid DB bloat (it gets to 20x the former size in a few days otherwise), then reopen it.
  4. Don't do anything until the phase 1 starts again.

Up to 2.1.214, this worked mostly well, with H2 taking just negligible CPU when not queried. But when I updated to 2.2.224, suddenly after phase 1 is finished, H2 continues to use a lot of resources in phase 2 - about 20-30% of the CPU available (8 cores), disk, RAM. Mostly the H2 thread and GC (see the screenshot below). After the DB is shut down and reopened, it does not consume much resources anymore. That means c. 12-16 hours of several cores utilized to the max, with a lot of disk writes and memory consumed.

The virtual machine has 8 cores and 16 GB of RAM. The DB is opened with the following parameters:
WRITE_DELAY=10000;DB_CLOSE_ON_EXIT=FALSE;CACHE_SIZE=2097152;MAX_COMPACT_TIME=60
The database size is typically around 30-40GB before daily compacting (5GB afterwards) and it has 10 tables with just a few columns (but a lot of records) with some hash indexes.

When experimenting with different versions, I discovered that this started with 2.2.222. I didn't find anything in the changelog regarding this so I suppose it is a side effect of some other change?

Does anyone else experience this? Could this be solved by some parameter tweaking? I understand that this is difficult to troubleshoot, but I don't even know where I would start about creating a test case. I'll try to experiment further, but it takes time.

CPU Vsphere
RAM Vsphere
Disk Vsphere
image

@andreitokar
Copy link
Contributor

This change in behaviour is most likely due to the following commit.

The database size is typically around 30-40GB before daily compacting (5GB afterwards)

I am very curious to see, if that number (30-40GB) is for your 2.1.214 case, and if so, how it changed (if at all) with the latest version 2.2.224.
If you want to reduce resource consumption on idle, try to decrease AUTO_COMPACT_FILL_RATE from default 90 to 50, 30 or even 0, assuming you can tolerate that wild database file growth, since you have daily maintenance shutdown.

@grandinj
Copy link
Contributor

@andreitokar that sounds like the DB is never managing to finish a compaction properly, and is just periodically re-doing the compaction without making any actual progress.

@V0174 V0174 changed the title Maintenace taking too much resources since 2.2.222 Maintenance taking too much resources since 2.2.222 Oct 13, 2023
@V0174
Copy link
Author

V0174 commented Oct 13, 2023

@andreitokar

how it changed (if at all) with the latest version 2.2.224.

I have a direct comparison only with 2.2.220. With 2.2.220 the DB was around 15 GB before SHUTDOWN COMPACT, with 2.2.222 and 2.2.224 it is ~35-45 GB. That really sounds to me like @grandinj says, that the auto compaction doesn't work properly in my case.

I will experiment with the AUTO_COMPACT_FILL_RATE and post the results.

@andreitokar
Copy link
Contributor

oops

@andreitokar andreitokar reopened this Oct 14, 2023
@wburzyns
Copy link

@andreitokar On the other side, 13740e1 is working wonders for me. In my scenario, with 2.2.200 and earlier, my DB grew by several GBs every day. With 2.2.222 it grows by no more than several hundred MBs per day. That's 5x-10x improvement, which is a lot.

@vreuland
Copy link
Contributor

Hello,

I was also eager to get this housekeeping improvement/fix in order to have a database in 2.X that does not grow wildly (cf. #3848).
I can see that thanks to the fix in 2.2.222, H2 is now able to shrink more effectively the database.
Here I am running a set of CRUD operations every 15 minutes (CRUD are performed during ~5 minutes, then there is a period of 10 minutes without activity) and we can see the compaction happening:
image

But unfortunately, I have the same issue than the one reported by @V0174 😢
For my database, after some of those CRUD operations, my application starts consuming CPU constantly (~20%) even if all activities on the database are stopped for hours. It keeps consuming CPU even if no operations are performed on the database and the JVM is constantly garbage-collecting.

It looks like the H2 housekeeping keeps writing all the time. A quick profiling (60 sec.) shows this:
image

Let me know if you need anything from me to help the investigation of this issue.

Kind regards,
Vincent

@andreitokar
Copy link
Contributor

andreitokar commented Oct 16, 2023

@vreuland If I understand correctly, the problem is that H2, while still "idle" (no client requests) keep consuming resources without any perceivable benefits, and this going on forever, right?.

While db in that state, can you periodically query INFORMATION_SCHEMA.SETTINGS table and see what are the values for:

  • info.FILE_SIZE
  • info.CHUNK_COUNT
  • info.PAGE_COUNT
  • info.PAGE_COUNT_LIVE
  • info.FILL_RATE
  • info.CHUNKS_FILL_RATE
  • info.CHUNKS_FILL_RATE_RW

This might give some insights as to what went wrong with housekeeping.

Another question: If you would try to shutdown database, and then restart again and keep idle, will it continue it's vicious cycle. And if so, can you share such database file?

@V0174
Copy link
Author

V0174 commented Oct 16, 2023

@andreitokar

try to decrease AUTO_COMPACT_FILL_RATE

This helped a lot, thanks. It was the weekend so there was a bit less data, but the difference is clear. With AUTO_COMPACT_FILL_RATE set to 50, the average CPU went down 93% (!) and counterintuitively even the DB size before compacting went down about half. Still, "manual" compacting on shutdown reduces the size to about one fourth of the original size, but much better than one eighth before.

I will experiment some more, but it will take time.

@vreuland
Copy link
Contributor

@vreuland If I understand correctly, the problem is that H2, while still "idle" (no client requests) keep consuming resources without any perceivable benefits, and this going on forever, right?.

While db in that state, can you periodically query INFORMATION_SCHEMA.SETTINGS table and see what are the values for:

  • info.FILE_SIZE
  • info.CHUNK_COUNT
  • info.PAGE_COUNT
  • info.PAGE_COUNT_LIVE
  • info.FILL_RATE
  • info.CHUNKS_FILL_RATE
  • info.CHUNKS_FILL_RATE_RW

This might give some insights as to what went wrong with housekeeping.

Another question: If you would try to shutdown database, and then restart again and keep idle, will it continue it's vicious cycle. And if so, can you share such database file?

Hi @andreitokar,

Yes, correct. Even though there is no requests, it keeps consuming resources for no benefits and that goes on forever.
I have quickly taken the requested values manually (taken randomly during ~ 5 minutes). It looks like it is "oscillating", uselessly rewriting chunks/pages:
image
I can gather more values if needed.

I will try the database shutdown/restart and keep you posted.
Regarding the database file, I cannot share it unfortunately, but I can try to build one with fake data and simulate the same load, hopefully reproducing the issue so that you can have the file.

Regards,
Vincent

@vreuland
Copy link
Contributor

Hello @andreitokar

I had finally a bit of time to look further into this issue.
I was not able to recreate a database with fake data reproducing the issue though.
What I did is that I have taken an online backup of the database where the problem occurs: when I re-create (restore) the database from the backup and just open it, the problem is still there (endless housekeeping consuming CPU while no operation is performed).

If I simply shutdown the database then reopen it, the issue is still there.
Only if I do a "shutdown compact", then the problem is gone when the database is reopened afterwards.

In order to understand more what is happening, you can find attached a dump (using MVStoreTool -dump) of the chunks metadata: mvstore-dump.zip
On top of that, I have also added some "debug/tracing" statements at:

You can find the log attached: tracing.txt

My observations are:

  • last chunks are rewritten endlessly
  • from time to time, compaction stops few seconds but it is only due to the retention time (45 secs). Once last written chunks (written by the housekeeping) are older than the the 45 secs, they are picked up by the housekeeping and rewritten (endlessly)
  • a weird behavior can be observed: A same chunk can be taken/compacted twice in two successive doHousekeeping iterations. Cf. for example chunk:279871 in the logs. Is that normal? It looks like there is a race between the housekeeping thread (MVStore background writer) and the thread doing the serialization of the rewritten chunk (H2-serialization).

Please have a look.
I will continue my investigation but your input/help would be very valuable.

Thank you.
Kind regards,
Vincent

@wburzyns
Copy link

Only if I do a "shutdown compact", then the problem is gone when the database is reopened afterwards.

This happened to me also, but with 2.2.220. I had a database that was opening/closing fine, but every attempt to add a new row to a table failed. SHUTDOWN COMPACT somehow repaired the DB.

@andreitokar
Copy link
Contributor

@wburzyns, it looks like a totally different case. Of course, it is possible that SHUTDOWN COMPACT helped in both cases. SHUTDOWN COMPACT recreates the whole database file, so no wonder that it fixes (or may break) something along the way, although original problem might be quite different.

@vreuland
Copy link
Contributor

Please disregard mvstore-dump.zip attached above.
Actually, the dump is partial -> "MVStoreTool -dump" crashed (NullPointerException) on one of the block of the database
image

Exception in thread "main" java.lang.NullPointerException
	at org.h2.mvstore.DataUtils.parseMap(DataUtils.java:804)
	at org.h2.mvstore.Chunk.<init>(Chunk.java:171)
	at org.h2.mvstore.SFChunk.<init>(SFChunk.java:26)
	at org.h2.mvstore.MVStoreTool.dump(MVStoreTool.java:151)
	at org.h2.mvstore.MVStoreTool.main(MVStoreTool.java:57)

Could the fact that some blocks are "corrupted" explain the problem of the endless housekeeping?

@andreitokar please let me know if I can extract anything else that could be helpful to investigate this issue.

Kind regards,
Vincent

@vreuland
Copy link
Contributor

Quick note:
I have checked I don't think my DB is corrupted actually (and is the cause of the housekeeping issue)
There are instead 2 small bugs in the MVStoreTool (-dump):

  1. If one of the read block starts with a 'c', the tool will try to parse it as a chunk header. It can happen that this is not the start of a chunk (i.e. a leaf page can spread over multiple bloc and contain data with 'c'...). In previous versions (2.1.214), the parsing of the chunk header would throw a MVStoreException in that case and that exception is caught by the tool (the tool simply moves to the next block then). In latest version, the exception is not there anymore and a null is returned, leading to a NullPointerException further down.
  2. Similar issue is observed with blocks starting with an "H". The tool thinks that this is a file header (even though the two file headers have already been read at the beginning of the file) and start outputting binary stuff... This bug was already present in previous versions I think:
    image
    in both cases, those are easy to fix and I'll try to create an issue+PR soon.

For the housekeeping issue, I am still clueless though.
Kr,
Vincent

@andreitokar
Copy link
Contributor

@vreuland I tend to agree with you, that it's most likely MVStoreTool problem. That code was neglected for a long time and probably not up to date with latest MVStore changes.
Houskeeping issue may be the one caused by the fact that chunk rewriting may actually not improve file layout, but make it worse. It is possible, and i think your log proves it, that rewriting of a meta table (because rewriting involves not just leave page itself, but all pages above it) may end up with more chunks eligible for rewriting and therefore endless loop. Specific update pattern, when new chunks are always inserted at the end, may lead to extremely unbalanced tree, and updates in that deep part will be amplified by the depth of the tree.
As a workaround, have you tried to decrease AUTO_COMPACT_FILL_RATE from it's default value of 90 to lets say 70, or even 50? I think it may stop that vicious cycle.

@manticore-projects
Copy link
Contributor

manticore-projects commented Dec 19, 2023

Greetings!

Coming from #3948.

  1. I have reduced AUTO_COMPACT_FILL_RATE=45 but still see massive write activity on an Idle DB
    DB size is 200 MB, there is NO session or user activity but just an open connection after running 1 empty select. Within less than 5 minutes more than 2 GByte of data were written:
  66175 be/4 are           0.00 B      2.43 G  ?unavailable?  java -jar data/src/h2database/h2/target/h2-2.2.229-SNAPSHOT-3fbbb2cc5-dirty.jar [H2-save]

Although the second MVStore process is not showing anymore as before.

  1. Sample Database: [removed]

You can reproduce the behaviour by:

  1. open this DB in server mode

  2. execute an empty SELECT select * from mcmon.EXECUTION

  3. do nothing after and start sudo iotop -a for monitoring the IO

  4. this Sample Database is not corrupted but can be rebuilt from scratch (by our software):
    a) create new DB
    b) populate with DDL statements (schemas, tables, indexes)
    c) run a batch of DML statements (always exactly the same data, based on static business logic)

For me this is a big problem because:

  1. I experience micro freezes on my laptop
  2. at this write rate my SSD is at risk
  3. recent H2 versions fixed various other issues so going back to 212 is also not an option

@manticore-projects
Copy link
Contributor

manticore-projects commented Dec 19, 2023

Setting ;AUTO_COMPACT_FILL_RATE=5 finally stopped that IO on idle, thank you for saving my day (and my SSD).

Can I recommend to introduce a kind of tenor, so this housekeeping will occur only when idle within a certain period (e. g. every 20 mins when idle) -- but not continuously.

@manticore-projects
Copy link
Contributor

Specific update pattern, when new chunks are always inserted at the end, may lead to extremely unbalanced tree, and updates in that deep part will be amplified by the depth of the tree.

Fyi, this describes our particular use-case applied to the provide database. (Which is a ledger, where new Debit/Credit entries are appended continuously at the end).
Although I don't think there is anything wrong or unusual with that.

@manticore-projects
Copy link
Contributor

manticore-projects commented Dec 19, 2023

It's getting weird:

  1. I reduced AUTO_COMPACT_FILL_RATE from 90 to 45 --> did not solve the problem
  2. I further reduced AUTO_COMPACT_FILL_RATE from 45 to 5 --> after writing 800 MB at start, the problem went away
  3. after I increased AUTO_COMPACT_FILL_RATE to 25, 45, 90 again (to find the "sweet spot") --> problem still did not re-occur

It almost looks like the lowest AUTO_COMPACT_FILL_RATE allowed to repair the store structure.
Maybe the 90 itself is not the problem, but maybe with 90 the "housekeeping" is interfering with itself and so never can finish (while 5 can finish and is "clean" after).

I will repeat more tests later, starting from scratch with an empty/new database.

Update: after a while, the IO problem returns with AUTO_COMPACT_FILL_RATE=90, so this "repair" does not to seem permanent or indeed 90 is actually causing the problem during the attempt of house keeping.

@manticore-projects
Copy link
Contributor

manticore-projects commented Dec 19, 2023

Another question: If you would try to shutdown database, and then restart again and keep idle, will it continue it's vicious cycle. And if so, can you share such database file?

Yes and yes ( [removed] ).

As written above, closing the DB, setting AUTO_COMPACT_FILL_RATE=5, opening the DB seems to "repair" it (after a massive initial WRITE activity). After that, closing it and setting AUTO_COMPACT_FILL_RATE=90 does not harm anymore.

andreitokar added a commit to andreitokar/h2database that referenced this issue Dec 22, 2023
@andreitokar
Copy link
Contributor

@manticore-projects Thank you for the test case, it really helped.
The problem is that when idle database tries to improve data file layout too much, cure becomes worse than a disease. While it's trying to rewrite sparsely populated chunks, it creates more sparse chunks than it started with. This is always a possibility with current implementation, where old page evacuation from sparse chunk is just a regular "update" operation, which allows for concurrent reads and even updates, but creates dead non-leaf pages along the way. Fix will check results after the fact, and if it turns out that there were no improvements, housekeeping will stop until some further activity in the database.

@manticore-projects
Copy link
Contributor

Thank you for your work and effort @andreitokar and team! We all do appreciate it. Merry Christmas.

@manticore-projects
Copy link
Contributor

bring it.

catull pushed a commit to catull/h2database that referenced this issue Jan 6, 2024
@grandinj
Copy link
Contributor

@manticore-projects you have admin access now to the H2 repo

@vreuland
Copy link
Contributor

vreuland commented Feb 4, 2024

@manticore-projects Thank you for the test case, it really helped. The problem is that when idle database tries to improve data file layout too much, cure becomes worse than a disease. While it's trying to rewrite sparsely populated chunks, it creates more sparse chunks than it started with. This is always a possibility with current implementation, where old page evacuation from sparse chunk is just a regular "update" operation, which allows for concurrent reads and even updates, but creates dead non-leaf pages along the way. Fix will check results after the fact, and if it turns out that there were no improvements, housekeeping will stop until some further activity in the database.

Hello @andreitokar

Thank you for the fix you have made.
Unfortunately, that does not fix the issue in some of the databases I have 😢 : I can still experience endless chunk rewriting when the database is idle (even though it is less aggressive than before).

Looking further into this, I could see that:

  • the new "stop condition" is using the rewritable chunk fill rate and does not seem to cope well with a retention time > 0 (I am using the default of 45 sec.): The "stop condition" does not take (immediately) into account the lastly rewritten chunks
  • the isIdle value does not discriminate the writes performed by the housekeeping itself from the rest of the writes. Even if no operation is performed on the database by any user, once the housekeeping rewrites some chunks, the database is then "flagged" as non-idle and that resumes the useless chunk rewriting.

I thought at first that the best would be to find a way to determine before doing the chunk rewrite, whether this rewrite will help or not. But I am not sure that this is posssible using only the chunks metadata we have today (live/dead pages) or without scanning all the maps...
Therefore I liked your idea to simply do a chunk rewrite and stop if there is no progress/improvement and thought I could adapt the "stop condition" to make it work with my databases.
After a lot (lot) of tests, I could find a slightly adapted version of the "stop condition" that works quite well: vreuland@c83b88d
It simply:

  • uses the chunks fill rates instead of the rewritable chunks fill rate to have less dependency on the retention time
  • avoids using isIdle (for the reason stated above) but records instead whether the last housekeeping has rewritten chunks and what was the previous chunk fill rate to decide to continue/restart the chunk rewriting housekeeping

That was the best version I could come up with in regard to the different test scenarios I have performed:

  • no operation performed at all on the database
  • big bunch of operations performed every 30 minutes and idle database in between
  • constant small amount of operations on the database

I would be very glad to have your view on this and wonder if this could be officially included if I draft a proper PR.

Thank you in advance.
Vincent

andreitokar added a commit to andreitokar/h2database that referenced this issue Feb 19, 2024
andreitokar added a commit to andreitokar/h2database that referenced this issue Feb 19, 2024
@andreitokar
Copy link
Contributor

andreitokar commented Feb 19, 2024

Hi @vreuland,
First of all, thank you for your testing efforts and contribution, and sorry about the delay, I've been somewhat busy lately.

I agree with you on both points

  1. It's probably better to use full chunk fill rate instead of "rewritable", because it better represent goal of housekeeping
  2. isIdle() may yield flaky results, due to the fact that comparison base taken after rewriting will not reflect async changes resulted from rewriting itself.
    On the other hand, "stop condition" in your PR also seems questionable (see my comment in PR). Of course, there is no single or "right" answer here, and everyone's mileage may vary, but it just seems not logical.
    I've created another PR, which should address two items above. I stick with the definition of "idle" mode, just use cached value everywhere within a single pass, and relaxed the way it's determined, so minimal activity would not count. It seems to work nicely distinguishing between user activity and idle state.
    Could you please give it a try with your tests and see which change yields better results.

@andreitokar andreitokar reopened this Feb 19, 2024
@vreuland
Copy link
Contributor

Hello @andreitokar ,
Thank you for your response and your new proposal. This is really appreciated (and no worries for the delay!)

I have directly replied in the PR regarding the rational behind the stopping condition I used (#4000 (comment)), but it is indeed probably not flawless.

Let me try your new proposal with the different tests I have. I will probably need a couple of days but can return with a clear report.

Thanks again.
Vincent

@vreuland
Copy link
Contributor

Hi,
I have tried but unfortunately it does not work well with one of the database I have.
I am simply restoring an online backup of that database and then let the database completely idle.
I could quickly see the housekeeping consuming CPU (due to chunk rewriting) periodically with in-between pauses of ~20 seconds. (~25 seconds on active housekeeping, followed by ~ 20 seconds where housekeeping is paused, then housekeeping resumes during 25 seconds... to link with the retention time of 45 seconds obviously).

I have very quickly added some "traces" (main variables used in the housekeeping dumped in stdout...) to see more in-depth what was going on. (cf. vreuland@ba42400)

Here is the log: housekeeping-refinement-traces-3.txt

We can actually see that the stopIdleHousekeeping never switches to "true":

  • The chunksFillRate starts low, then increases with the first housekeeping rewrites but then gets stuck at 55% (for ever). It never goes down and therefore never triggers the "stopIdleHousekeeping"
  • the rewrite chunk fill rate / adjusted chunk fill rate is moving up due to the rewriting and takes ~25 seconds to reach the target. Then 20 seconds later, retention time is over and the rewriteable rate goes down again

I can maybe replace the condition
stopIdleHousekeeping = idle && getChunksFillRate() < chunksFillRate;
by
stopIdleHousekeeping = idle && getChunksFillRate() <= chunksFillRate;
and see what it brings (?)
To be honest, I also don't fully grasp what is the rationale with the remaining condition on the rewriteable chunks fill rate / restoreHousekeepingAtRate (and this "- 2").

I still have the feeling that we "simply" need to have to 2 clear conditions involved:

  1. A condition to decide if the housekeeping should be "stopped/paused": I can only think about stopping when the last iteration has rewritten some chunks and that rewriting did not improve the chunksFillRate (and would not involve at all the rewriteable chunks fill rate, which is too unreliable due to the retention time)
  2. A condition to decide if the housekeeping must be resumed (if it is currently stopped)": Somehow when some significant user operations have happened (?)

The equilibrium in those conditions is definitively hard to find but it is an interesting challenge 😄
Looking forward to reading your thoughts on this.

Kr,
Vincent

@andreitokar
Copy link
Contributor

@vreuland

stopIdleHousekeeping = idle && getChunksFillRate() <= chunksFillRate;

indeed makes more sense - if rewritten amount was so small that overall fill rate stays the same (almost), what is the point to continue?

what is the rationale with the remaining condition on the rewriteable chunks fill rate / restoreHousekeepingAtRate (and this "- 2").

It simply says that housekeeping need to be resumed if rewritableChunksFillRate drops by more than 2% from it's value at the time we stopped. The idea is to resume when more chunks become eligible for rewriting and database is still idle and with housekeeping stopped - otherwise it seems like a lost opportunity

On somewhat unrelated note: I wonder, if we should decrease default for RETENTION_TIME from 45000 to lets say 1000?
IMHO, 45 sec is way too conservative. It was chosen long ago, when we had no reliable way to determine if chunk is still in use.
I bet, if you decrease it in your case, housekeeping behaviour would be better.
In my tests, when I reverted RETENTION_TIME from 1000 back to default, and let database sit idle after some data write activity, I indeed saw two or three housekeeping on / off cycles of decreasing intensity, but then it stops for good with both chunksFillRate=86-88 and rewritableChunksFillRate=87-89 (AUTO_COMPACT_FILL_RATE is 90).

BTW, you do not have to do any surgery on H2, but instead can do a query like

SELECT * FROM INFORMATION_SCHEMA.SETTINGS
WHERE SETTING_NAME IN (
    'info.FILL_RATE',
    'info.CHUNKS_FILL_RATE',
    'info.CHUNKS_FILL_RATE_RW',
    'info.FILE_READ',
    'info.FILE_WRITE',
    'info.CHUNK_COUNT'
);

Such query does not do any I/O and therefor won't disturb "idle" status.

@vreuland
Copy link
Contributor

@andreitokar

I must admit I am still not fully convinced that the rewritableChunksFillRate is the good candidate to use to decide whether to resume the housekeeping mainly because of this effect of the retention time.
Now for sure, we can probably reach better results by tweaking the retention_time parameter but is this the best way?

At the same time, we have the "chunksFillRate" which gives, from my understanding, an "instantaneous" view of the current fill rate (no effect of retention or asynchronous operations) and looks therefore more suited to drive the pausing/resuming of the housekeeping. And that does not mean the rewritableChunksFillRate is not used at all: It is used but a step later after the stop/resume housekeeping condition and to simply decide whether a rewriting should be tried in the current housekeeping iteration.

I will continue to test your PR (with getChunksFillRate() <= chunksFillRate) on my different scenario but at the same time I have also updated my PR based on the remarks I have made here (and in my previous comment) and which I test as well: I do hope it is clearer and more logical/less questionable than before. Don't hesitate to have a quick look please share any feedback😄

@manticore-projects

Would it be possible to get again your sample database (the one you shared originally in #3909 (comment)) ? It would be great if I can also test the latest fix proposals on it 😃

@andreitokar
Copy link
Contributor

@vreuland

Now for sure, we can probably reach better results by tweaking the retention_time parameter but is this the best way?

I am not asking to tweak retention_time, just saying that it seems unreasonably high and has a huge effect on database file size and will change pattern of housekeeping work.

and chunksFillRate looks therefore more suited to drive the pausing/resuming of the housekeeping

I think we agree here on pausing - chunksFillRate directly reflects our goal and failure to improve is the reason to stop.
As far as condition to resume goes - it's a different story. We would resume after any client activity, but what if there is none? If your condition would be based on chunksFillRate only, it would never fire, simply because chunksFillRate would remain constant in that scenario. On the other hand, if retention_time is significant and db stays idle, more and more chunks would become eligible for rewrite, rewritableChunksFillRate would move close to chunksFillRate (decrease I assume, but who knows). We may just drop that case and do not resume until some client activity, and this would make the whole problem disappear. It just seems like a lost opportunity to improve file layout while idle, in preparation for a possible future storm of user activity.

@manticore-projects
Copy link
Contributor

manticore-projects commented Feb 23, 2024

We may just drop that case and do not resume until some client activity, and this would make the whole problem disappear. It just seems like a lost opportunity to improve file layout while idle, in preparation for a possible future storm of user activity.

How about making it an option or switch?
So people who want to maximize maintenance (on dedicated servers) can run it and people who want to minimize resources (on shared servers) won't run it?

@manticore-projects
Copy link
Contributor

@manticore-projects

Would it be possible to get again your sample database (the one you shared originally in #3909 (comment)) ? It would be great if I can also test the latest fix proposals on it 😃

I am really sorry, I don't have it anymore since @andreitokar's improvement solved that problem reliably for us on all servers and we run always with latest GIT.

andreitokar added a commit to andreitokar/h2database that referenced this issue Feb 24, 2024
@vreuland
Copy link
Contributor

Hello @andreitokar

If your condition would be based on chunksFillRate only, it would never fire, simply because chunksFillRate would remain constant in that scenario. On the other hand, if retention_time is significant and db stays idle, more and more chunks would become eligible for rewrite, rewritableChunksFillRate would move close to chunksFillRate (decrease I assume, but who knows). We may just drop that case and do not resume until some client activity, and this would make the whole problem disappear. It just seems like a lost opportunity to improve file layout while idle, in preparation for a possible future storm of user activity.

Ok. I understand your point. It is clearly an optimization to resume the housekeeping a bit sooner in certain cases (even though I am still not clearly see what could be the scenario leading to those cases to be honest).
But why not indeed. The attention is point is to make sure the "resume" is not triggered by chunks rewrites performed by housekeeping itself (while the same rewrites have put the housekeeping on stop a few seconds before) or too often for nothing.
I see you have also make additional refinements on the "resume condition" this week-end.
Please let me quickly test them (I was not able to reproduce my exact scenarios on unit tests, so what I am doing is that I am integrating the new version in my application and then running scenarios/tests which takes ~2 days to confidently see the evolution of the database file).

Thank you for all your effort on this. Again, it is greatly appreciated.

@vreuland
Copy link
Contributor

Hi @andreitokar
I have run my tests. These are using my application (which embeds an H2 database) and I am simply monitoring the database file size and the CPU usage of the app, making sure that they both behave (no too much cpu usage, especially when idle and a DB size that is not exploding). I know that there is more than just chunks rewriting to drive down the database file size (file truncation must happen as well and can only be triggered if last chunks of the file are becoming dead, ...) but that still gives a good idea (especially on several days of testing) about whether housekeeping/compaction is working.

The main scenario has run more than 2 days and consists in triggering every 15 minutes a high level of requests (that lasts 5 min.) towards the app and staying idle in between. There is also an housekeeping job in the app that is cleaning every hour the data older than 2 days (meaning the database size should remain almost constant after 2 days) and on top of that, an online database backup is taken also every hour.

Note that scenario makes H2 behave badly using 2.1.214 (database size is quickly exploding) or 2.2.224 (high CPU usage in the idle periods).
I have compared your PR (andreitokar:issue-3909) with mine
vreuland:fix-endless-chunk-rewriting-when-db-is-idle as I knew thanks to my previous tests that H2 was behaving ok using my PR.
I am running 4 instances of the app (2 using andreitokar:issue-3909) and 2 using vreuland:fix-endless-chunk-rewriting-when-db-is-idle) and the graphs are showing the max db file size and max cpu usage across each pair of instances.

In terms of DB file size, results are ok. Database size is kept more or less at the same level (even if the graph pattern is slighly different) :
image

In terms of CPU, it is is also ok. CPU remains very low in idle periods:
image

There is just one thing I have observed is that during the high requests periods, the CPU using andreitokar:issue-3909 is twice higher... Confirmed with a profiling of the instances, H2 is serializing (writing) twice as much in those high requests periods than when using the other PR...

I have run other scenarios (constant low requests rate for instance) and H2 was behaving ok and similarly with both PRs.

At the end, I believe you proposal/PR is fine. In my specific scenario I still think my PR behave a bit better but it is obviously difficult to assess if that would be the case in other scenarios. And for this higher CPU usage pointed out above, I think it is still ok and not too worrying.

So please, go ahead with merging one of those PRs. 😄
Thank you again for your support.

@vreuland
Copy link
Contributor

Hello @andreitokar

I hope you are doing great.
Shall we move forward on this issue as we have a proper fix?
I am really looking forward to getting a release with this fix. 😄

Best regards,
Vincent

andreitokar added a commit to andreitokar/h2database that referenced this issue Mar 24, 2024
@andreitokar
Copy link
Contributor

I've dropped housekeeping resumption, because it's clearly does nothing useful in my tests. Merged.

catull pushed a commit to catull/h2database that referenced this issue Mar 29, 2024
catull pushed a commit to catull/h2database that referenced this issue Mar 29, 2024
catull pushed a commit to catull/h2database that referenced this issue Mar 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants