raft sync times out with default timers when boltdb is bigger than 5-6GB #11983

write0nly · 2021-07-02T19:43:19Z

1- create a brand new vault cluster using the raft integrated storage backend

2- enable approle logins, create a role_id and secret_id

3- from a large set of clients run infinite loops that only do approle logins, nothing else in very harsh conditions, and leave it running for days

4- watch the database grow. the vault.db boltdb goes above 10GB

5- stop vault on one of the secondary nodes and move vault.db aside to vault.db.backup

6- restart and unseal the secondary

7- expectation is that the vault.db will be copied over like it normally is on smaller DBs

8- Instead experience errors where the node cannot copy the database and crashloops with:

...
Jun 25 18:49:13 vault-3 vault-1.8.0-dev[17280]: 2021-06-25T18:49:13.247+0100 [INFO]  core: security barrier not initialized
Jun 25 18:49:16 vault-3 vault-1.8.0-dev[17280]: 2021-06-25T18:49:16.401+0100 [WARN]  core: join attempt failed: error="error during raft bootstrap init call: context deadline exceeded"
Jun 25 18:49:16 vault-3 vault-1.8.0-dev[17280]: 2021-06-25T18:49:16.401+0100 [ERROR] core: failed to retry join raft cluster: retry=2s
...
Jun 25 18:49:18 vault-3 vault-1.8.0-dev[17280]: 2021-06-25T18:49:18.402+0100 [INFO]  core: security barrier not initialized
Jun 25 18:49:18 vault-3 vault-1.8.0-dev[17280]: 2021-06-25T18:49:18.402+0100 [INFO]  core: attempting to join possible raft leader node: leader_addr=https://vault-1:8200
Jun 25 18:49:18 vault-3 vault-1.8.0-dev[17280]: 2021-06-25T18:49:18.470+0100 [WARN]  core: join attempt failed: error="failed to send answer to raft leader node: error bootstrapping cluster: cluster already has state"
...

9- the secondary never comes up

10- stop vault and move vault.db.backup back to vault.db, restart and unseal vault

11- vault starts working properly

FYI this was tested on 1.8.0-dev from #11072 (comment)

The text was updated successfully, but these errors were encountered:

write0nly · 2021-07-02T20:04:29Z

testing with higher settings for the raft transport timeout

raskchanky · 2021-07-07T00:30:21Z

@write0nly In order to re-join the node, you should first clear out the entire raft data directory, so that both raft.db and vault.db are re-created when the node re-joins the cluster. Can you give that a try and see if it makes a difference?

write0nly · 2021-07-09T09:12:28Z

@raskchanky Thank you for the info. Indeed you are right. After doing the full rm, vault restart and unseal vault did recover (with the caveats mentioned at the bottom):

Jul 08 20:15:18 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T20:15:18.433+0100 [DEBUG] core: cannot unseal, not enough keys: keys=2 threshold=3 nonce=0397f2fd-3901-65b6-cddf-4598414ca313
Jul 08 20:15:18 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T20:15:18.549+0100 [DEBUG] core: unseal key supplied: migrate=false
Jul 08 20:20:11 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T20:20:11.589+0100 [INFO]  storage.raft.snapshot: creating new snapshot: path=/opt/vault/raft/raft/snapshots/12-46328811-1625772011589.tmp
Jul 08 20:20:12 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T20:20:12.103+0100 [DEBUG] core.cluster-listener: performing server cert lookup
Jul 08 20:20:12 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T20:20:12.212+0100 [DEBUG] storage.raft.raft-net: accepted connection: local-address=vault-3:8201 remote-address=10.XXX.XXX.59:50060
Jul 08 20:20:36 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T20:20:36.976+0100 [TRACE] storage.raft.snapshot: snapshot write: writing keys: num_written=50000
Jul 08 20:21:01 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T20:21:01.045+0100 [TRACE] storage.raft.snapshot: snapshot write: writing keys: num_written=100000
...
Jul 08 21:07:15 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T21:07:15.083+0100 [TRACE] storage.raft.snapshot: snapshot write: writing keys: num_written=12633378
Jul 08 21:07:15 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T21:07:15.084+0100 [INFO]  storage.raft: copied to local snapshot: bytes=8966483775
Jul 08 21:07:15 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T21:07:15.085+0100 [INFO]  storage.raft.fsm: installing snapshot to FSM
Jul 08 21:07:15 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T21:07:15.092+0100 [INFO]  storage.raft.fsm: snapshot installed
Jul 08 21:07:15 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T21:07:15.092+0100 [DEBUG] storage.raft.fsm: time to open database: elapsed=155.782µs path=/opt/vault/raft/vault.db
Jul 08 21:07:15 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T21:07:15.096+0100 [INFO]  storage.raft: Installed remote snapshot
Jul 08 21:07:15 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T21:07:15.548+0100 [WARN]  core: cluster listener is already started
Jul 08 21:07:15 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T21:07:15.549+0100 [TRACE] storage.raft: setting up raft cluster
Jul 08 21:07:15 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T21:07:15.549+0100 [DEBUG] storage.raft: raft already started, not setting up cluster
Jul 08 21:07:15 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T21:07:15.549+0100 [INFO]  core: vault is unsealed
Jul 08 21:07:16 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T21:07:16.160+0100 [TRACE] core: found new active node information, refreshing
Jul 08 21:07:16 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T21:07:16.160+0100 [DEBUG] core: parsing information for new active node: active_cluster_addr=https://vault-1:8201 active_redirect_addr=https://vault-dev/
Jul 08 21:07:16 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T21:07:16.160+0100 [DEBUG] core: refreshing forwarding connection
Jul 08 21:07:16 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T21:07:16.160+0100 [DEBUG] core: clearing forwarding clients
Jul 08 21:07:16 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T21:07:16.160+0100 [DEBUG] core: done clearing forwarding clients
Jul 08 21:07:16 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T21:07:16.161+0100 [DEBUG] core: done refreshing forwarding connection
Jul 08 21:07:16 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T21:07:16.162+0100 [DEBUG] core.cluster-listener: creating rpc dialer: alpn=req_fw_sb-act_v1 host=fw-de7d2022-280f-3908-fdaa-7d0a2c071af9
Jul 08 21:07:16 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T21:07:16.263+0100 [DEBUG] core.cluster-listener: performing client cert lookup

Caveats:

it took about 45mins to resync
the unseal actually went fine but this behaviour where it goes back to 0/3 in the UI and in the cli is very awkward. I got a script that shows the status in pretty format and it goes back to 0/3 and stays that way until the full sync has finished.

https://vault-1:8200/: version: 1.8.0-dev unsealed, raft: 52145174/52145049, leader: https://vault-1:8201
https://vault-2:8200/: version: 1.8.0-dev unsealed, raft: 52145193/52145187, leader: https://vault-1:8201
https://vault-3:8200/: version: 1.8.0-dev   sealed, progress: 0/3

It's unclear to me if this 1/3->2/3->3/3->0/3->copy...->sudden unseal behaviour is actually working as designed. I have not found any documentation that describes that. Normally when the sync is very fast people do not really notice it.

Regarding the rm of vault.db + raft.db could this become a documentation point?

Many thanks again!

ncabatoff · 2021-07-14T11:58:34Z

Regarding the rm of vault.db + raft.db could this become a documentation point?

I agree we should write something up about defragmenting raft/boltdb.

raskchanky self-assigned this Jul 7, 2021

raskchanky added bug Used to indicate a potential bug core/raft labels Jul 7, 2021

ncabatoff added the waiting-for-response label Jul 8, 2021

hsimon-hashicorp removed the waiting-for-response label Jul 12, 2021

ncabatoff added docs and removed bug Used to indicate a potential bug labels Jul 14, 2021

hsimon-hashicorp added storage/raft and removed core/raft docs labels Jul 20, 2021

ncabatoff added the docs label Jul 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

raft sync times out with default timers when boltdb is bigger than 5-6GB #11983

raft sync times out with default timers when boltdb is bigger than 5-6GB #11983

write0nly commented Jul 2, 2021

write0nly commented Jul 2, 2021

raskchanky commented Jul 7, 2021

write0nly commented Jul 9, 2021

ncabatoff commented Jul 14, 2021

raft sync times out with default timers when boltdb is bigger than 5-6GB #11983

raft sync times out with default timers when boltdb is bigger than 5-6GB #11983

Comments

write0nly commented Jul 2, 2021

write0nly commented Jul 2, 2021

raskchanky commented Jul 7, 2021

write0nly commented Jul 9, 2021

ncabatoff commented Jul 14, 2021