Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raft sync times out with default timers when boltdb is bigger than 5-6GB #11983

Open
write0nly opened this issue Jul 2, 2021 · 4 comments
Open
Assignees

Comments

@write0nly
Copy link

1- create a brand new vault cluster using the raft integrated storage backend

2- enable approle logins, create a role_id and secret_id

3- from a large set of clients run infinite loops that only do approle logins, nothing else in very harsh conditions, and leave it running for days

4- watch the database grow. the vault.db boltdb goes above 10GB

5- stop vault on one of the secondary nodes and move vault.db aside to vault.db.backup

6- restart and unseal the secondary

7- expectation is that the vault.db will be copied over like it normally is on smaller DBs

8- Instead experience errors where the node cannot copy the database and crashloops with:

...
Jun 25 18:49:13 vault-3 vault-1.8.0-dev[17280]: 2021-06-25T18:49:13.247+0100 [INFO]  core: security barrier not initialized
Jun 25 18:49:16 vault-3 vault-1.8.0-dev[17280]: 2021-06-25T18:49:16.401+0100 [WARN]  core: join attempt failed: error="error during raft bootstrap init call: context deadline exceeded"
Jun 25 18:49:16 vault-3 vault-1.8.0-dev[17280]: 2021-06-25T18:49:16.401+0100 [ERROR] core: failed to retry join raft cluster: retry=2s
...
Jun 25 18:49:18 vault-3 vault-1.8.0-dev[17280]: 2021-06-25T18:49:18.402+0100 [INFO]  core: security barrier not initialized
Jun 25 18:49:18 vault-3 vault-1.8.0-dev[17280]: 2021-06-25T18:49:18.402+0100 [INFO]  core: attempting to join possible raft leader node: leader_addr=https://vault-1:8200
Jun 25 18:49:18 vault-3 vault-1.8.0-dev[17280]: 2021-06-25T18:49:18.470+0100 [WARN]  core: join attempt failed: error="failed to send answer to raft leader node: error bootstrapping cluster: cluster already has state"
...

9- the secondary never comes up

10- stop vault and move vault.db.backup back to vault.db, restart and unseal vault

11- vault starts working properly

FYI this was tested on 1.8.0-dev from #11072 (comment)

@write0nly
Copy link
Author

testing with higher settings for the raft transport timeout

@raskchanky raskchanky self-assigned this Jul 7, 2021
@raskchanky raskchanky added bug Used to indicate a potential bug core/raft labels Jul 7, 2021
@raskchanky
Copy link
Contributor

@write0nly In order to re-join the node, you should first clear out the entire raft data directory, so that both raft.db and vault.db are re-created when the node re-joins the cluster. Can you give that a try and see if it makes a difference?

@write0nly
Copy link
Author

@raskchanky Thank you for the info. Indeed you are right. After doing the full rm, vault restart and unseal vault did recover (with the caveats mentioned at the bottom):

Jul 08 20:15:18 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T20:15:18.433+0100 [DEBUG] core: cannot unseal, not enough keys: keys=2 threshold=3 nonce=0397f2fd-3901-65b6-cddf-4598414ca313
Jul 08 20:15:18 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T20:15:18.549+0100 [DEBUG] core: unseal key supplied: migrate=false
Jul 08 20:20:11 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T20:20:11.589+0100 [INFO]  storage.raft.snapshot: creating new snapshot: path=/opt/vault/raft/raft/snapshots/12-46328811-1625772011589.tmp
Jul 08 20:20:12 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T20:20:12.103+0100 [DEBUG] core.cluster-listener: performing server cert lookup
Jul 08 20:20:12 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T20:20:12.212+0100 [DEBUG] storage.raft.raft-net: accepted connection: local-address=vault-3:8201 remote-address=10.XXX.XXX.59:50060
Jul 08 20:20:36 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T20:20:36.976+0100 [TRACE] storage.raft.snapshot: snapshot write: writing keys: num_written=50000
Jul 08 20:21:01 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T20:21:01.045+0100 [TRACE] storage.raft.snapshot: snapshot write: writing keys: num_written=100000
...
Jul 08 21:07:15 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T21:07:15.083+0100 [TRACE] storage.raft.snapshot: snapshot write: writing keys: num_written=12633378
Jul 08 21:07:15 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T21:07:15.084+0100 [INFO]  storage.raft: copied to local snapshot: bytes=8966483775
Jul 08 21:07:15 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T21:07:15.085+0100 [INFO]  storage.raft.fsm: installing snapshot to FSM
Jul 08 21:07:15 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T21:07:15.092+0100 [INFO]  storage.raft.fsm: snapshot installed
Jul 08 21:07:15 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T21:07:15.092+0100 [DEBUG] storage.raft.fsm: time to open database: elapsed=155.782µs path=/opt/vault/raft/vault.db
Jul 08 21:07:15 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T21:07:15.096+0100 [INFO]  storage.raft: Installed remote snapshot
Jul 08 21:07:15 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T21:07:15.548+0100 [WARN]  core: cluster listener is already started
Jul 08 21:07:15 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T21:07:15.549+0100 [TRACE] storage.raft: setting up raft cluster
Jul 08 21:07:15 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T21:07:15.549+0100 [DEBUG] storage.raft: raft already started, not setting up cluster
Jul 08 21:07:15 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T21:07:15.549+0100 [INFO]  core: vault is unsealed
Jul 08 21:07:16 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T21:07:16.160+0100 [TRACE] core: found new active node information, refreshing
Jul 08 21:07:16 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T21:07:16.160+0100 [DEBUG] core: parsing information for new active node: active_cluster_addr=https://vault-1:8201 active_redirect_addr=https://vault-dev/
Jul 08 21:07:16 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T21:07:16.160+0100 [DEBUG] core: refreshing forwarding connection
Jul 08 21:07:16 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T21:07:16.160+0100 [DEBUG] core: clearing forwarding clients
Jul 08 21:07:16 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T21:07:16.160+0100 [DEBUG] core: done clearing forwarding clients
Jul 08 21:07:16 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T21:07:16.161+0100 [DEBUG] core: done refreshing forwarding connection
Jul 08 21:07:16 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T21:07:16.162+0100 [DEBUG] core.cluster-listener: creating rpc dialer: alpn=req_fw_sb-act_v1 host=fw-de7d2022-280f-3908-fdaa-7d0a2c071af9
Jul 08 21:07:16 vault-3 vault-1.8.0-dev[27066]: 2021-07-08T21:07:16.263+0100 [DEBUG] core.cluster-listener: performing client cert lookup

Caveats:

  • it took about 45mins to resync
  • the unseal actually went fine but this behaviour where it goes back to 0/3 in the UI and in the cli is very awkward. I got a script that shows the status in pretty format and it goes back to 0/3 and stays that way until the full sync has finished.
https://vault-1:8200/: version: 1.8.0-dev unsealed, raft: 52145174/52145049, leader: https://vault-1:8201
https://vault-2:8200/: version: 1.8.0-dev unsealed, raft: 52145193/52145187, leader: https://vault-1:8201
https://vault-3:8200/: version: 1.8.0-dev   sealed, progress: 0/3

It's unclear to me if this 1/3->2/3->3/3->0/3->copy...->sudden unseal behaviour is actually working as designed. I have not found any documentation that describes that. Normally when the sync is very fast people do not really notice it.

Regarding the rm of vault.db + raft.db could this become a documentation point?

Many thanks again!

@ncabatoff ncabatoff added docs and removed bug Used to indicate a potential bug labels Jul 14, 2021
@ncabatoff
Copy link
Collaborator

Regarding the rm of vault.db + raft.db could this become a documentation point?

I agree we should write something up about defragmenting raft/boltdb.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants