Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Persistence] Don't persist ALL channel_monitors on every bitcoin block connection. #2647

Open
G8XSU opened this issue Oct 4, 2023 · 4 comments · May be fixed by #2966
Open

[Persistence] Don't persist ALL channel_monitors on every bitcoin block connection. #2647

G8XSU opened this issue Oct 4, 2023 · 4 comments · May be fixed by #2966
Assignees
Milestone

Comments

@G8XSU
Copy link
Contributor

G8XSU commented Oct 4, 2023

Currently on every bitcoin block update we persist all channel_monitors with updated best_block.

This can be troublesome for large node operators with 1000's of channels.

It also causes a thunder herd problem (ref), and hammers the storage with many requests all at once.

@G8XSU
Copy link
Contributor Author

G8XSU commented Oct 4, 2023

Assigning to myself, will see if it is doable.

@G8XSU
Copy link
Contributor Author

G8XSU commented Oct 4, 2023

Adding more detail:
Currently on every bitcoin block update we persist all channel_monitors with updated best_block.

This can be troublesome for large node operators with 1000's of channels.

It also causes a thunder herd problem (ref), and hammers the storage with many requests all at once.

@TheBlueMatt TheBlueMatt added this to the 0.1.1 milestone Oct 15, 2023
@benthecarman
Copy link
Contributor

Probably easiest way to do this would just have a config option and do the writes in batches

@G8XSU
Copy link
Contributor Author

G8XSU commented Dec 12, 2023

Approach:
We want to persist monitors at some cadence, easiest thing to would be to stop persisting on every block and instead persist on every 10th/50th block.

This will cut down IO by a factor of 10/50 but doesn't solve the thundering herd problem. All monitors will rush to get persisted after the same block.

So idea is to introduce somewhat random yet deterministic distribution scheme for monitor persists.
Partition_key will be a function of (monitor, block_height).

This partitioning strategy will alleviate thundering herd issue and hot partition problem for monitor persists and we can kind of evenly distribute the IO load.

For a node with 500 channels, this should cut down IO from 250k monitor persist calls to ~5-6k persists in an 8 hour interval.

Note this will also mean that on node restarting, a monitor will be at max 50 blocks out-of-date and we will need to sync them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants