Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

slow sync and big script_pubkeys table (sqlite) #801

Closed
nicbus opened this issue Nov 24, 2022 · 2 comments · Fixed by #806
Closed

slow sync and big script_pubkeys table (sqlite) #801

nicbus opened this issue Nov 24, 2022 · 2 comments · Fixed by #806
Assignees
Labels
bug Something isn't working

Comments

@nicbus
Copy link

nicbus commented Nov 24, 2022

Describe the bug
wallet sync gets slow and the script_pubkeys (sqlite) db table grows in size

To Reproduce
looks like the situation is reproducible from scratch in a regtest environment by creating a new wallet and performing a series of send operations. I was able to see a progressive slowdown by sending 20 times to the wallet itself, with 5 recipients each time

Expected behavior
wallet sync should complete in relatively short amounts of time and db size should stay relatively small

Build environment

  • BDK tag/commit: v0.24
  • OS+version: debian 11
  • Rust/Cargo version: 1.65
  • Rust/Cargo target: x86_64-unknown-linux-gnu

Additional context
a project using bdk became very slow after running for some time. the bdk sync time went up to ~5h (on a VPS with a fast Internet connection, remote electrum server also on a fast VPS) and the sqlite db file had grown to ~500MB. the service had operated for a few weeks and had performed a relatively small number of wallet operations, having sent ~130 transactions and currently having ~60 UTXOs

almost all the space in the db is taken by the scripts_pubkeys table, as deleting all records and vacuuming gives a ~200kb db file

checking the entries in the script_pubkeys db table gives:

  • record count per keychain:
    select keychain,count(*) from script_pubkeys group by keychain;
    "External"|2471200
    "Internal"|2460200
  • max child value by keychain:
    select * from script_pubkeys where keychain='"External"' order by child desc limit 1;
    "External"|2471102|
    select * from script_pubkeys where keychain='"Internal"' order by child desc limit 1;
    "Internal"|1235499|
  • unique value count per keychain:
    select keychain,count(*) from (select distinct keychain,child,script from script_pubkeys) group by keychain;
    "External"|1235800
    "Internal"|1235500
  • repeated (keychain,child) values count per keychain:
    select keychain,kc_count,count(kc_count) from (select keychain,child,count(child) as kc_count from script_pubkeys group by keychain,child) group by keychain,kc_count;
    "External"|1|617900
    "External"|2|308900
    "External"|3|154600
    "External"|4|76900
    "External"|5|38900
    "External"|6|19300
    "External"|7|7300
    "External"|8|7400
    "External"|9|3400
    "External"|10|600
    "External"|11|400
    "External"|12|100
    "External"|13|100
    "Internal"|1|617900
    "Internal"|2|309000
    "Internal"|3|154400
    "Internal"|4|77500
    "Internal"|5|38600
    "Internal"|6|19300
    "Internal"|7|12000
    "Internal"|8|4600
    "Internal"|9|1200
    "Internal"|10|600
    "Internal"|11|200
    "Internal"|12|100
    "Internal"|13|100

trying to reproduce the issue (20x send with 5 recipients) I got the following db situation (same queries as above):

  • record count per keychain:
    "External"|900
    "Internal"|900
  • max child value by keychain:
    "External"|499|
    "Internal"|499|
  • unique value count per keychain:
    "External"|500
    "Internal"|500
  • repeated (keychain,child) values count per keychain:
    "External"|1|200
    "External"|2|200
    "External"|3|100
    "Internal"|1|200
    "Internal"|2|200
    "Internal"|3|100
@nicbus nicbus added the bug Something isn't working label Nov 24, 2022
afilini added a commit to afilini/bdk that referenced this issue Nov 26, 2022
@afilini
Copy link
Member

afilini commented Nov 26, 2022

I think I found the issue, can you test with the linked PR and see if you can still reproduce?

The PR is still missing the code to clean up existing databases, so please start fresh and see if the slowdown is still there.

@nicbus
Copy link
Author

nicbus commented Nov 28, 2022

running the same test (20 sends with 5 recipients) on the PR branch yields a small constant slowdown, consistent with the first send operations, and no issue on the sqlite db. also tried sending more times with more recipients and there was no sign of the reported issue

looks fixed to me, thanks for the quick response

notmandatory added a commit that referenced this issue Nov 30, 2022
b5fcddc Add sqlite migration to drop duplicated script_pubkeys rows (Steve Myers)
21c96c9 Add test for issue #801 (Alekos Filini)
c51d544 [wip] Ensure there are no duplicated script_pubkeys in sqlite (Alekos Filini)

Pull request description:

  ### Description

  Add a `UNIQUE` constraint on the script_pubkeys table so that it doesn't grow constantly when caching new addresses.

  Fixes #801

  ### Notes to the reviewers

  Adding it to the 0.25 milestone since it's just a bugfix.

  Still in draft because I need to add extra migration queries to clean up existing dbs.

  ### Checklists

  #### All Submissions:

  * [x] I've signed all my commits
  * [x] I followed the [contribution guidelines](https://github.com/bitcoindevkit/bdk/blob/master/CONTRIBUTING.md)
  * [x] I ran `cargo fmt` and `cargo clippy` before committing

  #### Bugfixes:

  * [ ] This pull request breaks the existing API
  * [x] I've added tests to reproduce the issue which are now passing
  * [x] I'm linking the issue being fixed by this PR

ACKs for top commit:
  notmandatory:
    ACK b5fcddc

Tree-SHA512: 7b10e453bb38af5c4f80f77692a56e37259680e50f9c2c9e554a0e5f04fb9cab897da6476c6c9930f1c501b455472984a1c92c4f137cff49acdc390d2e705107
icota pushed a commit to icota/bdk that referenced this issue May 2, 2023
Add a `UNIQUE` constraint on the script_pubkeys table so that it doesn't
grow constantly when caching new addresses.

Fixes bitcoindevkit#801
icota pushed a commit to icota/bdk that referenced this issue May 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants