Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pre-allocate batch buffer in IAVL #31

Closed
p0mvn opened this issue Feb 24, 2022 · 1 comment
Closed

Pre-allocate batch buffer in IAVL #31

p0mvn opened this issue Feb 24, 2022 · 1 comment
Assignees
Labels
enhancement New feature or request refined

Comments

@p0mvn
Copy link
Member

p0mvn commented Feb 24, 2022

Background

Ethereum recently discovered that growing the buffer on goleveldb is slow and implemented a fix:
ethereum/go-ethereum#24392

We need to implement and benchmark a similar fix

Acceptance Criteria

  • introduce the new constructor method in tm-db to pre-allocate buffer
  • use that method in IAVL's SaveVersion
  • add unit tests
  • benchmark and document the change
  • test on the node
@p0mvn p0mvn self-assigned this Feb 24, 2022
@p0mvn p0mvn added enhancement New feature or request refined labels Feb 24, 2022
@p0mvn
Copy link
Member Author

p0mvn commented Feb 27, 2022

I have a naive implementation of the pre-allocated batch buffer. It is naive because we do not account for the fact that for the height > 1, some leaf nodes may already be persisted.

I attempted to pre-calculate the batch size, unit tested, and bench-tested when only committing first height.

I don't see a good way to "estimate" a pre-allocation number without actually calculating it because we have various types of nodes - branch, leaf, fast, orphan. So 1 insert in IAVL does not directly translate to a certain number of bytes that are going to be allocated to batch since there might be rotations, some leaves are already persisted, etc...

Unfortunately, bench tests already show a decrease in performance. It seems that the performance decrease needed stemming from batch size pre-calculations offset the performance gain from the pre-allocation.

Draft PR: #29

Benchstat:

roman@ubuntu:~/projects/iavl (dev/iavl_data_locality)$ benchstat save_version_batch_not_with_size.log save_version_batch_with_size.log
name                         old time/op    new time/op    delta
MutableTree_SetWithBatch-16     1.30s ± 7%     1.34s ± 8%  +3.35%  (p=0.026 n=15+15)

name                         old alloc/op   new alloc/op   delta
MutableTree_SetWithBatch-16     311MB ± 0%     340MB ± 0%  +9.34%  (p=0.000 n=15+14)

name                         old allocs/op  new allocs/op  delta
MutableTree_SetWithBatch-16     3.91M ± 0%     4.01M ± 0%  +2.54%  (p=0.000 n=13+14)

I think with the current design, there are too many constraints one needs to work around to pre-calculate the batch size. As a result, we can't gain from the pre-allocations unless we significantly redesign IAVL.

In addition, making this change with the buffer requires changing the DBInterface in tm-db. There is a concern that other db implementations might not have the same constraint as goleveldb so they don't really need the NewBatchWithSize implementation.

I think that it is not worth further pursuing this feature due to all of the above constraints.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request refined
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

1 participant