Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster multitree bench #238

Merged

Conversation

MattHalpinParity
Copy link
Contributor

@MattHalpinParity MattHalpinParity commented Apr 20, 2024

Much improved multitree stress test. It can now generate and submit over 1000 tree commits a second.
When in archive mode the database can keep up with this. When doing full reference counting etc it can't and commit processing is the bottleneck. In this situation it gets around 130 tree commits a second.

I fixed an issue where tree commits weren't added to the commit queue size. This caused huge numbers of commits to get queued up without the database keeping up. Now it throttles commits so it doesn't get left behind.

@MattHalpinParity MattHalpinParity marked this pull request as ready for review April 22, 2024 10:19
/// be empty and non direct children will be NodeSpec::UnresolvedPath.
fn generate_node(
fn generate_node_data(&self, rng: &mut rand::rngs::SmallRng, leaf: bool) -> Vec<u8> {
// Polkadot doesn't store actual values in branch nodes, only in leaf nodes. Hence the value
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guess it is a result of the content being well addressed in polkadot, not a rule in the absolute. Maybe:

Suggested change
// Polkadot doesn't store actual values in branch nodes, only in leaf nodes. Hence the value
// Polkadot avoid storing actual values in branch nodes, mainly in leaf nodes. Hence the value

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I'll change the comment.

if let Some(existing_index) = existing_index {
let previous_child_node = &previous_node.unwrap().children[*existing_index];
let previous_child = previous_child_node.get_child_as_ref();
if *removals > 0 || *additions > 0 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not too sure about the logic of these removals/additions counters.
The way it should work is : on changed node:

  • if removed (eg replacement but there can be removal with no new node in some trie related corner case) then only do Rc minus one, if rc is < 0 then do rc minus one on all children.
  • if new add new node.
  • if updated, do removed as in first line, and for all children do rc + 1.

The logic I want to assert here is that we got for every changed node this rc + 1 of every of its children and for every pruned node we do the rc - 1 of every of its children.

(seen some io issue running this on a binary trie, if we end up running in similar io issue there is an alternate design, but no use looking at it unless we see such issue)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those values aren’t refcounts, they’re targets for how many values (basically leaf nodes) to add and remove.
This is in the part where it changes the tree for each new commit. It starts with a target number of additions and removals to make for the whole tree then distributes them through child nodes as best it can. Sometimes it adds/removes leaf nodes otherwise it passes the numbers down to the children for them to deal with. They also aren’t guaranteed to be able to make the changes but on average each commit should end up with an appropriate number of changes.
The actual node refcounting happens within the Db and it’ll deal with whether nodes need to be removed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I was wondering abot the number of NodeRef::Existing(NodeAddress) in the commit set. I feel like it may have been underestimated (not directly related with this pr though).

@MattHalpinParity MattHalpinParity merged commit 2b6820e into paritytech:master May 3, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants