Faster multitree bench #238

MattHalpinParity · 2024-04-20T18:28:13Z

Much improved multitree stress test. It can now generate and submit over 1000 tree commits a second.
When in archive mode the database can keep up with this. When doing full reference counting etc it can't and commit processing is the bottleneck. In this situation it gets around 130 tree commits a second.

I fixed an issue where tree commits weren't added to the commit queue size. This caused huge numbers of commits to get queued up without the database keeping up. Now it throttles commits so it doesn't get left behind.

cheme · 2024-04-29T08:20:23Z

admin/src/multitree_bench/mod.rs

-	/// be empty and non direct children will be NodeSpec::UnresolvedPath.
-	fn generate_node(
+	fn generate_node_data(&self, rng: &mut rand::rngs::SmallRng, leaf: bool) -> Vec<u8> {
+		// Polkadot doesn't store actual values in branch nodes, only in leaf nodes. Hence the value


Guess it is a result of the content being well addressed in polkadot, not a rule in the absolute. Maybe:

Suggested change

// Polkadot doesn't store actual values in branch nodes, only in leaf nodes. Hence the value

// Polkadot avoid storing actual values in branch nodes, mainly in leaf nodes. Hence the value

Good point. I'll change the comment.

cheme · 2024-04-29T08:42:11Z

admin/src/multitree_bench/mod.rs

+			if let Some(existing_index) = existing_index {
+				let previous_child_node = &previous_node.unwrap().children[*existing_index];
+				let previous_child = previous_child_node.get_child_as_ref();
+				if *removals > 0 || *additions > 0 {


I am not too sure about the logic of these removals/additions counters.
The way it should work is : on changed node:

if removed (eg replacement but there can be removal with no new node in some trie related corner case) then only do Rc minus one, if rc is < 0 then do rc minus one on all children.

if new add new node.

if updated, do removed as in first line, and for all children do rc + 1.

The logic I want to assert here is that we got for every changed node this rc + 1 of every of its children and for every pruned node we do the rc - 1 of every of its children.

(seen some io issue running this on a binary trie, if we end up running in similar io issue there is an alternate design, but no use looking at it unless we see such issue)

Those values aren’t refcounts, they’re targets for how many values (basically leaf nodes) to add and remove.
This is in the part where it changes the tree for each new commit. It starts with a target number of additions and removals to make for the whole tree then distributes them through child nodes as best it can. Sometimes it adds/removes leaf nodes otherwise it passes the numbers down to the children for them to deal with. They also aren’t guaranteed to be able to make the changes but on average each commit should end up with an appropriate number of changes.
The actual node refcounting happens within the Db and it’ll deal with whether nodes need to be removed.

ok, I was wondering abot the number of NodeRef::Existing(NodeAddress) in the commit set. I feel like it may have been underestimated (not directly related with this pr though).

MattHalpinParity added 4 commits April 18, 2024 13:21

Fix for recording dereferences

6e0e4b3

Added get_node_children. Also increase min refcount bits.

c479217

Faster multitree bench

8d1bd9e

Add tree commits to commit queue size

e0f7878

MattHalpinParity marked this pull request as ready for review April 22, 2024 10:19

MattHalpinParity requested review from arkpar and cheme April 22, 2024 10:20

Better default parameters for multitree stress

a84d077

cheme reviewed Apr 29, 2024

View reviewed changes

cheme approved these changes Apr 30, 2024

View reviewed changes

Clarified comment

af5443f

MattHalpinParity merged commit 2b6820e into paritytech:master May 3, 2024
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster multitree bench #238

Faster multitree bench #238

MattHalpinParity commented Apr 20, 2024 •

edited

cheme Apr 29, 2024

MattHalpinParity Apr 29, 2024

cheme Apr 29, 2024

MattHalpinParity Apr 29, 2024

cheme Apr 30, 2024

	// Polkadot doesn't store actual values in branch nodes, only in leaf nodes. Hence the value
	// Polkadot avoid storing actual values in branch nodes, mainly in leaf nodes. Hence the value

Faster multitree bench #238

Faster multitree bench #238

Conversation

MattHalpinParity commented Apr 20, 2024 • edited

cheme Apr 29, 2024

Choose a reason for hiding this comment

MattHalpinParity Apr 29, 2024

Choose a reason for hiding this comment

cheme Apr 29, 2024

Choose a reason for hiding this comment

MattHalpinParity Apr 29, 2024

Choose a reason for hiding this comment

cheme Apr 30, 2024

Choose a reason for hiding this comment

MattHalpinParity commented Apr 20, 2024 •

edited