Router Optimizations #1799

TheBlueMatt · 2022-10-25T04:21:51Z

After discussion in #1722 we realized the A* stuff these days is entirely useless (thanks ZFR!), so its best to remove it. While we're at it, this also swaps our stupid BTree lookups for a HashMap, but keeps a sorted keys list for outbound gossip sync.

This gets us halfway to #1473, with a TODO to investigate swapping the BTreeSet for a sorted vec, which I have a strong feeling will be faster (and way more space-effecient!).

~~This is totally up-for-grabs - it needs documentation, real commit messages, benchmarks, etc. If no one else does it I'll pick it up eventually but this should be a nice improvement.~~

Supersedes #1722.

TheBlueMatt · 2022-10-25T04:45:07Z

Oh, would also be cool to add a fuzzer that demonstrates equivalence between the custom map and a btreemap.

The previous copy was more than one and a half years old, the lightning network has changed a lot since! As of this commit, performance on my Xeon W-10885M with a SK hynix Gold P31 storing a BTRFS volume is as follows: ``` test ln::channelmanager::bench::bench_sends ... bench: 5,896,492 ns/iter (+/- 512,421) test routing::gossip::benches::read_network_graph ... bench: 1,645,740,604 ns/iter (+/- 47,611,514) test routing::gossip::benches::write_network_graph ... bench: 234,870,775 ns/iter (+/- 8,301,775) test routing::router::benches::generate_mpp_routes_with_probabilistic_scorer ... bench: 166,155,032 ns/iter (+/- 30,206,162) test routing::router::benches::generate_mpp_routes_with_zero_penalty_scorer ... bench: 136,843,661 ns/iter (+/- 67,111,218) test routing::router::benches::generate_routes_with_probabilistic_scorer ... bench: 52,954,598 ns/iter (+/- 11,360,547) test routing::router::benches::generate_routes_with_zero_penalty_scorer ... bench: 37,598,126 ns/iter (+/- 17,262,519) test bench::bench_sends ... bench: 37,760,922 ns/iter (+/- 5,179,123) test bench::bench_reading_full_graph_from_file ... bench: 25,615 ns/iter (+/- 1,149) ```

Historically we've had various bugs in keeping the `lowest_inbound_channel_fees` field in `NodeInfo` up-to-date as we go. This leaves the A* routing less efficient as it can't prune hops as aggressively. In order to get accurate benchmarks, this commit updates the minimum-inbound-fees field on load. This is not the most efficient way of doing so, but suffices for fetching benchmarks and will be removed in the coming commits. Note that this is *slower* than the non-updating version in the previous commit. While I haven't dug into this incredibly deeply, the graph snapshot in use has min-fee info for only 9,618 of 20,818 nodes. Thus, it is my guess that with the graph snapshot as-is the branch predictor is able to largely remove the A* heuristic lookups, but with this change it is forced to wait for A* heuristic map lookups to complete, causing a performance regression. ``` test routing::router::benches::generate_mpp_routes_with_probabilistic_scorer ... bench: 182,980,059 ns/iter (+/- 32,662,047) test routing::router::benches::generate_mpp_routes_with_zero_penalty_scorer ... bench: 151,170,457 ns/iter (+/- 75,351,011) test routing::router::benches::generate_routes_with_probabilistic_scorer ... bench: 58,187,277 ns/iter (+/- 11,606,440) test routing::router::benches::generate_routes_with_zero_penalty_scorer ... bench: 41,210,193 ns/iter (+/- 18,103,320) ```

codecov-commenter · 2023-01-19T21:32:33Z

Codecov Report

Base: 90.71% // Head: 90.77% // Increases project coverage by +0.06% 🎉

Coverage data is based on head (bde841e) compared to base (153b048).
Patch coverage: 89.43% of modified lines in pull request are covered.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1799      +/-   ##
==========================================
+ Coverage   90.71%   90.77%   +0.06%     
==========================================
  Files          97       99       +2     
  Lines       50677    51701    +1024     
  Branches    50677    51701    +1024     
==========================================
+ Hits        45971    46933     +962     
- Misses       4706     4768      +62

Impacted Files	Coverage Δ
lightning/src/routing/router.rs	`91.15% <52.38%> (+0.23%)`	⬆️
lightning/src/util/indexed_map.rs	`96.29% <96.29%> (ø)`
lightning/src/routing/gossip.rs	`92.05% <100.00%> (-0.11%)`	⬇️
lightning/src/ln/inbound_payment.rs	`92.00% <0.00%> (-1.50%)`	⬇️
lightning/src/chain/onchaintx.rs	`94.56% <0.00%> (-0.84%)`	⬇️
lightning/src/ln/functional_tests.rs	`96.69% <0.00%> (-0.44%)`	⬇️
lightning/src/util/ser.rs	`91.41% <0.00%> (-0.30%)`	⬇️
lightning/src/util/ser_macros.rs	`86.73% <0.00%> (-0.30%)`	⬇️
lightning-invoice/src/utils.rs	`97.62% <0.00%> (-0.15%)`	⬇️
lightning-invoice/src/lib.rs	`87.37% <0.00%> (-0.11%)`	⬇️
... and 15 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

TheBlueMatt · 2023-01-19T21:44:00Z

Cleaned up the commit messages, added benchmarks that demonstrate the performance advantages, and added a fuzzer that should catch and issues in the new map implementation.

arik-so · 2023-01-20T02:28:00Z

lightning/src/util/indexed_map.rs

+	}
+
+	/// Returns an iterator which iterates over the `key`/`value` pairs in a random order.
+	pub fn unordered_iter(&self) -> impl Iterator<Item = (&K, &V)> {


what's the benefit of a random order? I thought the entire point of this data structure was that the iteration order be deterministic? Should the doc comment be updated to reflect that this actually returns a deterministically sorted list?

If you don't need the ordering it's much more efficient. Most uses don't care about the order.

But won't it technically always return an ordered version? Considering that, at least in this commit, the underlying structure is a BTreeMap?

Not in the next commit :)

And that's… a good thing?

I'm not sure I get your question? The first commit changes callsites to make explicit whether they're relying on getting things in-order or not, the second commit actually changes the backing datastructure. Just because something happens to be in-order doesn't mean a caller can rely on it if the API contract clearly indicates they cant.

arik-so · 2023-01-20T02:35:29Z

lightning/src/util/indexed_map.rs

@@ -18,15 +20,18 @@ use core::ops::RangeBounds;
 /// actually backed by a `HashMap`, with some additional tracking to ensure we can iterate over
 /// keys in the order defined by [`Ord`].
 #[derive(Clone, PartialEq, Eq)]
-pub struct IndexedMap<K: Ord, V> {
-	map: BTreeMap<K, V>,
+pub struct IndexedMap<K: Hash + Ord, V> {


should this be a separate commit?

I think it might make more sense to introduce IndexedMap as the desired type from the beginning.

It breaks it up a bit to be a tiny bit easier to review? Makes the commit that adds the new data structure implementation a freestanding commit.

arik-so · 2023-01-20T02:41:02Z

fuzz/src/indexedmap.rs

+use crate::utils::test_logger;
+
+// Note that while we take the trees by &mut here
+fn check_eq(btree: &BTreeMap<u8, u8>, indexed: &IndexedMap<u8, u8>) {


this looks super useful. You may wanna add that to an IndexedMap test_util perhaps?

There is no IndexedMap test_util? Are you suggesting/requesting additional tests?

wpaulino

Nice! Saw similar improvements on my hardware.

lightning/src/util/indexed_map.rs

fuzz/src/indexedmap.rs

lightning/src/util/indexed_map.rs

lightning/src/routing/gossip.rs

lightning/src/util/indexed_map.rs

wpaulino

LGTM, feel free to squash

fuzz/src/bin/msg_channel_details_target.rs

fuzz/src/bin/indexedmap_target.rs

As evidenced by the previous commit, it appears our A* router does worse than a more naive approach. This isn't super surpsising, as the A* heuristic calculation requires a map lookup, which is relatively expensive. ``` test routing::router::benches::generate_mpp_routes_with_probabilistic_scorer ... bench: 169,991,943 ns/iter (+/- 30,838,048) test routing::router::benches::generate_mpp_routes_with_zero_penalty_scorer ... bench: 122,144,987 ns/iter (+/- 61,708,911) test routing::router::benches::generate_routes_with_probabilistic_scorer ... bench: 48,546,068 ns/iter (+/- 10,379,642) test routing::router::benches::generate_routes_with_zero_penalty_scorer ... bench: 32,898,557 ns/iter (+/- 14,157,641) ```

tnull

Basically LGTM.

Some questions/suggestions, feel free to squash if you decide not to tackle them.

lightning/src/util/indexed_map.rs

lightning/src/routing/router.rs

TheBlueMatt · 2023-01-25T17:44:24Z

Squashed, updated the docs trivially, and added a commit to clean up a few more things in the router:

$ git diff-tree -U3 158a3f1 2173280f
diff --git a/lightning/src/routing/router.rs b/lightning/src/routing/router.rs
index eb6eede0e..8a18c44ba 100644
--- a/lightning/src/routing/router.rs
+++ b/lightning/src/routing/router.rs
@@ -885,18 +885,11 @@ impl<'a> PaymentPath<'a> {
 	}
 }
 
+#[inline(always)]
+/// Calculate the fees required to route the given amount over a channel with the given fees.
 fn compute_fees(amount_msat: u64, channel_fees: RoutingFees) -> Option<u64> {
-	let proportional_fee_millions =
-		amount_msat.checked_mul(channel_fees.proportional_millionths as u64);
-	if let Some(new_fee) = proportional_fee_millions.and_then(|part| {
-			(channel_fees.base_msat as u64).checked_add(part / 1_000_000) }) {
-
-		Some(new_fee)
-	} else {
-		// This function may be (indirectly) called without any verification,
-		// with channel_fees provided by a caller. We should handle it gracefully.
-		None
-	}
+	amount_msat.checked_mul(channel_fees.proportional_millionths as u64)
+		.and_then(|part| (channel_fees.base_msat as u64).checked_add(part / 1_000_000))
 }
 
 /// The default `features` we assume for a node in a route, when no `features` are known about that
@@ -1289,7 +1282,7 @@ where L::Target: Logger {
 							if !should_process { should_process = true; }
 						}
 
-						if should_process {
+						'processing_node: while should_process {
 							let mut hop_use_fee_msat = 0;
 							let mut total_fee_msat = $next_hops_fee_msat;
 
@@ -1299,7 +1292,7 @@ where L::Target: Logger {
 								match compute_fees(amount_to_transfer_over_msat, $candidate.fees()) {
 									// max_value means we'll always fail
 									// the old_entry.total_fee_msat > total_fee_msat check
-									None => total_fee_msat = u64::max_value(),
+									None => break 'processing_node,
 									Some(fee_msat) => {
 										hop_use_fee_msat = fee_msat;
 										total_fee_msat += hop_use_fee_msat;
@@ -1392,6 +1385,7 @@ where L::Target: Logger {
 									);
 								}
 							}
+							break 'processing_node;
 						}
 					}
 				}
diff --git a/lightning/src/util/indexed_map.rs b/lightning/src/util/indexed_map.rs
index 12c9c9dcd..cccbfe7bc 100644
--- a/lightning/src/util/indexed_map.rs
+++ b/lightning/src/util/indexed_map.rs
@@ -8,7 +8,7 @@ use core::ops::RangeBounds;
 
 /// A map which can be iterated in a deterministic order.
 ///
-/// This would traditionally be accomplished by simply using a `BTreeMap`, however B-Trees
+/// This would traditionally be accomplished by simply using a [`BTreeMap`], however B-Trees
 /// generally have very slow lookups. Because we use a nodes+channels map while finding routes
 /// across the network graph, our network graph backing map must be as performant as possible.
 /// However, because peers expect to sync the network graph from us (and we need to support that
@@ -16,9 +16,11 @@ use core::ops::RangeBounds;
 /// into our outbound message queue), we need an iterable map with a consistent iteration order we
 /// can jump to a starting point on.
 ///
-/// Thus, we have a custom data structure here - its API mimics that of Rust's `BTreeMap`, but is
+/// Thus, we have a custom data structure here - its API mimics that of Rust's [`BTreeMap`], but is
 /// actually backed by a [`HashMap`], with some additional tracking to ensure we can iterate over
 /// keys in the order defined by [`Ord`].
+///
+/// [`BTreeMap`]: alloc::collections::BTreeMap
 #[derive(Clone, Debug, PartialEq, Eq)]
 pub struct IndexedMap<K: Hash + Ord, V> {
 	map: HashMap<K, V>,

lightning/src/routing/router.rs

TheBlueMatt · 2023-01-25T18:03:01Z

Rewrote the last commit to do more like what @tnull suggested, taking advantage of the CMOVs that saturating_* compile down to rather than explicit branching.

tnull

LGTM from my side.

lightning/src/routing/router.rs

Our network graph has to be iterable in a deterministic order and with the ability to iterate over a specific range. Thus, historically, we've used a `BTreeMap` to do the iteration. This is fine, except our map needs to also provide high performance lookups in order to make route-finding fast. Sadly, `BTreeMap`s are quite slow due to the branching penalty. Here we replace the `BTreeMap`s in the scorer with a dummy wrapper. In the next commit the internals thereof will be replaced with a `HashMap`-based implementation.

Our network graph has to be iterable in a deterministic order and with the ability to iterate over a specific range. Thus, historically, we've used a `BTreeMap` to do the iteration. This is fine, except our map needs to also provide high performance lookups in order to make route-finding fast. Sadly, `BTreeMap`s are quite slow due to the branching penalty. Here we replace the implementation of our `IndexedMap` with a `HashMap` to store the elements itself and a `BTreeSet` to store the keys set in sorted order for iteration. As of this commit on the same hardware as the above few commits, the benchmark results are: ``` test routing::router::benches::generate_mpp_routes_with_probabilistic_scorer ... bench: 109,544,993 ns/iter (+/- 27,553,574) test routing::router::benches::generate_mpp_routes_with_zero_penalty_scorer ... bench: 81,164,590 ns/iter (+/- 55,422,930) test routing::router::benches::generate_routes_with_probabilistic_scorer ... bench: 34,726,569 ns/iter (+/- 9,646,345) test routing::router::benches::generate_routes_with_zero_penalty_scorer ... bench: 22,772,355 ns/iter (+/- 9,574,418) ```

Often when we call `compute_fees` we really just want it to saturate and we deal with `u64::max_value` later. In that case, we're much better off doing the saturating in the `compute_fees` as it can use CMOVs rather than branching at each step and then `unwrap_or`ing at the callsite.

TheBlueMatt · 2023-01-25T18:59:11Z

Fixed the doc comment in an intermediary commit without changing the full diff.

TheBlueMatt added 2 commits January 19, 2023 05:06

TheBlueMatt force-pushed the 2022-10-heap-nerdsnipe branch 2 times, most recently from 87bc732 to e572ae7 Compare January 19, 2023 21:31

TheBlueMatt force-pushed the 2022-10-heap-nerdsnipe branch 2 times, most recently from 8682115 to 20bedad Compare January 19, 2023 21:41

TheBlueMatt marked this pull request as ready for review January 19, 2023 21:44

TheBlueMatt force-pushed the 2022-10-heap-nerdsnipe branch 3 times, most recently from c1efa29 to 3f91255 Compare January 19, 2023 21:51

tnull self-requested a review January 19, 2023 22:03

TheBlueMatt force-pushed the 2022-10-heap-nerdsnipe branch from 3f91255 to f77ad1b Compare January 19, 2023 22:20

arik-so reviewed Jan 20, 2023

View reviewed changes

wpaulino reviewed Jan 20, 2023

View reviewed changes

lightning/src/util/indexed_map.rs Outdated Show resolved Hide resolved

fuzz/src/indexedmap.rs Outdated Show resolved Hide resolved

lightning/src/util/indexed_map.rs Show resolved Hide resolved

tnull reviewed Jan 21, 2023

View reviewed changes

TheBlueMatt force-pushed the 2022-10-heap-nerdsnipe branch from f77ad1b to 158a3f1 Compare January 24, 2023 04:34

wpaulino reviewed Jan 25, 2023

View reviewed changes

fuzz/src/bin/msg_channel_details_target.rs Show resolved Hide resolved

fuzz/src/bin/indexedmap_target.rs Show resolved Hide resolved

tnull reviewed Jan 25, 2023

View reviewed changes

lightning/src/util/indexed_map.rs Show resolved Hide resolved

lightning/src/routing/router.rs Outdated Show resolved Hide resolved

TheBlueMatt force-pushed the 2022-10-heap-nerdsnipe branch 2 times, most recently from 592d3ee to 78ac11e Compare January 25, 2023 17:23

tnull reviewed Jan 25, 2023

View reviewed changes

lightning/src/routing/router.rs Outdated Show resolved Hide resolved

TheBlueMatt force-pushed the 2022-10-heap-nerdsnipe branch from 2173280 to af8510f Compare January 25, 2023 18:02

tnull previously approved these changes Jan 25, 2023

View reviewed changes

lightning/src/routing/router.rs Show resolved Hide resolved

TheBlueMatt dismissed tnull’s stale review via a1e465f January 25, 2023 18:09

TheBlueMatt force-pushed the 2022-10-heap-nerdsnipe branch 2 times, most recently from a1e465f to dca4b77 Compare January 25, 2023 18:10

TheBlueMatt added 4 commits January 25, 2023 18:58

Add a fuzzer to check that IndexedMap is equivalent to BTreeMap

e64b5d9

TheBlueMatt force-pushed the 2022-10-heap-nerdsnipe branch from dca4b77 to bde841e Compare January 25, 2023 18:58

tnull approved these changes Jan 25, 2023

View reviewed changes

wpaulino approved these changes Jan 25, 2023

View reviewed changes

TheBlueMatt merged commit ca5b108 into lightningdevkit:main Jan 25, 2023

This was referenced Jan 26, 2023

Explore replacing BTreeSet in IndexedMap with a sorted Vec #1473

Closed

Explore replacing BTreeSet in IndexedMap with a sorted Vec #1992

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Router Optimizations #1799

Router Optimizations #1799

TheBlueMatt commented Oct 25, 2022 •

edited

TheBlueMatt commented Oct 25, 2022

codecov-commenter commented Jan 19, 2023 •

edited

TheBlueMatt commented Jan 19, 2023

arik-so Jan 20, 2023 •

edited

TheBlueMatt Jan 20, 2023

arik-so Jan 20, 2023

TheBlueMatt Jan 20, 2023

arik-so Jan 20, 2023

TheBlueMatt Jan 20, 2023

arik-so Jan 20, 2023

arik-so Jan 20, 2023

TheBlueMatt Jan 20, 2023

arik-so Jan 20, 2023

TheBlueMatt Jan 20, 2023

wpaulino left a comment

wpaulino left a comment

tnull left a comment

TheBlueMatt commented Jan 25, 2023

TheBlueMatt commented Jan 25, 2023

tnull left a comment •

edited

TheBlueMatt commented Jan 25, 2023

Router Optimizations #1799

Router Optimizations #1799

Conversation

TheBlueMatt commented Oct 25, 2022 • edited

TheBlueMatt commented Oct 25, 2022

codecov-commenter commented Jan 19, 2023 • edited

Codecov Report

TheBlueMatt commented Jan 19, 2023

arik-so Jan 20, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wpaulino left a comment

Choose a reason for hiding this comment

wpaulino left a comment

Choose a reason for hiding this comment

tnull left a comment

Choose a reason for hiding this comment

TheBlueMatt commented Jan 25, 2023

TheBlueMatt commented Jan 25, 2023

tnull left a comment • edited

Choose a reason for hiding this comment

TheBlueMatt commented Jan 25, 2023

TheBlueMatt commented Oct 25, 2022 •

edited

codecov-commenter commented Jan 19, 2023 •

edited

arik-so Jan 20, 2023 •

edited

tnull left a comment •

edited