Always pick the best paths, rather than (poorly) attempting to randomize #1610

TheBlueMatt · 2022-07-12T21:44:51Z

Based on #1605, this removes the half-assed attempt at randomization in the router, instead simply picking the paths which have the lowest score-per-value-transferred. This ends up being basically the same thing anyway, but more directly, and simply. Further, it fixes some edge-case issues around not selecing enough balance, and, notable, spuriously failing to route if we ended up needing >= 50 paths to route (which is now a configurable number, not that users should be using 50 paths, really).

This then switches us to a saturation limit of 1/4, which I think is much cleaner, but the router cleanups are needed to make tests pass with that.

Tagging 0.0.110 cause I want it the 1/4 switch for it.

codecov-commenter · 2022-07-12T21:55:02Z

Codecov Report

Merging #1610 (2b57097) into main (f75b6cb) will increase coverage by 0.00%.
The diff coverage is 100.00%.

❗ Current head 2b57097 differs from pull request most recent head ff8d3f7. Consider uploading reports for the commit ff8d3f7 to get more accurate results

@@           Coverage Diff            @@
##             main    #1610    +/-   ##
========================================
  Coverage   90.84%   90.84%            
========================================
  Files          80       80            
  Lines       44675    44841   +166     
  Branches    44675    44841   +166     
========================================
+ Hits        40583    40735   +152     
- Misses       4092     4106    +14

Impacted Files	Coverage Δ
lightning/src/ln/functional_tests.rs	`97.11% <ø> (+0.17%)`	⬆️
lightning/src/routing/router.rs	`92.44% <100.00%> (-0.09%)`	⬇️
lightning/src/ln/functional_test_utils.rs	`93.51% <0.00%> (-1.74%)`	⬇️
lightning-net-tokio/src/lib.rs	`76.85% <0.00%> (-0.31%)`	⬇️
lightning/src/util/events.rs	`39.54% <0.00%> (+0.28%)`	⬆️
lightning-background-processor/src/lib.rs	`95.81% <0.00%> (+0.61%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f75b6cb...ff8d3f7. Read the comment docs.

tnull

Hum, I agree that the path randomization was not that great to begin with (especially since we keep sorting the candidates before/after). However, as this reverts what little indeterminism was introduced with #1359, I wonder what generally the path forward would be? Do we for example want to introduce a 'privacy budget' that can be spent either on overpaying fees or on choosing non-optimal routes?

lightning/src/routing/router.rs

tnull · 2022-07-13T08:52:16Z

lightning/src/routing/router.rs

-		let mut cur_route = Vec::<PaymentPath>::new();
-		let mut aggregate_route_value_msat = 0;
+	// First, sort by the cost-per-value of the path, selecting only the paths which
+	// contribute the most per cost.


"cost-per-value" vs. "contribute the most per cost": Which one is it / should it be? cost-per-value or value-per-cost? I'd argue probably the latter, since we still want to drop paths that contribute less to improve success?

I'm confused - they're the same thing, at least as far as ordering goes? I tried to reword it some.

Well mainly I wanted to point out the discrepancy between the comment and code. But yeah, they can of course be used somewhat analogous. That said, I assume you introduced the conversion to u128 and the shift by 64 to compare them without the need for floats? Since value >> cost, could you get around that by simply sorting by value/cost in ascending order, rather than cost/value in descending order?

Also, we're now only selecting by cost efficiency, but not by overall amount as before. Why is this better than preferring a smaller number of larger value paths, which potentially reduced the failure probability?

Well mainly I wanted to point out the discrepancy between the comment and code. But yeah, they can of course be used somewhat analogous.

Right, I did update the comment.

That said, I assume you introduced the conversion to u128 and the shift by 64 to compare them without the need for floats? Since value >> cost, could you get around that by simply sorting by value/cost in ascending order, rather than cost/value in descending order?

Hmm, that's a good point. For the values our scorer uses that should work alright. That said, #1600 proposes using score values of a full bitcoin, so then we'd be liable to hit issues again. To avoid any scorer-specific logic in the router just shifting up into a u128 seemed the simplest, and while its gonna be really slow on most systems, it shouldn't matter much here, its at the end of the router and only across a handful of paths.

Also, we're now only selecting by cost efficiency, but not by overall amount as before. Why is this better than preferring a smaller number of larger value paths, which potentially reduced the failure probability?

Hmm, yea, that's a valid point - we may prefer to send a few large parts rather than a number of small parts. Still, that seems like something that should be up to the scorer - the router should seek to reduce the total cost in scorer+fee terms of the paths it selects, if the scorer wants to prefer larger parts, then great, if not, that's its problem.

Hmm, that's a good point. For the values our scorer uses that should work alright. That said, #1600 proposes using score values of a full bitcoin, so then we'd be liable to hit issues again. To avoid any scorer-specific logic in the router just shifting up into a u128 seemed the simplest, and while its gonna be really slow on most systems, it shouldn't matter much here, its at the end of the router and only across a handful of paths.

Ah I see. Yeah, it works as it is.

Hmm, yea, that's a valid point - we may prefer to send a few large parts rather than a number of small parts. Still, that seems like something that should be up to the scorer - the router should seek to reduce the total cost in scorer+fee terms of the paths it selects, if the scorer wants to prefer larger parts, then great, if not, that's its problem.

Hum, I agree it's probably best to keep the routing logic as simple as possible and let the Scorer handle it going forward. And hopefully it works as intended and optimizes for failure probability.

lightning/src/routing/router.rs

TheBlueMatt · 2022-07-13T18:19:33Z

Hum, I agree that the path randomization was not that great to begin with (especially since we keep sorting the candidates before/after). However, as this reverts what little indeterminism was introduced with #1359, I wonder what generally the path forward would be? Do we for example want to introduce a 'privacy budget' that can be spent either on overpaying fees or on choosing non-optimal routes?

Yea, I'm not super happy with it, but at the same time I'm relatively confident our "privacy gain" there was so close to zero its not really worth discussing at length. I'm somewhat thinking our current routing approach is just incompatible with a privacy design - we should consider doing some kind of "private routing", but I'm not sure it can be trivially accomplished with dijkstras. There's maybe a world where we do something like a "privacy budget" but I'd think we'd want to implement that by taking all channel fees below X and replace them with random(0-X) while doing the graph walking, thus making any path of the same length equally likely, or something like that (I believe this is kinda-sorta what CLN does, not sure). In any case, anything I can think of that would be a reasonably private approach simply doesn't start with "dijkstras strictly over the fee + cost, without randomization on the fee/cost function".

lightning/src/routing/router.rs

tnull · 2022-07-14T12:32:10Z

lightning/src/routing/router.rs

-		let mut cur_route = Vec::<PaymentPath>::new();
-		let mut aggregate_route_value_msat = 0;
+	// First, sort by the cost-per-value of the path, selecting only the paths which
+	// contribute the most per cost.


Well mainly I wanted to point out the discrepancy between the comment and code. But yeah, they can of course be used somewhat analogous. That said, I assume you introduced the conversion to u128 and the shift by 64 to compare them without the need for floats? Since value >> cost, could you get around that by simply sorting by value/cost in ascending order, rather than cost/value in descending order?

lightning/src/routing/router.rs

tnull · 2022-07-14T12:50:37Z

lightning/src/routing/router.rs

-		let mut cur_route = Vec::<PaymentPath>::new();
-		let mut aggregate_route_value_msat = 0;
+	// First, sort by the cost-per-value of the path, selecting only the paths which
+	// contribute the most per cost.


Also, we're now only selecting by cost efficiency, but not by overall amount as before. Why is this better than preferring a smaller number of larger value paths, which potentially reduced the failure probability?

lightning/src/routing/router.rs

tnull · 2022-07-14T13:03:29Z

Yea, I'm not super happy with it, but at the same time I'm relatively confident our "privacy gain" there was so close to zero its not really worth discussing at length. I'm somewhat thinking our current routing approach is just incompatible with a privacy design - we should consider doing some kind of "private routing", but I'm not sure it can be trivially accomplished with dijkstras.

I don't think privacy is a binary space. Since there are always a lot of assumptions to be made on the adversary's end, introducing even mediocre amounts of uncertainty might go a long way in practice. Even randomly choosing to forgo the most optimal route and recomputing another one from time to time could break the 'shortest path' assumption any MITM needs to make when trying to break end-to-end privacy.

There's maybe a world where we do something like a "privacy budget" but I'd think we'd want to implement that by taking all channel fees below X and replace them with random(0-X) while doing the graph walking, thus making any path of the same length equally likely, or something like that (I believe this is kinda-sorta what CLN does, not sure). In any case, anything I can think of that would be a reasonably private approach simply doesn't start with "dijkstras strictly over the fee + cost, without randomization on the fee/cost function".

Absolutely, there are a lot of things that could be done to enable more private routing, what you sketched might be a good start. That said, I'll have a look what the state-of-the-art in CLN currently is.

TheBlueMatt · 2022-07-14T19:22:08Z

I don't think privacy is a binary space. Since there are always a lot of assumptions to be made on the adversary's end, introducing even mediocre amounts of uncertainty might go a long way in practice. Even randomly choosing to forgo the most optimal route and recomputing another one from time to time could break the 'shortest path' assumption any MITM needs to make when trying to break end-to-end privacy.

Yea, I think that's a good point. There's possibly more of a privacy cost here than I suggested in the description. Still, I'd much rather address that by changing the scorer to give it more control over the ultimate cost (ie not always adding the fee on the router side and have the scorer return that) so that the scorer can do things like what I suggested above and we don't have to touch the router for it.

TheBlueMatt · 2022-07-14T19:22:20Z

Rebased after merge of the dependent PR.

tnull · 2022-07-15T09:06:15Z

Alright, so far looks good. I'll let someone else have a go and then will do another round of review.

TheBlueMatt · 2022-07-15T15:33:04Z

Rebased.

jkczyz

Read through the discussion on the rationales. All seems reasonable to me.

lightning/src/routing/router.rs

jkczyz · 2022-07-19T01:22:19Z

lightning/src/routing/router.rs

 				payment_path.update_value_and_recompute_fees(cmp::min(value_contribution_msat, final_value_msat));
+				value_contribution_msat = cmp::min(value_contribution_msat, final_value_msat);


Would it be possible to formulate a test that catches the incorrect behavior?

Hmm, so with the current code we end up relying on it being correct a good bit more, so reverting the first patch here causes existing tests to fail. As for building a fresh test for it...its pretty convoluted and we'd be testing side-effects - we can construct a path that requires up to 2x overpayment, but we're trying to gather 3x total paths, so if we try to build a test that fails to find a path it won't work. We could maybe get away with something where it finds a path but not the optimal one because an optimal one was found later, but that's...complicated and kinda a strange test? I dunno.

Should be alright if other tests are catching it now.

tnull

LGTM, but one comment you may consider optional. Feel free to squash.

tnull · 2022-07-19T14:33:47Z

lightning/src/routing/router.rs

@@ -754,7 +754,7 @@ where L::Target: Logger, GL::Target: Logger {
 pub(crate) fn get_route<L: Deref, S: Score>(
 	our_node_pubkey: &PublicKey, payment_params: &PaymentParameters, network_graph: &ReadOnlyNetworkGraph,
 	first_hops: Option<&[&ChannelDetails]>, final_value_msat: u64, final_cltv_expiry_delta: u32,
-	logger: L, scorer: &S, random_seed_bytes: &[u8; 32]
+	logger: L, scorer: &S, _random_seed_bytes: &[u8; 32]


So, now that we don't use it anymore and probably won't going forward, should we remove random_seed_bytes from the signature of get_route entirely? As discussed, #495 will likely happen in the Scorer and #1482 can be done 'on top' in a similar vein as add_random_cltv_offset.

Yea, good question. The first version of this patch did, and while it didn't change the public interface, it touched a ton of test code. I dropped it cause I wasn't sure if we would end up using it again and it doesn't change the public interface so I don't care too much about an unused bit of code. I don't feel super strongly but until we're in a place we're confident in with the router I kinda feel like leaving it. One thing, at least, that we may do, is add the random seed as a "per route" input to the scorer.

Hm, but I assume if we needed randomization in the Scorer, we'd pass it directly to it before giving it to find_route? That said, I also don't think it's very pressing to remove it right now. Especially since I still like the idea of eventually introducing a proper Rand interface, which may require a larger refactoring anyways. So feel free to leave as is.

Maybe? I dunno, I haven't really thought about it, its just not clear to me how we'd do it, so I figured not worry too much about removing the field right now.

If we end up "paying" for an `htlc_minimum_msat` with fees, we increment `already_collected_value_msat` by more than the amount of the path that we collected (who's `value_contribution_msat` is higher than the total payment amount, despite having been reduced down to the payment amount). This throws off our total value collection target, though in the coming commit(s) it would also throw off our path selection calculations.

Currently, after we've selected a number of candidate paths, we construct a route from a random set of paths repeatedly, and then select the route with the lowest total cost. In the vast majority of cases this ends up doing a bunch of additional work in order to select the path(s) with the total lowest cost, with some vague attempt at randomization that doesn't actually work. Instead, here, we simply sort available paths by `cost / amount` and select the top paths. This ends up in practice having the same end result with substantially less complexity. In some rare cases it gets a better result, which also would have been achieved through more random trials. This implies there may in such cases be a potential privacy loss, but not a substantial one, given our path selection is ultimately mostly deterministic in many cases (or, if it is not, then privacy is achieved through randomization at the scorer level).

Saturating a channel beyond 1/4 of its capacity seems like a more reasonable threshold for avoiding a path than 1/2, especially given we should still be willing to send a payment with a lower saturation limit if it comes to that. This requires an (obvious) change to some router tests, but also requires a change to the `fake_network_test`, opting to simply remove some over-limit test code there - `fake_network_test` was our first ever functional test, and while it worked great to ensure LDK worked at all on day one, we now have a rather large breadth of functional tests, and a broad "does it work at all" test is no longer all that useful.

TheBlueMatt · 2022-07-19T15:16:43Z

Squashed without further changes.

TheBlueMatt added this to the 0.0.110 milestone Jul 12, 2022

TheBlueMatt added the blocked on dependent pr label Jul 12, 2022

tnull self-requested a review July 13, 2022 06:18

tnull reviewed Jul 13, 2022

View reviewed changes

TheBlueMatt force-pushed the 2022-07-no-rand-path-selection branch from b455d49 to aad65c8 Compare July 13, 2022 18:27

TheBlueMatt mentioned this pull request Jul 14, 2022

Add a per-amount base penalty in the ProbabilisticScorer #1617

Merged

tnull reviewed Jul 14, 2022

View reviewed changes

TheBlueMatt force-pushed the 2022-07-no-rand-path-selection branch from aad65c8 to 9687183 Compare July 14, 2022 18:58

TheBlueMatt added Seeking Code Review and removed blocked on dependent pr labels Jul 14, 2022

TheBlueMatt force-pushed the 2022-07-no-rand-path-selection branch from 9687183 to fa1ba8a Compare July 15, 2022 15:33

jkczyz reviewed Jul 19, 2022

View reviewed changes

TheBlueMatt force-pushed the 2022-07-no-rand-path-selection branch from fa1ba8a to 2b57097 Compare July 19, 2022 01:41

TheBlueMatt removed the Seeking Code Review label Jul 19, 2022

tnull previously approved these changes Jul 19, 2022

View reviewed changes

TheBlueMatt added 3 commits July 19, 2022 15:16

TheBlueMatt dismissed tnull’s stale review via ff8d3f7 July 19, 2022 15:16

TheBlueMatt force-pushed the 2022-07-no-rand-path-selection branch from 2b57097 to ff8d3f7 Compare July 19, 2022 15:16

TheBlueMatt assigned tnull and jkczyz Jul 19, 2022

tnull approved these changes Jul 19, 2022

View reviewed changes

jkczyz approved these changes Jul 19, 2022

View reviewed changes

TheBlueMatt merged commit 5023ff0 into lightningdevkit:main Jul 19, 2022

tnull mentioned this pull request Jan 24, 2024

Misc Tweaks for bindings #2847

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Always pick the best paths, rather than (poorly) attempting to randomize #1610

Always pick the best paths, rather than (poorly) attempting to randomize #1610

TheBlueMatt commented Jul 12, 2022

codecov-commenter commented Jul 12, 2022 •

edited

tnull left a comment

tnull Jul 13, 2022

TheBlueMatt Jul 13, 2022

tnull Jul 14, 2022

tnull Jul 14, 2022

TheBlueMatt Jul 14, 2022

tnull Jul 15, 2022

TheBlueMatt commented Jul 13, 2022

tnull Jul 14, 2022

tnull Jul 14, 2022

tnull commented Jul 14, 2022

TheBlueMatt commented Jul 14, 2022

TheBlueMatt commented Jul 14, 2022

tnull commented Jul 15, 2022 •

edited

TheBlueMatt commented Jul 15, 2022

jkczyz left a comment

jkczyz Jul 19, 2022

TheBlueMatt Jul 19, 2022

jkczyz Jul 19, 2022

tnull left a comment •

edited

tnull Jul 19, 2022

TheBlueMatt Jul 19, 2022

tnull Jul 19, 2022 •

edited

TheBlueMatt Jul 19, 2022

TheBlueMatt commented Jul 19, 2022

		payment_path.update_value_and_recompute_fees(cmp::min(value_contribution_msat, final_value_msat));
		value_contribution_msat = cmp::min(value_contribution_msat, final_value_msat);

Always pick the best paths, rather than (poorly) attempting to randomize #1610

Always pick the best paths, rather than (poorly) attempting to randomize #1610

Conversation

TheBlueMatt commented Jul 12, 2022

codecov-commenter commented Jul 12, 2022 • edited

Codecov Report

tnull left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TheBlueMatt commented Jul 13, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tnull commented Jul 14, 2022

TheBlueMatt commented Jul 14, 2022

TheBlueMatt commented Jul 14, 2022

tnull commented Jul 15, 2022 • edited

TheBlueMatt commented Jul 15, 2022

jkczyz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tnull left a comment • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tnull Jul 19, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TheBlueMatt commented Jul 19, 2022

codecov-commenter commented Jul 12, 2022 •

edited

tnull commented Jul 15, 2022 •

edited

tnull left a comment •

edited

tnull Jul 19, 2022 •

edited