Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always pick the best paths, rather than (poorly) attempting to randomize #1610

Merged

Conversation

TheBlueMatt
Copy link
Collaborator

Based on #1605, this removes the half-assed attempt at randomization in the router, instead simply picking the paths which have the lowest score-per-value-transferred. This ends up being basically the same thing anyway, but more directly, and simply. Further, it fixes some edge-case issues around not selecing enough balance, and, notable, spuriously failing to route if we ended up needing >= 50 paths to route (which is now a configurable number, not that users should be using 50 paths, really).

This then switches us to a saturation limit of 1/4, which I think is much cleaner, but the router cleanups are needed to make tests pass with that.

Tagging 0.0.110 cause I want it the 1/4 switch for it.

@TheBlueMatt TheBlueMatt added this to the 0.0.110 milestone Jul 12, 2022
@codecov-commenter
Copy link

codecov-commenter commented Jul 12, 2022

Codecov Report

Merging #1610 (2b57097) into main (f75b6cb) will increase coverage by 0.00%.
The diff coverage is 100.00%.

❗ Current head 2b57097 differs from pull request most recent head ff8d3f7. Consider uploading reports for the commit ff8d3f7 to get more accurate results

@@           Coverage Diff            @@
##             main    #1610    +/-   ##
========================================
  Coverage   90.84%   90.84%            
========================================
  Files          80       80            
  Lines       44675    44841   +166     
  Branches    44675    44841   +166     
========================================
+ Hits        40583    40735   +152     
- Misses       4092     4106    +14     
Impacted Files Coverage Δ
lightning/src/ln/functional_tests.rs 97.11% <ø> (+0.17%) ⬆️
lightning/src/routing/router.rs 92.44% <100.00%> (-0.09%) ⬇️
lightning/src/ln/functional_test_utils.rs 93.51% <0.00%> (-1.74%) ⬇️
lightning-net-tokio/src/lib.rs 76.85% <0.00%> (-0.31%) ⬇️
lightning/src/util/events.rs 39.54% <0.00%> (+0.28%) ⬆️
lightning-background-processor/src/lib.rs 95.81% <0.00%> (+0.61%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f75b6cb...ff8d3f7. Read the comment docs.

Copy link
Contributor

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum, I agree that the path randomization was not that great to begin with (especially since we keep sorting the candidates before/after). However, as this reverts what little indeterminism was introduced with #1359, I wonder what generally the path forward would be? Do we for example want to introduce a 'privacy budget' that can be spent either on overpaying fees or on choosing non-optimal routes?

lightning/src/routing/router.rs Show resolved Hide resolved
let mut cur_route = Vec::<PaymentPath>::new();
let mut aggregate_route_value_msat = 0;
// First, sort by the cost-per-value of the path, selecting only the paths which
// contribute the most per cost.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"cost-per-value" vs. "contribute the most per cost": Which one is it / should it be? cost-per-value or value-per-cost? I'd argue probably the latter, since we still want to drop paths that contribute less to improve success?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused - they're the same thing, at least as far as ordering goes? I tried to reword it some.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well mainly I wanted to point out the discrepancy between the comment and code. But yeah, they can of course be used somewhat analogous. That said, I assume you introduced the conversion to u128 and the shift by 64 to compare them without the need for floats? Since value >> cost, could you get around that by simply sorting by value/cost in ascending order, rather than cost/value in descending order?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we're now only selecting by cost efficiency, but not by overall amount as before. Why is this better than preferring a smaller number of larger value paths, which potentially reduced the failure probability?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well mainly I wanted to point out the discrepancy between the comment and code. But yeah, they can of course be used somewhat analogous.

Right, I did update the comment.

That said, I assume you introduced the conversion to u128 and the shift by 64 to compare them without the need for floats? Since value >> cost, could you get around that by simply sorting by value/cost in ascending order, rather than cost/value in descending order?

Hmm, that's a good point. For the values our scorer uses that should work alright. That said, #1600 proposes using score values of a full bitcoin, so then we'd be liable to hit issues again. To avoid any scorer-specific logic in the router just shifting up into a u128 seemed the simplest, and while its gonna be really slow on most systems, it shouldn't matter much here, its at the end of the router and only across a handful of paths.

Also, we're now only selecting by cost efficiency, but not by overall amount as before. Why is this better than preferring a smaller number of larger value paths, which potentially reduced the failure probability?

Hmm, yea, that's a valid point - we may prefer to send a few large parts rather than a number of small parts. Still, that seems like something that should be up to the scorer - the router should seek to reduce the total cost in scorer+fee terms of the paths it selects, if the scorer wants to prefer larger parts, then great, if not, that's its problem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that's a good point. For the values our scorer uses that should work alright. That said, #1600 proposes using score values of a full bitcoin, so then we'd be liable to hit issues again. To avoid any scorer-specific logic in the router just shifting up into a u128 seemed the simplest, and while its gonna be really slow on most systems, it shouldn't matter much here, its at the end of the router and only across a handful of paths.

Ah I see. Yeah, it works as it is.

Hmm, yea, that's a valid point - we may prefer to send a few large parts rather than a number of small parts. Still, that seems like something that should be up to the scorer - the router should seek to reduce the total cost in scorer+fee terms of the paths it selects, if the scorer wants to prefer larger parts, then great, if not, that's its problem.

Hum, I agree it's probably best to keep the routing logic as simple as possible and let the Scorer handle it going forward. And hopefully it works as intended and optimizes for failure probability.

lightning/src/routing/router.rs Show resolved Hide resolved
lightning/src/routing/router.rs Show resolved Hide resolved
@TheBlueMatt
Copy link
Collaborator Author

Hum, I agree that the path randomization was not that great to begin with (especially since we keep sorting the candidates before/after). However, as this reverts what little indeterminism was introduced with #1359, I wonder what generally the path forward would be? Do we for example want to introduce a 'privacy budget' that can be spent either on overpaying fees or on choosing non-optimal routes?

Yea, I'm not super happy with it, but at the same time I'm relatively confident our "privacy gain" there was so close to zero its not really worth discussing at length. I'm somewhat thinking our current routing approach is just incompatible with a privacy design - we should consider doing some kind of "private routing", but I'm not sure it can be trivially accomplished with dijkstras. There's maybe a world where we do something like a "privacy budget" but I'd think we'd want to implement that by taking all channel fees below X and replace them with random(0-X) while doing the graph walking, thus making any path of the same length equally likely, or something like that (I believe this is kinda-sorta what CLN does, not sure). In any case, anything I can think of that would be a reasonably private approach simply doesn't start with "dijkstras strictly over the fee + cost, without randomization on the fee/cost function".

lightning/src/routing/router.rs Outdated Show resolved Hide resolved
let mut cur_route = Vec::<PaymentPath>::new();
let mut aggregate_route_value_msat = 0;
// First, sort by the cost-per-value of the path, selecting only the paths which
// contribute the most per cost.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well mainly I wanted to point out the discrepancy between the comment and code. But yeah, they can of course be used somewhat analogous. That said, I assume you introduced the conversion to u128 and the shift by 64 to compare them without the need for floats? Since value >> cost, could you get around that by simply sorting by value/cost in ascending order, rather than cost/value in descending order?

lightning/src/routing/router.rs Outdated Show resolved Hide resolved
let mut cur_route = Vec::<PaymentPath>::new();
let mut aggregate_route_value_msat = 0;
// First, sort by the cost-per-value of the path, selecting only the paths which
// contribute the most per cost.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we're now only selecting by cost efficiency, but not by overall amount as before. Why is this better than preferring a smaller number of larger value paths, which potentially reduced the failure probability?

lightning/src/routing/router.rs Outdated Show resolved Hide resolved
@tnull
Copy link
Contributor

tnull commented Jul 14, 2022

Yea, I'm not super happy with it, but at the same time I'm relatively confident our "privacy gain" there was so close to zero its not really worth discussing at length. I'm somewhat thinking our current routing approach is just incompatible with a privacy design - we should consider doing some kind of "private routing", but I'm not sure it can be trivially accomplished with dijkstras.

I don't think privacy is a binary space. Since there are always a lot of assumptions to be made on the adversary's end, introducing even mediocre amounts of uncertainty might go a long way in practice. Even randomly choosing to forgo the most optimal route and recomputing another one from time to time could break the 'shortest path' assumption any MITM needs to make when trying to break end-to-end privacy.

There's maybe a world where we do something like a "privacy budget" but I'd think we'd want to implement that by taking all channel fees below X and replace them with random(0-X) while doing the graph walking, thus making any path of the same length equally likely, or something like that (I believe this is kinda-sorta what CLN does, not sure). In any case, anything I can think of that would be a reasonably private approach simply doesn't start with "dijkstras strictly over the fee + cost, without randomization on the fee/cost function".

Absolutely, there are a lot of things that could be done to enable more private routing, what you sketched might be a good start. That said, I'll have a look what the state-of-the-art in CLN currently is.

@TheBlueMatt TheBlueMatt force-pushed the 2022-07-no-rand-path-selection branch from aad65c8 to 9687183 Compare July 14, 2022 18:58
@TheBlueMatt
Copy link
Collaborator Author

I don't think privacy is a binary space. Since there are always a lot of assumptions to be made on the adversary's end, introducing even mediocre amounts of uncertainty might go a long way in practice. Even randomly choosing to forgo the most optimal route and recomputing another one from time to time could break the 'shortest path' assumption any MITM needs to make when trying to break end-to-end privacy.

Yea, I think that's a good point. There's possibly more of a privacy cost here than I suggested in the description. Still, I'd much rather address that by changing the scorer to give it more control over the ultimate cost (ie not always adding the fee on the router side and have the scorer return that) so that the scorer can do things like what I suggested above and we don't have to touch the router for it.

@TheBlueMatt
Copy link
Collaborator Author

Rebased after merge of the dependent PR.

@tnull
Copy link
Contributor

tnull commented Jul 15, 2022

Alright, so far looks good. I'll let someone else have a go and then will do another round of review.

@TheBlueMatt TheBlueMatt force-pushed the 2022-07-no-rand-path-selection branch from 9687183 to fa1ba8a Compare July 15, 2022 15:33
@TheBlueMatt
Copy link
Collaborator Author

Rebased.

Copy link
Contributor

@jkczyz jkczyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read through the discussion on the rationales. All seems reasonable to me.

lightning/src/routing/router.rs Outdated Show resolved Hide resolved
lightning/src/routing/router.rs Outdated Show resolved Hide resolved
lightning/src/routing/router.rs Outdated Show resolved Hide resolved
lightning/src/routing/router.rs Outdated Show resolved Hide resolved
payment_path.update_value_and_recompute_fees(cmp::min(value_contribution_msat, final_value_msat));
value_contribution_msat = cmp::min(value_contribution_msat, final_value_msat);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to formulate a test that catches the incorrect behavior?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so with the current code we end up relying on it being correct a good bit more, so reverting the first patch here causes existing tests to fail. As for building a fresh test for it...its pretty convoluted and we'd be testing side-effects - we can construct a path that requires up to 2x overpayment, but we're trying to gather 3x total paths, so if we try to build a test that fails to find a path it won't work. We could maybe get away with something where it finds a path but not the optimal one because an optimal one was found later, but that's...complicated and kinda a strange test? I dunno.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be alright if other tests are catching it now.

tnull
tnull previously approved these changes Jul 19, 2022
Copy link
Contributor

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but one comment you may consider optional. Feel free to squash.

@@ -754,7 +754,7 @@ where L::Target: Logger, GL::Target: Logger {
pub(crate) fn get_route<L: Deref, S: Score>(
our_node_pubkey: &PublicKey, payment_params: &PaymentParameters, network_graph: &ReadOnlyNetworkGraph,
first_hops: Option<&[&ChannelDetails]>, final_value_msat: u64, final_cltv_expiry_delta: u32,
logger: L, scorer: &S, random_seed_bytes: &[u8; 32]
logger: L, scorer: &S, _random_seed_bytes: &[u8; 32]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, now that we don't use it anymore and probably won't going forward, should we remove random_seed_bytes from the signature of get_route entirely? As discussed, #495 will likely happen in the Scorer and #1482 can be done 'on top' in a similar vein as add_random_cltv_offset.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, good question. The first version of this patch did, and while it didn't change the public interface, it touched a ton of test code. I dropped it cause I wasn't sure if we would end up using it again and it doesn't change the public interface so I don't care too much about an unused bit of code. I don't feel super strongly but until we're in a place we're confident in with the router I kinda feel like leaving it. One thing, at least, that we may do, is add the random seed as a "per route" input to the scorer.

Copy link
Contributor

@tnull tnull Jul 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, but I assume if we needed randomization in the Scorer, we'd pass it directly to it before giving it to find_route? That said, I also don't think it's very pressing to remove it right now. Especially since I still like the idea of eventually introducing a proper Rand interface, which may require a larger refactoring anyways. So feel free to leave as is.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe? I dunno, I haven't really thought about it, its just not clear to me how we'd do it, so I figured not worry too much about removing the field right now.

If we end up "paying" for an `htlc_minimum_msat` with fees, we
increment `already_collected_value_msat` by more than the amount
of the path that we collected (who's `value_contribution_msat` is
higher than the total payment amount, despite having been reduced
down to the payment amount).

This throws off our total value collection target, though in the
coming commit(s) it would also throw off our path selection
calculations.
Currently, after we've selected a number of candidate paths, we
construct a route from a random set of paths repeatedly, and then
select the route with the lowest total cost. In the vast majority
of cases this ends up doing a bunch of additional work in order to
select the path(s) with the total lowest cost, with some vague
attempt at randomization that doesn't actually work.

Instead, here, we simply sort available paths by `cost / amount`
and select the top paths. This ends up in practice having the same
end result with substantially less complexity. In some rare cases
it gets a better result, which also would have been achieved
through more random trials. This implies there may in such cases be
a potential privacy loss, but not a substantial one, given our path
selection is ultimately mostly deterministic in many cases (or, if
it is not, then privacy is achieved through randomization at the
scorer level).
Saturating a channel beyond 1/4 of its capacity seems like a more
reasonable threshold for avoiding a path than 1/2, especially given
we should still be willing to send a payment with a lower
saturation limit if it comes to that.

This requires an (obvious) change to some router tests, but also
requires a change to the `fake_network_test`, opting to simply
remove some over-limit test code there - `fake_network_test` was
our first ever functional test, and while it worked great to ensure
LDK worked at all on day one, we now have a rather large breadth
of functional tests, and a broad "does it work at all" test is no
longer all that useful.
@TheBlueMatt
Copy link
Collaborator Author

Squashed without further changes.

@TheBlueMatt TheBlueMatt merged commit 5023ff0 into lightningdevkit:main Jul 19, 2022
@tnull tnull mentioned this pull request Jan 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants