Avoid reusing just-failed channels in the router, making the impossibility penalty configurable #1600

TheBlueMatt · 2022-07-06T21:20:26Z

Some users want to keep a very long scorer data half-life to maintain knowledge for a longer period of time. Its somewhat unclear if that's optimal, but it is clear that it can cause the scorer to refuse to build a route when the only available channel failed most recently within the halflife. This is rather unexpected behavior, and the fact that the scorer must behave this way to avoid #1241 is very annoying in that it prevents fixing this.

Here we move the avoidance of just-failed channels into the router itself, allowing us to make the impossibility penalty configurable, which we do as well.

Closes #1241, superseding #1252.

codecov-commenter · 2022-07-06T21:42:52Z

Codecov Report

Merging #1600 (a863778) into main (4e5f74a) will increase coverage by 0.14%.
The diff coverage is 87.50%.

❗ Current head a863778 differs from pull request most recent head 5bff5f9. Consider uploading reports for the commit 5bff5f9 to get more accurate results

@@            Coverage Diff             @@
##             main    #1600      +/-   ##
==========================================
+ Coverage   90.86%   91.00%   +0.14%     
==========================================
  Files          80       80              
  Lines       44437    45502    +1065     
  Branches    44437    45502    +1065     
==========================================
+ Hits        40377    41409    +1032     
- Misses       4060     4093      +33

Impacted Files	Coverage Δ
lightning/src/routing/router.rs	`92.45% <77.77%> (+0.06%)`	⬆️
lightning/src/routing/scoring.rs	`97.57% <95.00%> (+1.50%)`	⬆️
lightning/src/ln/channelmanager.rs	`84.90% <100.00%> (+0.02%)`	⬆️
lightning/src/ln/functional_test_utils.rs	`95.24% <100.00%> (+<0.01%)`	⬆️
lightning/src/chain/onchaintx.rs	`93.98% <0.00%> (-0.93%)`	⬇️
lightning/src/util/events.rs	`39.25% <0.00%> (-0.29%)`	⬇️
lightning/src/ln/functional_tests.rs	`96.95% <0.00%> (-0.17%)`	⬇️
lightning/src/ln/channel.rs	`88.77% <0.00%> (+0.02%)`	⬆️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4e5f74a...5bff5f9. Read the comment docs.

tnull

Thanks, generally looks good, just some comments.

lightning/src/routing/router.rs

lightning/src/routing/scoring.rs

TheBlueMatt · 2022-07-07T16:06:10Z

Rebased and addressed feedback.

tnull

LGTM

lightning/src/routing/scoring.rs

jkczyz · 2022-07-12T21:51:31Z

lightning/src/routing/router.rs

 					if contributes_sufficient_value && doesnt_exceed_max_path_length &&
-						doesnt_exceed_cltv_delta_limit && may_overpay_to_meet_path_minimum_msat {
+						doesnt_exceed_cltv_delta_limit && !payment_failed_on_this_channel &&
+						may_overpay_to_meet_path_minimum_msat
+					{
 						hit_minimum_limit = true;
 					} else if contributes_sufficient_value && doesnt_exceed_max_path_length &&
-						doesnt_exceed_cltv_delta_limit && over_path_minimum_msat {
+						doesnt_exceed_cltv_delta_limit && over_path_minimum_msat &&
+						!payment_failed_on_this_channel
+					{
 						// Note that low contribution here (limited by available_liquidity_msat)


Instead of checking payment_failed_on_this_channel twice, consider having a leading if expression.

if payment_failed_on_this_channel { } else if /* ... */ {

Oh, good idea, I moved all the existing if conditions to that!

lightning/src/routing/router.rs

lightning/src/routing/scoring.rs

jkczyz · 2022-07-13T04:19:56Z

lightning/src/routing/scoring.rs

+			// Equivalent to hitting the else clause below with the amount equal to the effective
+			// capacity and without any certainty on the liquidity upper bound, plus the
+			// impossibility penalty.


This comment needs to be updated now since it's not only when equal to the effective capacity. Seems like the earlier change that added this now removed logic is what caused calculates_log10_without_overflowing_u64_max_value to not exercise the correct code path as mentioned earlier on that test (i.e., it doesn't hit the negative_log10_times_2048 case below.

I read the comment differently - I read the comment to say "the calculation we're doing here is equivalent to..." which is still true, no?

Ah, yeah, you're right!

jkczyz

Largely looks good but one comment.

jkczyz · 2022-07-13T17:05:52Z

lightning/src/routing/scoring.rs

+			// Equivalent to hitting the else clause below with the amount equal to the effective
+			// capacity and without any certainty on the liquidity upper bound, plus the
+			// impossibility penalty.


Ah, yeah, you're right!

jkczyz · 2022-07-13T17:12:23Z

lightning/src/routing/scoring.rs

+			let negative_log10_times_2048 = NEGATIVE_LOG10_UPPER_BOUND * 2048;
+			self.combined_penalty_msat(amount_msat, negative_log10_times_2048, params)
+				.saturating_add(params.considered_impossible_penalty_msat)


Would it make sense to do the max of these two rather than adding? Is the idea that we want this to be >= anything given in the else clause?

Yea, that's the idea. I suppose we could do a max. Originally I had it not adding the combined_penalty call at all but that seemed to brittle to deal with so switched to adding. I don't honestly have a strong opinion between adding and max, either way the docs can tell users what's going on, but for max i kinda worry users will acidentally set it too low and get no penalty here, which seems strange?

They would also need to set the other params lower, though, since max would select the combined penalty over the considered_impossible_penalty_msat if the latter were too low. So it's all kinda relative. Don't feel too strongly either though note that max may make debugging a little easier as otherwise you may need to mentally subtract some combined penalty if it is not obvious.

Right, sure, I just meant it'd be easy for a user to look at the field and think "okay, let me pick something a bit higher than the liquidity offset, forget to multiply by 2 or whatever, and end up with a penalty equal to the liquidity penalty, which seems wrong? I dunno, I'm happy to mentally convert when we're debugging. Unless you feel strongly I'd suggest we leave it.

Right, sure, I just meant it'd be easy for a user to look at the field and think "okay, let me pick something a bit higher than the liquidity offset, forget to multiply by 2 or whatever, and end up with a penalty equal to the liquidity penalty, which seems wrong?

Plus base and amount penalty, FWIW.

I dunno, I'm happy to mentally convert when we're debugging. Unless you feel strongly I'd suggest we leave it.

Sure we can leave it. Though one thing I just realized is that either way now the penalty will be variable across channels depending on the amount. Maybe less so when using max but maybe that's an argument in favor of adding. That way if left to choose only channels exceeding the maximum liquidity, we'd prefer ones that would otherwise be penalized less.

TheBlueMatt · 2022-07-13T19:41:23Z

Squashed without further changes.

lightning/src/routing/scoring.rs

TheBlueMatt · 2022-07-13T21:04:16Z

Squashed yet again. Change since yesterday was:

diff --git a/lightning/src/routing/scoring.rs b/lightning/src/routing/scoring.rs
index 6aa19abaa..9fa62d83f 100644
--- a/lightning/src/routing/scoring.rs
+++ b/lightning/src/routing/scoring.rs
@@ -397,7 +397,8 @@ pub struct ProbabilisticScoringParameters {
 	/// current estimate of the channel's available liquidity.
 	///
-	/// Note that in this case the [`liquidity_penalty_multiplier_msat`] and
-	/// [`amount_penalty_multiplier_msat`]-based penalties are still included in the overall
-	/// penalty.
+	/// Note that in this case all other penalties, including the
+	/// [`liquidity_penalty_multiplier_msat`] and [`amount_penalty_multiplier_msat`]-based
+	/// penalties, as well as the [`base_penalty_msat`] and the [`anti_probing_penalty_msat`], if
+	/// applicable, are still included in the overall penalty.
 	///
 	/// If you wish to avoid creating paths with such channels entirely, setting this to a value of
@@ -408,4 +409,6 @@ pub struct ProbabilisticScoringParameters {
 	/// [`liquidity_penalty_multiplier_msat`]: Self::liquidity_penalty_multiplier_msat
 	/// [`amount_penalty_multiplier_msat`]: Self::amount_penalty_multiplier_msat
+	/// [`base_penalty_msat`]: Self::base_penalty_msat
+	/// [`anti_probing_penalty_msat`]: Self::anti_probing_penalty_msat
 	pub considered_impossible_penalty_msat: u64,
 }

jkczyz · 2022-07-14T15:02:16Z

66ca68a mentions failures on a payment-level, but isn't it really on a path-level? i.e., if two parts of an MPP fail on different channels, retrying one part could retry over the channel failed on the other part?

TheBlueMatt · 2022-07-14T16:00:23Z

Oops, right, sorry, rewrote that commit message without changes to the diff, added a note that it this does have the drawback of potentially retrying different parts along the same path, but hopefully the scorer doesn't let that happen unless the payment is gonna fail anyway.

jkczyz · 2022-07-14T16:32:07Z

Oops, right, sorry, rewrote that commit message without changes to the diff, added a note that it this does have the drawback of potentially retrying different parts along the same path, but hopefully the scorer doesn't let that happen unless the payment is gonna fail anyway.

Yeah, same across payments, but as you said hopefully the scorer will learn quickly enough.

When an HTLC fails, we currently rely on the scorer learning the failed channel and assigning an infinite (`u64::max_value()`) penalty to the channel so as to avoid retrying over the exact same path (if there's only one available path). This is common when trying to pay a mobile client behind an LSP if the mobile client is currently offline. This leads to the scorer being overly conservative in some cases - returning `u64::max_value()` when a given path hasn't been tried for a given payment may not be the best decision, even if that channel failed 50 minutes ago. By tracking channels which failed on a payment part level and explicitly refusing to route over them we can relax the requirements on the scorer, allowing it to make different decisions on how to treat channels that failed relatively recently without causing payments to retry the same path forever. This does have the drawback that it could allow two separate part of a payment to traverse the same path even though that path just failed, however this should only occur if the payment is going to fail anyway, at least as long as the scorer is properly learning. Closes lightningdevkit#1241, superseding lightningdevkit#1252.

When we consider sending an HTLC over a given channel impossible due to our current knowledge of the channel's liquidity, we currently always assign a penalty of `u64::max_value()`. However, because we now refuse to retry a payment along the same path in the router itself, we can now make this value configurable. This allows users to have a relatively high knowledge decay interval without the side-effect of refusing to try the only available path in cases where a channel is intermittently available.

In general we should avoid taking paths that we are confident will not work as much possible, but we should be willing to try each payment at least once, even if its over a channel that failed recently. A full Bitcoin penalty for such a channel seems reasonable - lightning fees are unlikely to ever reach that point so such channels will be scored much worse than any other potential path, while still being below `u64::max_value()`.

TheBlueMatt · 2022-07-14T18:37:33Z

Rebased to address merge conflict.

TheBlueMatt mentioned this pull request Jul 6, 2022

Ensure we don't ever retry a payment along a just-failed path #1252

Closed

TheBlueMatt force-pushed the 2022-07-explicit-avoid-retries branch from 149fc6e to 848afc3 Compare July 6, 2022 21:24

jkczyz self-requested a review July 6, 2022 21:31

tnull self-requested a review July 7, 2022 07:07

tnull reviewed Jul 7, 2022

View reviewed changes

TheBlueMatt force-pushed the 2022-07-explicit-avoid-retries branch from 848afc3 to e6a114c Compare July 7, 2022 16:06

TheBlueMatt force-pushed the 2022-07-explicit-avoid-retries branch from e6a114c to 846291f Compare July 7, 2022 17:07

TheBlueMatt added the Seeking Code Review label Jul 8, 2022

TheBlueMatt force-pushed the 2022-07-explicit-avoid-retries branch from e3396ef to cc1a045 Compare July 8, 2022 14:25

tnull previously approved these changes Jul 11, 2022

View reviewed changes

lightning/src/routing/scoring.rs Show resolved Hide resolved

lightning/src/routing/scoring.rs Show resolved Hide resolved

jkczyz reviewed Jul 12, 2022

View reviewed changes

TheBlueMatt removed the Seeking Code Review label Jul 12, 2022

TheBlueMatt dismissed tnull’s stale review via a863778 July 12, 2022 23:13

TheBlueMatt force-pushed the 2022-07-explicit-avoid-retries branch from cc1a045 to a863778 Compare July 12, 2022 23:13

TheBlueMatt assigned tnull and jkczyz Jul 12, 2022

jkczyz reviewed Jul 13, 2022

View reviewed changes

jkczyz previously approved these changes Jul 13, 2022

View reviewed changes

TheBlueMatt dismissed jkczyz’s stale review via 85b842c July 13, 2022 19:41

TheBlueMatt force-pushed the 2022-07-explicit-avoid-retries branch from a863778 to 85b842c Compare July 13, 2022 19:41

jkczyz previously approved these changes Jul 13, 2022

View reviewed changes

lightning/src/routing/scoring.rs Outdated Show resolved Hide resolved

TheBlueMatt dismissed jkczyz’s stale review via eea6311 July 13, 2022 20:31

TheBlueMatt force-pushed the 2022-07-explicit-avoid-retries branch from 85b842c to eea6311 Compare July 13, 2022 20:31

jkczyz previously approved these changes Jul 13, 2022

View reviewed changes

TheBlueMatt dismissed jkczyz’s stale review via 8a1b418 July 13, 2022 21:03

TheBlueMatt force-pushed the 2022-07-explicit-avoid-retries branch from eea6311 to 8a1b418 Compare July 13, 2022 21:03

jkczyz previously approved these changes Jul 13, 2022

View reviewed changes

TheBlueMatt unassigned jkczyz Jul 13, 2022

tnull previously approved these changes Jul 14, 2022

View reviewed changes

TheBlueMatt dismissed stale reviews from tnull and jkczyz via 5bff5f9 July 14, 2022 15:59

TheBlueMatt force-pushed the 2022-07-explicit-avoid-retries branch from 8a1b418 to 5bff5f9 Compare July 14, 2022 15:59

TheBlueMatt assigned jkczyz Jul 14, 2022

jkczyz previously approved these changes Jul 14, 2022

View reviewed changes

TheBlueMatt added 3 commits July 14, 2022 18:37

TheBlueMatt dismissed jkczyz’s stale review via a3547e2 July 14, 2022 18:37

TheBlueMatt force-pushed the 2022-07-explicit-avoid-retries branch from 5bff5f9 to a3547e2 Compare July 14, 2022 18:37

TheBlueMatt mentioned this pull request Jul 14, 2022

Always pick the best paths, rather than (poorly) attempting to randomize #1610

Merged

jkczyz approved these changes Jul 14, 2022

View reviewed changes

TheBlueMatt unassigned jkczyz Jul 14, 2022

tnull approved these changes Jul 15, 2022

View reviewed changes

TheBlueMatt merged commit f75b6cb into lightningdevkit:main Jul 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid reusing just-failed channels in the router, making the impossibility penalty configurable #1600

Avoid reusing just-failed channels in the router, making the impossibility penalty configurable #1600

TheBlueMatt commented Jul 6, 2022 •

edited

codecov-commenter commented Jul 6, 2022 •

edited

tnull left a comment

TheBlueMatt commented Jul 7, 2022

tnull left a comment

jkczyz Jul 12, 2022

TheBlueMatt Jul 12, 2022

jkczyz Jul 13, 2022

TheBlueMatt Jul 13, 2022

jkczyz Jul 13, 2022

jkczyz left a comment

jkczyz Jul 13, 2022

jkczyz Jul 13, 2022

TheBlueMatt Jul 13, 2022

jkczyz Jul 13, 2022

TheBlueMatt Jul 13, 2022

jkczyz Jul 13, 2022

TheBlueMatt commented Jul 13, 2022

TheBlueMatt commented Jul 13, 2022

jkczyz commented Jul 14, 2022

TheBlueMatt commented Jul 14, 2022

jkczyz commented Jul 14, 2022

TheBlueMatt commented Jul 14, 2022

Avoid reusing just-failed channels in the router, making the impossibility penalty configurable #1600

Avoid reusing just-failed channels in the router, making the impossibility penalty configurable #1600

Conversation

TheBlueMatt commented Jul 6, 2022 • edited

codecov-commenter commented Jul 6, 2022 • edited

Codecov Report

tnull left a comment

Choose a reason for hiding this comment

TheBlueMatt commented Jul 7, 2022

tnull left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jkczyz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TheBlueMatt commented Jul 13, 2022

TheBlueMatt commented Jul 13, 2022

jkczyz commented Jul 14, 2022

TheBlueMatt commented Jul 14, 2022

jkczyz commented Jul 14, 2022

TheBlueMatt commented Jul 14, 2022

TheBlueMatt commented Jul 6, 2022 •

edited

codecov-commenter commented Jul 6, 2022 •

edited