Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track Channel Liquidity History in ProbabilisticScorer #1625

Merged

Conversation

TheBlueMatt
Copy link
Collaborator

Based on #1617, this adds two new commits which serve to track each channel's liquidity history into 8 (min and 8 max) (kinda) exponentially decaying buckets. I'm really curious to see how this performs, and anticipate it will be a useful part of the overall scorer but:

  • I haven't benchmarked this yet, it may be really slow, and too painful for "normal" operation, dunno yet,
  • This is just a draft, should probably write another test or two.

There's a number of constants in here, which I thought about, but not that hard, possible conversation topics around them include:

  • is 8000 historical datapoints (per channel) tracked enough?
  • Is decaying at a rate of 2047/2048 per datapoint enough?
  • Does it need some kind of time component or are we okay with that history stretching really far back if we haven't updated a channel in a while?

Copy link
Contributor

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tracking of historical data seems like a great idea. IMO we probably need a time component to make our scoring results too dependent on the current/past throughput. Also, while the taken approach seems to be very efficient it could get tricky to reason about and could be limiting in the future. I'd therefore argue we should plan ahead and take measures to make updating the parameters (e.g., number of buckets and bucket size) easy. So far I mainly had a look at the first commit, so here are some comments regarding it.

lightning/src/routing/scoring.rs Outdated Show resolved Hide resolved
lightning/src/util/ser.rs Show resolved Hide resolved
// Each time we update our liquidity estimate, we add 32 to the buckets for the current
// min and max liquidity offset positions.
//
// We then decay each bucket by multiplying by 2047/2048. This ensures we can't actually
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're describing here what you are doing with these magic numbers, but could you also go into more detail regarding why they are a good choice? E.g., how did you choose the $2047/2048$ rate?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I added a note, but they're just kinda picked by playing with different options and seeing what works...

@@ -750,6 +768,41 @@ impl<L: DerefMut<Target = u64>, T: Time, U: DerefMut<Target = T>> DirectedChanne
self.set_max_liquidity_msat(max_liquidity_msat);
}

fn update_history_buckets(&mut self) {
// We have 8 leaky buckets for min and max liquidity. Each bucket tracks the amount of time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this called on every failed/successful payment or probe attempt, it actually doesn't seem to track the amount of time spent in each bucket, but simply a (skewed) average over the last X datapoints? Maybe this method could/should be called on a regular (real time) basis to provide i.i.d. samples?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I'm a bit torn on it - I imagine there's quite a large distribution of amount of times we test a channel. If we've only ever tried to send one HTLC over a channel, do we want to do something time-based where we think that this channel is always in a given state, or do we want to leave it as a single datapoint?

@tnull
Copy link
Contributor

tnull commented Jul 26, 2022

Ah, needs rebase now that #1617 was merged.

@ddiekroeger
Copy link

is 8000 historical datapoints (per channel) tracked enough?

8000? I thought it was 8?

@TheBlueMatt
Copy link
Collaborator Author

TheBlueMatt commented Jul 26, 2022

8000? I thought it was 8?

We use 8 buckets, but the exponential decay is currently set so that all current data is only completely forgotten after 8000 other datapoints (though a single individual datapoint is forgotten pretty quickly). ie if we see the liquidity of a channel in the same place 10,000 times in a row it will take 8000 data points to forget that, but if we see it only once, we will decay down to zero after only a few datapoints.

Copy link

@ariard ariard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the historical buckets will adapt to rapid changes in network congestion, downgrading quickly the reliability of channel liquidity balances instead to reflect stale view. There is also the question if we could clear up the buckets based on receiving gossips or bolt4 onion messages receptions, i.e can we capture network anomalies falsifying the historical tracking ? We might also discount new channels displaying compelling routing conditions, so there is might be a bonus to introduce based on channel ages.

Overall, ACK on the direction, just wanna thinking on bounds of this approach.

// In total, this allows us to track data for the last 8,000 or so payments across a given
// channel.
//
// These constants are a balance - we try to fit in 2 bytes per bucket to reduce overhead,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it's new 32 bytes of storage per-channel (min_liquidity_offset_history / max_liquidity_offset_history), if we account the ~85k of public channels, that's ~2,7 MB of storage requirement if the liquidity history feature is enabled.

Generally, do we have a storage performance budget for low-grades mobile nodes or you're thinking this feature only make sense for high payment volume nodes with pretty generous resources ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we currently have a storage budget, no, but 3MB is much less than the full graph in-memory anyway. Ultimately I think we can probably assume we can safely use 200-300MB on most systems, even most phones these days ship with 1-2GB of RAM, more for higher-end devices.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah sounds we're large. Though would be nice to improve our user documentation about features affecting signifcantly (maybe here doesn't qualify) resource usage. Tracked in an issue : #1641

Also guide developers to have a nice view of resource usage of a Lightning node before to propose a new PR/feature.

lightning/src/routing/scoring.rs Outdated Show resolved Hide resolved
@TheBlueMatt
Copy link
Collaborator Author

I wonder if the historical buckets will adapt to rapid changes in network congestion, downgrading quickly the reliability of channel liquidity balances instead to reflect stale view.

Yea, probably need to run some numbers here and debate the constants selected, but ultimately I hope it generally doesn't. The point of having historical tracking is to avoid looking at the current state of affairs, we already have the explicit min/max tracking for that. The big question is gonna be how do we balance the penalties between the two.

There is also the question if we could clear up the buckets based on receiving gossips or bolt4 onion messages receptions, i.e can we capture network anomalies falsifying the historical tracking ?

Hmm, not quite sure your question here? Currently we only update scoring data based on payment/probe success/failure.

We might also discount new channels displaying compelling routing conditions, so there is might be a bonus to introduce based on channel ages.

Oh, i like that. Feels like maybe it could be a separate PR/scoring penalty, though, no?

@ariard
Copy link

ariard commented Jul 28, 2022

The point of having historical tracking is to avoid looking at the current state of affairs, we already have the explicit min/max tracking for that.

Okay so it's more the min/max tracking in themselves to monitor if they stay accurate in function of a network anomalies. It might be still interesting in the future to have a heuristic module to capture events and apply a decaying malus on the balance liquidiities.

Hmm, not quite sure your question here? Currently we only update scoring data based on payment/probe success/failure.

So I've reviewed the probabilistic scorer with the idea it incentivized node operators to maintain equilibrium of their inbound/outbound liquidity, by being proactive in their liquidity management. I wonder (in a large way) if new data received on our side like gossips could be a hint of a change in their liquidity policy that we should react in consequence.

Oh, i like that. Feels like maybe it could be a separate PR/scoring penalty, though, no?

Yep, better to advance piece by piece, once we're good with the historical tracking model.

@ariard ariard removed the Review club label Aug 9, 2022
@TheBlueMatt TheBlueMatt marked this pull request as ready for review August 22, 2022 23:10
@TheBlueMatt
Copy link
Collaborator Author

Added a no-learning decay with a default half-life of two weeks. I believe this is now ready for review, modulo the commit message that needs filling out.

@codecov-commenter
Copy link

codecov-commenter commented Aug 22, 2022

Codecov Report

Base: 90.77% // Head: 90.72% // Decreases project coverage by -0.05% ⚠️

Coverage data is based on head (55eb934) compared to base (b84b53a).
Patch coverage: 97.45% of modified lines in pull request are covered.

❗ Current head 55eb934 differs from pull request most recent head 3dce6f8. Consider uploading reports for the commit 3dce6f8 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1625      +/-   ##
==========================================
- Coverage   90.77%   90.72%   -0.06%     
==========================================
  Files          80       87       +7     
  Lines       44763    46843    +2080     
  Branches    44763    46843    +2080     
==========================================
+ Hits        40632    42496    +1864     
- Misses       4131     4347     +216     
Impacted Files Coverage Δ
lightning/src/routing/scoring.rs 96.56% <97.29%> (+0.43%) ⬆️
lightning/src/util/ser.rs 91.66% <100.00%> (+0.25%) ⬆️
lightning-block-sync/src/init.rs 90.90% <0.00%> (-2.66%) ⬇️
lightning-invoice/src/utils.rs 94.85% <0.00%> (-1.92%) ⬇️
lightning/src/util/events.rs 38.13% <0.00%> (-1.37%) ⬇️
lightning/src/ln/peer_handler.rs 55.96% <0.00%> (-1.30%) ⬇️
lightning/src/util/test_utils.rs 76.90% <0.00%> (-1.21%) ⬇️
lightning/src/ln/peer_channel_encryptor.rs 93.62% <0.00%> (-1.11%) ⬇️
lightning/src/chain/keysinterface.rs 92.54% <0.00%> (-1.06%) ⬇️
lightning/src/ln/chan_utils.rs 94.55% <0.00%> (-0.98%) ⬇️
... and 54 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Contributor

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass with mostly a lot of suggestions trying to give more explanation/rationale in the comments. Will do another pass focusing on the bucket walking tomorrow.

///
/// Note that 16 or more half lives guarantees that all historical data will be wiped.
///
/// Default value: 14 days
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting two weeks before considering data to be out-of-date seems a long time to me, given how fast network dynamics might change.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree, the point of the historical tracker is to track history, and function well when we limited data for a channel. Data from a month ago is useless in some cases, but in many cases a node that was a sink six months ago is still a sink today. Having some data is better than none. We could consider a different decay model where we decrease the penalty as time goes back rather than decaying the data itself, but I'm not sure that's really cleaner.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, all I was saying is that starting to decay data after two weeks seems like a long time to me, as we may not adapt fast enough if something changes. But, as we don't have any ground truth here, it's hard to argue either way.

Just as a note: What would probably work better than the current model of decaying all at the same rate would be to account for the velocity of behavior changes in the past by, e.g., keeping track of how often our expectation is changing on a per-node basis and decay faster the higher the deviation is. However, this would probably require keeping more data and further introduce the problem how to translate deviation into a corresponding delay. So, we're currently probably better off with the chosen approach.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, right, so the thinking on the decay here is that we really don't want to ever care about/use the decay here at all, but instead should rely on new data to decay out previous data. Of course if we haven't seen anything in six months we want to forget everything, but if its not been too long, we should rely on new data to slowly decay the old data, not the timer.

Just as a note: What would probably work better than the current model of decaying all at the same rate would be to account for the velocity of behavior changes in the past

Hmm, clever. I think we can do that pretty well with this bucket tracker, actually, though of course seems like something for a followup. We can determine how much the channel has been changing by calculating what % of our min,max bucket pairs are "impossible" - ie how many of our datapoints are in bucket pairs where the min capacity is larger than the max capacity.

lightning/src/routing/scoring.rs Outdated Show resolved Hide resolved
lightning/src/routing/scoring.rs Outdated Show resolved Hide resolved
lightning/src/routing/scoring.rs Outdated Show resolved Hide resolved
lightning/src/routing/scoring.rs Outdated Show resolved Hide resolved
lightning/src/routing/scoring.rs Outdated Show resolved Hide resolved
lightning/src/routing/scoring.rs Outdated Show resolved Hide resolved
lightning/src/routing/scoring.rs Outdated Show resolved Hide resolved
lightning/src/routing/scoring.rs Outdated Show resolved Hide resolved
lightning/src/routing/scoring.rs Outdated Show resolved Hide resolved
lightning/src/routing/scoring.rs Outdated Show resolved Hide resolved
lightning/src/routing/scoring.rs Outdated Show resolved Hide resolved
lightning/src/routing/scoring.rs Outdated Show resolved Hide resolved
lightning/src/routing/scoring.rs Outdated Show resolved Hide resolved
lightning/src/routing/scoring.rs Outdated Show resolved Hide resolved
// there are none, treat it as if we had no data.
let mut is_fully_decayed = true;
for (min_idx, min_bucket) in self.min_liquidity_offset_history.buckets.iter().enumerate() {
if min_bucket.checked_shr(decays).unwrap_or(0) > 0 { is_fully_decayed = false; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though is basically just this checked_shr().unwrap_or(0) command, could we DRY the decay function everywhere into its dedicated function, just to make clear what is happening?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Into what? Its more code to write a function to do .checked_shr.unwrap_or? I'm not quite sure I'm clear what the ask is here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it's more code, but then every instance of checked_shr().unwrap_or(0) could be replaced by something much more readable. It could be even worth making a newtype providing the decay() and is_decayed() methods to increase readability?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think moving code away from where its being used makes anything more readable. Quite the opposite - obscuring what's going on by moving one line behind a newtype barrier makes things substantially more confusing IMO - now you have to jump to an unrelated part of the file to figure out what's being done when trying to read a single line. I did move the check into a local lambda, though, since its in the same place.

lightning/src/routing/scoring.rs Outdated Show resolved Hide resolved
lightning/src/routing/scoring.rs Show resolved Hide resolved

let payment_amt_64th_bucket = amount_msat * 64 / self.capacity_msat;
debug_assert!(payment_amt_64th_bucket <= 64);
if payment_amt_64th_bucket > 64 { return res; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the combination of assert <= and return if >?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first is a debug assertion, the second is what happens in prod.

Copy link
Contributor

@tnull tnull Sep 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I just wasn't clear on what we can expect the value to be? Will it always be <= 64, or won't it? What would be the condition that would make us to return here in the prod case?

Sorry if this is obvious, but I'm currently still have a hard time wrapping my head around this 64th_bucket concept. Is it just a translation of the individual buckets to a kind of continuous u64 space in order to conduct calculations without loss of precision?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, it should always be <= 64, I just didn't want to assert that/rely on it in production, because I'm not 100% confident we have no cases where a channel's value didn't change out from under us and we end up trying to forward an amount greater than the capacity. I'm probably wrong but didn't want to assert it anyway.

As for the 64th_bucket thing, yea, its definitely confusing, I don't disagree. Its less loss of precision and more that it allows us to move where the bucket "is" - we have these buckets we store points in, but a bucket spans 1/8th the total continuous space. As the top comment describes, we really want to compare the amount to different points within each bucket depending on which bucket we're in, so we map buckets into 1/64th the continuous space and do that comparison by a point within the 64th space. We could instead compare strictly with the edges of the buckets (ie treat "min" as the bottom edge of the bucket and "max" as the top edge) but its not clear to me that that's any better, and does probably overestimate probability at least somewhat within each bucket.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thank you for the clarification. The added comment regarding each bucket moving the comparison point forward finally made me understand the 64th bucket approach. I understand the issue regarding the interval borders but wow this is kinda unintuitive.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, it was unintuitive to me until I wrote tests that weren't working right 😂

lightning/src/routing/scoring.rs Outdated Show resolved Hide resolved
lightning/src/routing/scoring.rs Outdated Show resolved Hide resolved
lightning/src/routing/scoring.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more minor comments. I think this would be ready for round with a second reviewer (@jkczyz).

lightning/src/routing/scoring.rs Outdated Show resolved Hide resolved
lightning/src/routing/scoring.rs Outdated Show resolved Hide resolved
lightning/src/routing/scoring.rs Outdated Show resolved Hide resolved

let payment_amt_64th_bucket = amount_msat * 64 / self.capacity_msat;
debug_assert!(payment_amt_64th_bucket <= 64);
if payment_amt_64th_bucket > 64 { return res; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thank you for the clarification. The added comment regarding each bucket moving the comparison point forward finally made me understand the 64th bucket approach. I understand the issue regarding the interval borders but wow this is kinda unintuitive.

@TheBlueMatt TheBlueMatt force-pushed the 2022-07-history-tracking branch 2 times, most recently from 400faf3 to 0c00089 Compare September 15, 2022 14:47
@TheBlueMatt
Copy link
Collaborator Author

Squashed the old fixups cause there were way too many, and added one new one to rename some vars.

lightning/src/routing/scoring.rs Show resolved Hide resolved
lightning/src/routing/scoring.rs Outdated Show resolved Hide resolved
lightning/src/routing/scoring.rs Outdated Show resolved Hide resolved
lightning/src/routing/scoring.rs Outdated Show resolved Hide resolved
@TheBlueMatt
Copy link
Collaborator Author

Addressed all feedback and (finally) wrote a full commit message for the new penalty commit. I think this is ready to land, pending squash.

Copy link
Contributor

@valentinewallace valentinewallace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I almost get the penalty calculation lol

@@ -750,6 +810,21 @@ impl<L: DerefMut<Target = u64>, T: Time, U: DerefMut<Target = T>> DirectedChanne
self.set_max_liquidity_msat(max_liquidity_msat);
}

fn update_history_buckets(&mut self) {
debug_assert!(*self.min_liquidity_offset_msat <= self.capacity_msat);
self.min_liquidity_offset_history.track_datapoint(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would seem cleaner to me to pass in the min_liquidity_offset raw and do the bucket calculation inside track_datapoint (similar for max_liquidity)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bucket tracker doesn't know what the capacity is, either, though. I guess we could pass both the offset and the capacity in, but that, too, feels like a layer violation?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't feel like as much of a layer violation to me (since the name of the var is min_liquidity_offset_history) but don't feel strongly

// If we used the middle of each bucket we'd never assign any penalty at all when
// sending less than 1/16th of a channel's capacity, or 1/8th if we used the top of the
// bucket.
let mut total_valid_points_tracked = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could invalid points be tracked? Trying to understand the use of "valid" here, it looks like we sum all points without verifying them

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There can be points in, for example, both the max-0 and min-1 buckets, but obviously we shouldn't consider those sets of points as it implies a max below the min, which is nonsense. Its mentioned at the top of the comment - let me know how it could be made more clear.

let mut total_valid_points_tracked = 0;
for (min_idx, min_bucket) in self.min_liquidity_offset_history.buckets.iter().enumerate() {
for max_bucket in self.max_liquidity_offset_history.buckets.iter().take(8 - min_idx) {
total_valid_points_tracked += (*min_bucket as u64) * (*max_bucket as u64);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we multiplying instead of summing? I thought each bucket was a total amount of points (mod decay)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to look at (min, max) bucket pairs, not individual buckets. For each pair, we calculate the probability of being in that state by multiplying the buckets together, then look at the buckets and the amount to determine the success probability if we're in that bucket pair.

lightning/src/routing/scoring.rs Outdated Show resolved Hide resolved
lightning/src/routing/scoring.rs Show resolved Hide resolved
lightning/src/routing/scoring.rs Show resolved Hide resolved
Comment on lines +832 to +886
let required_decays = self.now.duration_since(*self.last_updated).as_secs()
.checked_div(params.historical_no_updates_half_life.as_secs())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how no-std works here, will it never decay?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. We'll probably change that in #1752

@@ -750,6 +810,21 @@ impl<L: DerefMut<Target = u64>, T: Time, U: DerefMut<Target = T>> DirectedChanne
self.set_max_liquidity_msat(max_liquidity_msat);
}

fn update_history_buckets(&mut self) {
debug_assert!(*self.min_liquidity_offset_msat <= self.capacity_msat);
self.min_liquidity_offset_history.track_datapoint(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't feel like as much of a layer violation to me (since the name of the var is min_liquidity_offset_history) but don't feel strongly

lightning/src/routing/scoring.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@valentinewallace valentinewallace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, feel free to squash.

This introduces two new fields to the `ChannelLiquidity` struct -
`min_liquidity_offset_history` and `max_liquidity_offset_history`,
both an array of 8 `u16`s. Each entry represents the proportion of
time that we spent with the min or max liquidity offset in the
given 1/8th of the channel's liquidity range. ie the first bucket
in `min_liquidity_offset_history` represents the proportion of time
we've thought the channel's minimum liquidity is lower than 1/8th's
the channel's capacity.

Each bucket is stored, effectively, as a fixed-point number with
5 bits for the fractional part, which is incremented by one (ie 32)
each time we update our liquidity estimates and decide our
estimates are in that bucket. We then decay each bucket by
2047/2048.

Thus, memory of a payment sticks around for more than 8,000
data points, though the majority of that memory decays after 1,387
data points.
Our current `ProbabilisticScorer` attempts to build a model of the
current liquidity across the payment channel network. This works
fine to ignore channels we *just* tried to pay through, but it
fails to remember patterns over longer time horizons.

Specifically, there are *many* channels within the network that are
very often either fully saturated in one direction, or are
regularly rebalanced and rarely saturated. While our model may
discover that, when it decays its offsets or if there is a
temporary change in liquidity, it effectively forgets the "normal"
state of the channel.

This causes substantially suboptimal routing in practice, and
avoiding discarding older knowledge when new datapoints come in is
a potential solution to this.

Here, we implement one such design, using the decaying buckets
added in the previous commit to calculate a probability of payment
success based on a weighted average of recent liquidity estimates
for a channel.

For each min/max liquidity bucket pair (where the min liquidity is
less than the max liquidity), we can calculate the probability that
a payment succeeds using our traditional `amount / capacity`
formula. From there, we weigh the probability by the number of
points in each bucket pair, calculating a total probability for the
payment, and assigning a penalty using the same log-probability
calculation used for the non-historical penalties.
This simplifies adding additional half lives in
DirectedChannelLiquidity by simply storing a reference to the full
parameter set rather than only the single current half-life.
To avoid scoring based on incredibly old historical liquidity data,
we add a new half-life here which is used to (very slowly) decay
historical liquidity tracking buckets.
@TheBlueMatt
Copy link
Collaborator Author

Squashed without further changes.

// Ensure the bucket index we pass is in the range [0, 7], even if the liquidity offset
// is zero or the channel's capacity, though the second should generally never happen.
(self.min_liquidity_offset_msat.saturating_sub(1) * 8 / self.capacity_msat)
.try_into().unwrap_or(32)); // 32 is bogus for 8 buckets, and will be ignored
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.try_into().unwrap_or(32)); // 32 is bogus for 8 buckets, and will be ignored
.try_into().unwrap_or(32)); // 32 is bogus for 8 buckets, and will be ignored

// Ensure the bucket index we pass is in the range [0, 7], even if the liquidity offset
// is zero or the channel's capacity, though the second should generally never happen.
(self.max_liquidity_offset_msat.saturating_sub(1) * 8 / self.capacity_msat)
.try_into().unwrap_or(32)); // 32 is bogus for 8 buckets, and will be ignored
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.try_into().unwrap_or(32)); // 32 is bogus for 8 buckets, and will be ignored
.try_into().unwrap_or(32)); // 32 is bogus for 8 buckets, and will be ignored

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants