Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid panicking on wallclock time going backwards across restart #1603

Merged

Conversation

TheBlueMatt
Copy link
Collaborator

Because we serialize Instants using wallclock time in
ProbabilisticScorer, if time goes backwards across restarts we
may end up with Instants in the future, which causes rustc prior
to 1.60 to panic when calculating durations. Here we simply avoid
this by setting the time to now if we get a time in the future.

Comment on lines 1180 to 1184
// On rust prior to 1.60 `Instant::duration_since` will panic if time goes backwards.
// Because we calculate "now" against wallclock time when we were written are reload it
// here, we may cause time to go backwards if wallclock time is before when we were
// written. Thus, we check if we'd end up with a time in the future and juse use `now`
// instead to avoid panicing later.
Copy link
Contributor

@G8XSU G8XSU Jul 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if there's some actual bug and we end up hiding it because of this? (if diff is alot and weird)
shouldn't the solution be to use saturating_duration_since for now ?
(that will also mean that we will clean it up when duration_since is not panicking)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we could? That would hide system time going backwards at runtime, too, though. I don't really feel strongly either way, I'm happy to do either, both would hide the same error, though, I believe.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

main intention was for cleanup, but i guess it could be safer to do it at one place at read time.
first question was about if we should do a tolerance check on difference.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right, okay, yea, I didn't this its critical enough to overthink it and add tolerance checks or anything like that - worst case we don't decay the scorer data quick enough, as long as time is moving forward we'll still decay it, though.

@codecov-commenter
Copy link

codecov-commenter commented Jul 8, 2022

Codecov Report

Merging #1603 (55831d0) into main (f3d5b94) will increase coverage by 0.21%.
The diff coverage is 100.00%.

❗ Current head 55831d0 differs from pull request most recent head 497fd65. Consider uploading reports for the commit 497fd65 to get more accurate results

@@            Coverage Diff             @@
##             main    #1603      +/-   ##
==========================================
+ Coverage   91.07%   91.29%   +0.21%     
==========================================
  Files          80       80              
  Lines       44128    48731    +4603     
  Branches    44128    48731    +4603     
==========================================
+ Hits        40190    44488    +4298     
- Misses       3938     4243     +305     
Impacted Files Coverage Δ
lightning/src/routing/scoring.rs 96.98% <100.00%> (+0.17%) ⬆️
lightning/src/chain/onchaintx.rs 93.98% <0.00%> (-0.93%) ⬇️
lightning/src/ln/payment_tests.rs 98.88% <0.00%> (-0.37%) ⬇️
lightning/src/ln/functional_tests.rs 96.95% <0.00%> (-0.17%) ⬇️
lightning/src/ln/channel.rs 88.75% <0.00%> (+0.02%) ⬆️
lightning-net-tokio/src/lib.rs 77.77% <0.00%> (+0.61%) ⬆️
lightning-invoice/src/payment.rs 93.81% <0.00%> (+0.90%) ⬆️
lightning/src/routing/router.rs 93.59% <0.00%> (+0.97%) ⬆️
lightning/src/util/events.rs 43.03% <0.00%> (+1.04%) ⬆️
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f3d5b94...497fd65. Read the comment docs.

wpaulino
wpaulino previously approved these changes Jul 8, 2022
jkczyz
jkczyz previously approved these changes Jul 8, 2022
lightning/src/routing/scoring.rs Outdated Show resolved Hide resolved
lightning/src/routing/scoring.rs Outdated Show resolved Hide resolved
@TheBlueMatt TheBlueMatt dismissed stale reviews from jkczyz and wpaulino via fc5f68f July 8, 2022 19:39
@TheBlueMatt
Copy link
Collaborator Author

Pushed a squshed fixup to address spelling:

$ git diff-tree -U2 70bc165c 5ceeb2788
diff --git a/lightning/src/routing/scoring.rs b/lightning/src/routing/scoring.rs
index cd6eb62ce..b4dba7d0f 100644
--- a/lightning/src/routing/scoring.rs
+++ b/lightning/src/routing/scoring.rs
@@ -1179,7 +1179,7 @@ impl<T: Time> Readable for ChannelLiquidity<T> {
 		});
 		// On rust prior to 1.60 `Instant::duration_since` will panic if time goes backwards.
-		// Because we calculate "now" against wallclock time when we were written are reload it
+		// Because we calculate "now" against wallclock time when we were written are reloading
 		// here, we may cause time to go backwards if wallclock time is before when we were
-		// written. Thus, we check if we'd end up with a time in the future and juse use `now`
+		// written. Thus, we check if we'd end up with a time in the future and just use `now`
 		// instead to avoid panicing later.
 		let wall_clock_now = T::duration_since_epoch();

G8XSU
G8XSU previously approved these changes Jul 8, 2022
wpaulino
wpaulino previously approved these changes Jul 8, 2022
Comment on lines 1181 to 1182
// Because we calculate "now" against wallclock time when we were written are reloading
// here, we may cause time to go backwards if wallclock time is before when we were
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps you meant "when we were written and reload it here"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the "are reloading here" part doesn't make sense grammatically to me. Also, the second "we" is referring to something other than the first "we". Maybe s/when we were written/when written?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, sorry, that whole sentence just really made absolutely no sense. I rewrote it.

@TheBlueMatt TheBlueMatt dismissed stale reviews from wpaulino and G8XSU via 33cebe0 July 9, 2022 04:27
dunxen
dunxen previously approved these changes Jul 9, 2022
lightning/src/routing/scoring.rs Outdated Show resolved Hide resolved
jkczyz
jkczyz previously approved these changes Jul 11, 2022
@@ -1177,10 +1177,20 @@ impl<T: Time> Readable for ChannelLiquidity<T> {
(2, max_liquidity_offset_msat, required),
(4, duration_since_epoch, required),
});
// On rust prior to 1.60 `Instant::duration_since` will panic if time goes backwards.
// Because we write `last_updated` as wallclock time when we were written and create an
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend re-writing the second sentence as:

Because last_updated is written as a wallclock time but when read needs to be constructed as an Instant representing that wallclock time, we may hit that panic if the current wallclock time is before the wallclock time when written.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that doesn't tell me clearly what the panic is. I ended up writing a much more verbose version that explains in detail what's going on, let me know which you prefer.

jkczyz
jkczyz previously approved these changes Jul 11, 2022
Because we serialize `Instant`s using wallclock time in
`ProbabilisticScorer`, if time goes backwards across restarts we
may end up with `Instant`s in the future, which causes rustc prior
to 1.60 to panic when calculating durations. Here we simply avoid
this by setting the time to `now` if we get a time in the future.
@TheBlueMatt
Copy link
Collaborator Author

Squashed without further changes, note that all the changes from the first PR version were in the comment anyway.

@TheBlueMatt TheBlueMatt merged commit 5c06d1d into lightningdevkit:main Jul 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants