Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make htlc_maximum_msat a required field. #1519

Merged
merged 3 commits into from Jul 25, 2022

Conversation

tnull
Copy link
Contributor

@tnull tnull commented Jun 6, 2022

As of lightning/bolts#996, htlc_maximum_msat will soon be a required field.

This PR implements this change, i.e., it removes the special de-/serialisation logic needed before.
Not sure if we want to do this right away though, since it breaks serialisation backwards compat. 🤷‍♂️

@TheBlueMatt
Copy link
Collaborator

Given we can use this to clean up the scoring as well (we'd no longer need a "default" channel capacity), I think we should do this sooner rather than later. Will do a real review this week.

@tnull tnull force-pushed the 2022-06-require-htlc-max branch from 3fd4fdc to b24f081 Compare June 9, 2022 14:53
@tnull
Copy link
Contributor Author

tnull commented Jun 9, 2022

AFAICT, benchmark CI check is failing due to said compat breakage.

@TheBlueMatt
Copy link
Collaborator

Right, so I think compat is actually fine here, the issue is we have some channels in our kinda-checked-in gossip store that are missing the field. So we should update the NetworkGraph decoder to do something similar to what we do for P2P where we just silently ignore channels with missing htlc-max values and drop them from the graph. We'll re-add them if our peer has an updated copy with an htlc-max.

@TheBlueMatt
Copy link
Collaborator

I think, in this PR or a followup, we can drop EffectiveCapacity::Unknown which will be nice.

@tnull
Copy link
Contributor Author

tnull commented Jun 13, 2022

Right, so I think compat is actually fine here, the issue is we have some channels in our kinda-checked-in gossip store that are missing the field. So we should update the NetworkGraph decoder to do something similar to what we do for P2P where we just silently ignore channels with missing htlc-max values and drop them from the graph. We'll re-add them if our peer has an updated copy with an htlc-max.

Grr, after playing around with it a bit more, it seems just skipping on decoding errors isn't a very good option here, since this messes with the alignment in the reader. Also, I think that we would discard a lot of updates received via gossip and/or during deserialization when we just switch to the new layout.

Maybe the best way forward would be to introduce a LegacyUnsignedChannelUpdate (somewhat in analogy to #1529) that uses the old decoding and can be converted to UnsignedChannelUpdate by setting htlc_maximum_msat to the channel capacity when it was Absent. This way, in case of a failure, we could retry decoding+convert from the legacy format, instead of just skipping the update?

@TheBlueMatt
Copy link
Collaborator

Grr, after playing around with it a bit more, it seems just skipping on decoding errors isn't a very good option here, since this messes with the alignment in the reader.

Oh? I'm not sure why this is an issue? The last_update_message field we should be fine with, its already length-prefixed. The htlc_maximum_msat field directly in ChannelUpdateInfo implies we'll need to break out of the impl_writeable_tlv_based macro and write it by hand, making it MaybeReadable instead of Readable, but I don't see why that can't work.

@tnull
Copy link
Contributor Author

tnull commented Jun 13, 2022

Oh? I'm not sure why this is an issue? The last_update_message field we should be fine with, its already length-prefixed. The htlc_maximum_msat field directly in ChannelUpdateInfo implies we'll need to break out of the impl_writeable_tlv_based macro and write it by hand, making it MaybeReadable instead of Readable, but I don't see why that can't work.

Ah, I believe this could work in order to solve the alignment issue: we could ensure that we don't error-out mid decoding, i.e., the required number of bytes is always read, even if the ChannelUpdateInfo has no htlc_maximum_msat present and hence would result in a None.
However, I'm still not fully sure I can follow: wouldn't making ChannelUpdateInfo a MaybeReadable kind of chain up to ChannelInfo etc pp.? A slew of consequences seem to come with that, not sure we want those?

Also, are we positive we simply want to drop legacy channel updates when there is no htlc_maximum_msat present?

@TheBlueMatt
Copy link
Collaborator

Ah, I believe this could work in order to solve the alignment issue: we could ensure that we don't error-out mid decoding, i.e., the required number of bytes is always read, even if the ChannelUpdateInfo has no htlc_maximum_msat present and hence would result in a None.

Yep, exactly, read the whole thing either way, but set ChannelInfo::one_to_two/ChannelInfo::two_to_one to None instead of the inner field being None.

However, I'm still not fully sure I can follow: wouldn't making ChannelUpdateInfo a MaybeReadable kind of chain up to ChannelInfo etc pp.? A slew of consequences seem to come with that, not sure we want those?

Yes, but one_to_two and two_to_one are already Optional, so we can just bubble up the None to there and leave it.

Also, are we positive we simply want to drop legacy channel updates when there is no htlc_maximum_msat present?

I kinda assume so? Like, the theory of making it required is that its ~fully deployed everywhere and that we don't need to accept gossip where its not set. If that's true, we should be happy to not use gossip when its not set :)

@tnull
Copy link
Contributor Author

tnull commented Jun 15, 2022

Rebased and just pushed an intermediary step towards implementing MaybeReadable for ChannelUpdateInfo. Currently, a lot of tests fail.

I played around with utilizing pre-existing macros, but most of them are full of ?s, which would end decoding and therefore lead to the alignment issue. However, just writing everything from scratch seems like a lot of redundant code. I'm currently kind of stuck, have to think about this some more. Would also appreciate input in which direction to take this.

FWIW, I'm considering implementing _noerror variants of read_tlv_fields! and everything below to make this a bit easier. But this also may be a bit overblown..

Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, it does become a good bit more verbose...not a lot we can do about it, at least not without adding more support to the macros.

lightning/src/routing/gossip.rs Show resolved Hide resolved
lightning/src/routing/gossip.rs Show resolved Hide resolved
lightning/src/routing/gossip.rs Show resolved Hide resolved
lightning/src/routing/gossip.rs Outdated Show resolved Hide resolved
@tnull
Copy link
Contributor Author

tnull commented Jun 15, 2022

Yea, it does become a good bit more verbose...not a lot we can do about it, at least not without adding more support to the macros.

Alright, thanks for the feedback, will proceed.

Another conundrum to be solved is that even when we make sure we do not stop decoding, we will do so according to the new layout, i.e., will probably still hit an alignment issue. Will try to read and discard the appropriate number of bytes after we fail decoding the htlc_maximum_msat.

Btw. wouldn't such a change of the serialization format warrant to increase SERIALIZATION_VERSION? Or, if not, what would a be a case in which we want to do that?

@TheBlueMatt
Copy link
Collaborator

Another conundrum to be solved is that even when we make sure we do not stop decoding, we will do so according to the new layout, i.e., will probably still hit an alignment issue. Will try to read and discard the appropriate number of bytes after we fail decoding the htlc_maximum_msat.

It shouldn't be an issue as long as you call through the tlv stream read macro. It should read all available TLVs.

Btw. wouldn't such a change of the serialization format warrant to increase SERIALIZATION_VERSION? Or, if not, what would a be a case in which we want to do that?

No, I don't think so, I think that's basically just when we want to break TLV compat, but this should still be fully backwards and forwards compatible.

@tnull
Copy link
Contributor Author

tnull commented Jun 15, 2022

It shouldn't be an issue as long as you call through the tlv stream read macro. It should read all available TLVs.

Ah, right, one more reason why it could pay off to go the macro route. 👍

No, I don't think so, I think that's basically just when we want to break TLV compat, but this should still be fully
backwards and forwards compatible.

Thanks, makes sense.

@TheBlueMatt
Copy link
Collaborator

Ah, right, one more reason why it could pay off to go the macro route. 👍

I believe as written currently it works just fine and reads all available bytes.

@TheBlueMatt
Copy link
Collaborator

Looks like this needs rebase as well now. Feel free to squash fixups when you do so.

@TheBlueMatt TheBlueMatt added this to the 0.0.110 milestone Jul 5, 2022
@TheBlueMatt
Copy link
Collaborator

As of #1553 we actually now must ensure we ignore announcements that fail to deserialize as we've "soft-forked" serialization and if a node has an existing, but invalid, hostname field we'll refuse to read our entire network graph. Thus, we'll want to do that here and will need to land this in the next release.

@tnull
Copy link
Contributor Author

tnull commented Jul 5, 2022

As of #1553 we actually now must ensure we ignore announcements that fail to deserialize as we've "soft-forked" serialization and if a node has an existing, but invalid, hostname field we'll refuse to read our entire network graph. Thus, we'll want to do that here and will need to land this in the next release.

Yes, sorry for the delay here. I made some progress and hope to push some updates by the end of this week, early next week latest.

@codecov-commenter
Copy link

codecov-commenter commented Jul 12, 2022

Codecov Report

Merging #1519 (8b86ed7) into main (5023ff0) will increase coverage by 0.30%.
The diff coverage is 96.29%.

@@            Coverage Diff             @@
##             main    #1519      +/-   ##
==========================================
+ Coverage   90.82%   91.13%   +0.30%     
==========================================
  Files          80       80              
  Lines       44643    46470    +1827     
  Branches    44643    46470    +1827     
==========================================
+ Hits        40547    42350    +1803     
- Misses       4096     4120      +24     
Impacted Files Coverage Δ
lightning/src/ln/channel.rs 88.78% <ø> (-0.01%) ⬇️
lightning/src/ln/onion_route_tests.rs 97.68% <ø> (-0.01%) ⬇️
lightning/src/ln/priv_short_conf_tests.rs 96.59% <ø> (-0.01%) ⬇️
lightning/src/routing/router.rs 92.06% <ø> (-0.39%) ⬇️
lightning/src/routing/scoring.rs 96.09% <ø> (-0.01%) ⬇️
lightning/src/util/test_utils.rs 78.11% <ø> (-0.05%) ⬇️
lightning/src/routing/gossip.rs 91.80% <95.83%> (+0.47%) ⬆️
lightning-rapid-gossip-sync/src/lib.rs 90.90% <100.00%> (ø)
lightning-rapid-gossip-sync/src/processing.rs 91.40% <100.00%> (+0.96%) ⬆️
lightning/src/ln/channelmanager.rs 85.13% <100.00%> (+0.02%) ⬆️
... and 15 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5023ff0...8b86ed7. Read the comment docs.

@tnull
Copy link
Contributor Author

tnull commented Jul 12, 2022

Rebased and just pushed some progress. ChannelInfo and ChannelUpdateInfo are decoded and encoded just fine independently, but the MaybeReadable Option is still not behaving as it should when the two are combined. Currently figuring out a workaround, feels like I'm getting close(r) though.

Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not your bug, but due to #1553, whatever we do for ChannelUpdate we have to also do for NodeAnnouncement.

lightning/src/ln/msgs.rs Show resolved Hide resolved
lightning/src/routing/gossip.rs Outdated Show resolved Hide resolved
@@ -715,16 +797,55 @@ impl fmt::Display for ChannelInfo {
}
}

impl_writeable_tlv_based!(ChannelInfo, {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not just make ignorable work for impl_writeable_tlv_based?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum, it may have been possible before, but I think now that we handle reading ChannelUpdateInfos via ChannelUpdateInfoDeserWrap, we're needing a custom implementation of Readable for ChannelInfo anyways?

@tnull
Copy link
Contributor Author

tnull commented Jul 14, 2022

I think, in this PR or a followup, we can drop EffectiveCapacity::Unknown which will be nice.

@TheBlueMatt If you don't mind, I'll do this in a follow-up PR.

Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This basically LGTM, can you clean up the git history and I'll give it a proper review?

lightning/src/routing/gossip.rs Outdated Show resolved Hide resolved
dunxen
dunxen previously approved these changes Jul 19, 2022
Copy link
Contributor

@dunxen dunxen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 7b637a3

This LGTM. Only did a light review of gossip.rs though.

@TheBlueMatt
Copy link
Collaborator

Oops needs rebase now as well.

@tnull
Copy link
Contributor Author

tnull commented Jul 20, 2022

Oops needs rebase now as well.

Rebased on main.

@tnull
Copy link
Contributor Author

tnull commented Jul 20, 2022

Can you look into fixing the benchmark CI failure?

Yes, I have a fix ready that overrides with the default if htlc_maximum_msat == u64::max_value(). However, I took away from our discussion above that we do not want this and rather want to use a fresh snapshot where the few remaining updates not setting the field are already left out on generation? I'll check in again with @arik-so on this.

@TheBlueMatt
Copy link
Collaborator

Ah, right. Yea, let's just move forward with this and merge it with the benchmark CI failing, IMO. We can fix it pre-release (or not, even, really).

@TheBlueMatt
Copy link
Collaborator

In any case, assigning @arik-so as a "please generate a new snapshot"

@arik-so
Copy link
Contributor

arik-so commented Jul 20, 2022

Did my comment regarding the discussion with Rusty not send? I'm not seeing it in this thread, when I could have sworn I sent it. Either way, I sent @tnull a new snapshot for testing this morning.

lightning/src/ln/msgs.rs Outdated Show resolved Hide resolved
lightning/src/routing/gossip.rs Show resolved Hide resolved
Comment on lines 3025 to 3026
// Check we can decode legacy ChannelInfo, even if the `two_to_one`/`one_to_two` fields
// fail to decode.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a similar test for NodeInfo missing announcement_info?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, good catch, was coming back to that. I now added the test 04b677c, but it's indeed currently still failing, as not all bytes of the invalid NodeAnnouncementInfo are directly read when it fails, they are only cleaned up by s.eat_remaining(), which however still leads to the read_tlv_fields! macro returning an InvalidValue I can't catch (see here).
Currently not entirely sure how I'll about it. As it's in fact the NetAddress that is failing to decode, I probably could implement MaybeReadable for Vec<NetAddress> which only adds the successful decodes to the resulting vector.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now worked around this by introducing another NetAddressVecDeserWrapper in 285cb27, but this is really not pretty. It might be easier to switch NetAddress directly to MaybeReadable, since it is the part that may fail to decode. However, I think this would mess with serialization in other parts of the code, e.g., the Init message...

Copy link
Contributor Author

@tnull tnull Jul 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some offline discussion with @TheBlueMatt I now went to ignore all errors arising from trying to decode NodeAnnouncementInfo in c1bfa3f, which eliminates the need for an additional NetAddressVecDeserWrapper.

@tnull
Copy link
Contributor Author

tnull commented Jul 21, 2022

Did my comment regarding the discussion with Rusty not send? I'm not seeing it in this thread, when I could have sworn I sent it. Either way, I sent @tnull a new snapshot for testing this morning.

Can confirm that benchmark passes with the new snapshot at least locally. Can we upload it, I'd then update the URL in benchmark CI and here before we merge this?

@TheBlueMatt
Copy link
Collaborator

Oh, oops, right, can you also add a test (and fix, I think its broken) reading a NodeAnnouncementInfo that contains Some() NodeAnnouncement that itself contains a invalid Hostname NetAddress (ie too short encoding and also invalid string?)? We have to handle such graphs as they can be saved today even if we won't write them in 0.0.110.

@tnull
Copy link
Contributor Author

tnull commented Jul 22, 2022

Can confirm that benchmark passes with the new snapshot at least locally. Can we upload it, I'd then update the URL in benchmark CI and here before we merge this?

Updated snapshot URL with c6ab815, benchmark now passes.

@tnull
Copy link
Contributor Author

tnull commented Jul 22, 2022

Oh, oops, right, can you also add a test (and fix, I think its broken) reading a NodeAnnouncementInfo that contains Some() NodeAnnouncement that itself contains a invalid Hostname NetAddress (ie too short encoding and also invalid string?)? We have to handle such graphs as they can be saved today even if we won't write them in 0.0.110.

As discussed offline, we now eat all errors arising from trying to read NodeAnnouncementInfo with c1bfa3f.


impl MaybeReadable for NodeAnnouncementInfoDeserWrapper {
fn read<R: io::Read>(reader: &mut R) -> Result<Option<Self>, DecodeError> {
loop {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh lol, uhhh, hmm, I don't think this quite works, but could. Calling read on a "normal' object may try to read several bytes at a time, which technically the Read implementation may refuse to do if it doesn't have enough bytes. Instead, if we fail to read once we should fall back to just calling reader.read() on an, eg, 4KB buffer repeatedly until it returns zero.

Copy link
Contributor Author

@tnull tnull Jul 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I had thought the loop would always make at least a u8 worth of progress since it would try to read the tlv_len: BigSize fields on every iteration. That said, I now adopted the copy approach from eat_remaining, which seems like the cleaner way to do it anyways.

Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Patch otherwise LGTM.

Copy link
Contributor

@wpaulino wpaulino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, feel free to squash.

@tnull
Copy link
Contributor Author

tnull commented Jul 25, 2022

Looks good, feel free to squash.

Squashed!

@TheBlueMatt TheBlueMatt merged commit 1988cb2 into lightningdevkit:main Jul 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants