Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use delayed power table to validate or drop messages from future instances #151

Open
1 of 5 tasks
Tracked by #246 ...
anorth opened this issue Apr 3, 2024 · 7 comments
Open
1 of 5 tasks
Tracked by #246 ...
Assignees
Labels
gossipbft Relates to core GossipPBFT protocol

Comments

@anorth
Copy link
Member

anorth commented Apr 3, 2024

We have agreed that voting weights for an instance should come from the power table resulting from not the immediately previous instance, but some distance (10 instances / 5 minutes? longer) further back. This allows us to do proper message validation and avoid retransmitting bad messages.

Since the power table is given from the node, it might be that theres little or no code change required in GPBFT, but (1) confirm this, and (2) ensure that multi-instance tests do this properly, and (3) let's use #125 to help us gain confidence in the lookback distance.

@anorth anorth added the gossipbft Relates to core GossipPBFT protocol label Apr 3, 2024
@anorth
Copy link
Member Author

anorth commented Apr 4, 2024

The point of this is to be able to validate message from near-future instances (from a node's point of view). So there will be work here to implement that validation and queue of validated messages. This may involve changes to the host API so the participant can reliably keep track of N power tables and have the right one on hand for any message.

@Kubuxu Kubuxu added this to the F3 Alpha milestone Apr 22, 2024
@anorth anorth self-assigned this May 14, 2024
@anorth anorth changed the title Delayed power table Use delayed power table to validate or drop messages from future instances May 16, 2024
@anorth
Copy link
Member Author

anorth commented May 16, 2024

I'm closing #130 and expanding the scope here a little with a task list including maintaining the delayed power tables and using them to validate messages.

@anorth
Copy link
Member Author

anorth commented May 20, 2024

The host (Lotus) must end up with a store of (instance, finalised tipset) records somewhere in order to be able to bootstrap the protocol when a node starts up. F3 couldn't otherwise know what instance it's up to or what power table to use. So, the API will assume that F3 can ask the host for the power table corresponding(*) to an instance. The host can map instance -> Tipset -> Epoch and then go build the power table that F3 needs. F3 can cache the results.

A significant design question is whether the lookback parameter should be internal to F3, or encapsulated by the host. I.e. does (*) corresponding to mean the power table finalised by an instance, or the power table to be used for an instance. It initially seems natural to make the parameter internal to F3, but a few things push back against that:

  • All the sim testing infrastructure is set up to associate the power table to be used for an instance. That also makes tests much easier to write and independent of that parameter value. So the sim would have to compute the reverse offset to feed F3 the power table from an instance (which would initially be a bunch of genesis tips). The associations in the test code would be different to the production code, which would be confusing.
  • If the API is finalised by, then F3 also needs a way to ask for the gensis power table. With instance numbers as uint64 there's very high risk of an underflow computing current - offset to find the right instance. We could add an explicit API for fetching genesis, but then we need underflow checks and branches anywhere that calls these methods. We could alternatively make instance numbers a signed int64, and then infer genesis from any negative instance. (I would choose this option, using int64 throughout)

Thus, I am first going to encapsulate the offset parameter in the host. F3 will ask for the power table it should use for an instance. This means the offset configuration lives in the host, and subtraction of the offset happens on the host side.

It's not perfect, but I think we'd introduce a bunch of unnecessary complexity to try to keep the parameter in F3: reworking the simulation testing setup to account for offsets, adjusting tests that use power table fluctuations, converting instance to int64 everywhere. We can always come back to do this later if we don't like it.

@Stebalien
Copy link
Member

IMO, we should use int64 for instances regardless (I've seen too many issues with MaxUint64 overflowing int64).

That aside, I don't think having the lookback inside go-f3 will actually be all that difficult:

  1. We can still initialize instances with the desired power table (no lookback).
  2. A GetPowerTableFromInstance method can simply assert that the passed instance has the correct lookback for the instance being simulated, then return the power table.

@Kubuxu
Copy link
Collaborator

Kubuxu commented May 21, 2024

Regarding Lotus having to store the power table, we are storing finality certificates, which should contain the power tables that are being finalized.

@Stebalien
Copy link
Member

Well, the finality certificates only store power table diffs. But looking up the power table associated with an instance isn't difficult (instance - lookback_distance -> head ts -> power table).

@Stebalien
Copy link
Member

See https://github.com/filecoin-project/go-f3/pull/273/files#r1610685278 for an example of the power tables I'll need if we implement #257.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
gossipbft Relates to core GossipPBFT protocol
Projects
Status: In progress
Development

No branches or pull requests

3 participants