Add Config.DisableProposalForwardingCallback for controlling message level proposal forwarding #88

mitake · 2023-07-24T12:07:37Z

This PR adds the new field DisableProposalForwardingCallback to the Config struct. The field is a pointer to callback function which takes raftpb.Message. A follower node calls this callback when a state machine tries to submit a MsgProp message. If the callback returns true based on the message, the follower node drops it and returns ErrProposalDropped.

If we pass a callback function which always returns true, the behavior of the Raft library will be same to the case of DisableProposalForwarding == true.

The motivation is described in #73

ahrtr · 2023-07-30T17:09:53Z

@mitake

How will you define the callback in etcd?
Providing such flexibility might be a little dangerous, because the raft's behavior/correction depends on application's code logic.

mitake · 2023-08-07T13:02:06Z

@ahrtr

How will you define the callback in etcd?

The callback will be defined like this: https://github.com/etcd-io/etcd/pull/16285/files#diff-2f800da9d8120de4693c10b4171d1824f7382f7854c32f0666eb9e0241479a8dR529

Providing such flexibility might be a little dangerous, because the raft's behavior/correction depends on application's code logic.

Technically the mechanism of MsgProp (forwarding proposal from a follower to a leader) isn't a part of Raft I think (the spec of Raft doesn't include this IIRC). It's an implementation specific to this raft library. Also the mechanism works at outside of the consensus protocol so it might be safe to have the callback mechanism. But I understand your concern and need to double check the Raft paper for making sure about it.

mitake · 2023-08-24T13:54:11Z

I checked the dissertation and section 6.2 Routing requests to the leader has descriptions related to this topic:

1. The first option, which we recommend and which LogCabin implements, is for the server to reject the 
request and return to the client the address of the leader, if known. This allows the client to reconnect to 
the leader directly, so future requests can proceed at full speed. It also takes very little additional code to 
implement, since clients already need to reconnect to a different server in the event of a leader failure.

2. Alternatively, the server can proxy the client’s request to the leader. This may be simpler in some cases. 
For example, if a client connects to any server for read requests (see Section 6.4), then proxying the 
client’s write requests would save the client from having to manage a dis- tinct connection to the leader 
used only for writes.

According to this description, MsgProp (corresponding to 2) is an optional mechanism for Raft. So DisableProposalForwarding can be provided by the library and it would be safe to have additional mechanism for dropping specific proposal messages. If clientv3 can have a mechanism to select a leader in client side like what clientv2 does: etcd-io/etcd#4030 it might be possible to turn on DisableProposalForwarding in etcd (not fully sure though)?

Leaders: A server might be in the leader state, but if it isn’t the current leader, it could be needlessly 
delaying client requests. For example, suppose a leader is partitioned from the rest of the cluster, but it 
can still communicate with a particular client. Without additional mechanism, it could delay a request 
from that client forever, being unable to replicate a log entry to any other servers. Meanwhile, there 
might be another leader of a newer term that is able to communicate with a majority of the cluster and 
would be able to commit the client’s request. Thus, a leader in Raft steps down if an election timeout 
elapses without a successful round of heartbeats to a majority of its cluster; this allows clients to retry 
their requests with another server.

I think this description is related to our lease revoking problem by a stale leader. In our case a client is etcd server itself, so a delayed lease revoke message will be forwarded to a leader and accepted.

Anyway I'll try this change with the failpoint based testing when I have time.

mitake · 2023-08-24T13:59:48Z

Related idea: I feel it might be good for controlling risks of new parameter of the Raft library by adding a new struct (e.g. Experimental) to Config, which collects such new parameters. Library users can know that parameters in Experimental are experimental thing and can be deprecated in future.

I think we have a few candidate of such parameters like etcd-io/etcd#7782
cc @chaochn47 you might be interested in this idea? It's great if I can have your opinion.

…level proposal forwarding Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com>

ahrtr · 2023-11-09T16:37:43Z

A couple of points:

In network isolation status, a node may still consider itself as a leader but actually it might not be. It will still broadcast the leaseRevoke request, and eventually the request will be sent to other nodes when the network is recovered.
This solution can't resolve the second case "Leases are wrongly revoked by the new leader" as mentioned in the Root Cause section in doc

ahrtr · 2023-11-09T17:10:50Z

In network isolation status, a node may still consider itself as a leader but actually it might not be. It will still broadcast the leaseRevoke request, and eventually the request will be sent to other nodes when the network is recovered.

It seems not a problem, because it will be rejected by other nodes due to it's smaller term.

mitake · 2023-11-12T16:17:04Z

Sorry let me check the entire of your doc later and reply in a few days @ahrtr . Let me reply to the below comment:

It seems not a problem, because it will be rejected by other nodes due to it's smaller term.

For MsgApp of LeaseRevoke, this is true. But before that MsgProp of LeaseRevoke is sent from the stale leader (etcd layer thinks itself as a leader but raft layer isn't), and it cannot be rejected because MsgProp doesn't have term information and it will be just forwarded to a new leader. I think this behavior is quite complicated so would like to write a doc including diagrams.

mitake marked this pull request as draft July 24, 2023 12:08

mitake mentioned this pull request Jul 24, 2023

WIP, DO NOT MERGE: Disable proposal forwarding for lease revoke requests etcd-io/etcd#16285

Draft

Add Config.DisableProposalForwardingCallback for controlling message …

ddd5443

…level proposal forwarding Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com>

mitake force-pushed the disable-proposal-forwarding-callback branch from 12b7737 to ddd5443 Compare October 23, 2023 07:33

ahrtr mentioned this pull request Nov 9, 2023

Ignore old leader's leases revoking request etcd-io/etcd#16822

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Config.DisableProposalForwardingCallback for controlling message level proposal forwarding #88

Add Config.DisableProposalForwardingCallback for controlling message level proposal forwarding #88

mitake commented Jul 24, 2023

ahrtr commented Jul 30, 2023

mitake commented Aug 7, 2023

mitake commented Aug 24, 2023

mitake commented Aug 24, 2023

ahrtr commented Nov 9, 2023

ahrtr commented Nov 9, 2023

mitake commented Nov 12, 2023

Add Config.DisableProposalForwardingCallback for controlling message level proposal forwarding #88

Are you sure you want to change the base?

Add Config.DisableProposalForwardingCallback for controlling message level proposal forwarding #88

Conversation

mitake commented Jul 24, 2023

ahrtr commented Jul 30, 2023

mitake commented Aug 7, 2023

mitake commented Aug 24, 2023

mitake commented Aug 24, 2023

ahrtr commented Nov 9, 2023

ahrtr commented Nov 9, 2023

mitake commented Nov 12, 2023