Raft election using consul leave #9755

ltagliamonte-dd · 2021-02-10T22:17:20Z

Hello Hashicorp folks!!
I'm using consul v1.7.2, in my setup I have 5 consul-servers.
If i gracefully stop one of the consul servers (not the current leader) using consul leave && systemctl stop consul and then i start the process backs up a new leadership election starts.

If i just stop the process with systemctl stop consul a new election isn't triggered.

Is this the correct behavior?
It seems wrong to me trigger an election when a non leader server is gracefully stopped.

The text was updated successfully, but these errors were encountered:

ltagliamonte-dd · 2021-03-02T00:24:50Z

tested today with consul 1.8.8 same behavior.
this is what i'm doing on a node that is not the leader:

core@ip-10-4-73-219 ~ $  docker exec -it consul-server consul info | grep leader_add
	leader_addr = 10.4.73.198:8300
core@ip-10-4-73-219 ~ $ docker exec consul-server /bin/sh -c 'CONSUL_HTTP_TOKEN=<token> consul leave' && sudo systemctl stop consul-server
Graceful leave complete
core@ip-10-4-73-219 ~ $ sudo systemctl start consul-server
core@ip-10-4-73-219 ~ $  docker exec -it consul-server consul info | grep leader_add
	leader_addr = 10.4.73.190:8300

this is the logs from the leader:

Mar 02 00:19:15 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:15.569Z [INFO]  agent.server.serf.wan: serf: EventMemberLeave: ip-10-4-73-219.us-west-2.compute.internal.us-west-2 10.4.73.219
Mar 02 00:19:15 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:15.570Z [INFO]  agent.server: Handled event for server in area: event=member-leave server=ip-10-4-73-219.us-west-2.compute.internal.us-west-2 area=wan
Mar 02 00:19:19 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:19.453Z [INFO]  agent.server.serf.lan: serf: EventMemberLeave: ip-10-4-73-219.us-west-2.compute.internal 10.4.73.219
Mar 02 00:19:19 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:19.454Z [INFO]  agent.server: Removing LAN server: server="ip-10-4-73-219.us-west-2.compute.internal (Addr: tcp/10.4.73.219:8300) (DC: us-west-2)"
Mar 02 00:19:19 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:19.454Z [INFO]  agent.server: removing server by ID: id=aec26945-b142-2890-3e1e-ce00b18322b8
Mar 02 00:19:19 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:19.454Z [INFO]  agent.server.raft: updating configuration: command=RemoveServer server-id=aec26945-b142-2890-3e1e-ce00b18322b8 server-addr= servers="[{Suffrage:Voter ID:720af5a7-eb42-5a28-c8ac-aeda03a848cd Address:10.4.73.222:8300} {Suffrage:Voter ID:076cfb09-d23f-533e-6dc2-26ae3df40d09 Address:10.4.73.185:8300} {Suffrage:Voter ID:22d2bfef-d7ce-9d11-77fa-cb83b3875ac7 Address:10.4.73.190:8300} {Suffrage:Voter ID:9bda6029-1a4d-f06c-2c48-d97dd85a01d4 Address:10.4.73.198:8300}]"
Mar 02 00:19:19 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:19.455Z [INFO]  agent.server.raft: removed peer, stopping replication: peer=aec26945-b142-2890-3e1e-ce00b18322b8 last-index=398166241
Mar 02 00:19:19 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:19.455Z [INFO]  agent.server.raft: aborting pipeline replication: peer="{Nonvoter aec26945-b142-2890-3e1e-ce00b18322b8 10.4.73.219:8300}"
Mar 02 00:19:19 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:19.458Z [INFO]  agent.server: deregistering member: member=ip-10-4-73-219.us-west-2.compute.internal reason=left
Mar 02 00:19:23 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:23.650Z [WARN]  agent.server.raft: rejecting vote request since we have a leader: from=10.4.73.219:8300 leader=10.4.73.198:8300
Mar 02 00:19:27 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:27.249Z [WARN]  agent.server.raft: rejecting vote request since we have a leader: from=10.4.73.219:8300 leader=10.4.73.198:8300
Mar 02 00:19:51 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:51.752Z [INFO]  agent.server.serf.lan: serf: EventMemberJoin: ip-10-4-73-219.us-west-2.compute.internal 10.4.73.219
Mar 02 00:19:51 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:51.752Z [INFO]  agent.server: Adding LAN server: server="ip-10-4-73-219.us-west-2.compute.internal (Addr: tcp/10.4.73.219:8300) (DC: us-west-2)"
Mar 02 00:19:51 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:51.752Z [INFO]  agent.server.raft: updating configuration: command=AddNonvoter server-id=aec26945-b142-2890-3e1e-ce00b18322b8 server-addr=10.4.73.219:8300 servers="[{Suffrage:Voter ID:720af5a7-eb42-5a28-c8ac-aeda03a848cd Address:10.4.73.222:8300} {Suffrage:Voter ID:076cfb09-d23f-533e-6dc2-26ae3df40d09 Address:10.4.73.185:8300} {Suffrage:Voter ID:22d2bfef-d7ce-9d11-77fa-cb83b3875ac7 Address:10.4.73.190:8300} {Suffrage:Voter ID:9bda6029-1a4d-f06c-2c48-d97dd85a01d4 Address:10.4.73.198:8300} {Suffrage:Nonvoter ID:aec26945-b142-2890-3e1e-ce00b18322b8 Address:10.4.73.219:8300}]"
Mar 02 00:19:51 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:51.754Z [INFO]  agent.server.raft: added peer, starting replication: peer=aec26945-b142-2890-3e1e-ce00b18322b8
Mar 02 00:19:51 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:51.754Z [ERROR] agent.server.raft: failed to appendEntries to: peer="{Nonvoter aec26945-b142-2890-3e1e-ce00b18322b8 10.4.73.219:8300}" error=EOF
Mar 02 00:19:51 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:51.756Z [INFO]  agent.server: member joined, marking health alive: member=ip-10-4-73-219.us-west-2.compute.internal
Mar 02 00:19:51 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:51.769Z [ERROR] agent.server.raft: peer has newer term, stopping replication: peer="{Nonvoter aec26945-b142-2890-3e1e-ce00b18322b8 10.4.73.219:8300}"
Mar 02 00:19:51 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:51.769Z [INFO]  agent.server.raft: entering follower state: follower="Node at 10.4.73.198:8300 [Follower]" leader=
Mar 02 00:19:51 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:51.769Z [INFO]  agent.server.raft: aborting pipeline replication: peer="{Voter 720af5a7-eb42-5a28-c8ac-aeda03a848cd 10.4.73.222:8300}"
Mar 02 00:19:51 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:51.769Z [INFO]  agent.server.raft: aborting pipeline replication: peer="{Voter 076cfb09-d23f-533e-6dc2-26ae3df40d09 10.4.73.185:8300}"
Mar 02 00:19:51 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:51.769Z [INFO]  agent.server: cluster leadership lost
Mar 02 00:19:51 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:51.769Z [INFO]  agent.server.raft: aborting pipeline replication: peer="{Voter 22d2bfef-d7ce-9d11-77fa-cb83b3875ac7 10.4.73.190:8300}"
Mar 02 00:19:52 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:52.216Z [INFO]  agent.server.serf.wan: serf: EventMemberJoin: ip-10-4-73-219.us-west-2.compute.internal.us-west-2 10.4.73.219
Mar 02 00:19:52 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:52.216Z [INFO]  agent.server: Handled event for server in area: event=member-join server=ip-10-4-73-219.us-west-2.compute.internal.us-west-2 area=wan
Mar 02 00:19:52 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:52.890Z [WARN]  agent.server.coordinate: Batch update failed: error="node is not the leader"
Mar 02 00:19:56 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:56.723Z [WARN]  agent.server.raft: rejecting vote request since our last index is greater: candidate=10.4.73.219:8300 last-index=398166260 last-candidate-index=398166241
Mar 02 00:20:00 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:20:00.415Z [ERROR] agent.server: error performing anti-entropy sync of federation state: error="node is not the leader"
Mar 02 00:20:00 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:20:00.868Z [INFO]  agent.server: New leader elected: payload=ip-10-4-73-190.us-west-2.compute.internal
Mar 02 00:20:04 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:20:04.868Z [INFO]  agent.server.serf.lan: serf: EventMemberUpdate: ip-10-4-73-219.us-west-2.compute.internal
Mar 02 00:20:04 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:20:04.868Z [INFO]  agent.server: Updating LAN server: server="ip-10-4-73-219.us-west-2.compute.internal (Addr: tcp/10.4.73.219:8300) (DC: us-west-2)"
Mar 02 00:20:09 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:20:09.716Z [INFO]  agent.server.serf.wan: serf: EventMemberUpdate: ip-10-4-73-219.us-west-2.compute.internal.us-west-2
Mar 02 00:20:09 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:20:09.716Z [INFO]  agent.server: Handled event for server in area: event=member-update server=ip-10-4-73-219.us-west-2.compute.internal.us-west-2 area=wan

in the specific:

Mar 02 00:20:00 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:20:00.868Z [INFO]  agent.server: New leader elected: payload=ip-10-4-73-190.us-west-2.compute.internal

If the cluster has a leader why adding back a node triggers a re-election, it shouldn’t for the sake of the stability of the raft consensus.

ChipV223 · 2021-04-29T16:09:42Z

Hi @ltagliamonte-dd !

I just tried to repro this in a 3-server node DC(built via Vagrant) using Consul v1.9.5. When I performed consul leave on one of the follower nodes and then brought it back online a minute later, I didn't observe any leadership changes.

If you were to run this on the later-released version of Consul, do you to still observe this behavior?

ltagliamonte-dd · 2021-04-29T19:49:39Z

@ChipV223 thank you for looking into this, locally i can still see elections happening, those are the step i'm doing:

cat <<EOF > consul-config.json
{
  "telemetry": {
    "disable_hostname": true,
    "prometheus_retention_time": "60s"
  }
}
EOF

CONSUL_VERSION=1.9.5
docker run -d --rm --name=consul-0 -p 8500:8500 -p 8600:8600 -v $(pwd)/consul-config.json:/consul/config/consul-config.json -e CONSUL_BIND_INTERFACE=eth0 consul:${CONSUL_VERSION}
JOIN_IP=$(docker inspect --format='{{ .NetworkSettings.IPAddress }}' consul-0)

for i in $(seq 4)
do
  docker run -d --rm --name=consul-${i} -p 850${i}:8500 -p 860${i}:8600 -v $(pwd)/consul-config.json:/consul/config/consul-config.json -e CONSUL_BIND_INTERFACE=eth0 consul:${CONSUL_VERSION}
done

for i in $(seq 4)
do
  docker exec consul-${i} consul join ${JOIN_IP} 
done


leader_ip=$(docker exec -it consul-0 consul info | grep leader_addr | tr -s " " | cut -d "=" -f2 | cut -d":" -f1)
leader_id=$(docker ps -q | xargs -n 1 docker inspect --format '{{ .NetworkSettings.IPAddress }} {{ .Name }}'| grep ${leader_ip} | sed 's/ \// /' | cut -d" " -f2)
echo ${leader_id}
docker logs -f ${leader_id}


## in another terminal
CONSUL_VERSION=1.9.5
JOIN_IP=$(docker inspect --format='{{ .NetworkSettings.IPAddress }}' consul-0)
docker exec -it consul-1 consul leave
docker run -d --rm --name=consul-1 -p 8501:8500 -p 8601:8600 -v $(pwd)/consul-config.json:/consul/config/consul-config.json  -e CONSUL_BIND_INTERFACE=eth0 consul:${CONSUL_VERSION}
docker exec consul-1 consul join ${JOIN_IP} 

docker exec -it consul-2 consul leave
docker run -d --rm --name=consul-2 -p 8502:8500 -p 8602:8600 -v $(pwd)/consul-config.json:/consul/config/consul-config.json  -e CONSUL_BIND_INTERFACE=eth0 consul:${CONSUL_VERSION}
docker exec consul-2 consul join ${JOIN_IP} 

## check how many elections
curl http://127.0.0.1:8500/v1/agent/metrics?format=prometheus -sS | grep -i consul_raft_state_candidate


#cleanup
docker rm -f  consul-0 consul-1 consul-2 consul-3 consul-4

ChipV223 · 2021-04-30T01:48:58Z

Thank you for the information @ltagliamonte-dd !

This seems very environment-specific, particularly with docker since I was not able to repro this using Vagrant VMs. I'll try again using docker containers and report back when I get the chance

ltagliamonte · 2021-05-28T04:08:27Z

@ChipV223 any update on this issue?

ChipV223 · 2021-06-02T20:37:22Z

Hi @ltagliamonte !

I've not been able to look into this as I've been involved with other tasks. The plan is to try to run my repro again using docker containers sometime this week and will update when I have any news to share

mikemorris · 2021-08-27T16:39:28Z

@ltagliamonte We're investigating a report of a similar issue and wanted to follow up here to confirm a few details:

Are you running Consul OSS or Enterprise in these reproductions?
- If Enterprise, are any Enterprise features in use, including autopilot upgrades, read replicas or redundancy zones?
Can you confirm that this issue is reproducible in both 1.8.8 and 1.9.5? I'm curious about this specifically because some of the logic around handling server leaves and leadership elections changed substantially in Switch to using the external autopilot module #9103

ltagliamonte-dd · 2021-08-27T16:54:56Z

Hello @mikemorris

Consul OSS
I posted a test that reproduces the issue on 1..9.5 Raft election using consul leave #9755 (comment)

blake · 2022-01-07T19:43:00Z

Hi @ltagliamonte-dd, this issue was fixed by hashicorp/raft#476, added to Consul in #11375, and released in 1.10.4.

Thanks for reporting this. I'm going to close this issue now. Feel free to re-open it if for some reason this does not resolve the issue.

ltagliamonte-dd · 2022-01-07T22:32:11Z

@blake i did test with the steps i posted using 1.10.4 and i can still repro the issue.

dhiaayachi · 2022-01-14T16:36:59Z

Hey @ltagliamonte,
I was able to reproduce the issue with version 1.10.4, seems related to this log line you see:
Mar 02 00:19:51 ip-10-4-73-198.us-west-2.compute.internal docker[3182]: 2021-03-02T00:19:51.754Z [ERROR] agent.server.raft: failed to appendEntries to: peer="{Nonvoter aec26945-b142-2890-3e1e-ce00b18322b8 10.4.73.219:8300}" error=EOF.
For some reason, the leader fail to append raft logs to the newly rejoined node which trigger an election, I will dig into it more and let you know.

jsosulska added theme/internals Serf, Raft, SWIM, Lifeguard, Anti-Entropy, locking topics type/question Not an "enhancement" or "bug". Please post on discuss.hashicorp labels Feb 24, 2021

ChipV223 added the waiting-reply Waiting on response from Original Poster or another individual in the thread label Apr 29, 2021

ghost removed waiting-reply Waiting on response from Original Poster or another individual in the thread labels Apr 29, 2021

dhiaayachi mentioned this issue Sep 1, 2021

Consul server leave try to trigger an election #10970

Closed

blake mentioned this issue Oct 23, 2021

Joining a Consul server node to a 5 node cluster causes periodic loss of leader #11355

Open

blake closed this as completed Jan 7, 2022

dhiaayachi reopened this Jan 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Raft election using consul leave #9755

Raft election using consul leave #9755

ltagliamonte-dd commented Feb 10, 2021 •

edited

ltagliamonte-dd commented Mar 2, 2021 •

edited

ChipV223 commented Apr 29, 2021

ltagliamonte-dd commented Apr 29, 2021

ChipV223 commented Apr 30, 2021

ltagliamonte commented May 28, 2021

ChipV223 commented Jun 2, 2021

mikemorris commented Aug 27, 2021

ltagliamonte-dd commented Aug 27, 2021

blake commented Jan 7, 2022

ltagliamonte-dd commented Jan 7, 2022 •

edited

dhiaayachi commented Jan 14, 2022

Raft election using consul leave #9755

Raft election using consul leave #9755

Comments

ltagliamonte-dd commented Feb 10, 2021 • edited

ltagliamonte-dd commented Mar 2, 2021 • edited

ChipV223 commented Apr 29, 2021

ltagliamonte-dd commented Apr 29, 2021

ChipV223 commented Apr 30, 2021

ltagliamonte commented May 28, 2021

ChipV223 commented Jun 2, 2021

mikemorris commented Aug 27, 2021

ltagliamonte-dd commented Aug 27, 2021

blake commented Jan 7, 2022

ltagliamonte-dd commented Jan 7, 2022 • edited

dhiaayachi commented Jan 14, 2022

ltagliamonte-dd commented Feb 10, 2021 •

edited

ltagliamonte-dd commented Mar 2, 2021 •

edited

ltagliamonte-dd commented Jan 7, 2022 •

edited