Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raft election using consul leave #9755

Open
ltagliamonte-dd opened this issue Feb 10, 2021 · 11 comments
Open

Raft election using consul leave #9755

ltagliamonte-dd opened this issue Feb 10, 2021 · 11 comments
Labels
theme/internals Serf, Raft, SWIM, Lifeguard, Anti-Entropy, locking topics type/question Not an "enhancement" or "bug". Please post on discuss.hashicorp

Comments

@ltagliamonte-dd
Copy link
Contributor

ltagliamonte-dd commented Feb 10, 2021

Hello Hashicorp folks!!
I'm using consul v1.7.2, in my setup I have 5 consul-servers.
If i gracefully stop one of the consul servers (not the current leader) using consul leave && systemctl stop consul and then i start the process backs up a new leadership election starts.

If i just stop the process with systemctl stop consul a new election isn't triggered.

Is this the correct behavior?
It seems wrong to me trigger an election when a non leader server is gracefully stopped.

@jsosulska jsosulska added theme/internals Serf, Raft, SWIM, Lifeguard, Anti-Entropy, locking topics type/question Not an "enhancement" or "bug". Please post on discuss.hashicorp labels Feb 24, 2021
@ltagliamonte-dd
Copy link
Contributor Author

ltagliamonte-dd commented Mar 2, 2021

tested today with consul 1.8.8 same behavior.
this is what i'm doing on a node that is not the leader:

core@ip-10-4-73-219 ~ $  docker exec -it consul-server consul info | grep leader_add
	leader_addr = 10.4.73.198:8300
core@ip-10-4-73-219 ~ $ docker exec consul-server /bin/sh -c 'CONSUL_HTTP_TOKEN=<token> consul leave' && sudo systemctl stop consul-server
Graceful leave complete
core@ip-10-4-73-219 ~ $ sudo systemctl start consul-server
core@ip-10-4-73-219 ~ $  docker exec -it consul-server consul info | grep leader_add
	leader_addr = 10.4.73.190:8300

this is the logs from the leader:

Mar 02 00:19:15 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:15.569Z [INFO]  agent.server.serf.wan: serf: EventMemberLeave: ip-10-4-73-219.us-west-2.compute.internal.us-west-2 10.4.73.219
Mar 02 00:19:15 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:15.570Z [INFO]  agent.server: Handled event for server in area: event=member-leave server=ip-10-4-73-219.us-west-2.compute.internal.us-west-2 area=wan
Mar 02 00:19:19 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:19.453Z [INFO]  agent.server.serf.lan: serf: EventMemberLeave: ip-10-4-73-219.us-west-2.compute.internal 10.4.73.219
Mar 02 00:19:19 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:19.454Z [INFO]  agent.server: Removing LAN server: server="ip-10-4-73-219.us-west-2.compute.internal (Addr: tcp/10.4.73.219:8300) (DC: us-west-2)"
Mar 02 00:19:19 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:19.454Z [INFO]  agent.server: removing server by ID: id=aec26945-b142-2890-3e1e-ce00b18322b8
Mar 02 00:19:19 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:19.454Z [INFO]  agent.server.raft: updating configuration: command=RemoveServer server-id=aec26945-b142-2890-3e1e-ce00b18322b8 server-addr= servers="[{Suffrage:Voter ID:720af5a7-eb42-5a28-c8ac-aeda03a848cd Address:10.4.73.222:8300} {Suffrage:Voter ID:076cfb09-d23f-533e-6dc2-26ae3df40d09 Address:10.4.73.185:8300} {Suffrage:Voter ID:22d2bfef-d7ce-9d11-77fa-cb83b3875ac7 Address:10.4.73.190:8300} {Suffrage:Voter ID:9bda6029-1a4d-f06c-2c48-d97dd85a01d4 Address:10.4.73.198:8300}]"
Mar 02 00:19:19 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:19.455Z [INFO]  agent.server.raft: removed peer, stopping replication: peer=aec26945-b142-2890-3e1e-ce00b18322b8 last-index=398166241
Mar 02 00:19:19 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:19.455Z [INFO]  agent.server.raft: aborting pipeline replication: peer="{Nonvoter aec26945-b142-2890-3e1e-ce00b18322b8 10.4.73.219:8300}"
Mar 02 00:19:19 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:19.458Z [INFO]  agent.server: deregistering member: member=ip-10-4-73-219.us-west-2.compute.internal reason=left
Mar 02 00:19:23 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:23.650Z [WARN]  agent.server.raft: rejecting vote request since we have a leader: from=10.4.73.219:8300 leader=10.4.73.198:8300
Mar 02 00:19:27 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:27.249Z [WARN]  agent.server.raft: rejecting vote request since we have a leader: from=10.4.73.219:8300 leader=10.4.73.198:8300
Mar 02 00:19:51 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:51.752Z [INFO]  agent.server.serf.lan: serf: EventMemberJoin: ip-10-4-73-219.us-west-2.compute.internal 10.4.73.219
Mar 02 00:19:51 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:51.752Z [INFO]  agent.server: Adding LAN server: server="ip-10-4-73-219.us-west-2.compute.internal (Addr: tcp/10.4.73.219:8300) (DC: us-west-2)"
Mar 02 00:19:51 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:51.752Z [INFO]  agent.server.raft: updating configuration: command=AddNonvoter server-id=aec26945-b142-2890-3e1e-ce00b18322b8 server-addr=10.4.73.219:8300 servers="[{Suffrage:Voter ID:720af5a7-eb42-5a28-c8ac-aeda03a848cd Address:10.4.73.222:8300} {Suffrage:Voter ID:076cfb09-d23f-533e-6dc2-26ae3df40d09 Address:10.4.73.185:8300} {Suffrage:Voter ID:22d2bfef-d7ce-9d11-77fa-cb83b3875ac7 Address:10.4.73.190:8300} {Suffrage:Voter ID:9bda6029-1a4d-f06c-2c48-d97dd85a01d4 Address:10.4.73.198:8300} {Suffrage:Nonvoter ID:aec26945-b142-2890-3e1e-ce00b18322b8 Address:10.4.73.219:8300}]"
Mar 02 00:19:51 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:51.754Z [INFO]  agent.server.raft: added peer, starting replication: peer=aec26945-b142-2890-3e1e-ce00b18322b8
Mar 02 00:19:51 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:51.754Z [ERROR] agent.server.raft: failed to appendEntries to: peer="{Nonvoter aec26945-b142-2890-3e1e-ce00b18322b8 10.4.73.219:8300}" error=EOF
Mar 02 00:19:51 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:51.756Z [INFO]  agent.server: member joined, marking health alive: member=ip-10-4-73-219.us-west-2.compute.internal
Mar 02 00:19:51 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:51.769Z [ERROR] agent.server.raft: peer has newer term, stopping replication: peer="{Nonvoter aec26945-b142-2890-3e1e-ce00b18322b8 10.4.73.219:8300}"
Mar 02 00:19:51 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:51.769Z [INFO]  agent.server.raft: entering follower state: follower="Node at 10.4.73.198:8300 [Follower]" leader=
Mar 02 00:19:51 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:51.769Z [INFO]  agent.server.raft: aborting pipeline replication: peer="{Voter 720af5a7-eb42-5a28-c8ac-aeda03a848cd 10.4.73.222:8300}"
Mar 02 00:19:51 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:51.769Z [INFO]  agent.server.raft: aborting pipeline replication: peer="{Voter 076cfb09-d23f-533e-6dc2-26ae3df40d09 10.4.73.185:8300}"
Mar 02 00:19:51 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:51.769Z [INFO]  agent.server: cluster leadership lost
Mar 02 00:19:51 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:51.769Z [INFO]  agent.server.raft: aborting pipeline replication: peer="{Voter 22d2bfef-d7ce-9d11-77fa-cb83b3875ac7 10.4.73.190:8300}"
Mar 02 00:19:52 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:52.216Z [INFO]  agent.server.serf.wan: serf: EventMemberJoin: ip-10-4-73-219.us-west-2.compute.internal.us-west-2 10.4.73.219
Mar 02 00:19:52 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:52.216Z [INFO]  agent.server: Handled event for server in area: event=member-join server=ip-10-4-73-219.us-west-2.compute.internal.us-west-2 area=wan
Mar 02 00:19:52 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:52.890Z [WARN]  agent.server.coordinate: Batch update failed: error="node is not the leader"
Mar 02 00:19:56 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:19:56.723Z [WARN]  agent.server.raft: rejecting vote request since our last index is greater: candidate=10.4.73.219:8300 last-index=398166260 last-candidate-index=398166241
Mar 02 00:20:00 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:20:00.415Z [ERROR] agent.server: error performing anti-entropy sync of federation state: error="node is not the leader"
Mar 02 00:20:00 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:20:00.868Z [INFO]  agent.server: New leader elected: payload=ip-10-4-73-190.us-west-2.compute.internal
Mar 02 00:20:04 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:20:04.868Z [INFO]  agent.server.serf.lan: serf: EventMemberUpdate: ip-10-4-73-219.us-west-2.compute.internal
Mar 02 00:20:04 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:20:04.868Z [INFO]  agent.server: Updating LAN server: server="ip-10-4-73-219.us-west-2.compute.internal (Addr: tcp/10.4.73.219:8300) (DC: us-west-2)"
Mar 02 00:20:09 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:20:09.716Z [INFO]  agent.server.serf.wan: serf: EventMemberUpdate: ip-10-4-73-219.us-west-2.compute.internal.us-west-2
Mar 02 00:20:09 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:20:09.716Z [INFO]  agent.server: Handled event for server in area: event=member-update server=ip-10-4-73-219.us-west-2.compute.internal.us-west-2 area=wan

in the specific:

Mar 02 00:20:00 ip-10-4-73-198.us-west-2.compute.internal docker[3182]:     2021-03-02T00:20:00.868Z [INFO]  agent.server: New leader elected: payload=ip-10-4-73-190.us-west-2.compute.internal

If the cluster has a leader why adding back a node triggers a re-election, it shouldn’t for the sake of the stability of the raft consensus.

@ChipV223
Copy link
Contributor

Hi @ltagliamonte-dd !

I just tried to repro this in a 3-server node DC(built via Vagrant) using Consul v1.9.5. When I performed consul leave on one of the follower nodes and then brought it back online a minute later, I didn't observe any leadership changes.

If you were to run this on the later-released version of Consul, do you to still observe this behavior?

@ChipV223 ChipV223 added the waiting-reply Waiting on response from Original Poster or another individual in the thread label Apr 29, 2021
@ltagliamonte-dd
Copy link
Contributor Author

@ChipV223 thank you for looking into this, locally i can still see elections happening, those are the step i'm doing:

cat <<EOF > consul-config.json
{
  "telemetry": {
    "disable_hostname": true,
    "prometheus_retention_time": "60s"
  }
}
EOF

CONSUL_VERSION=1.9.5
docker run -d --rm --name=consul-0 -p 8500:8500 -p 8600:8600 -v $(pwd)/consul-config.json:/consul/config/consul-config.json -e CONSUL_BIND_INTERFACE=eth0 consul:${CONSUL_VERSION}
JOIN_IP=$(docker inspect --format='{{ .NetworkSettings.IPAddress }}' consul-0)

for i in $(seq 4)
do
  docker run -d --rm --name=consul-${i} -p 850${i}:8500 -p 860${i}:8600 -v $(pwd)/consul-config.json:/consul/config/consul-config.json -e CONSUL_BIND_INTERFACE=eth0 consul:${CONSUL_VERSION}
done

for i in $(seq 4)
do
  docker exec consul-${i} consul join ${JOIN_IP} 
done


leader_ip=$(docker exec -it consul-0 consul info | grep leader_addr | tr -s " " | cut -d "=" -f2 | cut -d":" -f1)
leader_id=$(docker ps -q | xargs -n 1 docker inspect --format '{{ .NetworkSettings.IPAddress }} {{ .Name }}'| grep ${leader_ip} | sed 's/ \// /' | cut -d" " -f2)
echo ${leader_id}
docker logs -f ${leader_id}


## in another terminal
CONSUL_VERSION=1.9.5
JOIN_IP=$(docker inspect --format='{{ .NetworkSettings.IPAddress }}' consul-0)
docker exec -it consul-1 consul leave
docker run -d --rm --name=consul-1 -p 8501:8500 -p 8601:8600 -v $(pwd)/consul-config.json:/consul/config/consul-config.json  -e CONSUL_BIND_INTERFACE=eth0 consul:${CONSUL_VERSION}
docker exec consul-1 consul join ${JOIN_IP} 

docker exec -it consul-2 consul leave
docker run -d --rm --name=consul-2 -p 8502:8500 -p 8602:8600 -v $(pwd)/consul-config.json:/consul/config/consul-config.json  -e CONSUL_BIND_INTERFACE=eth0 consul:${CONSUL_VERSION}
docker exec consul-2 consul join ${JOIN_IP} 

## check how many elections
curl http://127.0.0.1:8500/v1/agent/metrics?format=prometheus -sS | grep -i consul_raft_state_candidate


#cleanup
docker rm -f  consul-0 consul-1 consul-2 consul-3 consul-4

@ghost ghost removed waiting-reply Waiting on response from Original Poster or another individual in the thread labels Apr 29, 2021
@ChipV223
Copy link
Contributor

Thank you for the information @ltagliamonte-dd !

This seems very environment-specific, particularly with docker since I was not able to repro this using Vagrant VMs. I'll try again using docker containers and report back when I get the chance

@ltagliamonte
Copy link

@ChipV223 any update on this issue?

@ChipV223
Copy link
Contributor

ChipV223 commented Jun 2, 2021

Hi @ltagliamonte !

I've not been able to look into this as I've been involved with other tasks. The plan is to try to run my repro again using docker containers sometime this week and will update when I have any news to share

@mikemorris
Copy link
Contributor

@ltagliamonte We're investigating a report of a similar issue and wanted to follow up here to confirm a few details:

  • Are you running Consul OSS or Enterprise in these reproductions?
    • If Enterprise, are any Enterprise features in use, including autopilot upgrades, read replicas or redundancy zones?
  • Can you confirm that this issue is reproducible in both 1.8.8 and 1.9.5? I'm curious about this specifically because some of the logic around handling server leaves and leadership elections changed substantially in Switch to using the external autopilot module #9103

@ltagliamonte-dd
Copy link
Contributor Author

Hello @mikemorris

@blake
Copy link
Member

blake commented Jan 7, 2022

Hi @ltagliamonte-dd, this issue was fixed by hashicorp/raft#476, added to Consul in #11375, and released in 1.10.4.

Thanks for reporting this. I'm going to close this issue now. Feel free to re-open it if for some reason this does not resolve the issue.

@blake blake closed this as completed Jan 7, 2022
@ltagliamonte-dd
Copy link
Contributor Author

ltagliamonte-dd commented Jan 7, 2022

@blake i did test with the steps i posted using 1.10.4 and i can still repro the issue.

@dhiaayachi
Copy link
Collaborator

Hey @ltagliamonte,
I was able to reproduce the issue with version 1.10.4, seems related to this log line you see:
Mar 02 00:19:51 ip-10-4-73-198.us-west-2.compute.internal docker[3182]: 2021-03-02T00:19:51.754Z [ERROR] agent.server.raft: failed to appendEntries to: peer="{Nonvoter aec26945-b142-2890-3e1e-ce00b18322b8 10.4.73.219:8300}" error=EOF.
For some reason, the leader fail to append raft logs to the newly rejoined node which trigger an election, I will dig into it more and let you know.

@dhiaayachi dhiaayachi reopened this Jan 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
theme/internals Serf, Raft, SWIM, Lifeguard, Anti-Entropy, locking topics type/question Not an "enhancement" or "bug". Please post on discuss.hashicorp
Projects
None yet
Development

No branches or pull requests

7 participants