Skip to content

Latest commit

 

History

History
25 lines (19 loc) · 1.07 KB

85-improving-availability-with-failover.md

File metadata and controls

25 lines (19 loc) · 1.07 KB
layout title date comments categories language references
post
Improving availability with failover
2018-10-26 12:02
true
system design
en

Cold Standby: Use heartbeat or metrics/alerts to track failure. Provision new standby nodes when a failure occurs. Only suitable for stateless services.

Hot Standby: Keep two active systems undertaking the same role. Data is mirrored in near real time, and both systems will have identical data.

Warm Standby: Keep two active systems but the secondary one does not take traffic unless the failure occurs.

Checkpointing (or like Redis snapshot): Use write-ahead log (WAL) to record requests before processing. Standby node recovers from the log during the failover.

  • cons
    • time-consuming for large logs
    • lose data since the last checkpoint
  • usercase: Storm, WhillWheel, Samza

Active-active (or all active): Keep two active systems behind a load balancer. Both of them take in parallel. Data replication is bi-directional.