Raft
#Distributed_System #Raft
Raft is a consensus algorithm designed to manage a replicated log. It’s used to ensure that a cluster of machines can work together in a reliable and fault-tolerant manner, providing strong consistency even in the presence of failures.
Roles of Raft
- Leader: The central node that handles all client requests, replicates log entries to followers, and manages the cluster.
 - Follower: Passive nodes that respond to requests from the leader and apply log entries.
 - Candidate: Nodes that attempt to become leaders during an election.
 
Key components
- Leader Election: Ensures that one and only one leader is elected among the cluster nodes.
 - Log Replication: Ensures that logs are replicated consistently across all nodes.
 - Safety: Ensures that the system behaves correctly and consistently, even in the presence of failures.
 - Membership Changes: Manages changes to the cluster’s membership dynamically.
 
Leader Election
- Election Timer: Each node starts as a follower and waits for a heartbeat from the leader. If a follower does not receive a heartbeat within a certain timeout, it becomes a candidate and starts an election.
 - Voting: Candidates request votes from other nodes. A node votes for the first candidate it receives a request from, resetting its election timer.
 - Winning the Election: A candidate becomes a leader if it receives a majority of votes. The new leader then sends heartbeats to all followers to assert its leadership.
 
Log Replication
- Append Entries: The leader appends new log entries from clients to its log and sends these entries to its followers.
 - Commitment: A log entry is committed once the leader has successfully replicated it to a majority of the cluster nodes.
 - Consistency: Followers apply committed entries to their state machines in the order they were appended.
 
Safety
- Term Number: Each node maintains a term number that increases with each election. This term number helps maintain a consistent state across the cluster.
 - Log Matching: Logs are replicated in the same order across all nodes, ensuring that the system remains consistent.
 - Leader Completeness: The leader always contains all committed entries from previous terms.
 
Membership Changes
- Configuration Changes: Raft supports changes to the cluster configuration (adding or removing nodes) through a two-phase commit process to ensure safety and consistency during transitions.
 
Algorithm in Detail
- Election Process:
 
- Followers wait for a timeout period without receiving a heartbeat.
 - The follower becomes a candidate and increments its term number.
 - The candidate requests votes from other nodes.
 - If the candidate receives a majority of votes, it becomes the leader.
 - The leader sends heartbeats to followers to maintain authority.
 
- Log Replication Process:
 
- The leader receives a command from the client and appends it to its log.
 - The leader sends an AppendEntries RPC to followers.
 - Followers append the entry to their logs and send an acknowledgment back.
 - Once the leader receives a majority of acknowledgments, it commits the entry.
 - The leader notifies followers of the committed entries.
 - Followers apply the committed entries to their state machines.
 
- Safety and Consistency:
 
- Logs are always ordered by term and index.
 - Entries are only committed if they are replicated on a majority of nodes.
 - Followers only accept entries from the current term or previous terms if they match their log’s previous entries.