all-things-risingwave
How does RW handle leader election and service discovery during network partitions?
I was looking up how RW avoids split-brain situations during network partitions and could not find much. My best guess is that right now this is delegated to etcd. From what the code indicates, the __meta_election_
key on etcd is used to detect which meta node is the leader. Is that correct?
Ra
Rafael Acevedo
Asked on Sep 21, 2023
- RW handles leader election and service discovery during network partitions by leveraging etcd and implementing the Raft protocol.
- All meta nodes participate in the election process, ensuring only one node is elected as leader while others act as followers.
- If a node crashes or fails to keep the lease within the valid time, a new leader is elected through re-triggering the election process.
- The old leader, if it exists, will automatically withdraw from the election or become a follower.
- Service discovery logic is encapsulated in the meta client to enable clients to discover the new leader and handle normal requests effectively.
- Initially, a different solution was planned for the cloud environment, but the decision was made to maintain leader election and service discovery within the kernel itself.
Sep 22, 2023Edited by