all-things-risingwave

How does RW handle leader election and service discovery during network partitions?

I was looking up how RW avoids split-brain situations during network partitions and could not find much. My best guess is that right now this is delegated to etcd. From what the code indicates, the __meta_election_ key on etcd is used to detect which meta node is the leader. Is that correct?

Ra

Rafael Acevedo

Asked on Sep 21, 2023

  • RW handles leader election and service discovery during network partitions by leveraging etcd and implementing the Raft protocol.
  • All meta nodes participate in the election process, ensuring only one node is elected as leader while others act as followers.
  • If a node crashes or fails to keep the lease within the valid time, a new leader is elected through re-triggering the election process.
  • The old leader, if it exists, will automatically withdraw from the election or become a follower.
  • Service discovery logic is encapsulated in the meta client to enable clients to discover the new leader and handle normal requests effectively.
  • Initially, a different solution was planned for the cloud environment, but the decision was made to maintain leader election and service discovery within the kernel itself.
Sep 22, 2023Edited by