troubleshooting

What does it mean when I get the 'lease keep alive timeout' error on my RisingWave meta nodes?

I cannot seem to get my RisingWave cluster to ever last more than a day without some issue or another. Here is the error message I am encountering:

2024-04-12T19:41:58.46878663Z INFO risingwave_meta::rpc::election::etcd: keep alive loop for lease 207041146683978859 stopped
2024-04-12T19:41:58.468887662Z ERROR risingwave_meta::rpc::election::etcd: keep alive failed, stopping main loop```
Ne

Neil

Asked on Apr 12, 2024

  • The 'lease keep alive timeout' error on RisingWave meta nodes indicates a failure in communication between the meta nodes and the etcd service.
  • This error suggests that the meta node is unable to maintain the lease keep alive, leading to a timeout and subsequent failure.
  • Possible solutions include checking the etcd status, upgrading RisingWave to a newer version, monitoring etcd resource utilization, and ensuring network connectivity.
  • If the issue persists, consider trying out a new meta backend to replace etcd.
Apr 13, 2024Edited by