troubleshooting

How does RisingWave manage cluster upgrades to ensure zero downtime? Is a rolling upgrade implemented, or does the company follow a different approach?

Redik is asking about RisingWave's approach to cluster upgrades and whether zero downtime is ensured. Yufan Song explains the reasons for not preferring rolling upgrades and suggests shutting down all compute nodes at once for a single recovery. Yingjun provides a workaround involving creating a new instance, migrating workload, and shutting down the old instance. The discussion also touches on the concept of zero downtime and user experience during upgrades.

Re

Redik

Asked on Aug 23, 2023

  • RisingWave does not prefer rolling upgrades due to the major cost of recovery and potential inconsistency between nodes.
  • The company opts for shutting down all compute nodes at once to minimize recovery occurrences.
  • Yingjun suggests a workaround involving creating a new instance, migrating workload, and shutting down the old instance to minimize downtime.
  • Zero downtime is challenging to achieve, and it's more about ensuring a seamless user experience during upgrades.
  • Systems like AWS Redshift and AWS Aurora may have downtime during upgrades, typically a few minutes.
  • AWS Aurora offers zero downtime for minor version upgrades using replicas to achieve this.
Aug 24, 2023Edited by