troubleshooting

How to troubleshoot compute node failures when using StarRocks sink for analytics data synchronization?

I am facing compute node failures when using StarRocks sink to sync analytics data. The compute nodes keep failing, and I am unsure of the root cause. The error log indicates a barrier issue related to the sink. How can I troubleshoot this issue effectively?

Sh

Shelton Suen

Asked on Apr 30, 2024

  1. Check the number of compute nodes in the cluster to ensure it meets the requirements for the workload.

  2. Verify the resources allocated to the compute nodes, such as CPU and memory limits, to ensure they are sufficient for the workload.

  3. Review the logs and error messages to identify any patterns or specific errors related to the compute node failures.

  4. Consider the version compatibility between RisingWave and StarRocks to ensure there are no compatibility issues causing the failures.

  5. Investigate other tasks and sinks running in the cluster to see if they are impacting the stability of the compute nodes.

  6. Look for chained errors or dependencies that could be causing the compute node failures.

  7. Perform a detailed analysis of the compute node restarts when adding the StarRocks sink to pinpoint any specific triggers.

  8. Collaborate with the team to gather insights and perspectives on the issue for a comprehensive troubleshooting approach.

Apr 30, 2024Edited by