How to troubleshoot compute node failures when using StarRocks sink for analytics data synchronization?
I am facing compute node failures when using StarRocks sink to sync analytics data. The compute nodes keep failing, and I am unsure of the root cause. The error log indicates a barrier issue related to the sink. How can I troubleshoot this issue effectively?
Shelton Suen
Asked on Apr 30, 2024
-
Check the number of compute nodes in the cluster to ensure it meets the requirements for the workload.
-
Verify the resources allocated to the compute nodes, such as CPU and memory limits, to ensure they are sufficient for the workload.
-
Review the logs and error messages to identify any patterns or specific errors related to the compute node failures.
-
Consider the version compatibility between RisingWave and StarRocks to ensure there are no compatibility issues causing the failures.
-
Investigate other tasks and sinks running in the cluster to see if they are impacting the stability of the compute nodes.
-
Look for chained errors or dependencies that could be causing the compute node failures.
-
Perform a detailed analysis of the compute node restarts when adding the StarRocks sink to pinpoint any specific triggers.
-
Collaborate with the team to gather insights and perspectives on the issue for a comprehensive troubleshooting approach.