How to remove a stuck job when the meta-node fails in RisingWave?
I'm facing an issue with RisingWave where the meta-node died and now I cannot remove a job. I've been following the instructions provided, but the job seems to be stuck. Here's what I've done so far:
- Set system parameter by psql:
ALTER SYSTEM SET pause_on_next_bootstrap to true
- Restart the meta node.
- Attempted to drop the relevant mviews, but they don't exist.
- Restart the meta node again to resume.
However, when I try to cancel the job with cancel jobs 62001
, it returns (0 rows)
, and the job still appears in SHOW JOBS
with 0.00% progress. Additionally, I get an error when trying to delete from rw_catalog.rw_ddl_progress
.
The meta-node log shows an error related to failing to cancel a recovered streaming job. Any tips on how to proceed?
André Falk
Asked on Nov 30, 2023
I've been following the steps to remove a stuck job after a meta-node failure, but the job seems to be stuck at 0.00% progress and I can't cancel it or delete its progress from rw_catalog.rw_ddl_progress
. The meta-node log indicates an error with canceling a recovered streaming job. What should I do next to resolve this issue?