all-things-risingwave

How to handle multiple barrier messages arriving at different timestamps in a stream processing system?

In a stream processing system, when multiple barrier messages arrive at different timestamps at a merge node, how should the system handle the situation to ensure normal operation and completion of snapshots?

JJ

JJ

Asked on May 23, 2022

  1. For a node with multiple fan-in like Merge, it needs to wait until all barriers (of the same epoch) are collected and then emit the barrier out.
  2. The merger always waits for all upstream nodes to send a barrier, even if some upstream nodes are remote.
  3. To accomplish a global snapshot, each parallel instance on each machine makes its own snapshot and hands it over to SharedBuffer for merging and uploading.
  4. On the cluster-level, the StreamManager on Meta Node waits for all machines to complete their checkpoints before committing the checkpoint globally.
May 23, 2022Edited by