troubleshooting

What does it mean when there is no `offset_info` available for internal CDC tables in RisingWave?

I'm using RisingWave for change data capture (CDC) from PostgreSQL, and I've noticed that some internal tables lack offset_info. Here's the process I follow to measure initial data load progress:

  1. Get a list of table partitions: SHOW internal tables LIKE '__internal_%_source_%';
  2. Get the current LSN offset: a. SELECT offset_info->'split_info'->'pg_split'->'inner'->'start_offset' FROM {partition}; b. Convert that string to JSON, then pull out sourceOffset.lsn
  3. Compare that to the current LSN in PostgreSQL: SELECT (pg_current_wal_lsn() - '0/0') - {risingwave_lsn}

However, many of the table partitions have no rows, so offset_info is not returned. What does it mean when there is no offset_info available at all for one of these internal tables?

Ri

Rick Otten

Asked on Dec 22, 2023

The absence of offset_info for an internal CDC table in RisingWave generally indicates that the connector has crashed or encountered an issue such as reaching the max_wal_senders limit in PostgreSQL. Even if the upstream table is empty, the internal table in RisingWave should contain a row describing its consumption offset due to the heartbeat event from the Debezium connector. If offset_info is missing, it's recommended to check the compute node's log for errors.

Jan 03, 2024Edited by