Hello, I'm trying to understand how RisingWave handles missing column values in records ingested from Kafka via Change Data Capture (CDC). Is it possible to define a default missing value, such as NULL
or occasionally a different non-null value? Additionally, is RisingWave PostgreSQL-compatible in terms of how null-related semantics are handled?
For example, in our use case, we have a large number of materialized views generated based on an internal query language. It's critical that the CREATE SOURCE
command, which corresponds to tables in our transactional database streamed via CDC and Kafka, can accurately represent NULL
values. If not, the materialized views created on top of those sources will return incorrect results, especially when applying different aggregations like MIN/MAX/SUM to the same source table columns.
Nizar Hejazi
Asked on Jun 02, 2023
Currently, RisingWave only supports non-NULL default values for CREATE TABLE
without a connector, and the support for non-NULL default values for external sources is a work in progress. However, RisingWave is designed to strictly follow the behavior of PostgreSQL in handling NULL
values. The default missing value for columns in CREATE SOURCE
is implicitly set to NULL
, which should meet the requirement for handling missing values in materialized views and sources. If a schema change is needed, it seems that adding a column with a NULL
default value is not supported, so the materialized view would need to be dropped and recreated to ensure the new column has NULL
as the default value.