Does RisingWave (RW) work with partitions in Iceberg sink?

Neil is having issues with Iceberg sink and partitions in RisingWave (RW). He has an Iceberg table partitioned by the Hour of a timestamp, but the data from RW doesn't seem to respect the partition. When writing data from Spark, it creates a partitioned folder and writes data to that partition. Neil is using AWS Glue and Iceberg catalog exposed via an Iceberg REST server that RW connects to. The table is created with a partition specification and the RW sink is configured to write to an S3 path.



Asked on Mar 18, 2024

  • RW does work with partitions in Iceberg sink.
  • Iceberg uses hidden partitions, and its specification does not have a requirement for the partition path.
  • The issue Neil is facing with data not respecting the partition in RW might be related to how Iceberg handles partitions internally.
  • RW respects the partition specification, but the behavior of hidden partitions in Iceberg may cause the data to be written directly to the /data folder instead of the expected partitioned folder.
  • It's recommended to consult the team or further investigate how Iceberg handles partitions to understand the behavior observed in RW.
Mar 19, 2024Edited by