troubleshooting

Is there a way to buffer up messages from a source before sending to a sink in Kafka?

Neil is looking for a way to collect messages from a source for a specific duration before sending them to a sink in Kafka. He tried using WATERMARK but didn't get the desired result. He also mentioned the use case of sinking data to Iceberg and wanting to optimize the table in between batches.

Ne

Neil

Asked on Dec 11, 2023

  • Neil is trying to buffer up messages from a source before sending them to a sink in Kafka.
  • He attempted to use WATERMARK but it did not produce the expected outcome.
  • Neil's specific use case involves sinking data to Iceberg and wanting to optimize the table in between batches.
  • The current issue seems to be related to commit conflicts in the Iceberg table, which may be caused by the coupling of commits with checkpoints.
  • To reduce the chance of commit conflicts, increasing the checkpoint interval, such as setting checkpoint_frequency to a larger value, is recommended.
  • There is ongoing work to decouple commits from checkpoints in the system.
Dec 12, 2023Edited by