In a scenario where events are streamed to Kafka and then to Rising wave as a table, with downstream materialized views for transformation and sink, and the event schema gets updated leading to streaming of new schema to Rising wave. The question is about restricting processing to data from the last couple of days to avoid rescanning all existing data and whether indexing by time can speed up data processing.
Tuan Vuong
Asked on Sep 19, 2023
Yes, creating an index on the time column can benefit data processing in this scenario. When data is indexed by time, it is sorted by time which makes time filtering faster.