How to replicate Flink event-time temporal joins in RW?
I'm looking to replicate a Flink query that involves event-time temporal joins in Real-time Warehouse (RW) environment. The query involves joining a temporal table based on event time. Here is the Flink query for reference:
SELECT
tr.event_time,
tr.price as price_usd,
tr.price * exc.rate as price_cad
FROM transactions tr
LEFT JOIN rates FOR SYSTEM_TIME AS OF tr.event_time exc
ON tr.currency = exc.currency_base AND exc.currency_target = 'CAD'
I'm considering expressing it as an interval join
in RW, but it seems to require additional steps like left join, filtering exchange rates, ranking rates, and handling late arrivals and retractions. How can I replicate Flink event-time temporal joins in RW effectively?
Sergii Mikhtoniuk
Asked on Apr 09, 2024
- Replicating Flink event-time temporal joins in RW involves several steps:
- Perform a left join between the transactions and rates tables.
- Filter the exchange rates that precede the transaction event time.
- Rank the rates and keep only the closest one.
- Handle late arrivals and retractions effectively.
-
Consider using an
interval join
approach in RW, but be prepared for additional complexity and steps compared to Flink's event-time temporal join. -
Ensure that the join conditions and filtering criteria align with the Flink query to achieve the desired results in RW.
-
Test the replication thoroughly, especially in scenarios involving late arrivals and retractions, to understand the behavior and performance of the solution.