I'm trying to use the TUMBLE function in SQL to compute averages over a 30-second window for new data coming in, without including old persisted data. I attempted the following query, but it's slower than when I include old data. Am I doing something wrong?
CREATE MATERIALIZED VIEW IF NOT EXISTS fibersensordata_30s_mean AS
WITH pre_filtered_data AS (
SELECT
machine_id,
subject_id,
timestamp,
raw_magnitude,
raw_phase,
nodes_per_meter,
node_quality
FROM
fibersensordata
WHERE
timestamp >= (SELECT MAX(timestamp) - INTERVAL '30 SECONDS' FROM fibersensordata)
)
SELECT
machine_id,
subject_id,
window_start AS starttime,
window_end AS endtime,
AVG(raw_magnitude) AS mean_raw_magnitude,
AVG(raw_phase) AS mean_raw_phase,
AVG(nodes_per_meter) AS mean_nodes_per_meter,
AVG(node_quality) AS mean_node_quality,
COUNT(window_end) AS fibersensordata_samples
FROM
TUMBLE(pre_filtered_data, pre_filtered_data.timestamp, INTERVAL '30 SECONDS')
GROUP BY
machine_id, subject_id, window_start, window_end
ORDER BY
endtime;
I want to only include data starting from 30 seconds before the current time and update every 30 seconds with new incoming data.
Atiqul Islam
Asked on Apr 06, 2024
To achieve real-time computation with the TUMBLE function, you should use now() - INTERVAL '30 SECONDS'
instead of selecting the maximum timestamp from the table. However, make sure to remove the FROM fibersensordata
part from the WHERE clause, as it's causing a syntax error. The corrected WHERE clause should look like this:
WHERE
timestamp >= now() - INTERVAL '30 SECONDS'
This will ensure that your materialized view only includes data from the last 30 seconds and updates with new data as it comes in.