Why does the frontend node crash with an OOM error when querying a large table?
I'm experiencing an issue where my frontend node crashes with an out-of-memory (OOM) error when I attempt to query a large table with over 40 million data points using the postgres client. The server memory is about 16G, and no special logs are generated upon the crash, although I've noticed some error logs that might be related to disconnection. Here's an example of the error logs:
ERROR pgwire::pg_protocol: flush error: Broken pipe (os error 32)
ERROR pgwire::pg_protocol: Error: ReadMsgError: Connection reset by peer (os error 104)
ERROR pgwire::pg_protocol: Error: ReadMsgError: unexpected end of file
During the query, the memory usage keeps increasing until the service node crashes, and I receive the following message:
dev=> select * from iot_terminal_log;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
I'm using RisingWave via docker-compose, and the frontend node is the one that crashes. I'm considering trying another server since this one might not have enough memory and is running other services. Could the large amount of data be causing the frontend node data to be too large to store, resulting in memory overflow?
xin
Asked on Mar 06, 2023
It seems like you're encountering an out-of-memory issue on the frontend node when running a large query. This could be due to the limited server memory and the additional load from other services running on the same server. The error logs you've provided suggest that there might be a disconnection issue, but the primary concern is the memory overflow caused by the large query. It's possible that the frontend node's memory is insufficient to handle the query, leading to an OOM crash. You might want to consider increasing the server's memory or optimizing the query to reduce memory consumption. Additionally, checking the memory configuration for the frontend node could help identify if there are any settings that could be adjusted to prevent such crashes.