Craig James wrote:
I managed to capture one such event using top(1) with the "batch" option as a background process. See output below
You should add "-c" to your batch top capture, then you'll be able to see what the individual postmaster processes are actually doing when things get stuck.
max_connections = 1000 shared_buffers = 2000MB work_mem = 256MB Mem: 8194800k total, 8118688k used, 76112k free, 36k buffers Swap: 2031608k total, 169348k used, 1862260k free, 7313232k cached
These settings appear way too high for a server with 8GB of RAM. I'm not sure if max_connections is too large, or if it's work_mem that's too big, but one or both of them may need to be tuned way down from where they are now to get your memory usage under control. Your server might running out of RAM during the periods where it becomes unresponsive--that could be the system paging stuff out to swap, which isn't necessarily a high user of I/O but it will block things. Not having any memory used for buffers is never a good sign.
-- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support greg@xxxxxxxxxxxxxxx www.2ndQuadrant.us -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance