(First, apologies if this post now appears twice - it appears our mail server rewrites my address!) Hello all. I'm a pgsql performance virgin so hope I cross all the 't's and dot the lower-case 'j's when posting this query... On our production database server we're experiencing behaviour that several engineers are unable to explain - hence this Email. First, our specs; Scientific Linux 6.2, kernel 2.6.32 PG version 9.1.3, release 1PGDG.rhel6 24GB RAM 8 cores 2x software SSD-based RAIDs: a) ~660GB, RAID 5, 4 SSDs (data) b) ~160GB, RAID 1, 2 SSDs (xlogs + tmp tables) We're seeing SELECT statements and even EXPLAIN (no ANAYLZE) statements hang indefinitely until *something* (we don't know what) releases some kind of resource or no longer becomes a massive bottle neck. These are the symptoms. However, the system seems healthy - no table ('heavyweight') locks are held by any session (this happens with only a few connected sessions), all indexes are used correctly, other transactions are writing data (we generally only have a few sessions running at a time - perhaps 10) etc. etc. In fact, we are writing (or bgwriter is), 2-3 hundred MB/s sometimes. We regularly run vacuum analyze at quiet periods - generally 1-2s daily. These sessions (that only read data) that are blocked can block from anything from between only 5 minutes to 10s of hours then miraculously complete successfully at once. Any suggestions for my next avenue of investigation? I'll try and capture more data by observation next time it happens (it is relatively intermittent). Regards, Jim PS. These are the settings that differ from the default: checkpoint_segments = 128 maintenance_work_mem = 256MB synchronous_commit = off random_page_cost = 3.0 wal_buffers = 16MB shared_buffers = 8192MB checkpoint_completion_target = 0.9 effective_cache_size = 18432MB work_mem = 32MB effective_io_concurrency = 12 max_stack_depth = 8MB log_autovacuum_min_duration = 0 log_lock_waits = on autovacuum_vacuum_scale_factor = 0.1 autovacuum_naptime = 8 autovacuum_max_workers = 4 PPS. I've just noticed that our memory configuration is over subscribed! shared_buffers + effective_cache_size > Total available RAM! Could this be the root cause somehow? -- Jim Vanns Systems Programmer Framestore -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance