On Fri, Nov 16, 2012 at 11:19 AM, Vlad <marchenko@xxxxxxxxx> wrote: > >> We're looking for spikes in 'blk' which represents when lwlocks bump. >> If you're not seeing any then this is suggesting a buffer pin related >> issue -- this is also supported by the fact that raising shared >> buffers didn't help. If you're not seeing 'bk's, go ahead and >> disable the stats macro. > > > most blk comes with 0, some with 1, few hitting 100. I can't say that during > stall times the number of blk 0 vs blk non-0 are very different. right. this is feeling more and more like a buffer pin issue. but even then we can't be certain -- it could be symptom, not the cause. to prove it we need to demonstrate that everyone is spinning and waiting, which we haven't done. classic spinlock contention manifests in high user cpu. we are binding in kernel, so I wonder if it's all the select() calls. we haven't yet ruled out kernel regression. If I were you, I'd be investigating pgbouncer to see if your app is compliant with transaction mode processing, if for no other reason than it will mitigate high load issues. >> *) How many specific query plans are needed to introduce the >> condition, Hopefully, it's not too many. If so, let's start >> gathering the plans. If you have a lot of plans to sift through, one >> thing we can attempt to eliminate noise is to tweak >> log_min_duration_statement so that during stall times (only) it logs >> offending queries that are unexpectedly blocking. > > > unfortunately, there are quite a few query plans... also, I don't think > setting log_min_duration_statement will help us, cause when server is > hitting high load average, it reacts slowly even on a key press. So even > non-offending queries will be taking long to execute. I see all sorts of > queries a being executed long during stall: spanning from simple > LOG: duration: 1131.041 ms statement: SELECT 'DBD::Pg ping test' > to complex ones, joining multiple tables. > We are still looking into all the logged queries in attempt to find the ones > that are causing the problem, I'll report if we find any clues. right. merlin -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general