On Tue, Sep 10, 2013 at 02:04:57PM -0400, David Whittaker wrote: > Hi Andrew, > > > On Tue, Sep 10, 2013 at 11:26 AM, Andrew Dunstan <andrew@xxxxxxxxxxxx>wrote: > > > > > On 09/10/2013 11:04 AM, David Whittaker wrote: > > > >> > >> Hi All, > >> > >> I've been seeing a strange issue with our Postgres install for about a > >> year now, and I was hoping someone might be able to help point me at the > >> cause. At what seem like fairly random intervals Postgres will become > >> unresponsive to the 3 application nodes it services. These periods tend to > >> last for 10 - 15 minutes before everything rights itself and the system > >> goes back to normal. > >> > >> During these periods the server will report a spike in the outbound > >> bandwidth (from about 1mbs to about 5mbs most recently), a huge spike in > >> context switches / interrupts (normal peaks are around 2k/8k respectively, > >> and during these periods they‘ve gone to 15k/22k), and a load average of > >> 100+. CPU usage stays relatively low, but it’s all system time reported, > >> user time goes to zero. It doesn‘t seem to be disk related since we’re > >> running with a shared_buffers setting of 24G, which will fit just about our > >> entire database into memory, and the IO transactions reported by the > >> server, as well as the disk reads reported by Postgres stay consistently > >> low. > >> > >> We‘ve recently started tracking how long statements take to execute, and > >> we’re seeing some really odd numbers. A simple delete by primary key, for > >> example, from a table that contains about 280,000 rows, reportedly took > >> 18h59m46.900s. An update by primary key in that same table was reported as > >> 7d 17h 58m 30.415s. That table is frequently accessed, but obviously those > >> numbers don't seem reasonable at all. > >> > >> Some other changes we've made to postgresql.conf: > >> > >> synchronous_commit = off > >> > >> maintenance_work_mem = 1GB > >> wal_level = hot_standby > >> wal_buffers = 16MB > >> > >> max_wal_senders = 10 > >> > >> wal_keep_segments = 5000 > >> > >> checkpoint_segments = 128 > >> > >> checkpoint_timeout = 30min > >> > >> checkpoint_completion_target = 0.9 > >> > >> max_connections = 500 > >> > >> The server is a Dell Poweredge R900 with 4 Xeon E7430 processors, 48GB of > >> RAM, running Cent OS 6.3. > >> > >> So far we‘ve tried disabling Transparent Huge Pages after I found a > >> number of resources online that indicated similar interrupt/context switch > >> issues, but it hasn’t resolve the problem. I managed to catch it happening > >> once and run a perf which showed: > >> > >> | > >> + 41.40% 48154 postmaster 0x347ba9 f 0x347ba9 > >> + 9.55% 10956 postmaster 0x2dc820 f set_config_option > >> + 8.64% 9946 postmaster 0x5a3d4 f writeListPage > >> + 5.75% 6609 postmaster 0x5a2b0 f > >> ginHeapTupleFastCollect > >> + 2.68% 3084 postmaster 0x192483 f > >> build_implied_join_equality > >> + 2.61% 2990 postmaster 0x187a55 f build_paths_for_OR > >> + 1.86% 2131 postmaster 0x794aa f get_collation_oid > >> + 1.56% 1822 postmaster 0x5a67e f > >> ginHeapTupleFastInsert > >> + 1.53% 1766 postmaster 0x1929bc f > >> distribute_qual_to_rels > >> + 1.33% 1558 postmaster 0x249671 f cmp_numerics| > >> > >> I‘m not sure what 0x347ba9 represents, or why it’s an address rather than > >> a method name. > >> > >> That's about the sum of it. Any help would be greatly appreciated and if > >> you want any more information about our setup, please feel free to ask. > >> > >> > >> > > I have seen cases like this with very high shared_buffers settings. > > > > 24Gb for shared_buffers is quite high, especially on a 48Gb box. What > > happens if you dial that back to, say, 12Gb? > > > > I'd be willing to give it a try. I'd really like to understand what's > going on here though. Can you elaborate on that? Why would 24G of shared > buffers be too high in this case? The machine is devoted entirely to PG, > so having PG use half of the available RAM to cache data doesn't feel > unreasonable. Some of the overhead of bgwriter and checkpoints is more or less linear in the size of shared_buffers. If your shared_buffers is large a lot of data could be dirty when a checkpoint starts, resulting in an I/O spike ... (although we've spread checkpoints in recent pg versions, so this should be less a problem nowadays). Another reason is that the OS cache is also being used for reads and writes and with a large shared_buffers there is a risk of "doubly cached data" (in the OS cache + in shared_buffers). In an ideal world most frequently used blocks should be in shared_buffers and less frequently used block in the OS cache .. > > > > > > cheers > > > > andrew > > > > -- No trees were killed in the creation of this message. However, many electrons were terribly inconvenienced. -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance