On 24 Listopad 2011, 16:39, Robert Treat wrote: > On Thu, Nov 24, 2011 at 9:09 AM, Tomas Vondra <tv@xxxxxxxx> wrote: >> On 24 Listopad 2011, 14:51, Gaëtan Allart wrote: >>> Postgresql.conf : >>> >>> max_connections = 50 >>> shared_buffers = 12G >>> temp_buffers = 40MB >>> work_mem = 128MB >>> maintenance_work_mem = 256MB >>> max_files_per_process = 8192 >>> checkpoint_segments = 256 >>> checkpoint_timeout = 30min >>> checkpoint_completion_target = 0.9 >> >> Fine. Let's see the options that look suspicious. >> > > I think you missed some suspicious settings... I'd recommend setting > shared buffers to 8gb, and I'd likely reduce checkpoint segements to > 30 and set the checkpoint timeout back to 5 minutes. Everything about > the way this server is configured (including those vm settings) is > pushing it towards delaying the WAL/Buffer/Checkpoint as long as > possible, which matches with the idea of good performance initial > followed by a period of poor performance and heavy i/o. Yes, checkpoints were my first thought too. OTOH the OP reported that most of the I/O is caused by WAL writer - that's not exactly the part that does the work during checkpoint. Plus the WAL may not be postponed, as it's usually O_DIRECT and fsynced, right. You're right that the writes are postponed, but I generally see that as a good thing when combined with spread checkpoints. And even with those vm settings (about 3.2GB for background writes), I wouldn't expect this behaviour (because the page cache usually expires after 30 seconds). Say you need 100% of the shared buffers is dirty and need to be written. You have 27 minutes (30*0.9) to do that - that means about 8MB/s. With 30 seconds expire there might be about 240MB before the pdflush starts to write the data to the SSD. And that can surely handle more than 50MB/s. So why the long delay? The question is what else is going on there. But all this is just guessing - I want to see the log_checkpoint message, iostat results etc. > On a side note, I'd guess your work_mem is probably too high. 50 > (connections) x 128 (mb work mem) x 2 (sorts per query) = 12GB RAM, > which is 25% of total ram on the box. That doesn't necessarily mean > game over, but it seem like it wouldn't be that hard to get thrashing > being set up that way. YMMV. This is one of the reasons why effective_cache_size should be lower than 32GB, probably ... Tomas -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general