On Thu, Nov 24, 2011 at 9:09 AM, Tomas Vondra <tv@xxxxxxxx> wrote: > On 24 Listopad 2011, 14:51, Gaëtan Allart wrote: >> Hello everyone, >> >> I'm having some troubles with a Postgresql server. >> We're using PG has a database backend for a very big website (lots of data >> and much traffic). >> >> The issue : server suddenly (1H after restart) becomes slow (queries not >> responding), load rises (>20 instead of 1), iowait rises (20 to 70%) >> >> Version : 9.0.5 >> Server : Dual Xeon X5650 (24 cores total) >> Memory : 48 GB >> Disks : SSD >> >> >> Top when overloaded : > > Top is not the most useful tool here, I guess. Use "iotop" (will show you > which processes are doing the I/O) and tools like vmstat / iostat. > >> Postgresql.conf : >> >> max_connections = 50 >> shared_buffers = 12G >> temp_buffers = 40MB >> work_mem = 128MB >> maintenance_work_mem = 256MB >> max_files_per_process = 8192 >> checkpoint_segments = 256 >> checkpoint_timeout = 30min >> checkpoint_completion_target = 0.9 > > Fine. Let's see the options that look suspicious. > I think you missed some suspicious settings... I'd recommend setting shared buffers to 8gb, and I'd likely reduce checkpoint segements to 30 and set the checkpoint timeout back to 5 minutes. Everything about the way this server is configured (including those vm settings) is pushing it towards delaying the WAL/Buffer/Checkpoint as long as possible, which matches with the idea of good performance initial followed by a period of poor performance and heavy i/o. On a side note, I'd guess your work_mem is probably too high. 50 (connections) x 128 (mb work mem) x 2 (sorts per query) = 12GB RAM, which is 25% of total ram on the box. That doesn't necessarily mean game over, but it seem like it wouldn't be that hard to get thrashing being set up that way. YMMV. Robert Treat conjecture: xzilla.net consulting: omniti.com -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general