On 24 Listopad 2011, 23:19, Cédric Villemain wrote: > Le 24 novembre 2011 20:38, Gaëtan Allart <gaetan@xxxxxxxxxxx> a écrit : >> Finally, it "crashed" againŠ :-( >> >> Here's the output of iotop while databased was inaccessible : >> >> 32361 be/4 postgres 0.00 B/s 0.00 B/s 0.00 % 99.99 % postgres: >> mydb >> mydb host(34847) idle >> 32244 be/4 postgres 163.48 K/s 0.00 B/s 0.00 % 99.99 % postgres: >> mydb >> mydb host(34660) SELECT >> 32045 be/4 postgres 7.78 K/s 0.00 B/s 0.00 % 99.99 % postgres: >> mydb >> mydb host(33765) SELECT >> 32158 be/4 postgres 7.78 K/s 0.00 B/s 0.00 % 99.99 % postgres: >> mydb >> mydb host(34112) SELECT >> 32242 be/4 postgres 7.78 K/s 0.00 B/s 0.00 % 99.99 % postgres: >> mydb >> mydb host(34632) SELECT >> 32372 be/4 postgres 0.00 B/s 0.00 B/s 0.00 % 99.99 % postgres: >> mydb >> mydb host(38858) idle in transaction >> 32231 be/4 postgres 15.57 K/s 0.00 B/s 0.00 % 99.99 % postgres: >> mydb >> mydb host(38602) SELECT >> 28811 be/4 postgres 3.89 K/s 0.00 B/s 0.00 % 99.99 % postgres: >> mydb >> mydb host(40594) SELECT >> 32190 be/4 postgres 3.89 K/s 0.00 B/s 0.00 % 99.99 % postgres: >> mydb >> mydb host(38497) SELECT This iotop output is rather incomplete - it does not show which of the processes actually did the I/O. The fact that a process does 99% of the I/O and yet performs no reads or writes is a bit strange. Use "iotop -o" and post the result. >> Does this help? > > yes. > It seem you have an issue with your checkpoint syncing time, it is > fixed in 9.1 and backported in 9.0 here : > http://projects.2ndquadrant.com/backports People generally don't want to apply backports on their own, especially when it's a production server and when it's unclear it actually fixes the issue they have. I'm not sure about that. > It is possible you have other problems that explains the issue you > have. An immediate solution before trying a patch is to reduce your > shared_buffer setting to something very low, like 1GB. Well, using low shared_buffers was used especially before 8.3, when the spread checkpoints were not available. It prevents the I/O overload when the database suddenly decides to write all of the dirty buffers. But he's on 9.0 (so he already has spread checkpoints). Plus the number of buffers he's writing is negligible - usually about 700 buffers (6MB), 3192 buffers (25MB) at most. That surely should not be a problem for the SSD he's using. Tomas -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general