Here are the latest checkpoint logs : LOG: checkpoint complete: wrote 842 buffers (0.1%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=168.970 s, sync=0.005 s, total=168.977 s LOG: checkpoint starting: time LOG: checkpoint complete: wrote 318 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 2 recycled; write=63.818 s, sync=0.006 s, total=63.825 s LOG: checkpoint starting: time LOG: checkpoint complete: wrote 744 buffers (0.1%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=147.035 s, sync=0.006 s, total=147.043 s LOG: checkpoint starting: time LOG: checkpoint complete: wrote 108 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 3 recycled; write=35.410 s, sync=14.921 s, total=54.811 s LOG: checkpoint starting: time LOG: checkpoint complete: wrote 393 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=88.835 s, sync=43.210 s, total=135.728 s LOG: checkpoint starting: time LOG: checkpoint complete: wrote 914 buffers (0.1%); 0 transaction log file(s) added, 0 removed, 1 recycled; write=148.162 s, sync=14.249 s, total=170.481 s LOG: checkpoint starting: time LOG: checkpoint complete: wrote 202 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 1 recycled; write=53.152 s, sync=0.004 s, total=53.159 s LOG: checkpoint starting: time LOG: checkpoint complete: wrote 897 buffers (0.1%); 0 transaction log file(s) added, 0 removed, 1 recycled; write=42.414 s, sync=1.175 s, total=58.957 s LOG: checkpoint starting: shutdown immediate LOG: checkpoint complete: wrote 666 buffers (0.1%); 0 transaction log file(s) added, 0 removed, 1 recycled; write=0.027 s, sync=1.600 s, total=1.630 s LOG: checkpoint starting: time LOG: checkpoint complete: wrote 627 buffers (0.1%); 0 transaction log file(s) added, 0 removed, 1 recycled; write=125.856 s, sync=0.006 s, total=125.864 s LOG: checkpoint starting: time And pg_stat_bgwriter : postgres=# select *,now() from pg_stat_bgwriter; checkpoints_timed | checkpoints_req | buffers_checkpoint | buffers_clean | maxwritten_clean | buffers_backend | buffers_alloc | now -------------------+-----------------+--------------------+---------------+ ------------------+-----------------+---------------+---------------------- --------- 388 | 13 | 494948 | 4306591 | 13555 | 7458743 | 7835244602 | 2011-11-26 00:43:47.232924+01 (1 row) postgres=# select *,now() from pg_stat_bgwriter; checkpoints_timed | checkpoints_req | buffers_checkpoint | buffers_clean | maxwritten_clean | buffers_backend | buffers_alloc | now -------------------+-----------------+--------------------+---------------+ ------------------+-----------------+---------------+---------------------- --------- 389 | 13 | 501802 | 4352198 | 13809 | 7469220 | 7839778941 | 2011-11-26 00:49:00.680779+01 (1 row) Processes that were writing were "SELECT" queries against database. Gaëtan Le 26/11/11 00:17, « Cédric Villemain » <cedric.villemain.debian@xxxxxxxxx> a écrit : >Le 25 novembre 2011 23:47, Gaëtan Allart <gaetan@xxxxxxxxxxx> a écrit : >> Hello Tomas and Cédric, >> >> Right now, the server is not all right. Load is above 30 and queries are >> slow like hell. >> >> >> Here's the complete iotop. Note the 71 MB/s writes (apparently on SELECT >> queries). >> >> Total DISK READ: 633.35 K/s | Total DISK WRITE: 71.06 M/s >> TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND >> >> >> 27352 be/4 postgres 15.64 K/s 86.01 K/s 0.00 % 99.99 % postgres: >> database database 176.31.228.6(38816) SELECT >> 20226 be/4 postgres 7.82 K/s 0.00 B/s 0.00 % 99.99 % postgres: >> database database 176.31.228.6(34166) SELECT >> 26950 be/4 postgres 23.46 K/s 0.00 B/s 0.00 % 82.14 % postgres: >> database database 46.105.104.205(40820) SELECT >> 23160 be/4 postgres 3.91 K/s 0.00 B/s 0.00 % 81.14 % postgres: >> database database 46.105.104.205(58091) SELECT >> 29184 be/4 postgres 7.82 K/s 0.00 B/s 0.00 % 79.17 % postgres: >> database database 46.105.104.205(51047) SELECT >> 27271 be/4 postgres 23.46 K/s 234.58 K/s 0.00 % 77.15 % postgres: >> database database 46.105.104.205(42315) SELECT >> 28224 be/4 postgres 0.00 B/s 0.00 B/s 0.00 % 74.09 % postgres: >> database database 46.105.104.205(49871) SELECT >> 27450 be/4 postgres 23.46 K/s 0.00 B/s 0.00 % 70.12 % postgres: >> database database 46.105.104.205(49096) SELECT >> 29206 be/4 postgres 121.20 K/s 0.00 B/s 0.00 % 62.99 % postgres: >> database database 176.31.228.6(39787) SELECT >> 8202 be/4 postgres 129.02 K/s 0.00 B/s 0.00 % 60.59 % postgres: >> database database 91.121.89.14(57291) SELECT >> 29209 be/4 postgres 3.91 K/s 0.00 B/s 0.00 % 55.75 % postgres: >> database database 176.31.228.6(39807) idle >> 27046 be/4 postgres 3.91 K/s 0.00 B/s 0.00 % 52.35 % postgres: >> database database 46.105.104.205(41285) SELECT >> 29217 be/4 postgres 54.73 K/s 0.00 B/s 0.00 % 50.18 % postgres: >> database database 46.105.104.205(51178) SELECT >> 3249 be/4 root 3.91 K/s 320.59 K/s 0.00 % 40.31 % [kjournald] >> 28918 be/4 postgres 7.82 K/s 0.00 B/s 0.00 % 34.83 % postgres: >> database database 46.105.104.205(50164) SELECT >> 26856 be/4 postgres 7.82 K/s 234.58 K/s 0.00 % 34.30 % postgres: >> database database 46.105.104.205(40589) SELECT >> 29205 be/4 postgres 7.82 K/s 0.00 B/s 0.00 % 34.17 % postgres: >> database database 176.31.228.6(39784) SELECT >> 4175 be/4 root 0.00 B/s 0.00 B/s 0.00 % 1.84 % [flush-8:0] >> 28905 be/4 postgres 0.00 B/s 0.00 B/s 0.00 % 0.02 % postgres: >> database database 46.105.104.205(50125) SELECT >> 28919 be/4 postgres 0.00 B/s 0.00 B/s 0.00 % 0.01 % postgres: >> database database 46.105.104.205(50167) SELECT >> 19807 be/4 postgres 3.91 K/s 0.00 B/s 0.00 % 0.00 % postgres: >> database database 176.31.228.6(33598) SELECT >> 28703 be/4 postgres 0.00 B/s 86.01 K/s 0.00 % 0.00 % postgres: >> database database 176.31.228.6(39260) SELECT >> 29183 be/4 postgres 7.82 K/s 0.00 B/s 0.00 % 0.00 % postgres: >> database database 46.105.104.205(51044) SELECT >> >> >> Here is dirty_expire_centisecs : >> >> cat /proc/sys/vm/dirty_expire_centisecs >> 3000 >> >> >> Bgwriter configuration is default : >> >> #bgwriter_delay = 200ms # 10-10000ms between rounds >> #bgwriter_lru_maxpages = 100 # 0-1000 max buffers written/round >> #bgwriter_lru_multiplier = 2.0 # 0-10.0 multipler on buffers >>scanned/round >> >> >> >> >> Is there anything I can provide to help you ? > >the checkpoints logs and the output of pg_stat_bgwriter (as asked by >Tomas). > >It is probable that shared_buffers are too small for your workload >(expected), do you still have issue with checkpoint sync duration ? > >You should increase shared_buffers (you can iterate) *and* try to not >have the long checkpoint sync. (keep shared_buffers <= 8GB, as already >mentioned upthread) > >The output you provided does not show everything, and the select are >not looking to write that much. What process are writing ? > >> >> >> >> Thanks again, >> >> Gaëtan >> >> >> >> Le 25/11/11 14:12, « Tomas Vondra » <tv@xxxxxxxx> a écrit : >> >>>On 25 Listopad 2011, 12:43, Cédric Villemain wrote: >>>> Le 25 novembre 2011 11:25, Tomas Vondra <tv@xxxxxxxx> a écrit : >>>>> On 24 Listopad 2011, 23:19, Cédric Villemain wrote: >>>>>> >>>>>> It seem you have an issue with your checkpoint syncing time, it is >>>>>> fixed in 9.1 and backported in 9.0 here : >>>>>> http://projects.2ndquadrant.com/backports >>>>> >>>>> People generally don't want to apply backports on their own, >>>>>especially >>>>> when it's a production server and when it's unclear it actually fixes >>>>> the >>>>> issue they have. I'm not sure about that. >>>> >>>> I agree that most people don't want to do that themselves, but if it >>>> happens to be the solution they can proceed or ask someone to do it. >>>> People want to see their production system back to a normal situation, >>>> here the limited information are not enought to be sure, but the >>>> checkpoint sync time are clear: sync time are not correct. >>>> It is very probable that compacting the fsync will help, but it is not >>>> sure it is required yet. >>> >>>Yes, the sync times are quite crazy. Especially given the tiny number of >>>buffers to be written and the fact that the SSD should handle random I/O >>>quite well. >>> >>>>>> It is possible you have other problems that explains the issue you >>>>>> have. An immediate solution before trying a patch is to reduce your >>>>>> shared_buffer setting to something very low, like 1GB. >>>>> >>>>> Well, using low shared_buffers was used especially before 8.3, when >>>>>the >>>>> spread checkpoints were not available. It prevents the I/O overload >>>>>when >>>>> the database suddenly decides to write all of the dirty buffers. But >>>>> he's >>>>> on 9.0 (so he already has spread checkpoints). >>>> >>>> It is a different animal here. >>>> >>>>> >>>>> Plus the number of buffers he's writing is negligible - usually about >>>>> 700 >>>>> buffers (6MB), 3192 buffers (25MB) at most. That surely should not >>>>>be a >>>>> problem for the SSD he's using. >>>> >>>> See the blog entry from Greg Smith: >>>> >>>>http://blog.2ndquadrant.com/en/2011/06/backporting-and-checkpoint-tro.h >>>>tm >>>>l >>>> >>>> And the slides of his talk at pgconf2011: >>>> >>>>http://www.2ndquadrant.com/static/2quad/media/pdfs/talks/WriteStuff-PGC >>>>on >>>>2011.pdf >>>> >>>> I was just pointing that there are known issues in this area, with >>>> known solutions. >>> >>>Thanks for the links, interesting stuff. Still, my impression is that >>>the >>>SSD is stressed by something else, and the high fsync values during a >>>checkpoint are merely a symptom. So fixing a checkpoint (using the >>>backpatch) won't actually fix the issue. But I'm just guessing here. >>> >>>> Getting more information on vacuum activity, bgwriter activity should >>>>help >>>> too. >>> >>>Yes, that'd be nice. Gaëtan, can you post bgwriter-related options from >>>postgresql.conf and two snapshosts from pg_stat_bgwriter (say 5 minutes >>>apart, collected when the db is slow)? A complete 'iotop -o' output >>>would >>>be nice too. >>> >>>BTW what filesystem are we talking about? What mount options have you >>>used? >>> >>>What about the /proc/sys/vm/dirty_expire_centiseconds? >>> >>>Tomas >> >> > > > >-- >Cédric Villemain +33 (0)6 20 30 22 52 >http://2ndQuadrant.fr/ >PostgreSQL: Support 24x7 - Développement, Expertise et Formation -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general