Re: General performance/load issue

Gaëtan Allart <gaetan@xxxxxxxxxxx> · Fri, 25 Nov 2011 23:48:35 +0000

Here are the latest checkpoint logs :

LOG:  checkpoint complete: wrote 842 buffers (0.1%); 0 transaction log
file(s) added, 0 removed, 0 recycled; write=168.970 s, sync=0.005 s,
total=168.977 s
LOG:  checkpoint starting: time
LOG:  checkpoint complete: wrote 318 buffers (0.0%); 0 transaction log
file(s) added, 0 removed, 2 recycled; write=63.818 s, sync=0.006 s,
total=63.825 s
LOG:  checkpoint starting: time
LOG:  checkpoint complete: wrote 744 buffers (0.1%); 0 transaction log
file(s) added, 0 removed, 0 recycled; write=147.035 s, sync=0.006 s,
total=147.043 s
LOG:  checkpoint starting: time
LOG:  checkpoint complete: wrote 108 buffers (0.0%); 0 transaction log
file(s) added, 0 removed, 3 recycled; write=35.410 s, sync=14.921 s,
total=54.811 s
LOG:  checkpoint starting: time
LOG:  checkpoint complete: wrote 393 buffers (0.0%); 0 transaction log
file(s) added, 0 removed, 0 recycled; write=88.835 s, sync=43.210 s,
total=135.728 s
LOG:  checkpoint starting: time
LOG:  checkpoint complete: wrote 914 buffers (0.1%); 0 transaction log
file(s) added, 0 removed, 1 recycled; write=148.162 s, sync=14.249 s,
total=170.481 s
LOG:  checkpoint starting: time
LOG:  checkpoint complete: wrote 202 buffers (0.0%); 0 transaction log
file(s) added, 0 removed, 1 recycled; write=53.152 s, sync=0.004 s,
total=53.159 s
LOG:  checkpoint starting: time
LOG:  checkpoint complete: wrote 897 buffers (0.1%); 0 transaction log
file(s) added, 0 removed, 1 recycled; write=42.414 s, sync=1.175 s,
total=58.957 s
LOG:  checkpoint starting: shutdown immediate
LOG:  checkpoint complete: wrote 666 buffers (0.1%); 0 transaction log
file(s) added, 0 removed, 1 recycled; write=0.027 s, sync=1.600 s,
total=1.630 s
LOG:  checkpoint starting: time
LOG:  checkpoint complete: wrote 627 buffers (0.1%); 0 transaction log
file(s) added, 0 removed, 1 recycled; write=125.856 s, sync=0.006 s,
total=125.864 s
LOG:  checkpoint starting: time

And pg_stat_bgwriter :

postgres=# select *,now() from pg_stat_bgwriter;
 checkpoints_timed | checkpoints_req | buffers_checkpoint | buffers_clean
| maxwritten_clean | buffers_backend | buffers_alloc |              now

-------------------+-----------------+--------------------+---------------+
------------------+-----------------+---------------+----------------------
---------
               388 |              13 |             494948 |       4306591
|            13555 |         7458743 |    7835244602 | 2011-11-26
00:43:47.232924+01
(1 row)

postgres=# select *,now() from pg_stat_bgwriter;
 checkpoints_timed | checkpoints_req | buffers_checkpoint | buffers_clean
| maxwritten_clean | buffers_backend | buffers_alloc |              now

-------------------+-----------------+--------------------+---------------+
------------------+-----------------+---------------+----------------------
---------
               389 |              13 |             501802 |       4352198
|            13809 |         7469220 |    7839778941 | 2011-11-26
00:49:00.680779+01
(1 row)

Processes that were writing were "SELECT" queries against database.

Gaëtan

Le 26/11/11 00:17, « Cédric Villemain »
<cedric.villemain.debian@xxxxxxxxx> a écrit :

>Le 25 novembre 2011 23:47, Gaëtan Allart <gaetan@xxxxxxxxxxx> a écrit :
>> Hello Tomas and Cédric,
>>
>> Right now, the server is not all right. Load is above 30 and queries are
>> slow like hell.
>>
>>
>> Here's the complete iotop. Note the 71 MB/s writes (apparently on SELECT
>> queries).
>>
>> Total DISK READ: 633.35 K/s | Total DISK WRITE: 71.06 M/s
>>  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND
>>
>>
>> 27352 be/4 postgres   15.64 K/s   86.01 K/s  0.00 % 99.99 % postgres:
>> database database 176.31.228.6(38816) SELECT
>> 20226 be/4 postgres    7.82 K/s    0.00 B/s  0.00 % 99.99 % postgres:
>> database database 176.31.228.6(34166) SELECT
>> 26950 be/4 postgres   23.46 K/s    0.00 B/s  0.00 % 82.14 % postgres:
>> database database 46.105.104.205(40820) SELECT
>> 23160 be/4 postgres    3.91 K/s    0.00 B/s  0.00 % 81.14 % postgres:
>> database database 46.105.104.205(58091) SELECT
>> 29184 be/4 postgres    7.82 K/s    0.00 B/s  0.00 % 79.17 % postgres:
>> database database 46.105.104.205(51047) SELECT
>> 27271 be/4 postgres   23.46 K/s  234.58 K/s  0.00 % 77.15 % postgres:
>> database database 46.105.104.205(42315) SELECT
>> 28224 be/4 postgres    0.00 B/s    0.00 B/s  0.00 % 74.09 % postgres:
>> database database 46.105.104.205(49871) SELECT
>> 27450 be/4 postgres   23.46 K/s    0.00 B/s  0.00 % 70.12 % postgres:
>> database database 46.105.104.205(49096) SELECT
>> 29206 be/4 postgres  121.20 K/s    0.00 B/s  0.00 % 62.99 % postgres:
>> database database 176.31.228.6(39787) SELECT
>>  8202 be/4 postgres  129.02 K/s    0.00 B/s  0.00 % 60.59 % postgres:
>> database database 91.121.89.14(57291) SELECT
>> 29209 be/4 postgres    3.91 K/s    0.00 B/s  0.00 % 55.75 % postgres:
>> database database 176.31.228.6(39807) idle
>> 27046 be/4 postgres    3.91 K/s    0.00 B/s  0.00 % 52.35 % postgres:
>> database database 46.105.104.205(41285) SELECT
>> 29217 be/4 postgres   54.73 K/s    0.00 B/s  0.00 % 50.18 % postgres:
>> database database 46.105.104.205(51178) SELECT
>>  3249 be/4 root        3.91 K/s  320.59 K/s  0.00 % 40.31 % [kjournald]
>> 28918 be/4 postgres    7.82 K/s    0.00 B/s  0.00 % 34.83 % postgres:
>> database database 46.105.104.205(50164) SELECT
>> 26856 be/4 postgres    7.82 K/s  234.58 K/s  0.00 % 34.30 % postgres:
>> database database 46.105.104.205(40589) SELECT
>> 29205 be/4 postgres    7.82 K/s    0.00 B/s  0.00 % 34.17 % postgres:
>> database database 176.31.228.6(39784) SELECT
>>  4175 be/4 root        0.00 B/s    0.00 B/s  0.00 %  1.84 % [flush-8:0]
>> 28905 be/4 postgres    0.00 B/s    0.00 B/s  0.00 %  0.02 % postgres:
>> database database 46.105.104.205(50125) SELECT
>> 28919 be/4 postgres    0.00 B/s    0.00 B/s  0.00 %  0.01 % postgres:
>> database database 46.105.104.205(50167) SELECT
>> 19807 be/4 postgres    3.91 K/s    0.00 B/s  0.00 %  0.00 % postgres:
>> database database 176.31.228.6(33598) SELECT
>> 28703 be/4 postgres    0.00 B/s   86.01 K/s  0.00 %  0.00 % postgres:
>> database database 176.31.228.6(39260) SELECT
>> 29183 be/4 postgres    7.82 K/s    0.00 B/s  0.00 %  0.00 % postgres:
>> database database 46.105.104.205(51044) SELECT
>>
>>
>> Here is dirty_expire_centisecs :
>>
>> cat /proc/sys/vm/dirty_expire_centisecs
>> 3000
>>
>>
>> Bgwriter configuration is default :
>>
>> #bgwriter_delay = 200ms   # 10-10000ms between rounds
>> #bgwriter_lru_maxpages = 100  # 0-1000 max buffers written/round
>> #bgwriter_lru_multiplier = 2.0  # 0-10.0 multipler on buffers
>>scanned/round
>>
>>
>>
>>
>> Is there anything I can provide to help you ?
>
>the checkpoints logs and the output of pg_stat_bgwriter (as asked by
>Tomas).
>
>It is probable that shared_buffers are too small for your workload
>(expected), do you still have issue with checkpoint sync duration ?
>
>You should increase shared_buffers (you can iterate) *and* try to not
>have the long checkpoint sync. (keep shared_buffers <= 8GB, as already
>mentioned upthread)
>
>The output you provided does not show everything, and the select are
>not looking to write that much. What process are writing ?
>
>>
>>
>>
>> Thanks again,
>>
>> Gaëtan
>>
>>
>>
>> Le 25/11/11 14:12, « Tomas Vondra » <tv@xxxxxxxx> a écrit :
>>
>>>On 25 Listopad 2011, 12:43, Cédric Villemain wrote:
>>>> Le 25 novembre 2011 11:25, Tomas Vondra <tv@xxxxxxxx> a écrit :
>>>>> On 24 Listopad 2011, 23:19, Cédric Villemain wrote:
>>>>>>
>>>>>> It seem you have an issue with your checkpoint syncing time, it is
>>>>>> fixed in 9.1 and backported in 9.0 here :
>>>>>> http://projects.2ndquadrant.com/backports
>>>>>
>>>>> People generally don't want to apply backports on their own,
>>>>>especially
>>>>> when it's a production server and when it's unclear it actually fixes
>>>>> the
>>>>> issue they have. I'm not sure about that.
>>>>
>>>> I agree that most people don't want to do that themselves, but if it
>>>> happens to be the solution they can proceed or ask someone to do it.
>>>> People want to see their production system back to a normal situation,
>>>> here the limited information are not enought to be sure, but the
>>>> checkpoint sync time are clear: sync time are not correct.
>>>> It is very probable that compacting the fsync will help, but it is not
>>>> sure it is required yet.
>>>
>>>Yes, the sync times are quite crazy. Especially given the tiny number of
>>>buffers to be written and the fact that the SSD should handle random I/O
>>>quite well.
>>>
>>>>>> It is possible you have other problems that explains the issue you
>>>>>> have. An immediate solution before trying a patch is to reduce your
>>>>>> shared_buffer setting to something very low, like 1GB.
>>>>>
>>>>> Well, using low shared_buffers was used especially before 8.3, when
>>>>>the
>>>>> spread checkpoints were not available. It prevents the I/O overload
>>>>>when
>>>>> the database suddenly decides to write all of the dirty buffers. But
>>>>> he's
>>>>> on 9.0 (so he already has spread checkpoints).
>>>>
>>>> It is a different animal here.
>>>>
>>>>>
>>>>> Plus the number of buffers he's writing is negligible - usually about
>>>>> 700
>>>>> buffers (6MB), 3192 buffers (25MB) at most. That surely should not
>>>>>be a
>>>>> problem for the SSD he's using.
>>>>
>>>> See the blog entry from Greg Smith:
>>>>
>>>>http://blog.2ndquadrant.com/en/2011/06/backporting-and-checkpoint-tro.h
>>>>tm
>>>>l
>>>>
>>>> And the slides of his talk at pgconf2011:
>>>>
>>>>http://www.2ndquadrant.com/static/2quad/media/pdfs/talks/WriteStuff-PGC
>>>>on
>>>>2011.pdf
>>>>
>>>> I was just pointing that there are known issues in this area, with
>>>> known solutions.
>>>
>>>Thanks for the links, interesting stuff. Still, my impression is that
>>>the
>>>SSD is stressed by something else, and the high fsync values during a
>>>checkpoint are merely a symptom. So fixing a checkpoint (using the
>>>backpatch) won't actually fix the issue. But I'm just guessing here.
>>>
>>>> Getting more information on vacuum activity, bgwriter activity should
>>>>help
>>>> too.
>>>
>>>Yes, that'd be nice. Gaëtan, can you post bgwriter-related options from
>>>postgresql.conf and two snapshosts from pg_stat_bgwriter (say 5 minutes
>>>apart, collected when the db is slow)? A complete 'iotop -o' output
>>>would
>>>be nice too.
>>>
>>>BTW what filesystem are we talking about? What mount options have you
>>>used?
>>>
>>>What about the /proc/sys/vm/dirty_expire_centiseconds?
>>>
>>>Tomas
>>
>>
>
>
>
>-- 
>Cédric Villemain +33 (0)6 20 30 22 52
>http://2ndQuadrant.fr/
>PostgreSQL: Support 24x7 - Développement, Expertise et Formation

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general