Re: Amazon High I/O instances

Sébastien Lorion <sl@xxxxxxxxxxxxxxxxxxxxx> · Thu, 13 Sep 2012 01:01:11 -0400

pgbench initialization has been going on for almost 5 hours now and still stuck before vacuum starts .. something is definitely wrong as I don't remember it took so long first time I created the db. Here are the current stats now:

iostat (xbd13-14 are WAL zpool)

 device     r/s   w/s    kr/s    kw/s qlen svc_t  %b
xbd8     161.3 109.8  1285.4  3450.5    0  12.5  19

xbd7     159.5 110.6  1272.3  3450.5    0  11.4  14
xbd6     161.1 108.8  1284.4  3270.6    0  10.9  14
xbd5     159.5 109.0  1273.1  3270.6    0  11.6  15

xbd14      0.0   0.0     0.0     0.0    0   0.0   0
xbd13      0.0   0.0     0.0     0.0    0   0.0   0
xbd12    204.6 110.8  1631.3  3329.2    0   9.1  15

xbd11    216.0 111.2  1722.5  3329.2    1   8.6  16
xbd10    197.2 109.4  1573.5  3285.8    0   9.8  15
xbd9     195.0 109.4  1557.1  3285.8    0   9.9  15

zpool iostat (db pool)
pool        alloc   free   read  write   read  write
db           143G   255G  1.40K  1.53K  11.2M  12.0M

vmstat

procs      memory      page                    disks     faults         cpu

 r b w     avm    fre   flt  re  pi  po    fr  sr ad0 xb8   in   sy   cs us sy id
 0 0 0   5634M    28G     7   0   0   0  7339   0   0 245 2091 6358 20828  2  5 93

 0 0 0   5634M    28G    10   0   0   0  6989   0   0 312 1993 6033 20090  1  4 95
 0 0 0   5634M    28G     7   0   0   0  6803   0   0 292 1974 6111 22763  2  5 93

 0 0 0   5634M    28G    10   0   0   0  7418   0   0 339 2041 6170 20838  2  4 94
 0 0 0   5634M    28G   123   0   0   0  6980   0   0 282 1977 5906 19961  2  4 94

top

 last pid:  2430;  load averages:  0.72,  0.73,  0.69         up 0+04:56:16  04:52:53

32 processes:  1 running, 31 sleeping
CPU:  1.8% user,  0.0% nice,  5.3% system,  1.4% interrupt, 91.5% idle
Mem: 1817M Active, 25M Inact, 36G Wired, 24K Cache, 699M Buf, 28G Free

Swap:

  PID USERNAME  THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND

 1283 pgsql       1  34    0  3967M  1896M zio->i  5  80:14 21.00% postgres
 1282 pgsql       1  25    0 25740K  3088K select  2  10:34  0.00% pgbench

 1274 pgsql       1  20    0  2151M 76876K select  1   0:09  0.00% postgres

On Wed, Sep 12, 2012 at 9:16 PM, Sébastien Lorion <sl@xxxxxxxxxxxxxxxxxxxxx> wrote:

I recreated the DB and WAL pools, and launched pgbench -i -s 10000. Here are the stats during the load (still running):

iostat (xbd13-14 are WAL zpool)
device     r/s   w/s    kr/s    kw/s qlen svc_t  %b

xbd8       0.0 471.5     0.0 14809.3   40  67.9  84
xbd7       0.0 448.1     0.0 14072.6   39  62.0  74
xbd6       0.0 472.3     0.0 14658.6   39  61.3  77

xbd5       0.0 464.7     0.0 14433.1   39  61.4  76
xbd14      0.0   0.0     0.0     0.0    0   0.0   0
xbd13      0.0   0.0     0.0     0.0    0   0.0   0

xbd12      0.0 460.1     0.0 14189.7   40  63.4  78
xbd11      0.0 462.9     0.0 14282.8   40  61.8  76
xbd10      0.0 477.0     0.0 14762.1   38  61.2  77

xbd9       0.0 477.6     0.0 14796.2   38  61.1  77

zpool iostat (db pool)
pool        alloc   free   read  write   read  write

db          11.1G   387G      0  6.62K      0  62.9M

vmstat
procs      memory      page                    disks     faults         cpu
 r b w     avm    fre   flt  re  pi  po    fr  sr ad0 xb8   in   sy   cs us sy id

 0 0 0   3026M    35G   126   0   0   0 29555   0   0 478 2364 31201 26165 10  9 81

top
last pid:  1333;  load averages:  1.89,  1.65,  1.08      up 0+01:17:08  01:13:45

32 processes:  2 running, 30 sleeping
CPU: 10.3% user,  0.0% nice,  7.8% system,  1.2% interrupt, 80.7% idle

Mem: 26M Active, 19M Inact, 33G Wired, 16K Cache, 25M Buf, 33G Free

On Wed, Sep 12, 2012 at 9:02 PM, Sébastien Lorion <sl@xxxxxxxxxxxxxxxxxxxxx> wrote:
>
> One more question .. I could not set wal_sync_method to anything else but fsync .. is that expected or should other choices be also available ? I am not sure how the EC2 SSD cache flushing is handled on EC2, but I hope it is flushing the whole cache on every sync .. As a side note, I got corrupted databases (errors about pg_xlog directories not found, etc) at first when running my tests, and I suspect it was because of vfs.zfs.cache_flush_disable=1, though I cannot prove it for sure.

>
> Sébastien
>
>
> On Wed, Sep 12, 2012 at 8:49 PM, Sébastien Lorion <sl@xxxxxxxxxxxxxxxxxxxxx> wrote:
>>
>> Is dedicating 2 drives for WAL too much ? Since my whole raid is comprised of SSD drives, should I just put it in the main pool ?

>>
>> Sébastien
>>
>>
>> On Wed, Sep 12, 2012 at 8:28 PM, Sébastien Lorion <sl@xxxxxxxxxxxxxxxxxxxxx> wrote:

>>>

>>> Ok, make sense .. I will update that as well and report back. Thank you for your advice.
>>>
>>> Sébastien
>>>
>>>
>>> On Wed, Sep 12, 2012 at 8:04 PM, John R Pierce <pierce@xxxxxxxxxxxx> wrote:

>>>>
>>>> On 09/12/12 4:49 PM, Sébastien Lorion wrote:
>>>>>
>>>>> You set shared_buffers way below what is suggested in Greg Smith book (25% or more of RAM) .. what is the rationale behind that rule of thumb ? Other values are more or less what I set, though I could lower the effective_cache_size and vfs.zfs.arc_max and see how it goes.

>>>>
>>>>
>>>> I think those 25% rules were typically created when ram was no more than 4-8GB.
>>>>
>>>> for our highly transactional workload, at least, too large of a shared_buffers seems to slow us down, perhaps due to higher overhead of managing that many 8k buffers.    I've heard other read-mostly workloads, such as data warehousing, can take advantage of larger buffer counts.

>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> john r pierce                            N 37, W 122
>>>> santa cruz ca                         mid-left coast

>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)

>>>> To make changes to your subscription:
>>>> http://www.postgresql.org/mailpref/pgsql-general
>>>

>>>

>>
>