Re: Extremely slow small files rewrite performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Any update?

On Tue, Oct 21, 2014 at 3:32 PM, Sergey Nazarov <natarajaya@xxxxxxxxx> wrote:
> Ouch, I think client log is missing.
> Here it goes:
> https://www.dropbox.com/s/650mjim2ldusr66/ceph-client.admin.log.gz?dl=0
>
> On Tue, Oct 21, 2014 at 3:22 PM, Sergey Nazarov <natarajaya@xxxxxxxxx> wrote:
>> I enabled logging and performed same tests.
>> Here is the link on archive with logs, they are only from one node
>> (from the node where active MDS was sitting):
>> https://www.dropbox.com/s/80axovtoofesx5e/logs.tar.gz?dl=0
>>
>> Rados bench results:
>>
>> # rados bench -p test 10 write
>>  Maintaining 16 concurrent writes of 4194304 bytes for up to 10
>> seconds or 0 objects
>>  Object prefix: benchmark_data_atl-fs11_4630
>>    sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
>>      0       0         0         0         0         0         -         0
>>      1      16        46        30   119.967       120  0.201327  0.348463
>>      2      16        88        72   143.969       168  0.132983  0.353677
>>      3      16       124       108   143.972       144  0.930837  0.383018
>>      4      16       155       139   138.976       124  0.899468  0.426396
>>      5      16       203       187   149.575       192  0.236534  0.400806
>>      6      16       243       227   151.309       160  0.835213  0.397673
>>      7      16       276       260   148.549       132  0.905989  0.406849
>>      8      16       306       290   144.978       120  0.353279  0.422106
>>      9      16       335       319   141.757       116   1.12114  0.428268
>>     10      16       376       360    143.98       164  0.418921   0.43351
>>     11      16       377       361   131.254         4  0.499769  0.433693
>>  Total time run:         11.206306
>> Total writes made:      377
>> Write size:             4194304
>> Bandwidth (MB/sec):     134.567
>>
>> Stddev Bandwidth:       60.0232
>> Max bandwidth (MB/sec): 192
>> Min bandwidth (MB/sec): 0
>> Average Latency:        0.474923
>> Stddev Latency:         0.376038
>> Max latency:            1.82171
>> Min latency:            0.060877
>>
>>
>> # rados bench -p test 10 seq
>>    sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
>>      0       0         0         0         0         0         -         0
>>      1      16        61        45   179.957       180  0.010405   0.25243
>>      2      16       109        93   185.962       192  0.908263  0.284303
>>      3      16       151       135   179.965       168  0.255312  0.297283
>>      4      16       191       175    174.97       160  0.836727  0.330659
>>      5      16       236       220   175.971       180  0.009995  0.330832
>>      6      16       275       259   172.639       156   1.06855  0.345418
>>      7      16       311       295   168.545       144  0.907648  0.361689
>>      8      16       351       335   167.474       160  0.947688  0.363552
>>      9      16       390       374   166.196       156  0.140539  0.369057
>>  Total time run:        9.755367
>> Total reads made:     401
>> Read size:            4194304
>> Bandwidth (MB/sec):    164.422
>>
>> Average Latency:       0.387705
>> Max latency:           1.33852
>> Min latency:           0.008064
>>
>> # rados bench -p test 10 rand
>>    sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
>>      0       0         0         0         0         0         -         0
>>      1      16        55        39   155.938       156  0.773716  0.257267
>>      2      16        93        77   153.957       152  0.006573  0.339199
>>      3      16       135       119   158.629       168  0.009851  0.359675
>>      4      16       171       155   154.967       144  0.892027  0.359015
>>      5      16       209       193   154.369       152   1.13945  0.378618
>>      6      16       256       240    159.97       188  0.009965  0.368439
>>      7      16       295       279     159.4       156  0.195812  0.371259
>>      8      16       343       327   163.472       192  0.880587  0.370759
>>      9      16       380       364    161.75       148  0.113111  0.377983
>>     10      16       424       408   163.173       176  0.772274  0.379497
>>  Total time run:        10.518482
>> Total reads made:     425
>> Read size:            4194304
>> Bandwidth (MB/sec):    161.620
>>
>> Average Latency:       0.393978
>> Max latency:           1.36572
>> Min latency:           0.006448
>>
>> On Tue, Oct 21, 2014 at 2:03 PM, Gregory Farnum <greg@xxxxxxxxxxx> wrote:
>>> Can you enable debugging on the client ("debug ms = 1", "debug client
>>> = 20") and mds ("debug ms = 1", "debug mds = 20"), run this test
>>> again, and post them somewhere for me to look at?
>>>
>>> While you're at it, can you try rados bench and see what sort of
>>> results you get?
>>> -Greg
>>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>>
>>>
>>> On Tue, Oct 21, 2014 at 10:57 AM, Sergey Nazarov <natarajaya@xxxxxxxxx> wrote:
>>>> It is CephFS mounted via ceph-fuse.
>>>> I am getting the same results not depending on how many other clients
>>>> are having this fs mounted and their activity.
>>>> Cluster is working on Debian Wheezy, kernel 3.2.0-4-amd64.
>>>>
>>>> On Tue, Oct 21, 2014 at 1:44 PM, Gregory Farnum <greg@xxxxxxxxxxx> wrote:
>>>>> Are these tests conducted using a local fs on RBD, or using CephFS?
>>>>> If CephFS, do you have multiple clients mounting the FS, and what are
>>>>> they doing? What client (kernel or ceph-fuse)?
>>>>> -Greg
>>>>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>>>>
>>>>>
>>>>> On Tue, Oct 21, 2014 at 9:05 AM, Sergey Nazarov <natarajaya@xxxxxxxxx> wrote:
>>>>>> Hi
>>>>>>
>>>>>> I just built a new cluster using this quickstart instructions:
>>>>>> http://ceph.com/docs/master/start/
>>>>>>
>>>>>> And here is what I am seeing:
>>>>>>
>>>>>> # time for i in {1..10}; do echo $i > $i.txt ; done
>>>>>> real 0m0.081s
>>>>>> user 0m0.000s
>>>>>> sys 0m0.004s
>>>>>>
>>>>>> And if I try to repeat the same command (when files already created):
>>>>>>
>>>>>> # time for i in {1..10}; do echo $i > $i.txt ; done
>>>>>> real 0m48.894s
>>>>>> user 0m0.000s
>>>>>> sys 0m0.004s
>>>>>>
>>>>>> I was very surprised and then just tried to rewrite a single file:
>>>>>>
>>>>>> # time echo 1 > 1.txt
>>>>>> real 0m3.133s
>>>>>> user 0m0.000s
>>>>>> sys 0m0.000s
>>>>>>
>>>>>> BTW, I dont think it is the problem with OSD speed or network:
>>>>>>
>>>>>> # time sysbench --num-threads=1 --test=fileio --file-total-size=1G
>>>>>> --file-test-mode=rndrw prepare
>>>>>> 1073741824 bytes written in 23.52 seconds (43.54 MB/sec).
>>>>>>
>>>>>> Here is my ceph cluster status and verion:
>>>>>>
>>>>>> # ceph -w
>>>>>>     cluster d3dcacc3-89fb-4db0-9fa9-f1f6217280cb
>>>>>>      health HEALTH_OK
>>>>>>      monmap e4: 4 mons at
>>>>>> {atl-fs10=10.44.101.70:6789/0,atl-fs11=10.44.101.91:6789/0,atl-fs12=10.44.101.92:6789/0,atl-fs9=10.44.101.69:6789/0},
>>>>>> election epoch 40, quorum 0,1,2,3 atl-fs9,atl-fs10,atl-fs11,atl-fs12
>>>>>>      mdsmap e33: 1/1/1 up {0=atl-fs12=up:active}, 3 up:standby
>>>>>>      osdmap e92: 4 osds: 4 up, 4 in
>>>>>>       pgmap v8091: 192 pgs, 3 pools, 123 MB data, 1658 objects
>>>>>>             881 GB used, 1683 GB / 2564 GB avail
>>>>>>                  192 active+clean
>>>>>>   client io 1820 B/s wr, 1 op/s
>>>>>>
>>>>>> # ceph -v
>>>>>> ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3)
>>>>>>
>>>>>> All nodes connected with gigabit network.
>>>>>> _______________________________________________
>>>>>> ceph-users mailing list
>>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux