Re: fio librbd result is poor

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Christian,
Thanks for your reply.

At 2016-12-19 14:01:57, "Christian Balzer" <chibi@xxxxxxx> wrote: > >Hello, > >On Mon, 19 Dec 2016 13:29:07 +0800 (CST) 马忠明 wrote: > >> Hi guys, >> >> So recently I was testing our ceph cluster which mainly used for block usage(rbd). >> >> We have 30 ssd drives total(5 storage nodes,6 ssd drives each node).However the result of fio is very poor. >> >All relevant details are missing. >SSD exact models, CPU/RAM config, network config, Ceph, OS/kernel, fio
>versions, the config you tested this with, as in replication.
SSD:Intel® SSD DC S3510 Series 1.2TB 2.5" CPU:2×Intel E5-2630v4 MEM:128GB
Network config:2*10G bond4 LACP network connection
Ceph:Hammer 0.94.6
OS/kernel: Ubuntu 14.04.5 LTS/3.13.0-96-generic
Fio:2.12

> >> We tested the workload on ssd pool with following parameter : >> >> "fio --size=50G \ >> >> --ioengine=rbd \ >> >> --direct=1 \ >> >> --numjobs=1 \ >> >> --rw=randwrite(randread) \ >> >> --name=com_ssd_4k_randwrite(randread) \ >> >> --bs=4k \ >> >> --iodepth=32 \ >> >> --pool=ssd_volumes \ >> >> --runtime=60 \ >> >> --ramp_time=30 \ >> >> --rbdname=4k_test_image" >> >> and here is the result: >> >> random write:4631;random read:21127 >> >> >> >> >> I also tested the pool(size=1,min_size=1,pg_num=256) which is consisted by only one single ssd drive with same workload pattern which is more acceptable.(random write:8303;random read:27859) >> >I'm only going to comment on the write part. > >On my staging cluster (* see below) I ran your fio against the cache tier >(so only SSDs involved) with this result: > > write: io=4206.3MB, bw=71784KB/s, iops=17945, runt= 60003msec > slat (usec): min=0, max=531, avg= 3.26, stdev=11.33 > clat (usec): min=5, max=41996, avg=1770.23, stdev=2260.61 > lat (usec): min=9, max=41997, avg=1773.36, stdev=2260.60 > >So more than 2 times better than your non-replicated test. > >4k randwrites stress the CPUs (run atop or such on your OSD nodes >when doing a test run), so this might be your limit here. >Along with less than optimal SSDs or a high latency network.
>
yes...CPU usage might be the bottleneck of the whole system.BTW,our ceph cluster is combined with mirantis openstack,above result ran from one computer node.And I also ran pressure test with all 10 computer node.The result is almost same and cpu usage for all storage node is nearly 50-60%.the cpu usage for every ssd osd is nearly 250-300%.

pool parameter for ssd_volomes(size=3,min_size=1,pg_num 2048 pgp_num 2048)


>Christian > > >* Staging cluster: >--- >4 nodes running latest Hammer under Debian Jessie (with sysvinit, kernel >4.6) and manually created OSDs. >Infiniband (IPoIB) QDR (40Gb/s, about 30Gb/s effective) between all nodes. > >2 HDD OSD nodes with 32GB RAM, fast enough CPU (E5-2620 v3), 2x 200GB DC S3610 for >OS and journals (2 per SSD), 4x 1GB 2.5" SATAs for OSDs. >For my amusement and edification the OSDs of one node are formatted with >XFS, the other one EXT4 (as all my production clusters). > >The 2 SSD ODS nodes have 1x 200GB DC S3610 (OS and 4 journal partitions) >and 2x 400GB DC S3610s (2 180GB partitions, so 8 SSD OSDs total), same >specs as the HDD nodes otherwise. >Also one node with XFS, the other EXT4. > >Pools are size=2, min_size=1, obviously. >--- > >> >> >> >> We have optimized the linux kernal(read_ahead,disk_scheduler,numa,swappiness) and ceph.conf(client_message,filestore_queue,journal_queue,rbd_cache).And checked the raid cache setting. >> >> >> >> >> The only deficiency for the architecture is the unbalance weight between three racks which one rack has only one storage node. >> >> >> >> >> So can anybody tell us whether this number is reasonable.If not,any suggestion to improve the number will be appreciated. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> > > >-- >Christian Balzer Network/Systems Engineer >chibi@xxxxxxx Global OnLine Japan/Rakuten Communications >http://www.gol.com/


 

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux