On Tue, Dec 29, 2015 at 5:20 PM, Francois Lafont <flafdivers@xxxxxxx> wrote: > Hi, > > On 28/12/2015 09:04, Yan, Zheng wrote: > >>> Ok, so in a client node, I have mounted cephfs (via ceph-fuse) and a rados >>> block device formatted in XFS. If I have well understood, cephfs uses sync >>> IO (not async IO) and, with ceph-fuse, cephfs can't make O_DIRECT IO. So, I >>> have tested this fio command in cephfs _and_ in rbd: >>> >>> fio --randrepeat=1 --ioengine=sync --direct=0 --gtod_reduce=1 --name=readwrite \ >>> --filename=rw.data --bs=4k --iodepth=1 --size=300MB --readwrite=randrw \ >>> >>> >>> and indeed with cephfs _and_ rbd, I have approximatively the same result: >>> - cephfs => ~516 iops >>> - rbd => ~587 iops >>> >>> Is it consistent? >>> >> yes > > Ok, cool. ;) > >>> That being said, I'm unable to know if it's good performance as regard my hardware >>> configuration. I'm curious to know the result in other clusters with the same fio >>> command. >> >> This fio command is check performance of single thread SYNC IO. If you >> want to check overall throughput, you can try using buffered IO or >> increasing thread number. > > Ok, I have increased the thread number via the --numjobs option of fio > and indeed, if I add all the iops of each job, it seems to me that I can > reach something like ~1000 iops with ~5 jobs. This result seems to me > further in relation with my hardware configuration, isn't it? yes > > And it seems to me that I can see the bottleneck of my little cluster (only > 5 OSD servers with each 4 osds daemons). According to the "atop" command, I > can see that some disks (4TB SATA 7200rpm Western digital WD4000FYYZ) are > very busy. It's curious because during the bench I have some disks very busy > and some other disks not so busy. But I think the reason is that is a little > cluster and with just 15 osds (the 5 other osds are full SSD osds cephfsmetadata > dedicated), I can have a perfect repartition of data, especially when the > bench concern just a specific file of few hundred MB. do these disks have same size and performance? large disks (with higher wights) or slow disks are likely busy. > > That being said, when you talk about "using buffered IO" I'm not sure to > understand the option of fio which is concerns by that. Is it the --buffered > option ? Because with this option I have noticed no change concerning iops. > Personally, I was able to increase global iops only with the --numjobs option. > I didn't make it clear. I actually meant buffered write (add --rwmixread=0 option to fio) . In your test case, writes mix with reads. read is synchronous when cache miss. Regards Yan, Zheng >> FYI, I have written a patch to add AIO support to cephfs kernel client: >> https://github.com/ceph/ceph-client/commits/testing > > Ok thanks for the information but I'm afraid to be unable to test it immediately. > >>> * --direct=1 => ~1400 iops >>> * --direct=0 => ~570 iops >>> >>> Why I have this behavior? I thought it will be the opposite (better perfs with >>> --direct=0). Is it normal? >>> >> linux kernel only supports AIO for fd opened in O_DIRECT mode, when >> file is not opened in O_DIRECT mode, AIO is actually SYNC IO. > > Ok, so this is not ceph specific, this is a behavior of the Linux kernel. > A good info to know again. > > Anyway, thanks _a_ _lot_ Yan for your help very efficient. I have learned > lot of very interesting things > > Regards. > > -- > François Lafont _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com