Re: cephfs, low performances

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Because Ceph is not perfectly distributed there will be more PGs/objects in one drive than others. That drive will become a bottleneck for the entire cluster. The current IO scheduler poses some challenges in this regard. I've implemented a new scheduler which I've seen much better drive utilization across the cluster as well as 3-17% performance increase and a substantial reduction in client performance deviation (all clients are getting the same amount of performance). Hopefully we will be able to get that into Jewel.

Robert LeBlanc

Sent from a mobile device please excuse any typos.

On Dec 31, 2015 12:20 AM, "Francois Lafont" <flafdivers@xxxxxxx> wrote:
Hi,

On 30/12/2015 10:23, Yan, Zheng wrote:

>> And it seems to me that I can see the bottleneck of my little cluster (only
>> 5 OSD servers with each 4 osds daemons). According to the "atop" command, I
>> can see that some disks (4TB SATA 7200rpm Western digital WD4000FYYZ) are
>> very busy. It's curious because during the bench I have some disks very busy
>> and some other disks not so busy. But I think the reason is that is a little
>> cluster and with just 15 osds (the 5 other osds are full SSD osds cephfsmetadata
>> dedicated), I can have a perfect repartition of data, especially when the
>> bench concern just a specific file of few hundred MB.
>
> do these disks have same size and performance? large disks (with
> higher wights) or slow disks are likely busy.

The disks are exactly the same model with the same size (4TB SATA 7200rpm
Western digital WD4000FYYZ). I'm not completely sure but it seems to me
that in a specific node I have a disk which is a little slower than the
others (maybe minus ~50-75 iops) and it seems to me that it's the busiest
disk during a bench.

Is it possible (or frequent) to have difference of perfs between exactly
same model of disks?

>> That being said, when you talk about "using buffered IO" I'm not sure to
>> understand the option of fio which is concerns by that. Is it the --buffered
>> option ? Because with this option I have noticed no change concerning iops.
>> Personally, I was able to increase global iops only with the --numjobs option.
>>
>
> I didn't make it clear. I actually meant buffered write (add
> --rwmixread=0 option to fio) .

But with fio if I set "--readwrite=randrw --rwmixread=0", it's completely
equivalent to just set "--readwrite=randwrite", no?

> In your test case, writes mix with reads.

Yes indeed.

> read is synchronous when cache miss.

You mean that I have SYNC IO for reading if I set --direct=0, is it correct?
Is it valid for any file system or just for cephfs?

Regards.

--
François Lafont
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux