Re: IO scheduler & osd_disk_thread_ioprio_class

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thank you for your reply, answers below.


On 23 Jun 2015, at 13:15, Christian Balzer <chibi@xxxxxxx> wrote:


Hello,

On Tue, 23 Jun 2015 12:53:45 +0200 Jan Schermer wrote:

I use CFQ but I have just discovered it completely _kills_ writes when
also reading (doing backfill for example)

I've seen similar things, but for the record and so people can correctly
reproduce things, please be specific.

For starters, what version of Ceph?


0.67.12 dumpling (newest git)
I know it’s ancient :-)

CFQ with what kernel, with what filesystem, on what type of OSD (HDD, HDD
with on disk journal, HDD with SSD journal)?


My test was done on the block device, not filesystem, on a SSD.
I tested several scenarios but the most simple one is to run

fio --filename=/dev/sda --direct=1 --sync=1 --rw=write --bs=4k --numjobs=1 --iodepth=1 --runtime=60 --time_based --group_reporting --name=test --ioengine=aio
and
fio --filename=/dev/sda --direct=1 --sync=1 --rw=randread --bs=32k --numjobs=1 --iodepth=8 --runtime=60 --time_based --group_reporting --name=test --ioengine=aio

You will see the first fio IOPS drop to ~10.

This will of course depend on the drive and this is also saturating the SATA 2 capacity I have on my test machine (which might be the real cause).

I am still testing various combinations, different drives have different thresholds (some fall to the bottom only with 128k block size which is larger than my average IO on the drives - not accounting for backfills).

There’s a point though where it just hits the bottom and no amount of cfq-tuning magic can help.


If I run a fio job for synchronous writes and at the same time run a fio
job for random reads, writes drop to 10 IOPS (oops!). Setting io
priority with ionice works nicely maintaining ~250 IOPS for writes while
throttling reads.

Setting the priority to what (level and type) on which process?
The fio ones, the OSD ones?

ionice -c3 fio-for-read-test

this sets the class to idle
setting the priority to 7 but leaving in on best-effort helps, but not much (10 x 30 IOPs)


Scrub and friends can really wreck havoc on one of my cluster which is 99%
writes, same goes for the few times it has to do reads (VMs booting).

Scrubbing is fine on my cluster, backfilling kills it with new drives - that’s what I’m investigating right now and I encountered this. So before I go scratching my head I thought I’d ask here - I’m probably not the first one to have these kind of problems :-)

Thanks

Jan


Christian

I looked at osd_disk_thread_ioprio_class - for some reason documentation
says “idle” “rt” “be” for possible values, but it only accepts numbers
(3 should be idle) in my case - and doesn’t seem to do anything in
regards to slow requests. Do I need to restart the OSD for it to take
effect? It actually looks like it made things even worse for me…

Changing the scheduler to deadline improves the bottom line a a lot for
my benchmark, but large amount of reads can still drop that to 30 IOPS -
contrary to CFQ which maintains steady 250 IOPS for writes even under
read load.

What would be the recommendation here? Did someone test this extensively
before?

thanks

Jan

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


-- 
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx    Global OnLine Japan/Fusion Communications
http://www.gol.com/

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux