Re: IO scheduler & osd_disk_thread_ioprio_class

Jan Schermer <jan@xxxxxxxxxxx> · Tue, 23 Jun 2015 17:12:56 +0200

Right now, with the utilization that CEPH OSD can do on one drive, deadline would probably do the same.
But I think having CFQ working is good practice, you never know when you might need to ionice something, it also is much more powerful if you use stuff like cgroups (although deadline also is group-aware AFAIK). It also does nice things like merging requests and you can really tune it for the workload - that probably went out of window the moment I set both slice_idle and group_idle to 0 as that effectively disables any coalescing…

I hope someone with more experience will chime in, maybe this is misguided and deadline should ultimately be used for OSDs.

Jan

> On 23 Jun 2015, at 17:07, Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote:
> 
> Looks OK, at least the docs suggest 0 for the idle values with "high
> performance storage".
> 
> Stepping back a bit, did you try noop scheduler instead of
> cfq/deadline? I'm kind of surprised that you want/need to do any IO
> scheduling when your journal and FileStore are on a good SSD.
> 
> Cheers, Dan
> 
> 
> On Tue, Jun 23, 2015 at 4:57 PM, Jan Schermer <jan@xxxxxxxxxxx> wrote:
>> For future generations
>> 
>> I persuaded CFQ to play nice in the end:
>> 
>> find /sys/block/sd[a-z]/queue/iosched/group_idle -exec sh -c 'echo 0 > {}' \;
>> find /sys/block/sd[a-z]/queue/iosched/slice_idle -exec sh -c 'echo 0 > {}' \;
>> find /sys/block/sd[a-z]/queue/iosched/quantum -exec sh -c 'echo 32 > {}' \;
>> 
>> Feedback welcome :-)
>> 
>> It behaves very close to deadline, with the benefit that it can still do IO classes (and they work).
>> 
>> Jan
>> 
>> 
>>> On 23 Jun 2015, at 14:21, Eneko Lacunza <elacunza@xxxxxxxxx> wrote:
>>> 
>>> Hi Jan,
>>> 
>>> What SSD model?
>>> 
>>> I've seen SSDs work quite well usually but suddenly give a totally awful performance for some time (not those 8K you see though).
>>> 
>>> I think there was some kind of firmware process involved, I had to replace the drive with a serious DC one.
>>> 
>>> El 23/06/15 a las 14:07, Jan Schermer escribió:
>>>> Yes, but that’s a separate issue :-)
>>>> Some drives are just slow (100 IOPS) for synchronous writes with no other load.
>>>> The drives I’m testing have ~8K IOPS when not under load - having them drop to 10 IOPS is a huge problem. If it’s indeed a CFQ problem (as I suspect) then no matter what drive you have you will have problems.
>>>> 
>>>> Jan
>>>> 
>>>>> On 23 Jun 2015, at 14:03, Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote:
>>>>> 
>>>>> Oh sorry, I had missed that. Indeed that is surprising. Did you read
>>>>> the recent thread ("SSD IO performance") discussing the relevance of
>>>>> O_DSYNC performance for the journal?
>>>>> 
>>>>> Cheers, Dan
>>>>> 
>>>>> On Tue, Jun 23, 2015 at 1:54 PM, Jan Schermer <jan@xxxxxxxxxxx> wrote:
>>>>>> I only use SSDs, which is why I’m so surprised at the CFQ behaviour - the drive can sustain tens of thousand of reads per second, thousands of writes - yet saturating it with reads drops the writes to 10 IOPS - that’s mind boggling to me.
>>>>>> 
>>>>>> Jan
>>>>>> 
>>>>>>> On 23 Jun 2015, at 13:43, Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote:
>>>>>>> 
>>>>>>> On Tue, Jun 23, 2015 at 1:37 PM, Jan Schermer <jan@xxxxxxxxxxx> wrote:
>>>>>>>> Yes, I use the same drive
>>>>>>>> 
>>>>>>>> one partition for journal
>>>>>>>> other for xfs with filestore
>>>>>>>> 
>>>>>>>> I am seeing slow requests when backfills are occuring - backfills hit the filestore but slow requests are (most probably) writes going to the journal - 10 IOPS is just to few for anything.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> My Ceph version is dumpling - that explains the integers.
>>>>>>>> So it’s possible it doesn’t work at all?
>>>>>>> I thought that bug was fixed. You can check if it worked by using
>>>>>>> "iotop -b -n1" and looking for threads with the idle priority.
>>>>>>> 
>>>>>>>> Bad news about the backfills no being in the disk thread, I might have to use deadline after all.
>>>>>>> If your experience follows the same paths of most users, eventually
>>>>>>> deep scrubs will cause latency issues and you'll switch back to cfq
>>>>>>> plus ionicing the disk thread.
>>>>>>> 
>>>>>>> Are you using Ceph RBD or object storage? If RBD, eventually you'll
>>>>>>> find that you need to put the journals on an SSD.
>>>>>>> 
>>>>>>> Cheers, Dan
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> ceph-users@xxxxxxxxxxxxxx
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> 
>>> 
>>> --
>>> Zuzendari Teknikoa / Director Técnico
>>> Binovo IT Human Project, S.L.
>>> Telf. 943575997
>>>     943493611
>>> Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa)
>>> www.binovo.es
>>> 
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> 
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com