Yes, I use the same drive one partition for journal other for xfs with filestore I am seeing slow requests when backfills are occuring - backfills hit the filestore but slow requests are (most probably) writes going to the journal - 10 IOPS is just to few for anything. My Ceph version is dumpling - that explains the integers. So it’s possible it doesn’t work at all? Bad news about the backfills no being in the disk thread, I might have to use deadline after all. Thanks Jan > On 23 Jun 2015, at 13:09, Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote: > > Hi Jan, > I guess you have the OSD journal on the same spinning disk as the > FileStore? Otherwise the synchronous writes go to the separate journal > device so the fio test is less relevant. > We've looked a lot at IO schedulers and concluded that ionic'ing the > disk thread to idle is the best we can do. Note that the disk thread > is used for scrubs, not backfilling. There is no way to ionice > backfills/recoveries. > Which Ceph version are you using? When the > osd_disk_thread_ioprio_class feature was first implemented it was > integers, then it switched to strings e.g. idle. And for a couple > versions the setting didn't work at all. > > As of latest firefly you can do: > > osd disk thread ioprio class = idle > osd disk thread ioprio priority = 0 > > BTW, if backfills are causing you grief, you can throttle those down > to minimums with: > > osd max backfills = 1 > osd recovery max active = 1 > osd recovery max single start = 1 > osd recovery op priority = 1 > > > Hope that helps, > > Dan > > > > On Tue, Jun 23, 2015 at 12:53 PM, Jan Schermer <jan@xxxxxxxxxxx> wrote: >> I use CFQ but I have just discovered it completely _kills_ writes when also reading (doing backfill for example) >> >> If I run a fio job for synchronous writes and at the same time run a fio job for random reads, writes drop to 10 IOPS (oops!). Setting io priority with ionice works nicely maintaining ~250 IOPS for writes while throttling reads. >> >> I looked at osd_disk_thread_ioprio_class - for some reason documentation says “idle” “rt” “be” for possible values, but it only accepts numbers (3 should be idle) in my case - and doesn’t seem to do anything in regards to slow requests. Do I need to restart the OSD for it to take effect? It actually looks like it made things even worse for me… >> >> Changing the scheduler to deadline improves the bottom line a a lot for my benchmark, but large amount of reads can still drop that to 30 IOPS - contrary to CFQ which maintains steady 250 IOPS for writes even under read load. >> >> What would be the recommendation here? Did someone test this extensively before? >> >> thanks >> >> Jan >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com