Re: Scrubbing not using Idle thread?

Nick Fisk <nick@xxxxxxxxxx> · Tue, 8 Nov 2016 09:04:40 -0000

> -----Original Message-----
> From: Dan van der Ster [mailto:dan@xxxxxxxxxxxxxx]
> Sent: 08 November 2016 08:38
> To: Nick Fisk <nick@xxxxxxxxxx>
> Cc: Ceph Users <ceph-users@xxxxxxxxxxxxxx>
> Subject: Re:  Scrubbing not using Idle thread?
> 
> Hi Nick,
> 
> That's expected since jewel, which moved the scrub IOs out of the disk thread and into the ?op? thread. They can now be prioritized
> using osd_scrub_priority, and you can experiment with osd_op_queue = prio/wpq to see if scrubs can be made more transparent
> with the latter, newer, queuing implementation.
> I don't recall what's still left in the disk thread -- perhaps snap trimming or PG removal.
> 
> Cheers, Dan

Thanks for the pointer Dan. The Jewel Docs don't seem to mention this, hence me going down the wrong path. Some of the default values don't match the docs either....I'm feeling a PR coming along for the docs. 

I'm going to drop them down to 1 as I think they are having a negagtive effect when recovery/backfilling is going on, leading to OSD's dropping out.

Thanks Again,
Nick

> 
> 
> On Tue, Nov 8, 2016 at 9:16 AM, Nick Fisk <nick@xxxxxxxxxx> wrote:
> > Hi,
> >
> > I have all the normal options set in ceph.conf (priority and class for
> > disk threads) however scrubs look like they are running as the standard BE/4 class in iotop. Running 10.2.3.
> >
> > Eg
> >
> > PG Dump (Shows that OSD 1 will be scrubbing)
> > pg_stat objects mip     degr    misp    unf     bytes   log     disklog state   state_stamp     v       reported        up
> > up_primary      acting     acting_primary  last_scrub      scrub_stamp     last_deep_scrub deep_scrub_stamp
> > 1.8fc   3314    0       0       0       0       13899923456     3043    3043    active+clean+scrubbing+deep     2016-11-08
> > 07:51:25.708169      1468442'173537     1468443:139543  [15,32,1]       15      [15,32,1]       15      1435301'169725  2016-11-06
> > 21:46:26.537632      1338333'158008  2016-10-30 23:59:50.794774
> >
> > sudo iotop -o --batch --iter 1 | grep ceph
> > 3958453 be/4 ceph       92.17 M/s    0.00 B/s  0.00 % 66.59 % ceph-osd -f --cluster ceph --id 1 --setuser ceph --setgroup ceph
> > [tp_osd_tp]
> > 3956201 be/4 ceph       12.47 K/s    0.00 B/s  0.00 %  4.72 % ceph-osd -f --cluster ceph --id 11 --setuser ceph --setgroup ceph
> > [tp_osd_tp]
> > 3954970 be/4 ceph      361.68 K/s    0.00 B/s  0.00 %  1.98 % ceph-osd -f --cluster ceph --id 6 --setuser ceph --setgroup ceph
> > [tp_osd_tp]
> > 3956653 be/4 ceph       12.47 K/s    0.00 B/s  0.00 %  1.26 % ceph-osd -f --cluster ceph --id 2 --setuser ceph --setgroup ceph
> > [tp_osd_tp]
> > 3957297 be/4 ceph       12.47 K/s    0.00 B/s  0.00 %  0.60 % ceph-osd -f --cluster ceph --id 7 --setuser ceph --setgroup ceph
> > [tp_osd_tp]
> > 3956648 be/4 ceph       12.47 K/s    0.00 B/s  0.00 %  0.14 % ceph-osd -f --cluster ceph --id 2 --setuser ceph --setgroup ceph
> > [tp_osd_tp]
> >
> > Only OSD1 is doing IO, the rest are doing nothing in comparison, so this must be the scrubbing, but why is it not running as idle?
> >
> > There are idle threads though, they just don't seem to do anything
> >
> > iotop  --batch --iter 1 | grep ceph | grep idle
> > 3954974 idle ceph        0.00 B/s    0.00 B/s  0.00 %  0.00 % ceph-osd -f --cluster ceph --id 6 --setuser ceph --setgroup ceph
> > [tp_osd_disk]
> > 3956209 idle ceph        0.00 B/s    0.00 B/s  0.00 %  0.00 % ceph-osd -f --cluster ceph --id 11 --setuser ceph --setgroup ceph
> > [tp_osd_disk]
> > 3956655 idle ceph        0.00 B/s    0.00 B/s  0.00 %  0.00 % ceph-osd -f --cluster ceph --id 2 --setuser ceph --setgroup ceph
> > [tp_osd_disk]
> > 3957253 idle ceph        0.00 B/s    0.00 B/s  0.00 %  0.00 % ceph-osd -f --cluster ceph --id 8 --setuser ceph --setgroup ceph
> > [tp_osd_disk]
> > 3957306 idle ceph        0.00 B/s    0.00 B/s  0.00 %  0.00 % ceph-osd -f --cluster ceph --id 7 --setuser ceph --setgroup ceph
> > [tp_osd_disk]
> > 3957350 idle ceph        0.00 B/s    0.00 B/s  0.00 %  0.00 % ceph-osd -f --cluster ceph --id 10 --setuser ceph --setgroup ceph
> > [tp_osd_disk]
> > 3958464 idle ceph        0.00 B/s    0.00 B/s  0.00 %  0.00 % ceph-osd -f --cluster ceph --id 1 --setuser ceph --setgroup ceph
> > [tp_osd_disk]
> > 3958505 idle ceph        0.00 B/s    0.00 B/s  0.00 %  0.00 % ceph-osd -f --cluster ceph --id 4 --setuser ceph --setgroup ceph
> > [tp_osd_disk]
> > 3958832 idle ceph        0.00 B/s    0.00 B/s  0.00 %  0.00 % ceph-osd -f --cluster ceph --id 3 --setuser ceph --setgroup ceph
> > [tp_osd_disk]
> > 3960488 idle ceph        0.00 B/s    0.00 B/s  0.00 %  0.00 % ceph-osd -f --cluster ceph --id 5 --setuser ceph --setgroup ceph
> > [tp_osd_disk]
> > 3960851 idle ceph        0.00 B/s    0.00 B/s  0.00 %  0.00 % ceph-osd -f --cluster ceph --id 9 --setuser ceph --setgroup ceph
> > [tp_osd_disk]
> > 3959437 idle ceph        0.00 B/s    0.00 B/s  0.00 %  0.00 % ceph-osd -f --cluster ceph --id 0 --setuser ceph --setgroup ceph
> > [tp_osd_disk]
> >
> > ceph daemon osd.1 config get osd_disk_thread_ioprio_class {
> >     "osd_disk_thread_ioprio_class": "idle"
> > }
> > ceph daemon osd.1 config get osd_disk_thread_ioprio_priority {
> >     "osd_disk_thread_ioprio_priority": "7"
> > }
> >
> > cat /sys/block/sd*/queue/scheduler
> > noop deadline [cfq]
> > noop deadline [cfq]
> > noop deadline [cfq]
> > noop deadline [cfq]
> > noop deadline [cfq]
> > noop deadline [cfq]
> > noop deadline [cfq]
> > noop deadline [cfq]
> > noop deadline [cfq]
> > noop deadline [cfq]
> > noop deadline [cfq]
> > noop deadline [cfq]
> >
> > Any Ideas or have I misunderstood something?
> > Nick
> >
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@xxxxxxxxxxxxxx
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com