Re: Lousy recovery for mclock and reef

Cedric <yipikai7@xxxxxxxxx> · Sun, 26 May 2024 12:36:34 +0200



What about drives IOPS ? Hdd tops at an average 150, you can use iostat
-xmt to get these values (also last column show disk utilization which is
very usefull)

On Sun, May 26, 2024, 09:37 Mazzystr <mazzystr@xxxxxxxxx> wrote:

> I can't explain the problem.  I have to recover three discs that are hdds.
> I figured on just replacing one to give the full recovery capacity of the
> cluster to that one disc.  I was never able to achieve a higher recovery
> rate than about 22 MiB/sec so I just added the other two discs.  Recovery
> bounced up to 129 MiB/sec for a while.  Then things settled at 50 MiB/sec.
> I kept tinkering to try to get back to 120 and now things are back to 23
> MiB/Sec again.  This is very irritating.
>
> Cpu usage is minimal in the single digit %.  Mem is right on target per
> target setting in ceph.conf.  Disc's and network appear to be 20%
> utilized.
>
> I'm not a normal Ceph user.  I don't care about client access at all.  The
> mclock assumptions are wrong for me.  I want my data to be replicated
> correctly as fast as possible.
>
> How do I open up the floodgates for maximum recovery performance?
>
>
>
>
> On Sat, May 25, 2024 at 8:13 PM Zakhar Kirpichenko <zakhar@xxxxxxxxx>
> wrote:
>
> > Hi!
> >
> > Could you please elaborate what you meant by "adding another disc to the
> > recovery process"?
> >
> > /Z
> >
> >
> > On Sat, 25 May 2024, 22:49 Mazzystr, <mazzystr@xxxxxxxxx> wrote:
> >
> >> Well this was an interesting journey through the bowels of Ceph.  I have
> >> about 6 hours into tweaking every setting imaginable just to circle back
> >> to
> >> my basic configuration and 2G memory target per osd.  I was never able
> to
> >> exceed 22 Mib/Sec recovery time during that journey.
> >>
> >> I did end up fixing the issue and now I see the following -
> >>
> >>   io:
> >>     recovery: 129 MiB/s, 33 objects/s
> >>
> >> This is normal for my measly cluster.  I like micro ceph clusters.  I
> have
> >> a lot of them. :)
> >>
> >> What was the fix?  Adding another disc to the recovery process!  I was
> >> recovering to one disc now I'm recovering to two.  I have three total
> that
> >> need to be recovered.  Somehow that one disc was completely swamped.  I
> >> was
> >> unable to see it in htop, atop, iostat.  Disc business was 6% max.
> >>
> >> My config is back to mclock scheduler, profile high_recovery_ops, and
> >> backfills of 256.
> >>
> >> Thank you everyone that took the time to review and contribute.
> Hopefully
> >> this provides some modern information for the next person that has slow
> >> recovery.
> >>
> >> /Chris C
> >>
> >>
> >>
> >>
> >>
> >> On Fri, May 24, 2024 at 1:43 PM Kai Stian Olstad <ceph+list@xxxxxxxxxx>
> >> wrote:
> >>
> >> > On 24.05.2024 21:07, Mazzystr wrote:
> >> > > I did the obnoxious task of updating ceph.conf and restarting all my
> >> > > osds.
> >> > >
> >> > > ceph --admin-daemon /var/run/ceph/ceph-osd.*.asok config get
> >> > > osd_op_queue
> >> > > {
> >> > >     "osd_op_queue": "wpq"
> >> > > }
> >> > >
> >> > > I have some spare memory on my target host/osd and increased the
> >> target
> >> > > memory of that OSD to 10 Gb and restarted.  No effect observed.  In
> >> > > fact
> >> > > mem usage on the host is stable so I don't think the change took
> >> effect
> >> > > even with updating ceph.conf, restart and a direct asok config set.
> >> > > target
> >> > > memory value is confirmed to be set via asok config get
> >> > >
> >> > > Nothing has helped.  I still cannot break the 21 MiB/s barrier.
> >> > >
> >> > > Does anyone have any more ideas?
> >> >
> >> > For recovery you can adjust the following.
> >> >
> >> > osd_max_backfills default is 1, in my system I get the best
> performance
> >> > with 3 and wpq.
> >> >
> >> > The following I have not adjusted myself, but you can try.
> >> > osd_recovery_max_active is default to 3.
> >> > osd_recovery_op_priority is default to 3, a lower number increases the
> >> > priority for recovery.
> >> >
> >> > All of them can be runtime adjusted.
> >> >
> >> >
> >> > --
> >> > Kai Stian Olstad
> >> >
> >> _______________________________________________
> >> ceph-users mailing list -- ceph-users@xxxxxxx
> >> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >>
> >
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx