Re: Lousy recovery for mclock and reef

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Well this was an interesting journey through the bowels of Ceph.  I have
about 6 hours into tweaking every setting imaginable just to circle back to
my basic configuration and 2G memory target per osd.  I was never able to
exceed 22 Mib/Sec recovery time during that journey.

I did end up fixing the issue and now I see the following -

  io:
    recovery: 129 MiB/s, 33 objects/s

This is normal for my measly cluster.  I like micro ceph clusters.  I have
a lot of them. :)

What was the fix?  Adding another disc to the recovery process!  I was
recovering to one disc now I'm recovering to two.  I have three total that
need to be recovered.  Somehow that one disc was completely swamped.  I was
unable to see it in htop, atop, iostat.  Disc business was 6% max.

My config is back to mclock scheduler, profile high_recovery_ops, and
backfills of 256.

Thank you everyone that took the time to review and contribute.  Hopefully
this provides some modern information for the next person that has slow
recovery.

/Chris C





On Fri, May 24, 2024 at 1:43 PM Kai Stian Olstad <ceph+list@xxxxxxxxxx>
wrote:

> On 24.05.2024 21:07, Mazzystr wrote:
> > I did the obnoxious task of updating ceph.conf and restarting all my
> > osds.
> >
> > ceph --admin-daemon /var/run/ceph/ceph-osd.*.asok config get
> > osd_op_queue
> > {
> >     "osd_op_queue": "wpq"
> > }
> >
> > I have some spare memory on my target host/osd and increased the target
> > memory of that OSD to 10 Gb and restarted.  No effect observed.  In
> > fact
> > mem usage on the host is stable so I don't think the change took effect
> > even with updating ceph.conf, restart and a direct asok config set.
> > target
> > memory value is confirmed to be set via asok config get
> >
> > Nothing has helped.  I still cannot break the 21 MiB/s barrier.
> >
> > Does anyone have any more ideas?
>
> For recovery you can adjust the following.
>
> osd_max_backfills default is 1, in my system I get the best performance
> with 3 and wpq.
>
> The following I have not adjusted myself, but you can try.
> osd_recovery_max_active is default to 3.
> osd_recovery_op_priority is default to 3, a lower number increases the
> priority for recovery.
>
> All of them can be runtime adjusted.
>
>
> --
> Kai Stian Olstad
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux