Re: Lousy recovery for mclock and reef

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Now that you're on wpq, you can try tweaking osd_max_backfills (up)
and osd_recovery_sleep (down).

Josh

On Fri, May 24, 2024 at 1:07 PM Mazzystr <mazzystr@xxxxxxxxx> wrote:
>
> I did the obnoxious task of updating ceph.conf and restarting all my osds.
>
> ceph --admin-daemon /var/run/ceph/ceph-osd.*.asok config get osd_op_queue
> {
>     "osd_op_queue": "wpq"
> }
>
> I have some spare memory on my target host/osd and increased the target memory of that OSD to 10 Gb and restarted.  No effect observed.  In fact mem usage on the host is stable so I don't think the change took effect even with updating ceph.conf, restart and a direct asok config set.  target memory value is confirmed to be set via asok config get
>
> Nothing has helped.  I still cannot break the 21 MiB/s barrier.
>
> Does anyone have any more ideas?
>
> /C
>
> On Fri, May 24, 2024 at 10:20 AM Joshua Baergen <jbaergen@xxxxxxxxxxxxxxxx> wrote:
>>
>> It requires an OSD restart, unfortunately.
>>
>> Josh
>>
>> On Fri, May 24, 2024 at 11:03 AM Mazzystr <mazzystr@xxxxxxxxx> wrote:
>> >
>> > Is that a setting that can be applied runtime or does it req osd restart?
>> >
>> > On Fri, May 24, 2024 at 9:59 AM Joshua Baergen <jbaergen@xxxxxxxxxxxxxxxx>
>> > wrote:
>> >
>> > > Hey Chris,
>> > >
>> > > A number of users have been reporting issues with recovery on Reef
>> > > with mClock. Most folks have had success reverting to
>> > > osd_op_queue=wpq. AIUI 18.2.3 should have some mClock improvements but
>> > > I haven't looked at the list myself yet.
>> > >
>> > > Josh
>> > >
>> > > On Fri, May 24, 2024 at 10:55 AM Mazzystr <mazzystr@xxxxxxxxx> wrote:
>> > > >
>> > > > Hi all,
>> > > > Goodness I'd say it's been at least 3 major releases since I had to do a
>> > > > recovery.  I have disks with 60-75,000 power_on_hours.  I just updated
>> > > from
>> > > > Octopus to Reef last month and I'm hit with 3 disk failures and the
>> > > mclock
>> > > > ugliness.  My recovery is moving at a wondrous 21 mb/sec after some
>> > > serious
>> > > > hacking.  It started out at 9 mb/sec.
>> > > >
>> > > > My hosts are showing minimal cpu use.  normal mem use.  0-6% disk
>> > > > business.  Load is minimal so processes aren't blocked by disk io.
>> > > >
>> > > > I tried the changing all the sleeps and recovery_max and
>> > > > setting osd_mclock_profile high_recovery_ops to no change in performance.
>> > > >
>> > > > Does anyone have any suggestions to improve performance?
>> > > >
>> > > > Thanks,
>> > > > /Chris C
>> > > > _______________________________________________
>> > > > ceph-users mailing list -- ceph-users@xxxxxxx
>> > > > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>> > >
>> > _______________________________________________
>> > ceph-users mailing list -- ceph-users@xxxxxxx
>> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux