Re: Remove an OSD with hardware issue caused rgw 503

Mary Zhang <maryzhang0920@xxxxxxxxx> · Fri, 26 Apr 2024 09:54:33 -0700

Thank you Wesley for the clear explanation between the 2 methods!
The tracker issue you mentioned https://tracker.ceph.com/issues/44400 talks
about primary-affinity. Could primary-affinity help remove an OSD with
hardware issue from the cluster gracefully?

Thanks,
Mary

On Fri, Apr 26, 2024 at 8:43 AM Wesley Dillingham <wes@xxxxxxxxxxxxxxxxx>
wrote:

> What you want to do is to stop the OSD (and all its copies of data it
> contains) by stopping the OSD service immediately. The downside of this
> approach is it causes the PGs on that OSD to be degraded. But the upside is
> the OSD which has bad hardware is immediately no  longer participating in
> any client IO (the source of your RGW 503s). In this situation the PGs go
> into degraded+backfilling
>
> The alternative method is to keep the failing OSD up and in the cluster
> but slowly migrate the data off of it, this would be a long drawn out
> period of time in which the failing disk would continue to serve client
> reads and also facilitate backfill but you wouldnt take a copy of the data
> out of the cluster and cause degraded PGs. In this scenario the PGs would
> be remapped+backfilling
>
> I tried to find a way to have your cake and eat it to in relation to this
> "predicament" in this tracker issue: https://tracker.ceph.com/issues/44400
> but it was deemed "wont fix".
>
> Respectfully,
>
> *Wes Dillingham*
> LinkedIn <http://www.linkedin.com/in/wesleydillingham>
> wes@xxxxxxxxxxxxxxxxx
>
>
>
>
> On Fri, Apr 26, 2024 at 11:25 AM Mary Zhang <maryzhang0920@xxxxxxxxx>
> wrote:
>
>> Thank you Eugen for your warm help!
>>
>> I'm trying to understand the difference between 2 methods.
>> For method 1, or "ceph orch osd rm osd_id", OSD Service — Ceph
>> Documentation
>> <https://docs.ceph.com/en/latest/cephadm/services/osd/#remove-an-osd>
>> says
>> it involves 2 steps:
>>
>>    1.
>>
>>    evacuating all placement groups (PGs) from the OSD
>>    2.
>>
>>    removing the PG-free OSD from the cluster
>>
>> For method 2, or the procedure you recommended, Adding/Removing OSDs —
>> Ceph
>> Documentation
>> <
>> https://docs.ceph.com/en/latest/rados/operations/add-or-rm-osds/#removing-osds-manual
>> >
>> says
>> "After the OSD has been taken out of the cluster, Ceph begins rebalancing
>> the cluster by migrating placement groups out of the OSD that was removed.
>> "
>>
>> What's the difference between "evacuating PGs" in method 1 and "migrating
>> PGs" in method 2? I think method 1 must read the OSD to be removed.
>> Otherwise, we would not see slow ops warning. Does method 2 not involve
>> reading this OSD?
>>
>> Thanks,
>> Mary
>>
>> On Fri, Apr 26, 2024 at 5:15 AM Eugen Block <eblock@xxxxxx> wrote:
>>
>> > Hi,
>> >
>> > if you remove the OSD this way, it will be drained. Which means that
>> > it will try to recover PGs from this OSD, and in case of hardware
>> > failure it might lead to slow requests. It might make sense to
>> > forcefully remove the OSD without draining:
>> >
>> > - stop the osd daemon
>> > - mark it as out
>> > - osd purge <id|osd.id> [--force] [--yes-i-really-mean-it]
>> >
>> > Regards,
>> > Eugen
>> >
>> > Zitat von Mary Zhang <maryzhang0920@xxxxxxxxx>:
>> >
>> > > Hi,
>> > >
>> > > We recently removed an osd from our Cepth cluster. Its underlying disk
>> > has
>> > > a hardware issue.
>> > >
>> > > We use command: ceph orch osd rm osd_id --zap
>> > >
>> > > During the process, sometimes ceph cluster enters warning state with
>> slow
>> > > ops on this osd. Our rgw also failed to respond to requests and
>> returned
>> > > 503.
>> > >
>> > > We restarted rgw daemon to make it work again. But the same failure
>> > occured
>> > > from time to time. Eventually we noticed that rgw 503 error is a
>> result
>> > of
>> > > osd slow ops.
>> > >
>> > > Our cluster has 18 hosts and 210 OSDs. We expect remove an osd with
>> > > hardware issue won't impact cluster performance & rgw availbility. Is
>> our
>> > > expectation reasonable? What's the best way to handle osd with
>> hardware
>> > > failures?
>> > >
>> > > Thank you in advance for any comments or suggestions.
>> > >
>> > > Best Regards,
>> > > Mary Zhang
>> > > _______________________________________________
>> > > ceph-users mailing list -- ceph-users@xxxxxxx
>> > > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>> >
>> >
>> > _______________________________________________
>> > ceph-users mailing list -- ceph-users@xxxxxxx
>> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>> >
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx