Re: Remove an OSD with hardware issue caused rgw 503

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



If the rest of the cluster is healthy and your resiliency is configured properly, for example to sustain the loss of one or more hosts at a time, you don’t need to worry about a single disk. Just take it out and remove it (forcefully) so it doesn’t have any clients anymore. Ceph will immediately assign different primary OSDs and your clients will be happy again. ;-)

Zitat von Mary Zhang <maryzhang0920@xxxxxxxxx>:

Thank you Wesley for the clear explanation between the 2 methods!
The tracker issue you mentioned https://tracker.ceph.com/issues/44400 talks
about primary-affinity. Could primary-affinity help remove an OSD with
hardware issue from the cluster gracefully?

Thanks,
Mary


On Fri, Apr 26, 2024 at 8:43 AM Wesley Dillingham <wes@xxxxxxxxxxxxxxxxx>
wrote:

What you want to do is to stop the OSD (and all its copies of data it
contains) by stopping the OSD service immediately. The downside of this
approach is it causes the PGs on that OSD to be degraded. But the upside is
the OSD which has bad hardware is immediately no  longer participating in
any client IO (the source of your RGW 503s). In this situation the PGs go
into degraded+backfilling

The alternative method is to keep the failing OSD up and in the cluster
but slowly migrate the data off of it, this would be a long drawn out
period of time in which the failing disk would continue to serve client
reads and also facilitate backfill but you wouldnt take a copy of the data
out of the cluster and cause degraded PGs. In this scenario the PGs would
be remapped+backfilling

I tried to find a way to have your cake and eat it to in relation to this
"predicament" in this tracker issue: https://tracker.ceph.com/issues/44400
but it was deemed "wont fix".

Respectfully,

*Wes Dillingham*
LinkedIn <http://www.linkedin.com/in/wesleydillingham>
wes@xxxxxxxxxxxxxxxxx




On Fri, Apr 26, 2024 at 11:25 AM Mary Zhang <maryzhang0920@xxxxxxxxx>
wrote:

Thank you Eugen for your warm help!

I'm trying to understand the difference between 2 methods.
For method 1, or "ceph orch osd rm osd_id", OSD Service — Ceph
Documentation
<https://docs.ceph.com/en/latest/cephadm/services/osd/#remove-an-osd>
says
it involves 2 steps:

   1.

   evacuating all placement groups (PGs) from the OSD
   2.

   removing the PG-free OSD from the cluster

For method 2, or the procedure you recommended, Adding/Removing OSDs —
Ceph
Documentation
<
https://docs.ceph.com/en/latest/rados/operations/add-or-rm-osds/#removing-osds-manual
>
says
"After the OSD has been taken out of the cluster, Ceph begins rebalancing
the cluster by migrating placement groups out of the OSD that was removed.
"

What's the difference between "evacuating PGs" in method 1 and "migrating
PGs" in method 2? I think method 1 must read the OSD to be removed.
Otherwise, we would not see slow ops warning. Does method 2 not involve
reading this OSD?

Thanks,
Mary

On Fri, Apr 26, 2024 at 5:15 AM Eugen Block <eblock@xxxxxx> wrote:

> Hi,
>
> if you remove the OSD this way, it will be drained. Which means that
> it will try to recover PGs from this OSD, and in case of hardware
> failure it might lead to slow requests. It might make sense to
> forcefully remove the OSD without draining:
>
> - stop the osd daemon
> - mark it as out
> - osd purge <id|osd.id> [--force] [--yes-i-really-mean-it]
>
> Regards,
> Eugen
>
> Zitat von Mary Zhang <maryzhang0920@xxxxxxxxx>:
>
> > Hi,
> >
> > We recently removed an osd from our Cepth cluster. Its underlying disk
> has
> > a hardware issue.
> >
> > We use command: ceph orch osd rm osd_id --zap
> >
> > During the process, sometimes ceph cluster enters warning state with
slow
> > ops on this osd. Our rgw also failed to respond to requests and
returned
> > 503.
> >
> > We restarted rgw daemon to make it work again. But the same failure
> occured
> > from time to time. Eventually we noticed that rgw 503 error is a
result
> of
> > osd slow ops.
> >
> > Our cluster has 18 hosts and 210 OSDs. We expect remove an osd with
> > hardware issue won't impact cluster performance & rgw availbility. Is
our
> > expectation reasonable? What's the best way to handle osd with
hardware
> > failures?
> >
> > Thank you in advance for any comments or suggestions.
> >
> > Best Regards,
> > Mary Zhang
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux