Thank you Wesley for the clear explanation between the 2 methods! The tracker issue you mentioned https://tracker.ceph.com/issues/44400 talks about primary-affinity. Could primary-affinity help remove an OSD with hardware issue from the cluster gracefully? Thanks, Mary On Fri, Apr 26, 2024 at 8:43 AM Wesley Dillingham <wes@xxxxxxxxxxxxxxxxx> wrote: > What you want to do is to stop the OSD (and all its copies of data it > contains) by stopping the OSD service immediately. The downside of this > approach is it causes the PGs on that OSD to be degraded. But the upside is > the OSD which has bad hardware is immediately no longer participating in > any client IO (the source of your RGW 503s). In this situation the PGs go > into degraded+backfilling > > The alternative method is to keep the failing OSD up and in the cluster > but slowly migrate the data off of it, this would be a long drawn out > period of time in which the failing disk would continue to serve client > reads and also facilitate backfill but you wouldnt take a copy of the data > out of the cluster and cause degraded PGs. In this scenario the PGs would > be remapped+backfilling > > I tried to find a way to have your cake and eat it to in relation to this > "predicament" in this tracker issue: https://tracker.ceph.com/issues/44400 > but it was deemed "wont fix". > > Respectfully, > > *Wes Dillingham* > LinkedIn <http://www.linkedin.com/in/wesleydillingham> > wes@xxxxxxxxxxxxxxxxx > > > > > On Fri, Apr 26, 2024 at 11:25 AM Mary Zhang <maryzhang0920@xxxxxxxxx> > wrote: > >> Thank you Eugen for your warm help! >> >> I'm trying to understand the difference between 2 methods. >> For method 1, or "ceph orch osd rm osd_id", OSD Service — Ceph >> Documentation >> <https://docs.ceph.com/en/latest/cephadm/services/osd/#remove-an-osd> >> says >> it involves 2 steps: >> >> 1. >> >> evacuating all placement groups (PGs) from the OSD >> 2. >> >> removing the PG-free OSD from the cluster >> >> For method 2, or the procedure you recommended, Adding/Removing OSDs — >> Ceph >> Documentation >> < >> https://docs.ceph.com/en/latest/rados/operations/add-or-rm-osds/#removing-osds-manual >> > >> says >> "After the OSD has been taken out of the cluster, Ceph begins rebalancing >> the cluster by migrating placement groups out of the OSD that was removed. >> " >> >> What's the difference between "evacuating PGs" in method 1 and "migrating >> PGs" in method 2? I think method 1 must read the OSD to be removed. >> Otherwise, we would not see slow ops warning. Does method 2 not involve >> reading this OSD? >> >> Thanks, >> Mary >> >> On Fri, Apr 26, 2024 at 5:15 AM Eugen Block <eblock@xxxxxx> wrote: >> >> > Hi, >> > >> > if you remove the OSD this way, it will be drained. Which means that >> > it will try to recover PGs from this OSD, and in case of hardware >> > failure it might lead to slow requests. It might make sense to >> > forcefully remove the OSD without draining: >> > >> > - stop the osd daemon >> > - mark it as out >> > - osd purge <id|osd.id> [--force] [--yes-i-really-mean-it] >> > >> > Regards, >> > Eugen >> > >> > Zitat von Mary Zhang <maryzhang0920@xxxxxxxxx>: >> > >> > > Hi, >> > > >> > > We recently removed an osd from our Cepth cluster. Its underlying disk >> > has >> > > a hardware issue. >> > > >> > > We use command: ceph orch osd rm osd_id --zap >> > > >> > > During the process, sometimes ceph cluster enters warning state with >> slow >> > > ops on this osd. Our rgw also failed to respond to requests and >> returned >> > > 503. >> > > >> > > We restarted rgw daemon to make it work again. But the same failure >> > occured >> > > from time to time. Eventually we noticed that rgw 503 error is a >> result >> > of >> > > osd slow ops. >> > > >> > > Our cluster has 18 hosts and 210 OSDs. We expect remove an osd with >> > > hardware issue won't impact cluster performance & rgw availbility. Is >> our >> > > expectation reasonable? What's the best way to handle osd with >> hardware >> > > failures? >> > > >> > > Thank you in advance for any comments or suggestions. >> > > >> > > Best Regards, >> > > Mary Zhang >> > > _______________________________________________ >> > > ceph-users mailing list -- ceph-users@xxxxxxx >> > > To unsubscribe send an email to ceph-users-leave@xxxxxxx >> > >> > >> > _______________________________________________ >> > ceph-users mailing list -- ceph-users@xxxxxxx >> > To unsubscribe send an email to ceph-users-leave@xxxxxxx >> > >> _______________________________________________ >> ceph-users mailing list -- ceph-users@xxxxxxx >> To unsubscribe send an email to ceph-users-leave@xxxxxxx >> > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx