Dave, Actually, my failure domain is OSD since I so far only have 9 OSD nodes but EC 8+2. However, the drives are still functioning, except that one has failed multiple times in the last few days, requiring a node power-cycle to recover. I will certainly mark that one out immediately. The other two pending failures are behaving more politely, so I am assuming that the cluster could copy the data elsewhere as part of the rebalance. I think I'm also concerned about the rebalance process moving data to these drives with pending failures. Since I'm EC 8+2, perhaps it is safe to mark two out simultaneously? Thanks. -Dave -- Dave Hall Binghamton University kdhall@xxxxxxxxxxxxxx On Fri, Aug 4, 2023 at 10:16 AM Dave Holland <dh3@xxxxxxxxxxxx> wrote: > On Fri, Aug 04, 2023 at 09:44:57AM -0400, Dave Hall wrote: > > My inclination is to mark these 3 OSDs 'OUT' before they crash > completely, > > but I want to confirm my understanding of Ceph's response to this. > Mainly, > > given my EC pools (or replicated pools for that matter), if I mark all 3 > > OSD out all at once will I risk data loss? > > It depends on your crush map and failure domain layout. In the > unlikeliest and unluckiest case, all those 3 OSDs are in different > failure domains, and some data has 1 replica on each of those OSDs. In > that situation, if you take them out simultaneously, you would lose > data. If you're unsure, then do them one at a time and wait for the > rebalance/backfill to complete before doing the next. > > We arrange our OSDs so that the failure domain is the rack; losing an > entire rack is safe (and we've had that happen) so we know it's safe > to pull any number of OSDs in the same rack and we won't lose data. > > Dave > -- > ** Dave Holland ** Systems Support -- Informatics Systems Group ** > ** dh3@xxxxxxxxxxxx ** Wellcome Sanger Institute, Hinxton, UK ** > > > -- > The Wellcome Sanger Institute is operated by Genome Research > Limited, a charity registered in England with number 1021457 and a > company registered in England with number 2742969, whose registered > office is Wellcome Sanger Institute, Wellcome Genome Campus, > Hinxton, CB10 1SA. > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx