On Fri, Aug 04, 2023 at 09:44:57AM -0400, Dave Hall wrote: > My inclination is to mark these 3 OSDs 'OUT' before they crash completely, > but I want to confirm my understanding of Ceph's response to this. Mainly, > given my EC pools (or replicated pools for that matter), if I mark all 3 > OSD out all at once will I risk data loss? It depends on your crush map and failure domain layout. In the unlikeliest and unluckiest case, all those 3 OSDs are in different failure domains, and some data has 1 replica on each of those OSDs. In that situation, if you take them out simultaneously, you would lose data. If you're unsure, then do them one at a time and wait for the rebalance/backfill to complete before doing the next. We arrange our OSDs so that the failure domain is the rack; losing an entire rack is safe (and we've had that happen) so we know it's safe to pull any number of OSDs in the same rack and we won't lose data. Dave -- ** Dave Holland ** Systems Support -- Informatics Systems Group ** ** dh3@xxxxxxxxxxxx ** Wellcome Sanger Institute, Hinxton, UK ** -- The Wellcome Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx