On Tue, Mar 18, 2025, 5:02 PM Kai Stian Olstad <ceph+list@xxxxxxxxxx> wrote: > On Mon, Mar 17, 2025 at 03:08:54PM +0000, Eugen Block wrote: > >Before I replied, I wanted to renew my confidence and do a small test > >in a lab environment. I also created a k4m2 pool with host as > >failure-domain, started to write data chunks into it in a while loop > >and then marked three of the OSDs "out" simultaneously. After a few > >seconds of repeering backfill kicks in, I/O to the pool continues > >without interruption. So yeah, I also think it's safe to mark them out > >at the same time. > > That is just awesome that you tested it Eugen, thank you very much. > > -- > Kai Stian Olstad > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx You can do it as others have said. The risks are: - If OSDs fail while being outed before they have a chance to fully offload their PGs, it's still the same as stopping OSDs in different CRUSH domains. IOW, you are chancing fate a bit if you wait for a bunch of drives to pickup soft failures and then weight them out simultaneously. You are really only "safe" again and can stop OR let OSDs fail once PGs are all active+clean. - You must have enough free space and enough OSDs to inherit all the additional PGs. If you get stuck in a backfill_toofull situation, out so many OSDs that cause the PGs to exceed per OSD limits, etc... you can quickly find yourself in a pickle. Tyler _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx