Ceph always aims to provide high availability. So, if you do not set cluster flags that prevent Ceph from trying to self-heal, it will self-heal. Based on your description, it sounds like you want to consider the 'noout' flag. By default, after 10(?) minutes of an OSD being down, Ceph will begin the process of outing the affected OSD to ensure high availability. But be careful, as far as latency goes -- you likely still want to pre-emptively mark OSDs down ahead of the planned maintenance for latency purposes, and you must be cognisant of whether or not your replication policy puts you in a position where an unrelated failure during the maintenance can result in inactive PGs. Cheers, Tyler On Sun, May 29, 2022, 5:30 PM Jeremy Hansen <jeremy@xxxxxxxxxx> wrote: > Is there a maintenance mode for Ceph that would allow me to do work on > underlying network equipment without causing Ceph to panic? In our test > lab, we don’t have redundant networking and when doing switch upgrades and > such, Ceph has a panic attack and we end up having to reboot Ceph nodes > anyway. Like an hdfs style readonly mode or something? > > Thanks! > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx