Re: Maintenance mode?

Tyler Stachecki <stachecki.tyler@xxxxxxxxx> · Sun, 29 May 2022 18:40:08 -0400

Ceph always aims to provide high availability. So, if you do not set
cluster flags that prevent Ceph from trying to self-heal, it will self-heal.

Based on your description, it sounds like you want to consider the 'noout'
flag. By default, after 10(?) minutes of an OSD being down, Ceph will begin
the process of outing the affected OSD to ensure high availability.

But be careful, as far as latency goes -- you likely still want to
pre-emptively mark OSDs down ahead of the planned maintenance for latency
purposes, and you must be cognisant of whether or not your replication
policy puts you in a position where an unrelated failure during the
maintenance can result in inactive PGs.

Cheers,
Tyler

On Sun, May 29, 2022, 5:30 PM Jeremy Hansen <jeremy@xxxxxxxxxx> wrote:

> Is there a maintenance mode for Ceph that would allow me to do work on
> underlying network equipment without causing Ceph to panic? In our test
> lab, we don’t have redundant networking and when doing switch upgrades and
> such, Ceph has a panic attack and we end up having to reboot Ceph nodes
> anyway. Like an hdfs style readonly mode or something?
>
> Thanks!
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx