Hi Dan, thanks for your reply. There is still a problem. Firstly, I did indeed forget to restart the mon even though I looked at the help for mon_osd_down_out_subtree_limit and it says it requires a restart. Stupid me. Well, now I did a restart and it still doesn't work. Here is the situation: # ceph config dump | grep subtree mon advanced mon_osd_down_out_subtree_limit host * mon advanced mon_osd_reporter_subtree_level datacenter # ceph config get mon.ceph-01 mon_osd_down_out_subtree_limit host # ceph daemon mon.ceph-01 config get mon_osd_down_out_subtree_limit { "mon_osd_down_out_subtree_limit": "rack" } # ceph config show mon.ceph-01 | grep subtree mon_osd_down_out_subtree_limit rack default mon mon_osd_reporter_subtree_level datacenter mon The default overrides the mon config database setting. What is going on here? I restarted all 3 monitors. Best regards and thanks for your help, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 ________________________________________ From: Dan van der Ster <dan@xxxxxxxxxxxxxx> Sent: 14 July 2020 10:53:13 To: Frank Schilder Cc: Anthony D'Atri; ceph-users Subject: Re: Re: mon_osd_down_out_subtree_limit not working? mon_osd_down_out_subtree_limit has been working well here. Did you restart the mon's after making that config change? Can you do this just to make sure it took effect? ceph daemon mon.`hostname -s` config get mon_osd_down_out_subtree_limit -- dan On Tue, Jul 14, 2020 at 8:57 AM Frank Schilder <frans@xxxxxx> wrote: > > Yes. After the time-out of 600 secs the OSDs got marked down, all PGs got remapped and recovery/rebalancing started as usual. In the past, I did service on servers with the flag noout set and would expect that mon_osd_down_out_subtree_limit=host has the same effect when shutting down an entire host. Unfortunately, in my case these two settings behave differently. > > If I understand the documentation correctly, the OSDs should not get marked out automatically. > > Best regards, > ================= > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > > ________________________________________ > From: Anthony D'Atri <anthony.datri@xxxxxxxxx> > Sent: 14 July 2020 04:32:05 > To: Frank Schilder > Subject: Re: mon_osd_down_out_subtree_limit not working? > > Did it start rebalancing? > > > On Jul 13, 2020, at 4:29 AM, Frank Schilder <frans@xxxxxx> wrote: > > > > if I shut down all OSDs on this host, these OSDs should not be marked out automatically after mon_osd_down_out_interval(=600) seconds. I did a test today and, unfortunately, the OSDs do get marked as out. Ceph status was showing 1 host down as expected. > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx