Re: mon_osd_down_out_subtree_limit not working?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hrmm that is strange.

We set it via /etc/ceph/ceph.conf, not the config framework. Maybe try that?

-- dan

On Wed, Jul 15, 2020 at 9:59 AM Frank Schilder <frans@xxxxxx> wrote:
>
> Hi Dan,
>
> it still does not work. When I execute
>
> # ceph config set global mon_osd_down_out_subtree_limit host
> 2020-07-15 09:17:11.890 7f36cf7fe700 -1 set_mon_vals failed to set mon_osd_down_out_subtree_limit = host: Configuration option 'mon_osd_down_out_subtree_limit' may not be modified at runtime
>
> I get now a warning that one cannot change the value at run time. However, a restart of all monitors still does not apply the value:
>
> # ceph config show mon.ceph-01 | grep -e NAME -e mon_osd_down_out_subtree_limit | sed -e "s/  */\t/g"
> NAME    VALUE   SOURCE  OVERRIDES       IGNORES
> mon_osd_down_out_subtree_limit  rack    default mon
>
> so the setting in the config data base is still ignored. Any ideas? I cannot shut down the entire cluster for something that simple.
>
> Best regards,
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> ________________________________________
> From: Dan van der Ster <dan@xxxxxxxxxxxxxx>
> Sent: 14 July 2020 17:38:27
> To: Frank Schilder
> Cc: Anthony D'Atri; ceph-users
> Subject: Re:  Re: mon_osd_down_out_subtree_limit not working?
>
> Seems that
>
>     ceph config set mon mon_osd_down_out_subtree_limit
>
> isn't working. (I've seen this sort of config namespace issue in the past).
>
> I'd try `ceph config set global mon_osd_down_out_subtree_limit host`
> then restart the mon and check `ceph daemon mon.ceph-01 config get
> mon_osd_down_out_subtree_limit` again.
>
> -- dan
>
>
> On Tue, Jul 14, 2020 at 1:35 PM Frank Schilder <frans@xxxxxx> wrote:
> >
> > Hi Dan,
> >
> > thanks for your reply. There is still a problem.
> >
> > Firstly, I did indeed forget to restart the mon even though I looked at the help for mon_osd_down_out_subtree_limit and it says it requires a restart. Stupid me. Well, now I did a restart and it still doesn't work. Here is the situation:
> >
> > # ceph config dump | grep subtree
> >   mon                      advanced mon_osd_down_out_subtree_limit    host                                                         *
> >   mon                      advanced mon_osd_reporter_subtree_level    datacenter
> >
> > # ceph config get mon.ceph-01 mon_osd_down_out_subtree_limit
> > host
> >
> > # ceph daemon mon.ceph-01 config get mon_osd_down_out_subtree_limit
> > {
> >     "mon_osd_down_out_subtree_limit": "rack"
> > }
> >
> > # ceph config show mon.ceph-01 | grep subtree
> > mon_osd_down_out_subtree_limit rack                           default               mon
> > mon_osd_reporter_subtree_level datacenter                     mon
> >
> > The default overrides the mon config database setting. What is going on here? I restarted all 3 monitors.
> >
> > Best regards and thanks for your help,
> > =================
> > Frank Schilder
> > AIT Risø Campus
> > Bygning 109, rum S14
> >
> > ________________________________________
> > From: Dan van der Ster <dan@xxxxxxxxxxxxxx>
> > Sent: 14 July 2020 10:53:13
> > To: Frank Schilder
> > Cc: Anthony D'Atri; ceph-users
> > Subject: Re:  Re: mon_osd_down_out_subtree_limit not working?
> >
> > mon_osd_down_out_subtree_limit has been working well here. Did you
> > restart the mon's after making that config change?
> > Can you do this just to make sure it took effect?
> >
> >    ceph daemon mon.`hostname -s` config get mon_osd_down_out_subtree_limit
> >
> > -- dan
> >
> > On Tue, Jul 14, 2020 at 8:57 AM Frank Schilder <frans@xxxxxx> wrote:
> > >
> > > Yes. After the time-out of 600 secs the OSDs got marked down, all PGs got remapped and recovery/rebalancing started as usual. In the past, I did service on servers with the flag noout set and would expect that mon_osd_down_out_subtree_limit=host has the same effect when shutting down an entire host. Unfortunately, in my case these two settings behave differently.
> > >
> > > If I understand the documentation correctly, the OSDs should not get marked out automatically.
> > >
> > > Best regards,
> > > =================
> > > Frank Schilder
> > > AIT Risø Campus
> > > Bygning 109, rum S14
> > >
> > > ________________________________________
> > > From: Anthony D'Atri <anthony.datri@xxxxxxxxx>
> > > Sent: 14 July 2020 04:32:05
> > > To: Frank Schilder
> > > Subject: Re:  mon_osd_down_out_subtree_limit not working?
> > >
> > > Did it start rebalancing?
> > >
> > > > On Jul 13, 2020, at 4:29 AM, Frank Schilder <frans@xxxxxx> wrote:
> > > >
> > > > if I shut down all OSDs on this host, these OSDs should not be marked out automatically after mon_osd_down_out_interval(=600) seconds. I did a test today and, unfortunately, the OSDs do get marked as out. Ceph status was showing 1 host down as expected.
> > > _______________________________________________
> > > ceph-users mailing list -- ceph-users@xxxxxxx
> > > To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux