Re: Cannot shutdown monitors

Shinobu Kinjo <skinjo@xxxxxxxxxx> · Sat, 11 Feb 2017 13:12:22 +0900

On Sat, Feb 11, 2017 at 1:08 PM, Michael Andersen <michael@xxxxxxxxxxxxx> wrote:
> I believe I did shutdown mon process. Is that not done by the
>
> sudo systemctl stop ceph\*.service ceph\*.target

Oh, that's I missed.

>
> command? Also, as I noted, the mon process does not show up in ps after I do
> that, but I still get the shutdown halting.
>
> The libceph kernel module may be installed. I did not do so deliberately but
> I used ceph-deploy so if it installs that then that is why it's there. I
> also run some kubernetes pods with rbd persistent volumes on these machines,
> although no rbd volumes are in use or mounted when I try shut down. In fact
> I unmapped all rbd volumes across the whole cluster to make sure. Is libceph
> required for rbd?
>
> But even so, is it normal for the libceph kernel module to prevent shutdown?
> Is there another stage in the shutdown procedure that I am missing?
>
>
> On Feb 10, 2017 7:49 PM, "Brad Hubbard" <bhubbard@xxxxxxxxxx> wrote:
>
> That looks like dmesg output from the libceph kernel module. Do you
> have the libceph kernel module loaded?
>
> If the answer to that question is "yes" the follow-up question is
> "Why?" as it is not required for a MON or OSD host.
>
> On Sat, Feb 11, 2017 at 1:18 PM, Michael Andersen <michael@xxxxxxxxxxxxx>
> wrote:
>> Yeah, all three mons have OSDs on the same machines.
>>
>> On Feb 10, 2017 7:13 PM, "Shinobu Kinjo" <skinjo@xxxxxxxxxx> wrote:
>>>
>>> Is your primary MON running on the host which some OSDs are running on?
>>>
>>> On Sat, Feb 11, 2017 at 11:53 AM, Michael Andersen
>>> <michael@xxxxxxxxxxxxx> wrote:
>>> > Hi
>>> >
>>> > I am running a small cluster of 8 machines (80 osds), with three
>>> > monitors on
>>> > Ubuntu 16.04. Ceph version 10.2.5.
>>> >
>>> > I cannot reboot the monitors without physically going into the
>>> > datacenter
>>> > and power cycling them. What happens is that while shutting down, ceph
>>> > gets
>>> > stuck trying to contact the other monitors but networking has already
>>> > shut
>>> > down or something like that. I get an endless stream of:
>>> >
>>> > libceph: connect 10.20.0.10:6789 error -101
>>> > libceph: connect 10.20.0.13:6789 error -101
>>> > libceph: connect 10.20.0.17:6789 error -101
>>> >
>>> > where in this case 10.20.0.10 is the machine I am trying to shut down
>>> > and
>>> > all three IPs are the MONs.
>>> >
>>> > At this stage of the shutdown, the machine doesn't respond to pings,
>>> > and
>>> > I
>>> > cannot even log in on any of the virtual terminals. Nothing to do but
>>> > poweroff at the server.
>>> >
>>> > The other non-mon servers shut down just fine, and the cluster was
>>> > healthy
>>> > at the time I was rebooting the mon (I only reboot one machine at a
>>> > time,
>>> > waiting for it to come up before I do the next one).
>>> >
>>> > Also worth mentioning that if I execute
>>> >
>>> > sudo systemctl stop ceph\*.service ceph\*.target
>>> >
>>> > on the server, the only things I see are:
>>> >
>>> > root     11143     2  0 18:40 ?        00:00:00 [ceph-msgr]
>>> > root     11162     2  0 18:40 ?        00:00:00 [ceph-watch-noti]
>>> >
>>> > and even then, when no ceph daemons are left running, doing a reboot
>>> > goes
>>> > into the same loop.
>>> >
>>> > I can't really find any mention of this online, but I feel someone must
>>> > have
>>> > hit this. Any idea how to fix it? It's really annoying because its hard
>>> > for
>>> > me to get access to the datacenter.
>>> >
>>> > Thanks
>>> > Michael
>>> >
>>> > _______________________________________________
>>> > ceph-users mailing list
>>> > ceph-users@xxxxxxxxxxxxxx
>>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> >
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>
>
> --
> Cheers,
> Brad
>
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com