ceph-mon terminated with status 28

"Deneau, Tom" <tom.deneau@xxxxxxx> · Sun, 13 Dec 2015 13:49:16 +0000

I am trying to understand the following failure:

A small cluster was running fine, and then was left unused for a while.
When I went to try to use it again, the mon socket wasn't there and I could see that
ceph-mon was not running.  I saw the lines below at the end of dmesg output.
When I tried to restart ceph-mon using sudo start ceph-mon id=monhost,
I got the same set of errors newly appended to dmesg output.

I don't see anything more descriptive in /var/log/ceph/ceph-mon.log, just
the recording of new mon processes starting.

In this particular small cluster, the mon process was running on the same
node with 7 osd processes.  sudo initctl list shows that the osd procs are still
up, although logging the fact that they can't communicate with the mon socket.

Is there someplace else I should look for more details as to why mon is down
and can't be restarted?

-- Tom Deneau

dmesg output:
--------------
 init: ceph-mon (ceph/monhost) main process (16538) terminated with status 28
 init: ceph-mon (ceph/monhost) main process ended, respawning
 init: ceph-create-keys main process (16227) killed by TERM signal
 init: ceph-mon (ceph/monhost) main process (16546) terminated with status 28
 init: ceph-mon (ceph/monhost) main process ended, respawning
 init: ceph-create-keys main process (16548) killed by TERM signal
 init: ceph-mon (ceph/monhost) main process (16556) terminated with status 28
 init: ceph-mon (ceph/monhost) main process ended, respawning
 init: ceph-create-keys main process (16558) killed by TERM signal
 init: ceph-mon (ceph/monhost) main process (16566) terminated with status 28
 init: ceph-mon (ceph/monhost) respawning too fast, stopped
 init: ceph-create-keys main process (16568) killed by TERM signal

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html