I am trying to understand the following failure: A small cluster was running fine, and then was left unused for a while. When I went to try to use it again, the mon socket wasn't there and I could see that ceph-mon was not running. I saw the lines below at the end of dmesg output. When I tried to restart ceph-mon using sudo start ceph-mon id=monhost, I got the same set of errors newly appended to dmesg output. I don't see anything more descriptive in /var/log/ceph/ceph-mon.log, just the recording of new mon processes starting. In this particular small cluster, the mon process was running on the same node with 7 osd processes. sudo initctl list shows that the osd procs are still up, although logging the fact that they can't communicate with the mon socket. Is there someplace else I should look for more details as to why mon is down and can't be restarted? -- Tom Deneau dmesg output: -------------- init: ceph-mon (ceph/monhost) main process (16538) terminated with status 28 init: ceph-mon (ceph/monhost) main process ended, respawning init: ceph-create-keys main process (16227) killed by TERM signal init: ceph-mon (ceph/monhost) main process (16546) terminated with status 28 init: ceph-mon (ceph/monhost) main process ended, respawning init: ceph-create-keys main process (16548) killed by TERM signal init: ceph-mon (ceph/monhost) main process (16556) terminated with status 28 init: ceph-mon (ceph/monhost) main process ended, respawning init: ceph-create-keys main process (16558) killed by TERM signal init: ceph-mon (ceph/monhost) main process (16566) terminated with status 28 init: ceph-mon (ceph/monhost) respawning too fast, stopped init: ceph-create-keys main process (16568) killed by TERM signal -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html