> On two separate occasions I have lost power to my Ceph cluster. Both times, I had trouble bringing the cluster back to good health. I am wondering if I need to config something that would solve this problem? No special configuration should be necessary, I've had the unfortunate luck of witnessing several power loss events with large Ceph clusters. In both cases something other than Ceph was the source of frustrations once power was returned. That said, monitor daemons should be started first and must form a quorum before the cluster will be usable. It sounds like you have made it that far if your getting output from "ceph health" commands. The next step is to get your Ceph OSD daemons running, which will require the data partitions to be mounted and the journal device present. In Ubuntu installations this is handled by udev scripts installed by the Ceph packages (I think this is may be true for RHEL/CentOS but have not verified). Short of the udev method you can mount the data partition manually. Once the data partition is mounted you can start the OSDs manually in the event that init still doesn't work after mounting, to do so you will need to know the location of your keyring, ceph.conf and the OSD id. If you are unsure of what the OSD id is then you can look at the root of the OSD data partition, after it is mounted, in a file named "whoami". To manually start: /usr/bin/ceph-osd -i ${OSD_ID} --pid-file /var/run/ceph/osd.${OSD_ID}.pid -c /etc/ceph/ceph.conf After that it's a matter of examining the logs if your still having issues getting the OSDs to boot. > After powering back up the cluster, “ceph health” revealed stale pages, mds cluster degraded, 3/3 OSDs down. I tried to issue “sudo /etc/init.d/ceph -a start” but I got no output from the command and the health status did not change. The placement groups are stale because none of the OSDs have reported their state recently since they are down. > I ended up having to re-install the cluster to fix the issue, but as my group wants to use Ceph for VM storage in the future, we need to find a solution. That's a shame, but at least you will be better prepared if it happens again, hopefully your luck is not as unfortunate as mine! -- Kyle Bader _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com