Hi Mandell, On Thu, 7 Jun 2012, Mandell Degerness wrote: > I am thinking about data reliability issues and I'd like to know if we > can recover a cluster if we have most of the OSD data intact (i.e. > there are enough copies of all of the PGs), but we have lost all of > the monitor data. All of the monitor data can be found elsewhere: - cluster fsid - current mon map - past osd maps - the osd secret keys The mon secret key is only needed by the monitors, so a new one will do. So yes, in principle, a monitor can be rebuilt. However, it can only be done manually (and tediously) at the moment... there is no tool or process to do it easily. Making it easy wouldn't be too difficult, if there was a real need. However, given that you need a majority of the monitors online for the cluster to function, I would expect people to notice things are going wrong well before the last remaining monitor has a media failure. This is really just a protection against a catastrophic event that took out many nodes, and in that case I wouldn't expect all PGs to survive either. If they did, they were well distributed physically (or something), and the monitors should have been too... sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html