Hi Mike, Sorry to hear that, I hope this can help you to recover your RBD images: http://www.sebastien-han.fr/blog/2015/01/29/ceph-recover-a-rbd-image-from-a-dead-cluster/ Since you don’t have your monitors, you can still walk through the OSD data dir and look for the rbd identifiers. Something like this might help: sudo find /var/lib/ceph/osd/ -type f -name rbd*data.* | cut -d'.' -f 3 | sort | uniq Hope it helps. > On 29 Jan 2015, at 21:36, Mike Winfield <mike.winfield@xxxxxxxxxxxxxxxxxx> wrote: > > Hi, I'm hoping desperately that someone can help. I have a critical issue with a tiny 'cluster'... > > There was a power glitch earlier today (not an outage, might have been a brownout, some things went down, others didn't) and i came home to a CPU machine check exception on the singular host on which i keep a trio of ceph monitors. No option but to hard reset. When the system came back up, the monitors didn't. > > Each mon is reporting possible corruption of their leveldb stores, files are missing, one might surmise an fsck decided to discard them. See attached txt files for ceph-mon output and corresponding store.db directory listings. > > Is there any way to recover the leveldb for the monitors? I am more than capable and willing to dig into the structure of these files - or any similar measures - if necessary. Perhaps correlate a compete picture between the data files that are available? > > I do have a relevant backup of the monitor data but it is now three months old. I would prefer not to have to resort to this if there is any chance of recovering monitor operability by other means. > > Also, what would the consequences be of restoring such a backup when the (12TB worth of) osd's are perfectly fine and contain the latest up-to-date pg associations? Would there be a risk of data loss? > > Unfortunately i don't have any backups of the actual user data (being poor, scraping along on a shoestring budget, not exactly conducive to anything approaching an ideal hardware setup), unless one counts a set of old disks from a previously failed cluster from six months ago. > > My last recourse will likely be to try to scavenge and piece together my most important files from whatever i find on the osd's. Far from an exciting prospect but i am seriously desperate. > > I would be terribly grateful for any input. > > Mike > > <ceph-mon-0.txt><ceph-mon-1.txt><ceph-mon-2.txt><ls-mon-storedb.txt>_______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com Cheers. –––– Sébastien Han Cloud Architect "Always give 100%. Unless you're giving blood." Phone: +33 (0)1 49 70 99 72 Mail: sebastien.han@xxxxxxxxxxxx Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance
Attachment:
signature.asc
Description: Message signed with OpenPGP using GPGMail
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com