Am 04.05.15 um 09:00 schrieb Yujian Peng: > Hi, > I'm encountering a data disaster. I have a ceph cluster with 145 osd. The > data center had a power problem yesterday, and all of the ceph nodes were down. > But now I find that 6 disks(xfs) in 4 nodes have data corruption. Some disks > are unable to mount, and some disks have IO errors in syslog. > mount: Structure needs cleaning > xfs_log_forece: error 5 returned > I tried to repair one with xfs_repair -L /dev/sdx1, but the ceph-osd > reported a leveldb error: > Error initializing leveldb: Corruption: checksum mismatch > I cannot start the 6 osds and 22 pgs is down. > This is really a tragedy for me. Can you give me some idea to recovery the > xfs? Thanks very much! We had a similar issue last year. We ended up building a new Ceph cluster and manually importing all objects in a tedious, one-week process. The folks at Inktank were invaluable, providing us with the tools to recover every object from the broken cluster (we did not lose one single object due to corruption!), but without a support contract, we would have been lost. I know this is a community list and usually, commercial offers would be frowned upon, but this is the best advice I can give: If what you are running is a production cluster, you should seek contact with an Inktank/Redhat representative and negotiate if and how they can assist you with recovery. I am not sure that there are a lot of other options to get your data back. On the upside, however, you can be fairly sure that, although your cluster is totally lost now, most if not all objects will be able to recover. Sorry I couldn't be of more help, but that's how we experienced this issue. Regards, --ck _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com