Feedback on docs after MDS damage/journal corruption

Henrik Korkuc <lists@xxxxxxxxx> · Tue, 11 Oct 2016 14:00:12 +0300

Hey,

After a bright idea to pause 10.2.2 Ceph cluster for a minute to see if 
it will speed up backfill I managed to corrupt my MDS journal (should it 
happen after cluster pause/unpause, or is it some sort of a bug?). I had 
"Overall journal integrity: DAMAGED", etc

I was following 
http://docs.ceph.com/docs/jewel/cephfs/disaster-recovery/ and have some 
questions/feedback:

* It would be great to have some info when ‘snap’ or ‘inode’ should be reset
* It is not clear when MDS start should be attempted
* Can scan_extents/scan_inodes be run after MDS is running?
* "online MDS scrub" is mentioned in docs. Is it 
scan_extents/scan_inodes or some other command?

Now CephFS seems to be working (I have "mds0: Metadata damage detected" 
but scan_extends is currently running), let's see what happens when I 
finish scan_extends/scan_inodes.

Will these actions solve possible orphaned objects in pools? What else 
should I look into?

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com