Problems recovering MDS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



So, I was running Ceph 10.2.9 servers, with 10.2.6 (I think, what is in CentOS’s Jewel-SIG repo?), clients.

I had an issue where the MDS cluster stopped working, wasn’t responding to cache pressure, and I restarted the mdd’s and they failed to replay the journal. 

Long story short, I managed to get things sort of working, I upgraded to Luminous 12.1.4rc because it had the more developed cephfs-data-scan tools with scan_links (Jewel did not). Even though things are mostly working, there is obviously still some corruption in links and metadata, as I’m getting logs of them.

What I need to know is, how can I fix this so that I clear all the data corruption? I’ve gone through the steps documented in the disaster recovery. I’m doing a last ditch attempt to re-order how I do things just a little by running scan_frags, then scan_extents and scan_inodes, hoping that it can repair some of the damage.

At the very least what I want, since nothing important seems to be corrupted/damaged, is to repair or delete the damaged links/references, and clear up all that so things run reliably again.

Eric Renfro
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux