MDS damaged

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

after the upgrade to luminous 12.2.6 today, all our MDSes have been marked as damaged. Trying to restart the instances only result in standby MDSes. We currently have 2 filesystems active and 2 MDSes each.

I found the following error messages in the mon:


mds.0 <node1_IP>:6800/2412911269 down:damaged
mds.1 <node2_IP>:6800/830539001 down:damaged
mds.0 <node3_IP>:6800/4080298733 down:damaged


Whenever I try to force the repaired state with ceph mds repaired <fs_name>:<rank> I get something like this in the MDS logs:


2018-07-11 13:20:41.597970 7ff7e010e700  0 mds.1.journaler.mdlog(ro) error getting journal off disk 2018-07-11 13:20:41.598173 7ff7df90d700 -1 log_channel(cluster) log [ERR] : Error recovering journal 0x201: (5) Input/output error


Any attempt of running the journal export results in errors, like this one:


cephfs-journal-tool --rank=cephfs:0 journal export backup.bin
Error ((5) Input/output error)2018-07-11 17:01:30.631571 7f94354fff00 -1 Header 200.00000000 is unreadable

2018-07-11 17:01:30.631584 7f94354fff00 -1 journal_export: Journal not readable, attempt object-by-object dump with `rados`


Same happens for recover_dentries

cephfs-journal-tool --rank=cephfs:0 event recover_dentries summary
Events by type:2018-07-11 17:04:19.770779 7f05429fef00 -1 Header 200.00000000 is unreadable
Errors:
0

Is there something I could try to do to have the cluster back?

I was able to dump the contents of the metadata pool with rados export -p cephfs_metadata <filename> and I'm currently trying the procedure described in http://docs.ceph.com/docs/master/cephfs/disaster-recovery-experts/#using-an-alternate-metadata-pool-for-recovery but I'm not sure if it will work as it's apparently doing nothing at the moment (maybe it's just very slow).

Any help is appreciated, thanks!


    Alessandro

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux