cephfs degraded on ceph luminous 12.2.2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I'm running on ceph luminous 12.2.2 and my cephfs suddenly degraded.

I have 2 active mds instances and 1 standby. All the active instances are now in replay state and show the same error in the logs:


---- mds1 ----

2018-01-08 16:04:15.765637 7fc2e92451c0  0 ceph version 12.2.2 (cf0baeeeeba3b47f9427c6c97e2144b094b7e5ba) luminous (stable), process (unknown), pid 164
starting mds.mds1 at -
2018-01-08 16:04:15.785849 7fc2e92451c0  0 pidfile_write: ignore empty --pid-file
2018-01-08 16:04:20.168178 7fc2e1ee1700  1 mds.mds1 handle_mds_map standby
2018-01-08 16:04:20.278424 7fc2e1ee1700  1 mds.1.20635 handle_mds_map i am now mds.1.20635 2018-01-08 16:04:20.278432 7fc2e1ee1700  1 mds.1.20635 handle_mds_map state change up:boot --> up:replay
2018-01-08 16:04:20.278443 7fc2e1ee1700  1 mds.1.20635 replay_start
2018-01-08 16:04:20.278449 7fc2e1ee1700  1 mds.1.20635  recovery set is 0
2018-01-08 16:04:20.278458 7fc2e1ee1700  1 mds.1.20635  waiting for osdmap 21467 (which blacklists prior instance)


---- mds2 ----

2018-01-08 16:04:16.870459 7fd8456201c0  0 ceph version 12.2.2 (cf0baeeeeba3b47f9427c6c97e2144b094b7e5ba) luminous (stable), process (unknown), pid 295
starting mds.mds2 at -
2018-01-08 16:04:16.881616 7fd8456201c0  0 pidfile_write: ignore empty --pid-file
2018-01-08 16:04:21.274543 7fd83e2bc700  1 mds.mds2 handle_mds_map standby
2018-01-08 16:04:21.314438 7fd83e2bc700  1 mds.0.20637 handle_mds_map i am now mds.0.20637 2018-01-08 16:04:21.314459 7fd83e2bc700  1 mds.0.20637 handle_mds_map state change up:boot --> up:replay
2018-01-08 16:04:21.314479 7fd83e2bc700  1 mds.0.20637 replay_start
2018-01-08 16:04:21.314492 7fd83e2bc700  1 mds.0.20637  recovery set is 1
2018-01-08 16:04:21.314517 7fd83e2bc700  1 mds.0.20637  waiting for osdmap 21467 (which blacklists prior instance) 2018-01-08 16:04:21.393307 7fd837aaf700  0 mds.0.cache creating system inode with ino:0x100 2018-01-08 16:04:21.397246 7fd837aaf700  0 mds.0.cache creating system inode with ino:0x1

The cluster is recovering as we are changing some of the osds, and there are a few slow/stuck requests, but I'm not sure if this is the cause, as there is apparently no data loss (until now).

How can I force the MDSes to quit the replay state?

Thanks for any help,


    Alessandro


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux