Hi John, On 10/09/2017 10:47 AM, John Spray wrote: > When a rank is "damaged", that means the MDS rank is blocked from > starting because Ceph thinks the on-disk metadata is damaged -- no > amount of restarting things will help. thanks. > The place to start with the investigation is to find the source of the > damage. Look in your monitor log for "marking rank 6 damaged" I found this in the mon log: 2017-10-09 03:24:28.207424 7f3290710700 0 log_channel(cluster) log [DBG] : mds.6 147.87.226.187:6800/1120166215 down:damaged so at the time it was marked damaged, rank 6 was running on mds7. > and then look in your MDS logs at that timestamp (find the MDS that held > rank 6 at the time). looking at mds7 log for that timespan, I think I understand that: * at "early" 03:24, mds7 was serving rank 5 and crashed, restarted automatically twice, and then picked up rank 6 at 03:24:21. * at 03:24:21, mds7 got rank 6 and got into 'standby'-mode(?): 2017-10-09 03:24:21.598446 7f70ca01c240 0 set uid:gid to 64045:64045 (ceph:ceph) 2017-10-09 03:24:21.598469 7f70ca01c240 0 ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable), process (unknown), pid 1337 2017-10-09 03:24:21.601958 7f70ca01c240 0 pidfile_write: ignore empty --pid-file 2017-10-09 03:24:26.108545 7f70c2580700 1 mds.mds7 handle_mds_map standby 2017-10-09 03:24:26.115469 7f70c2580700 1 mds.6.95474 handle_mds_map i am now mds.6.95474 2017-10-09 03:24:26.115479 7f70c2580700 1 mds.6.95474 handle_mds_map state change up:boot --> up:replay 2017-10-09 03:24:26.115493 7f70c2580700 1 mds.6.95474 replay_start 2017-10-09 03:24:26.115502 7f70c2580700 1 mds.6.95474 recovery set is 0,1,2,3,4,5,7,8 2017-10-09 03:24:26.115511 7f70c2580700 1 mds.6.95474 waiting for osdmap 18284 (which blacklists prior instance) 2017-10-09 03:24:26.536629 7f70bc574700 0 mds.6.cache creating system inode with ino:0x106 2017-10-09 03:24:26.537009 7f70bc574700 0 mds.6.cache creating system inode with ino:0x1 2017-10-09 03:24:27.233759 7f70bd576700 -1 mds.6.journaler.pq(ro) _decode error from assimilate_prefetch 2017-10-09 03:24:27.233780 7f70bd576700 -1 mds.6.purge_queue _recover: Error -22 recovering write_pos 2017-10-09 03:24:27.238820 7f70bd576700 1 mds.mds7 respawn 2017-10-09 03:24:27.238828 7f70bd576700 1 mds.mds7 e: '/usr/bin/ceph-mds' 2017-10-09 03:24:27.238831 7f70bd576700 1 mds.mds7 0: '/usr/bin/ceph-mds' 2017-10-09 03:24:27.238833 7f70bd576700 1 mds.mds7 1: '-f' 2017-10-09 03:24:27.238835 7f70bd576700 1 mds.mds7 2: '--cluster' 2017-10-09 03:24:27.238836 7f70bd576700 1 mds.mds7 3: 'ceph' 2017-10-09 03:24:27.238838 7f70bd576700 1 mds.mds7 4: '--id' 2017-10-09 03:24:27.238839 7f70bd576700 1 mds.mds7 5: 'mds7' 2017-10-09 03:24:27.239567 7f70bd576700 1 mds.mds7 6: '--setuser' 2017-10-09 03:24:27.239579 7f70bd576700 1 mds.mds7 7: 'ceph' 2017-10-09 03:24:27.239580 7f70bd576700 1 mds.mds7 8: '--setgroup' 2017-10-09 03:24:27.239581 7f70bd576700 1 mds.mds7 9: 'ceph' 2017-10-09 03:24:27.239612 7f70bd576700 1 mds.mds7 respawning with exe /usr/bin/ceph-mds 2017-10-09 03:24:27.239614 7f70bd576700 1 mds.mds7 exe_path /proc/self/exe 2017-10-09 03:24:27.268448 7f9c7eafa240 0 ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable), process (unknown), pid 1337 2017-10-09 03:24:27.271987 7f9c7eafa240 0 pidfile_write: ignore empty --pid-file 2017-10-09 03:24:31.325891 7f9c7789c700 1 mds.mds7 handle_mds_map standby 2017-10-09 03:24:31.332376 7f9c7789c700 1 mds.1.0 handle_mds_map i am now mds.28178286.0 replaying mds.1.0 2017-10-09 03:24:31.332388 7f9c7789c700 1 mds.1.0 handle_mds_map state change up:boot --> up:standby-replay 2017-10-09 03:24:31.332401 7f9c7789c700 1 mds.1.0 replay_start 2017-10-09 03:24:31.332410 7f9c7789c700 1 mds.1.0 recovery set is 0,2,3,4,5,6,7,8 2017-10-09 03:24:31.332425 7f9c7789c700 1 mds.1.0 waiting for osdmap 18285 (which blacklists prior instance) 2017-10-09 03:24:31.351850 7f9c7108f700 0 mds.1.cache creating system inode with ino:0x101 2017-10-09 03:24:31.352204 7f9c7108f700 0 mds.1.cache creating system inode with ino:0x1 2017-10-09 03:24:32.144505 7f9c7008d700 0 mds.1.cache creating system inode with ino:0x100 2017-10-09 03:24:32.144671 7f9c7008d700 1 mds.1.0 replay_done (as standby) 2017-10-09 03:24:33.150117 7f9c71890700 1 mds.1.0 replay_done (as standby) for about two hours, then, the last line repeats unchanged for every following second. where can I go with this? anything I can do further? also, just in case: it seems that at the time of the crash a large (= a lot, lot of small files) 'rm -rf' was running (all clients use kernel 4.13.4 to mount the cephfs, not fuse). Regards, Daniel _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com