Hi John, thank you very much for your help. On 10/10/2017 12:57 PM, John Spray wrote: > A) Do a "rados -p <metadata pool> ls | grep "^506\." or similar, to > get a list of the objects done, gives me these: 506.00000000 506.00000017 506.0000001b 506.00000019 506.0000001a 506.0000001c 506.00000018 506.00000016 506.0000001e 506.0000001f 506.0000001d > B) Write a short bash loop to do a "rados -p <metadata pool> get" on > each of those objects into a file. done, saved them as the object name as filename, resulting in these 11 files: 90 Oct 10 13:17 506.00000000 4.0M Oct 10 13:17 506.00000016 4.0M Oct 10 13:17 506.00000017 4.0M Oct 10 13:17 506.00000018 4.0M Oct 10 13:17 506.00000019 4.0M Oct 10 13:17 506.0000001a 4.0M Oct 10 13:17 506.0000001b 4.0M Oct 10 13:17 506.0000001c 4.0M Oct 10 13:17 506.0000001d 4.0M Oct 10 13:17 506.0000001e 4.0M Oct 10 13:17 506.0000001f > C) Stop the MDS, set "debug mds = 20" and "debug journaler = 20", > mark the rank repaired, start the MDS again, and then gather the > resulting log (it should end in the same "Error -22 recovering > write_pos", but have much much more detail about what came before). I've attached the entire log from right before issueing "repaired" until after the mds drops to standby again. > Because you've hit a serious bug, it's really important to gather all > this and share it, so that we can try to fix it and prevent it > happening again to you or others. absolutely, sure. If you need anything more, I'm happy to share. > You have two options, depending on how much downtime you can tolerate: > - carefully remove all the metadata objects that start with 506. -- given the outtage (and people need access to their data), I'd go with this. Just to be safe: that would go like this? rados -p <metadata_pool> rm 506.00000000 rados -p <metadata_pool> rm 506.00000016 [...] Regards, Daniel
2017-10-10 13:21:55.413752 7f3f3011a700 5 mds.mds9 handle_mds_map epoch 96224 from mon.0 2017-10-10 13:21:55.413836 7f3f3011a700 10 mds.mds9 my compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=file layout v2} 2017-10-10 13:21:55.413847 7f3f3011a700 10 mds.mds9 mdsmap compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=file layout v2} 2017-10-10 13:21:55.413852 7f3f3011a700 10 mds.mds9 map says I am 147.87.226.189:6800/1634944095 mds.6.96224 state up:replay 2017-10-10 13:21:55.414088 7f3f3011a700 4 mds.6.purge_queue operator(): data pool 7 not found in OSDMap 2017-10-10 13:21:55.414141 7f3f3011a700 10 mds.mds9 handle_mds_map: initializing MDS rank 6 2017-10-10 13:21:55.414410 7f3f3011a700 10 mds.6.0 update_log_config log_to_monitors {default=true} 2017-10-10 13:21:55.414415 7f3f3011a700 10 mds.6.0 create_logger 2017-10-10 13:21:55.414635 7f3f3011a700 7 mds.6.server operator(): full = 0 epoch = 0 2017-10-10 13:21:55.414644 7f3f3011a700 4 mds.6.purge_queue operator(): data pool 7 not found in OSDMap 2017-10-10 13:21:55.414648 7f3f3011a700 4 mds.6.0 handle_osd_map epoch 0, 0 new blacklist entries 2017-10-10 13:21:55.414660 7f3f3011a700 10 mds.6.server apply_blacklist: killed 0 2017-10-10 13:21:55.414830 7f3f3011a700 10 mds.mds9 handle_mds_map: handling map as rank 6 2017-10-10 13:21:55.414839 7f3f3011a700 1 mds.6.96224 handle_mds_map i am now mds.6.96224 2017-10-10 13:21:55.414843 7f3f3011a700 1 mds.6.96224 handle_mds_map state change up:boot --> up:replay 2017-10-10 13:21:55.414855 7f3f3011a700 10 mds.beacon.mds9 set_want_state: up:standby -> up:replay 2017-10-10 13:21:55.414859 7f3f3011a700 1 mds.6.96224 replay_start 2017-10-10 13:21:55.414873 7f3f3011a700 7 mds.6.cache set_recovery_set 0,1,2,3,4,5,7,8 2017-10-10 13:21:55.414883 7f3f3011a700 1 mds.6.96224 recovery set is 0,1,2,3,4,5,7,8 2017-10-10 13:21:55.414893 7f3f3011a700 1 mds.6.96224 waiting for osdmap 18607 (which blacklists prior instance) 2017-10-10 13:21:55.414901 7f3f3011a700 4 mds.6.purge_queue operator(): data pool 7 not found in OSDMap 2017-10-10 13:21:55.416011 7f3f3011a700 7 mds.6.server operator(): full = 0 epoch = 18608 2017-10-10 13:21:55.416024 7f3f3011a700 4 mds.6.96224 handle_osd_map epoch 18608, 0 new blacklist entries 2017-10-10 13:21:55.416027 7f3f3011a700 10 mds.6.server apply_blacklist: killed 0 2017-10-10 13:21:55.416076 7f3f2a10e700 10 MDSIOContextBase::complete: 12C_IO_Wrapper 2017-10-10 13:21:55.416095 7f3f2a10e700 10 MDSInternalContextBase::complete: 15C_MDS_BootStart 2017-10-10 13:21:55.416101 7f3f2a10e700 2 mds.6.96224 boot_start 0: opening inotable 2017-10-10 13:21:55.416120 7f3f2a10e700 10 mds.6.inotable: load 2017-10-10 13:21:55.416301 7f3f2a10e700 2 mds.6.96224 boot_start 0: opening sessionmap 2017-10-10 13:21:55.416310 7f3f2a10e700 10 mds.6.sessionmap load 2017-10-10 13:21:55.416421 7f3f2a10e700 2 mds.6.96224 boot_start 0: opening mds log 2017-10-10 13:21:55.416430 7f3f2a10e700 5 mds.6.log open discovering log bounds 2017-10-10 13:21:55.416542 7f3f2990d700 4 mds.6.journalpointer Reading journal pointer '406.00000000' 2017-10-10 13:21:55.416603 7f3f2910c700 10 mds.6.log _submit_thread start 2017-10-10 13:21:55.417795 7f3f2a10e700 10 MDSIOContextBase::complete: 12C_IO_MT_Load 2017-10-10 13:21:55.417810 7f3f2a10e700 10 mds.6.inotable: load_2 got 34 bytes 2017-10-10 13:21:55.417816 7f3f2a10e700 10 mds.6.inotable: load_2 loaded v1 2017-10-10 13:21:55.417946 7f3f2a10e700 10 MDSIOContextBase::complete: N12_GLOBAL__N_112C_IO_SM_LoadE 2017-10-10 13:21:55.417953 7f3f2a10e700 10 mds.6.sessionmap _load_finish loaded version 0 2017-10-10 13:21:55.417956 7f3f2a10e700 10 mds.6.sessionmap _load_finish: omap load complete 2017-10-10 13:21:55.417958 7f3f2a10e700 10 mds.6.sessionmap _load_finish: v 0, 0 sessions 2017-10-10 13:21:55.417960 7f3f2a10e700 10 mds.6.sessionmap dump 2017-10-10 13:21:55.418287 7f3f2990d700 1 mds.6.journaler.mdlog(ro) recover start 2017-10-10 13:21:55.418298 7f3f2990d700 1 mds.6.journaler.mdlog(ro) read_head 2017-10-10 13:21:55.418406 7f3f2990d700 4 mds.6.log Waiting for journal 0x206 to recover... 2017-10-10 13:21:55.419926 7f3f2a10e700 1 mds.6.journaler.mdlog(ro) _finish_read_head loghead(trim 26013073408, expire 26013073408, write 26013073408, stream_format 1). probing for end of log (from 26013073408)... 2017-10-10 13:21:55.419947 7f3f2a10e700 1 mds.6.journaler.mdlog(ro) probing for end of the log 2017-10-10 13:21:55.420677 7f3f2a10e700 1 mds.6.journaler.mdlog(ro) _finish_probe_end write_pos = 26013073456 (header had 26013073408). recovered. 2017-10-10 13:21:55.420740 7f3f2990d700 4 mds.6.log Journal 0x206 recovered. 2017-10-10 13:21:55.420756 7f3f2990d700 4 mds.6.log Recovered journal 0x206 in format 1 2017-10-10 13:21:55.420763 7f3f2990d700 10 MDSInternalContextBase::complete: 15C_MDS_BootStart 2017-10-10 13:21:55.420765 7f3f2990d700 2 mds.6.96224 boot_start 1: loading/discovering base inodes 2017-10-10 13:21:55.420784 7f3f2990d700 0 mds.6.cache creating system inode with ino:0x106 2017-10-10 13:21:55.420831 7f3f2990d700 10 mds.6.cache.ino(0x106) fetch 2017-10-10 13:21:55.421116 7f3f2990d700 4 mds.6.purge_queue open: opening 2017-10-10 13:21:55.421122 7f3f2990d700 1 mds.6.journaler.pq(ro) recover start 2017-10-10 13:21:55.421124 7f3f2990d700 1 mds.6.journaler.pq(ro) read_head 2017-10-10 13:21:55.421231 7f3f2990d700 0 mds.6.cache creating system inode with ino:0x1 2017-10-10 13:21:55.422532 7f3f2a10e700 10 MDSIOContextBase::complete: 18C_IO_Inode_Fetched 2017-10-10 13:21:55.422548 7f3f2a10e700 10 mds.6.cache.ino(0x106) _fetched got 0 and 536 2017-10-10 13:21:55.422556 7f3f2a10e700 10 mds.6.cache.ino(0x106) magic is 'ceph fs volume v011' (expecting 'ceph fs volume v011') 2017-10-10 13:21:55.422584 7f3f2a10e700 10 mds.6.cache.snaprealm(0x106 seq 1 0x55b192f65c00) open_parents [1,head] 2017-10-10 13:21:55.422593 7f3f2a10e700 20 mds.6.cache.ino(0x106) decode_snap_blob snaprealm(0x106 seq 1 lc 0 cr 0 cps 1 snaps={} 0x55b192f65c00) 2017-10-10 13:21:55.422598 7f3f2a10e700 10 mds.6.cache.ino(0x106) _fetched [inode 0x106 [...2,head] ~mds6/ auth v19 snaprealm=0x55b192f65c00 f(v0 10=0+10) n(v3 rc2017-10-03 22:56:32.400835 b6253 88=11+77)/n(v0 11=0+11) (iversion lock) 0x55b193176700] 2017-10-10 13:21:55.831091 7f3f2b110700 1 mds.6.journaler.pq(ro) _finish_read_head loghead(trim 104857600, expire 108687220, write 108868115, stream_format 1). probing for end of log (from 108868115)... 2017-10-10 13:21:55.831107 7f3f2b110700 1 mds.6.journaler.pq(ro) probing for end of the log 2017-10-10 13:21:55.841213 7f3f2b110700 1 mds.6.journaler.pq(ro) _finish_probe_end write_pos = 134217728 (header had 108868115). recovered. 2017-10-10 13:21:55.841234 7f3f2b110700 4 mds.6.purge_queue operator(): open complete 2017-10-10 13:21:55.841236 7f3f2b110700 4 mds.6.purge_queue operator(): recovering write_pos 2017-10-10 13:21:55.841239 7f3f2b110700 10 mds.6.journaler.pq(ro) _prefetch 2017-10-10 13:21:55.841241 7f3f2b110700 10 mds.6.journaler.pq(ro) _prefetch 41943040 requested_pos 108868115 < target 134217728 (150811155), prefetching 25349613 2017-10-10 13:21:55.841246 7f3f2b110700 10 mds.6.journaler.pq(ro) _issue_read reading 108868115~25349613, read pointers 108868115/108868115/134217728 2017-10-10 13:21:55.841564 7f3f2b110700 10 mds.6.journaler.pq(ro) wait_for_readable at 108868115 onreadable 0x55b193232840 2017-10-10 13:21:55.842864 7f3f2b110700 10 mds.6.journaler.pq(ro) _finish_read got 108868115~183789 2017-10-10 13:21:55.842882 7f3f2b110700 10 mds.6.journaler.pq(ro) _assimilate_prefetch 108868115~183789 2017-10-10 13:21:55.842886 7f3f2b110700 10 mds.6.journaler.pq(ro) _assimilate_prefetch read_buf now 108868115~183789, read pointers 108868115/109051904/134217728 2017-10-10 13:21:55.842965 7f3f2b110700 -1 mds.6.journaler.pq(ro) _decode error from assimilate_prefetch 2017-10-10 13:21:55.842979 7f3f2b110700 -1 mds.6.purge_queue _recover: Error -22 recovering write_pos 2017-10-10 13:21:55.842983 7f3f2b110700 10 mds.beacon.mds9 set_want_state: up:replay -> down:damaged 2017-10-10 13:21:55.842990 7f3f2b110700 20 mds.beacon.mds9 0 slow request found 2017-10-10 13:21:55.843019 7f3f2b110700 10 mds.beacon.mds9 _send down:damaged seq 33 2017-10-10 13:21:55.843059 7f3f2b110700 20 mds.beacon.mds9 send_and_wait: awaiting 33 for up to 5s 2017-10-10 13:21:55.860786 7f3f3011a700 10 mds.beacon.mds9 handle_mds_beacon down:damaged seq 33 rtt 0.017763 2017-10-10 13:21:55.860818 7f3f2b110700 1 mds.mds9 respawn 2017-10-10 13:21:55.860824 7f3f2b110700 1 mds.mds9 e: 'ceph-mds' 2017-10-10 13:21:55.860826 7f3f2b110700 1 mds.mds9 0: 'ceph-mds' 2017-10-10 13:21:55.860829 7f3f2b110700 1 mds.mds9 1: '-i' 2017-10-10 13:21:55.860831 7f3f2b110700 1 mds.mds9 2: 'mds9' 2017-10-10 13:21:55.860834 7f3f2b110700 1 mds.mds9 3: '-f' 2017-10-10 13:21:55.860863 7f3f2b110700 1 mds.mds9 4: '--debug-mds=20' 2017-10-10 13:21:55.860864 7f3f2b110700 1 mds.mds9 5: '--debug-journaler=20' 2017-10-10 13:21:55.860911 7f3f2b110700 1 mds.mds9 respawning with exe /usr/bin/ceph-mds 2017-10-10 13:21:55.860919 7f3f2b110700 1 mds.mds9 exe_path /proc/self/exe 2017-10-10 13:21:55.887868 7f1f31970240 0 ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable), process (unknown), pid 3286 2017-10-10 13:21:55.891577 7f1f31970240 0 pidfile_write: ignore empty --pid-file 2017-10-10 13:21:55.894535 7f1f31970240 10 mds.mds9 112 MDSCacheObject 2017-10-10 13:21:55.894541 7f1f31970240 10 mds.mds9 1624 CInode 2017-10-10 13:21:55.894543 7f1f31970240 10 mds.mds9 16 elist<>::item *7=112 2017-10-10 13:21:55.894546 7f1f31970240 10 mds.mds9 504 inode_t 2017-10-10 13:21:55.894548 7f1f31970240 10 mds.mds9 48 nest_info_t 2017-10-10 13:21:55.894549 7f1f31970240 10 mds.mds9 40 frag_info_t 2017-10-10 13:21:55.894551 7f1f31970240 10 mds.mds9 40 SimpleLock *5=200 2017-10-10 13:21:55.894553 7f1f31970240 10 mds.mds9 48 ScatterLock *3=144 2017-10-10 13:21:55.894554 7f1f31970240 10 mds.mds9 456 CDentry 2017-10-10 13:21:55.894556 7f1f31970240 10 mds.mds9 16 elist<>::item 2017-10-10 13:21:55.894557 7f1f31970240 10 mds.mds9 40 SimpleLock 2017-10-10 13:21:55.894558 7f1f31970240 10 mds.mds9 824 CDir 2017-10-10 13:21:55.894559 7f1f31970240 10 mds.mds9 16 elist<>::item *2=32 2017-10-10 13:21:55.894561 7f1f31970240 10 mds.mds9 232 fnode_t 2017-10-10 13:21:55.894562 7f1f31970240 10 mds.mds9 48 nest_info_t *2 2017-10-10 13:21:55.894563 7f1f31970240 10 mds.mds9 40 frag_info_t *2 2017-10-10 13:21:55.894565 7f1f31970240 10 mds.mds9 272 Capability 2017-10-10 13:21:55.894566 7f1f31970240 10 mds.mds9 32 xlist<>::item *2=64 2017-10-10 13:21:55.895560 7f1f2ceef700 10 mds.mds9 MDSDaemon::ms_get_authorizer type=mon 2017-10-10 13:21:55.899616 7f1f31970240 10 mds.beacon.mds9 _send up:boot seq 1 2017-10-10 13:21:55.899932 7f1f2a712700 5 mds.mds9 handle_mds_map epoch 96224 from mon.0 2017-10-10 13:21:55.899982 7f1f2a712700 10 mds.mds9 my compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=file layout v2} 2017-10-10 13:21:55.899992 7f1f2a712700 10 mds.mds9 mdsmap compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=file layout v2} 2017-10-10 13:21:55.899999 7f1f2a712700 10 mds.mds9 map says I am 147.87.226.189:6800/3326997988 mds.-1.-1 state ??? 2017-10-10 13:21:55.900004 7f1f2a712700 10 mds.mds9 handle_mds_map: handling map in rankless mode 2017-10-10 13:21:55.900007 7f1f2a712700 10 mds.mds9 not in map yet 2017-10-10 13:21:55.900583 7f1f2c6ee700 10 mds.mds9 MDSDaemon::ms_get_authorizer type=mgr 2017-10-10 13:21:56.459839 7f1f2a712700 5 mds.mds9 handle_mds_map epoch 96225 from mon.0 2017-10-10 13:21:56.459868 7f1f2a712700 10 mds.mds9 my compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=file layout v2} 2017-10-10 13:21:56.459874 7f1f2a712700 10 mds.mds9 mdsmap compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=file layout v2} 2017-10-10 13:21:56.459877 7f1f2a712700 10 mds.mds9 map says I am 147.87.226.189:6800/3326997988 mds.-1.-1 state ??? 2017-10-10 13:21:56.459882 7f1f2a712700 10 mds.mds9 handle_mds_map: handling map in rankless mode 2017-10-10 13:21:56.459884 7f1f2a712700 10 mds.mds9 not in map yet 2017-10-10 13:21:59.899754 7f1f2770c700 10 mds.beacon.mds9 _send up:boot seq 2 2017-10-10 13:22:00.622000 7f1f2a712700 5 mds.mds9 handle_mds_map epoch 96226 from mon.0 2017-10-10 13:22:00.622048 7f1f2a712700 10 mds.mds9 my compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=file layout v2} 2017-10-10 13:22:00.622055 7f1f2a712700 10 mds.mds9 mdsmap compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=file layout v2} 2017-10-10 13:22:00.622060 7f1f2a712700 10 mds.mds9 map says I am 147.87.226.189:6800/3326997988 mds.-1.0 state up:standby 2017-10-10 13:22:00.622067 7f1f2a712700 10 mds.mds9 handle_mds_map: handling map in rankless mode 2017-10-10 13:22:00.622082 7f1f2a712700 10 mds.beacon.mds9 set_want_state: up:boot -> up:standby 2017-10-10 13:22:00.622089 7f1f2a712700 1 mds.mds9 handle_mds_map standby 2017-10-10 13:22:00.622250 7f1f2a712700 10 mds.beacon.mds9 handle_mds_beacon up:boot seq 2 rtt 0.722471 2017-10-10 13:22:03.899866 7f1f2770c700 10 mds.beacon.mds9 _send up:standby seq 3 2017-10-10 13:22:03.900760 7f1f2a712700 10 mds.beacon.mds9 handle_mds_beacon up:standby seq 3 rtt 0.000866 2017-10-10 13:22:07.899968 7f1f2770c700 10 mds.beacon.mds9 _send up:standby seq 4 2017-10-10 13:22:07.900735 7f1f2a712700 10 mds.beacon.mds9 handle_mds_beacon up:standby seq 4 rtt 0.000743 2017-10-10 13:22:11.900072 7f1f2770c700 10 mds.beacon.mds9 _send up:standby seq 5 2017-10-10 13:22:11.900898 7f1f2a712700 10 mds.beacon.mds9 handle_mds_beacon up:standby seq 5 rtt 0.000799 2017-10-10 13:22:15.900170 7f1f2770c700 10 mds.beacon.mds9 _send up:standby seq 6 2017-10-10 13:22:15.900927 7f1f2a712700 10 mds.beacon.mds9 handle_mds_beacon up:standby seq 6 rtt 0.000737 2017-10-10 13:22:19.900296 7f1f2770c700 10 mds.beacon.mds9 _send up:standby seq 7 2017-10-10 13:22:19.901092 7f1f2a712700 10 mds.beacon.mds9 handle_mds_beacon up:standby seq 7 rtt 0.000759 2017-10-10 13:22:23.900398 7f1f2770c700 10 mds.beacon.mds9 _send up:standby seq 8 2017-10-10 13:22:23.901256 7f1f2a712700 10 mds.beacon.mds9 handle_mds_beacon up:standby seq 8 rtt 0.000830 2017-10-10 13:22:27.900492 7f1f2770c700 10 mds.beacon.mds9 _send up:standby seq 9 2017-10-10 13:22:27.901297 7f1f2a712700 10 mds.beacon.mds9 handle_mds_beacon up:standby seq 9 rtt 0.000784
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com