Did you do any special operation (such as reset journal) before this happened On Sat, Nov 24, 2018 at 3:58 PM 关云飞 <gyfelectric@xxxxxxxxx> wrote: > > hi, > > According to my understanding, the parent directory CDir fnode > version should be incremented if creating file or directory operation > happened. But there was a strange problem in my environment parent > fnode version is far less then subdirectories inode version which > leading a assert fail and the mds can’t work now. > > Does anyone know how shuod i do can recover my environment? > Greatly appreciated if there has any help! > > ceph version: jewel 10.2.2 > mds mode: single mds > > mds.log > 2018-11-07 17:44:49.892205 7f4015824700 15 mds.0.journal try_to_expire > committing [dir 1 / [2,head] auth v=94977 cv=0/0 dir_auth=0 > state=1610612738|complete f(v0 m2018-10-17 16:39:14.350401 3=0+3) > n(v7782 rc2018-11-07 17:44:20.552395 b3272234942693 > 171405=170108+1297)/n(v7782 rc2018-10-17 16:39:14.350401 > b3272234942693 171394=170108+1286) hs=200+2,ss=0+0 dirty=2 | child=1 > subtree=1 subtreetemp=0 dirty=1 waiter=0 authpin=0 0x7f402a7aa000] > 2018-11-07 17:44:49.892249 7f4015824700 10 mds.0.cache.dir(1) commit > want 0 on [dir 1 / [2,head] auth v=94977 cv=0/0 dir_auth=0 > state=1610612738|complete f(v0 m2018-10-17 16:39:14.350401 3=0+3) > n(v7782 rc2018-11-07 17:44:20.552395 b3272234942693 > 171405=170108+1297)/n(v7782 rc2018-10-17 16:39:14.350401 > b3272234942693 171394=170108+1286) hs=200+2,ss=0+0 dirty=2 | child=1 > subtree=1 subtreetemp=0 dirty=1 waiter=0 authpin=0 0x7f402a7aa000] > 2018-11-07 17:44:49.892280 7f4015824700 10 mds.0.cache.dir(1) auth_pin > by 0x7f402a7aa000 on [dir 1 / [2,head] auth v=94977 cv=0/0 dir_auth=0 > ap=1+0+0 state=1610612738|complete f(v0 m2018-10-17 16:39:14.350401 > 3=0+3) n(v7782 rc2018-11-07 17:44:20.552395 b3272234942693 > 171405=170108+1297)/n(v7782 rc2018-10-17 16:39:14.350401 > b3272234942693 171394=170108+1286) hs=200+2,ss=0+0 dirty=2 | child=1 > subtree=1 subtreetemp=0 dirty=1 waiter=0 authpin=1 0x7f402a7aa000] > count now 1 + 0 > 2018-11-07 17:44:49.892312 7f4015824700 10 mds.0.cache.dir(1) _commit > want 94977 on [dir 1 / [2,head] auth v=94977 cv=0/0 dir_auth=0 > ap=1+0+0 state=1610612738|complete f(v0 m2018-10-17 16:39:14.350401 > 3=0+3) n(v7782 rc2018-11-07 17:44:20.552395 b3272234942693 > 171405=170108+1297)/n(v7782 rc2018-10-17 16:39:14.350401 > b3272234942693 171394=170108+1286) hs=200+2,ss=0+0 dirty=2 | child=1 > subtree=1 subtreetemp=0 dirty=1 waiter=0 authpin=1 0x7f402a7aa000] > 2018-11-07 17:44:49.892349 7f4015824700 10 mds.0.cache.dir(1) marking committing > 2018-11-07 17:44:49.892355 7f4015824700 10 mds.0.cache.dir(1) _omap_commit > 2018-11-07 17:44:49.892610 7f4015824700 10 mds.0.cache.dir(1) set > bioflow [dentry #1/bioflow [2,head] auth (dversion lock) v=94976 > inode=0x7f402a798c40 | lock=0 inodepin=1 dirty=1 authpin=0 > 0x7f402a7c4000] > 2018-11-07 17:44:49.892632 7f4015824700 14 mds.0.cache.dir(1) pos 8 dn > 'bioflow' inode [inode 10000000001 [...2,head] /bioflow/ auth v94976 > f(v0 m2018-10-17 14:48:45.785088 4=0+4) n(v6615 rc2018-11-07 > 17:44:20.552395 b3212972601995 169081=168070+1011) (inest lock) > (iversion lock) caps={1716375=pAsLsXsFs/-@1,1721730=pAsLsXsFs/-@3} | > dirtyscattered=0 request=0 lock=0 dirfrag=1 caps=1 dirtyrstat=0 > dirty=1 authpin=0 0x7f402a798c40 mode: 16877] > 2018-11-07 17:44:49.892924 7f4015824700 10 mds.0.cache.dir(1) rm test > [dentry #1/test [2,head] auth NULL (dversion lock) v=94948 inode=0 | > inodepin=0 dirty=1 0x7f402b156bc0] > 2018-11-07 17:44:49.893070 7f4015824700 1 -- 192.168.1.7:6807/26144 > --> 192.168.1.6:6801/31980 -- osd_op(mds.0.63911:110775 8.6b2cdaff > 1.00000000 [tmap2omap 1~0,omap-set-header 0~258,omap-set-vals > 0~458,omap-rm-keys 0~17] snapc 0=[] > ondisk+write+known_if_redirected+full_force e6778) v7 -- ?+0 > 0x7f40304ef8c0 con 0x7f402d350600 > > 2018-11-07 17:44:49.893134 7f4015824700 15 mds.0.journal try_to_expire > committing [dir 10000000001 /bioflow/ [2,head] auth v=94464 cv=0/0 > state=1610612736 f(v0 m2018-10-17 14:48:45.785088 4=0+4) n(v6615 > rc2018-11-07 17:44:20.552395 b3212972601995 169080=168070+1010) > hs=4+0,ss=0+0 dirty=2 | child=1 dirty=1 authpin=0 0x7f402a7aa600] > //expire next logsegment > > 2018-11-07 17:44:50.045914 7f4013820700 10 MDSIOContextBase::complete: > 18C_IO_Dir_Committed > 2018-11-07 17:44:50.045917 7f4013820700 10 mds.0.cache.dir(1) > _committed v 94977 on [dir 1 / [2,head] auth v=94977 cv=94977/0 > dir_auth=0 ap=1+0+1 state=1610612802|complete f(v0 m2018-10-17 > 16:39:14.350401 3=0+3) n(v7782 rc2018-11-07 17:44:20.552395 > b3272234942693 171405=170108+1297)/n(v7782 rc2018-10-17 > 16:39:14.350401 b3272234942693 171394=170108+1286) hs=200+2,ss=0+0 > dirty=2 | child=1 subtree=1 subtreetemp=0 dirty=1 waiter=0 authpin=1 > 0x7f402a7aa000] > 2018-11-07 17:44:50.045936 7f4013820700 10 mds.0.cache.dir(1) > mark_clean [dir 1 / [2,head] auth v=94977 cv=94977/94977 dir_auth=0 > ap=1+0+1 state=1610612738|complete f(v0 m2018-10-17 16:39:14.350401 > 3=0+3) n(v7782 rc2018-11-07 17:44:20.552395 b3272234942693 > 171405=170108+1297)/n(v7782 rc2018-10-17 16:39:14.350401 > b3272234942693 171394=170108+1286) hs=200+2,ss=0+0 dirty=2 | child=1 > subtree=1 subtreetemp=0 dirty=1 waiter=0 authpin=1 0x7f402a7aa000] > version 94977 > 2018-11-07 17:44:50.045955 7f4013820700 15 mds.0.cache.dir(1) dir > 94977 < inode 1755186 still dirty [inode 10000003557 [...2,head] > /tree112/ auth v1755186 f(v0 m2018-09-11 17:56:17.483220 1=1+0) n(v0 > rc2018-09-11 17:56:17.483220 2=1+1) (inest lock) (iversion lock) > caps={1716375=pAsLsXsFs/-@1},l=1716375 | caps=1 0x7f402d9aa820 mode: > 16877] > 2018-11-07 17:44:50.047480 7f4013820700 -1 mds/CDir.cc: In function > 'void CDir::_committed(int, version_t)' thread 7f4013820700 time > 2018-11-07 17:44:50.045990 > mds/CDir.cc: 2292: FAILED assert(in->is_dirty() || in->last < ((__u64)(-2))) > > ceph version 10.2.2-Summit3.0-beta2-43-g9c6f44d > (9c6f44d72d16134cc2504f76d6ee48517d046efd) > 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x85) [0x7f401e56d0d5] > 2: (CDir::_committed(int, unsigned long)+0xec2) [0x7f401e34baf2] > 3: (MDSIOContextBase::complete(int)+0xa4) [0x7f401e3bf754] > 4: (Finisher::finisher_thread_entry()+0x216) [0x7f401e49c9e6] > 5: (()+0x7dc5) [0x7f401d340dc5] > 6: (clone()+0x6d) [0x7f401be0d28d]