Christoph, I would like to share some update on the issue I reported on the RT subvolume with more data. I have back ported the three patches to 2.6.37. > > ?xfs: only lock the rt bitmap inode once per allocation > > ?xfs: fix xfs_get_extsz_hint for a zero extent size hint > > ?xfs: add lockdep annotations for the rt inodes With the patches the situation is slightly better however there seems to be a recursive deadlock as part of xfs_fs_evict_inode, if there are multiple extents associated with the same inode. This is a stack trace during a mount after a reboot when the log is replayed, however exactly the same path fails and deadlocks when the evict operation is attempted before a reboot. xfs_ilock(ip, XFS_ILOCK_EXCL) being acquired twice in a recursive loop deadlock : #0 xfs_ilock (ip=0xcf879980, lock_flags=33554436) at fs/xfs/xfs_iget.c:498 #1 0x801ee674 in xfs_iget_cache_hit (mp=0xcf640400, tp=0xcf0c0e58, ino=<value optimized out>, flags=0, lock_flags=33554436, ipp=0xcf60f950) at fs/xfs/xfs_iget.c:238 #2 xfs_iget (mp=0xcf640400, tp=0xcf0c0e58, ino=<value optimized out>, flags=0, lock_flags=33554436, ipp=0xcf60f950) at fs/xfs/xfs_iget.c:391 #3 0x80215b50 in xfs_trans_iget (mp=<value optimized out>, tp=0xcf0c0e58, ino=<value optimized out>, flags=0, lock_flags=33554436, ipp=0xcf60f950) at fs/xfs/xfs_trans_inode.c:60 #4 0x801a7044 in xfs_rtfree_extent (tp=0xcf0c0e58, bno=<value optimized out>, len=9) at fs/xfs/xfs_rtalloc.c:2166 #5 0x801c05d0 in xfs_bmap_del_extent (ip=0xcf879380, tp=<value optimized out>, idx=0, flist=0xcf60fbb0, cur=0x0, del=0xcf60fad0, logflagsp=0xcf60fac0, whichfork=0, rsvd=0) at fs/xfs/xfs_bmap.c:2892 #6 0x801c5460 in xfs_bunmapi (tp=0xcf0c0e58, ip=0xcf879380, bno=2303, len=4294967297, flags=0, nexts=2, firstblock=0xcf60fba8, flist=0xcf60fbb0, done=0xcf60fba0) at fs/xfs/xfs_bmap.c:5256 #7 0x801f0a88 in xfs_itruncate_finish (tp=0xcf60fc14, ip=0xcf879380, new_size=<value optimized out>, fork=0, sync=1) at fs/xfs/xfs_inode.c:1585 #8 0x80218428 in xfs_inactive (ip=0xcf879380) at fs/xfs/xfs_vnodeops.c:1102 #9 0x800e2be4 in evict (inode=0xcf8794c0) at fs/inode.c:450 #10 0x800e3300 in iput_final (inode=0xcf8794c0) at fs/inode.c:1401 #11 iput (inode=0xcf8794c0) at fs/inode.c:1423 #12 0x80208740 in xlog_recover_process_one_iunlink (mp=0xcf640400, agno=<value optimized out>, agino=<value optimized out>, bucket=29) at fs/xfs/xfs_log_recover.c:3212 #13 0x8020884c in xlog_recover_process_iunlinks (log=<value optimized out>) at fs/xfs/xfs_log_recover.c:3289 #14 0x80209928 in xlog_recover_finish (log=0xcf638000) at fs/xfs/xfs_log_recover.c:3926 #15 0x8020de74 in xfs_mountfs (mp=0xcf640400) at fs/xfs/xfs_mount.c:1386 #16 0x8022d228 in xfs_fs_fill_super (sb=0xcf5ff400, data=<value optimized out>, silent=<value optimized out>) at fs/xfs/linux-2.6/xfs_super.c:1539 #17 0x800cbe68 in mount_bdev (fs_type=<value optimized out>, flags=32768, dev_name=<value optimized out>, data=0xcfc52000, fill_super=0x8022d04c <xfs_fs_fill_super>) at fs/super.c:820 #18 0x8022a6a4 in xfs_fs_mount (fs_type=<value optimized out>, flags=<value optimized out>, dev_name=<value optimized out>, data=<value optimized out>) at fs/xfs/linux-2.6/xfs_super.c:1616 #19 0x800ca6e0 in vfs_kern_mount (type=0x80597e10, flags=<value optimized out>, name=<value optimized out>, data=<value optimized out>) at fs/super.c:986 #20 0x800ca888 in do_kern_mount (fstype=0xcff42580 "xfs", flags=<value optimized out>, name=<value optimized out>, data=<value optimized out>) at fs/super.c:1155 #21 0x800e9f08 in do_new_mount (dev_name=0xcf600100 "/dev/sda2", dir_name=<value optimized out>, type_page=0xcff42580 "xfs", flags=32768, data_page=0xcfc52000) at fs/namespace.c:1746 #22 do_mount (dev_name=0xcf600100 "/dev/sda2", dir_name=<value optimized out>, type_page=0xcff42580 "xfs", flags=32768, data_page=0xcfc52000) at fs/namespace.c:2066 #23 0x800ea9d0 in sys_mount (dev_name=0x46e5d4 "/dev/sda2", dir_name=<value optimized out>, type=<value optimized out>, flags=33792, data=0x4700b0) at fs/namespace.c:2210 #24 0x800117bc in handle_sys () at arch/mips/kernel/scall32-o32.S:59 #25 0x0041ff1c in ?? () warning: GDB can't find the start of the function at 0x41ff1b. The code deadlocks here : xfs_iget.c 515 if (lock_flags & XFS_ILOCK_EXCL) 516 mrupdate_nested(&ip->i_lock, In case of 2.6.37 xfs_iget_cache_hit try's to lock repeatedly during the evict. I had to fix the locking by detecting if the inode is already locked and is part of a transaction tp and also prevent from calleing xfs_trans_ijoin(). I can post the patch, however I would like to know if this deadlock makes sense to you. I suspect the same occurs with 2.6.39 as well. Although the xfs_trans_iget() got replaced with the xfs_ilock() the deadlock can happen in xfs_rtfree_extents(). Code on the 2.6.37 : xfs_rt int xfs_rtfree_extent() { ... ... /* * Synchronize by locking the bitmap inode. */ error = xfs_trans_iget(mp, tp, mp->m_sb.sb_rbmino, 0, XFS_ILOCK_EXCL | XFS_ILOCK_RTBITMAP, &ip); ... ... } Code on 2.6.39 int xfs_rtfree_extent() { ... ... /* * Synchronize by locking the bitmap inode. */ xfs_ilock(mp->m_rbmip, XFS_ILOCK_EXCL); /*called from the upstream calling function while loop*/ xfs_trans_ijoin_ref(tp, mp->m_rbmip, XFS_ILOCK_EXCL); .. .. } Kamal Christoph Hellwig wrote: > > On Thu, Feb 02, 2012 at 11:26:28AM -0500, Kamal Dasu wrote: >> > ?xfs: only lock the rt bitmap inode once per allocation >> > ?xfs: fix xfs_get_extsz_hint for a zero extent size hint >> > ?xfs: add lockdep annotations for the rt inodes >> > >> > But in general the RT subvolume code is not regularly tested and only >> > fixed when issues arise. >> >> >> Thanks for quick reply and clarifying this, if upgrading the kernel is >> not an option, should I be >> considering backporting changes to 2.6.37, should I use the entire >> 2.6.39 or 3.0 >> xfs implementation as is of cherry pick the above three changes ?. > > I don't remember if we have other changes in that area. If backporting > the changes is easy enough, go for it, if not stick to your original > workaround. Either way make sure you don't introduce other regressions > by running xfstests. > > _______________________________________________ > xfs mailing list > xfs@xxxxxxxxxxx > http://oss.sgi.com/mailman/listinfo/xfs > > -- View this message in context: http://old.nabble.com/Inode-lockdep-problem-observed-on-2.6.37.6-xfs-with-RT-subvolume-tp33247492p33297927.html Sent from the Xfs - General mailing list archive at Nabble.com. _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs