Re: lockdep warning on 4.11.0-rc5 kernel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Apr 07, 2017 at 09:58:43AM -0700, Darrick J. Wong wrote:
> On Fri, Apr 07, 2017 at 11:10:09AM -0400, Brian Foster wrote:
> > On Wed, Apr 05, 2017 at 04:52:11PM -0400, Vivek Goyal wrote:
> > > Hi,
> > > 
> > > I am running 4.11.0-rc5 kernel and did a kernel build and noticed
> > > following lockdep warning on console. Have not analyzed it. Lots of
> > > xfs in backtrace, so sending it to xfs mailing list.
> > > 
> > 
> > Darrick pointed out on irc yesterday that this is likely due to the
> > lock_inode() call in chmod_common(). I was confused as to where the
> > iolock came into play here, but apparently we now reuse the core
> > inode->i_rwsem for that.
> > 
> > In any event, I was playing around with this and reproduce pretty easily
> > by populating an fs with a bunch files with speculative preallocation
> > and then generating some memory pressure. I reproduce with the following
> > stack, however:
> > 
> > [  434.220605] =================================
> > [  434.222286] [ INFO: inconsistent lock state ]
> > [  434.224092] 4.11.0-rc4+ #36 Tainted: G           OE  
> > [  434.225839] ---------------------------------
> > [  434.227587] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W}
> > usage.
> > [  434.229995] kswapd0/59 [HC0[0]:SC0[0]:HE1:SE1] takes:
> > [  434.231851]  (&sb->s_type->i_mutex_key#17){+.+.?.}, at:
> > [<ffffffffc078e0aa>] xfs_ilock+0x20a/0x300 [xfs]
> > [  434.235473] {RECLAIM_FS-ON-W} state was registered at:
> > [  434.237427]   mark_held_locks+0x76/0xa0
> > [  434.238840]   lockdep_trace_alloc+0x7d/0xe0
> > [  434.240362]   kmem_cache_alloc+0x2f/0x2d0
> > [  434.241871]   kmem_zone_alloc+0x81/0x120 [xfs]
> > [  434.243559]   xfs_trans_alloc+0x6c/0x130 [xfs]
> > [  434.245233]   xfs_vn_update_time+0x75/0x230 [xfs]
> > [  434.247031]   file_update_time+0xbc/0x110
> > [  434.248593]   xfs_file_aio_write_checks+0x19b/0x1c0 [xfs]
> > [  434.250762]   xfs_file_buffered_aio_write+0x75/0x350 [xfs]
> > [  434.252978]   xfs_file_write_iter+0x103/0x150 [xfs]
> > [  434.254935]   __vfs_write+0xe8/0x160
> > [  434.256325]   vfs_write+0xcb/0x1f0
> > [  434.257625]   SyS_pwrite64+0x98/0xc0
> > [  434.258963]   entry_SYSCALL_64_fastpath+0x1f/0xc2
> > 
> > ... so this isn't just a chmod thing. OTOH, I think we agree that this
> > is not a real deadlock vector because the iolock is taken in the
> > destroy_inode() path and so there should be no other reference to the
> > inode.
> > 
> > That aside, the IOLOCK_EXCL was added to xfs_inactive() in commit
> > a36b926180 ("xfs: pull up iolock from xfs_free_eofblocks()") purely to
> > honor the cleaner call semantics that patch defined for
> > xfs_free_eofblocks(). We could probably either drop the iolock from here
> > (though we would then have to kill the assert in xfs_free_eofblocks()),
> > or use something like the diff below that quiets the lockdep splat for
> > me. Thoughts?
> 
> Well... the inode is inactive, which means there won't be any io to
> protect ourselves against, so we don't need to take the iolock, right?
> Why not remove the iolock grabbing and either change the iolock assert
> in _free_eofblocks to detect that we're coming from _inactive and not
> blow the assert, or just make a __xfs_free_eofblocks that skips the
> assert.
> 

Correct, the iolock is not necessary here and historically was not
taken. It was added in the commit above to clean up the function, more
specifically to support the assert. I'd rather not go and uglify the
whole thing again (might as well just revert the patch in that case).

I guess we could just kill the assert then. How about the appended?

Brian

diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index 4d1920e..de94798 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -903,9 +903,9 @@ xfs_can_free_eofblocks(struct xfs_inode *ip, bool force)
 }
 
 /*
- * This is called by xfs_inactive to free any blocks beyond eof
- * when the link count isn't zero and by xfs_dm_punch_hole() when
- * punching a hole to EOF.
+ * This is called to free any blocks beyond eof. The caller must hold
+ * IOLOCK_EXCL unless we are in the inode reclaim path and have the only
+ * reference to the inode.
  */
 int
 xfs_free_eofblocks(
@@ -920,8 +920,6 @@ xfs_free_eofblocks(
 	struct xfs_bmbt_irec	imap;
 	struct xfs_mount	*mp = ip->i_mount;
 
-	ASSERT(xfs_isilocked(ip, XFS_IOLOCK_EXCL));
-
 	/*
 	 * Figure out if there are any blocks beyond the end
 	 * of the file.  If not, then there is nothing to do.
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 7605d83..e36c2c3 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -1906,12 +1906,13 @@ xfs_inactive(
 		 * force is true because we are evicting an inode from the
 		 * cache. Post-eof blocks must be freed, lest we end up with
 		 * broken free space accounting.
+		 *
+		 * Note: don't bother with iolock here since lockdep complains
+		 * about acquiring it in reclaim context. We have the only
+		 * reference to the inode at this point anyways.
 		 */
-		if (xfs_can_free_eofblocks(ip, true)) {
-			xfs_ilock(ip, XFS_IOLOCK_EXCL);
+		if (xfs_can_free_eofblocks(ip, true))
 			xfs_free_eofblocks(ip);
-			xfs_iunlock(ip, XFS_IOLOCK_EXCL);
-		}
 
 		return;
 	}
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux