Re: lockdep recursive locking warning on for-next

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Feb 18, 2021 at 10:49:26AM -0800, Darrick J. Wong wrote:
> On Thu, Feb 18, 2021 at 01:14:50PM -0500, Brian Foster wrote:
> > Hi Darrick,
> > 
> > I'm seeing the warning below via xfs/167 on a test machine. It looks
> > like it's just complaining about nested freeze protection between the
> > scan invocation and an underlying transaction allocation for an inode
> > eofblocks trim. I suppose we could either refactor xfs_trans_alloc() to
> > drop and reacquire freeze protection around the scan, or alternatively
> > call __sb_writers_release() and __sb_writers_acquire() around the scan
> > to retain freeze protection and quiet lockdep. Hm?
> 
> Erk, isn't that a potential log grant livelock too?
> 
> Fill up the filesystem with real data and cow blocks until it's full,
> then spawn exactly enough file writer threads to eat up all the log
> reservation, then each _reserve() fails, so every thread starts a scan
> and tries to allocate /another/ transaction ... but there's no space
> left in the log, so those scans just block indefinitely.
> 
> So... I think the solution here is to go back to a previous version of
> what that patchset did, where we'd drop the whole transaction, run the
> scan, and jump back to the top of the function to get a fresh
> transaction.
> 

But we don't call into the scan while holding log reservation. We hold
the transaction memory and freeze protection. It's probably debatable
whether we'd want to scan with freeze protection held or not, but I
don't see how dropping either of those changes anything wrt to log
reservation..?

> > BTW, the stack report also had me wondering whether we had or need any
> > nesting protection in these new scan invocations. For example, if we
> > have an fs with a bunch of tagged inodes and concurrent allocation
> > activity, would anything prevent an in-scan transaction allocation from
> > jumping back into the scan code to complete outstanding work? It looks
> > like that might not be possible right now because neither scan reserves
> > blocks, but they do both use transactions and that's quite a subtle
> > balance..
> 
> Yes, that's a subtlety that screams for better documentation.
> 

TBH, I'm not sure that's enough. I think we should at least have some
kind of warning, even if only in DEBUG mode, that explicitly calls out
if we've become susceptible to this kind of scan reentry. Otherwise I
suspect that if this problem is ever truly introduced, the person who
first discovers it will probably be user with a blown stack. :( Could we
set a flag on the task or something that warns as such (i.e. "WARNING:
attempted block reservation in block reclaim context") or perhaps just
prevents scan reentry in the first place?

Brian

> --D
> 
> > 
> > Brian
> > 
> > [  316.631387] ============================================
> > [  316.636697] WARNING: possible recursive locking detected
> > [  316.642010] 5.11.0-rc4 #35 Tainted: G        W I      
> > [  316.647148] --------------------------------------------
> > [  316.652462] fsstress/17733 is trying to acquire lock:
> > [  316.657515] ffff8e0fd1d90650 (sb_internal){++++}-{0:0}, at: xfs_free_eofblocks+0x104/0x1d0 [xfs]
> > [  316.666405] 
> >                but task is already holding lock:
> > [  316.672239] ffff8e0fd1d90650 (sb_internal){++++}-{0:0}, at: xfs_trans_alloc_inode+0x5f/0x160 [xfs]
> > [  316.681269] 
> > ...
> >                stack backtrace:
> > [  316.774735] CPU: 38 PID: 17733 Comm: fsstress Tainted: G        W I       5.11.0-rc4 #35
> > [  316.782819] Hardware name: Dell Inc. PowerEdge R740/01KPX8, BIOS 1.6.11 11/20/2018
> > [  316.790386] Call Trace:
> > [  316.792844]  dump_stack+0x8b/0xb0
> > [  316.796168]  __lock_acquire.cold+0x159/0x2ab
> > [  316.800441]  lock_acquire+0x116/0x370
> > [  316.804106]  ? xfs_free_eofblocks+0x104/0x1d0 [xfs]
> > [  316.809045]  ? rcu_read_lock_sched_held+0x3f/0x80
> > [  316.813750]  ? kmem_cache_alloc+0x287/0x2b0
> > [  316.817937]  xfs_trans_alloc+0x1ad/0x310 [xfs]
> > [  316.822445]  ? xfs_free_eofblocks+0x104/0x1d0 [xfs]
> > [  316.827376]  xfs_free_eofblocks+0x104/0x1d0 [xfs]
> > [  316.832134]  xfs_blockgc_scan_inode+0x24/0x60 [xfs]
> > [  316.837074]  xfs_inode_walk_ag+0x202/0x4b0 [xfs]
> > [  316.841754]  ? xfs_inode_free_cowblocks+0xf0/0xf0 [xfs]
> > [  316.847040]  ? __lock_acquire+0x382/0x1e10
> > [  316.851142]  ? xfs_inode_free_cowblocks+0xf0/0xf0 [xfs]
> > [  316.856425]  xfs_inode_walk+0x66/0xc0 [xfs]
> > [  316.860670]  xfs_trans_alloc+0x160/0x310 [xfs]
> > [  316.865179]  xfs_trans_alloc_inode+0x5f/0x160 [xfs]
> > [  316.870119]  xfs_alloc_file_space+0x105/0x300 [xfs]
> > [  316.875048]  ? down_write_nested+0x30/0x70
> > [  316.879148]  xfs_file_fallocate+0x270/0x460 [xfs]
> > [  316.883913]  ? lock_acquire+0x116/0x370
> > [  316.887752]  ? __x64_sys_fallocate+0x3e/0x70
> > [  316.892026]  ? selinux_file_permission+0x105/0x140
> > [  316.896820]  vfs_fallocate+0x14d/0x3d0
> > [  316.900572]  __x64_sys_fallocate+0x3e/0x70
> > [  316.904669]  do_syscall_64+0x33/0x40
> > [  316.908250]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > ...
> > 
> 




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux