On Tue, Jul 23, 2024 at 12:45:33PM +0200, Jan Kara wrote: > On Tue 23-07-24 09:59:54, David Howells wrote: > > ====================================================== > > WARNING: possible circular locking dependency detected > > 6.10.0-build2+ #956 Not tainted > > ------------------------------------------------------ > > fsstress/6050 is trying to acquire lock: > > ffff888138fd82f0 (mapping.invalidate_lock#3){++++}-{3:3}, at: filemap_fault+0x26e/0x8b0 > > > > but task is already holding lock: > > ffff888113f26d18 (&vma->vm_lock->lock){++++}-{3:3}, at: lock_vma_under_rcu+0x165/0x250 > > > > which lock already depends on the new lock. > > > > the existing dependency chain (in reverse order) is: > > > > -> #4 (&vma->vm_lock->lock){++++}-{3:3}: > > __lock_acquire+0xaf0/0xd80 > > lock_acquire.part.0+0x103/0x280 > > down_write+0x3b/0x50 > > vma_start_write+0x6b/0xa0 > > vma_link+0xcc/0x140 > > insert_vm_struct+0xb7/0xf0 > > alloc_bprm+0x2c1/0x390 > > kernel_execve+0x65/0x1a0 > > call_usermodehelper_exec_async+0x14d/0x190 > > ret_from_fork+0x24/0x40 > > ret_from_fork_asm+0x1a/0x30 > > > > -> #3 (&mm->mmap_lock){++++}-{3:3}: > > __lock_acquire+0xaf0/0xd80 > > lock_acquire.part.0+0x103/0x280 > > __might_fault+0x7c/0xb0 > > strncpy_from_user+0x25/0x160 > > removexattr+0x7f/0x100 > > __do_sys_fremovexattr+0x7e/0xb0 > > do_syscall_64+0x9f/0x100 > > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > > > -> #2 (sb_writers#14){.+.+}-{0:0}: > > __lock_acquire+0xaf0/0xd80 > > lock_acquire.part.0+0x103/0x280 > > percpu_down_read+0x3c/0x90 > > vfs_iocb_iter_write+0xe9/0x1d0 > > __cachefiles_write+0x367/0x430 > > cachefiles_issue_write+0x299/0x2f0 > > netfs_advance_write+0x117/0x140 > > netfs_write_folio.isra.0+0x5ca/0x6e0 > > netfs_writepages+0x230/0x2f0 > > afs_writepages+0x4d/0x70 > > do_writepages+0x1e8/0x3e0 > > filemap_fdatawrite_wbc+0x84/0xa0 > > __filemap_fdatawrite_range+0xa8/0xf0 > > file_write_and_wait_range+0x59/0x90 > > afs_release+0x10f/0x270 > > __fput+0x25f/0x3d0 > > __do_sys_close+0x43/0x70 > > do_syscall_64+0x9f/0x100 > > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > This is the problematic step - from quite deep in the locking chain holding > invalidate_lock and having PG_Writeback set you suddently jump to very outer > locking context grabbing sb_writers. Now AFAICT this is not a real deadlock > problem because the locks are actually on different filesystems, just > lockdep isn't able to see this. So I don't think you will get rid of these > lockdep splats unless you somehow manage to convey to lockdep that there's > the "upper" fs (AFS in this case) and the "lower" fs (the one behind > cachefiles) and their locks are different. Actually, that can deadlock. We've just been over this with the NFS localio proposal. Network filesystem writeback that can recurse into a local filesystem needs to be using (PF_LOCAL_THROTTLE | PF_MEMALLOC_NOFS) for the writeback context. This is to prevent the deadlocks on upper->lower->upper and lower->upper->lower filesystem recursion via GFP_KERNEL memory allocation and reclaim recursing between the two filesystems. This is especially relevant for filesystems with ->writepage methods that can be called from direct reclaim. Hence allocations in this path need to be at least NOFS to prevent recursion back into the upper filesystem from writeback into the lower filesystem. Further, anywhere that dirty page writeback recurses into the front end write path of a filesystem can deadlock in balance_dirty_pages(). The upper filesystem can consume all the dirty threshold and then the lower filesystem blocks trying to clean dirty upper filesystem pages. Hence PF_LOCAL_THROTTLE is needed for the lower filesystem IO to prevent it from being throttled when the upper filesystem is throttled. -Dave. -- Dave Chinner david@xxxxxxxxxxxxx